base-data-manager/test/fixtures/discord-json-2021-01.md

41 lines
No EOL
2.1 KiB
Markdown

# discord-json-2021-01
## Manual edits
* images -> placeholders
* `accounts/avatar.png`
* manually scrub folder names
* `account/applications/0000000000000`
## Notes about files
* `activity/`
* All the .json are NDJSON so some json tools don't like them
* _Massive_ files. They hang scrub.ts for a long long time (had to run these piecemeal)
* These files also have an _incredible_ amount of shapes and variance.
* Instead of outputing all the shapes I made a sort of "super-object" to capture the shape with `jq -n '[inputs] | add' events-2021-00000-of-00001.json.tmp > unique_shape.json` and then scrubbing `unique_shape.json`
* `messages/`
* I hand did these to keep all the ids the same
* There are multiple types of chats. DMs, guild channels, etc
* I hand did the csvs as I have no scrubber for that
* These are only **THE EXPORTING USERS MESSAGES**, no other user, just fyi
* Ids in `messages.csv` are just the id of the message, not of any user
* There is the potential to derive missing info from a channel via `@` tags sent or possibly via attachments. Maybe...
* `11111111111111111`
* This one has a shorter id (it's an older one)
* Has `type: 0` but there's no guild information in `channel.json`
* The user name was `null` in `index.json`
* It's a really odd one
* `222222222222222222`
* This was a dm channel (said `direct message with xxx#7777` in index.json)
* Has `type: 1` and there are two recipients (just the ids) in `channel.json`
* Unfortunately that's all the info in the export
* `333333333333333333`
* This was a normal guild channel
* `type: 0` and there's guild information in `channel.json`
* I kept a good set of messages around from this one to show how attachements and other stuff works
* The last message seemed to be a link not as an attachment. Links just seem to be normal text
* `programs/`
* was empty...
* `servers/``
* Info about _some_ of the guilds we have ids for
* guild.json didn't really contain anything except the name
* I kept around the only guild I noticed an audit-log.json with info in it