13 lines
522 B
Markdown
13 lines
522 B
Markdown
### Addition process
|
|
|
|
* Paste in a new folder
|
|
* Manually scan through _folders_ and delete any long lists of files. For example, facebook chats, we dont need 100 chat thread .json files
|
|
* Run `util/scrub.ts` inside that folder
|
|
* Fix any warnings that come out of the above for manual edits
|
|
* Manually blank out any file/folder names with IDs or PII
|
|
* Manually scan through each file for any remaining PII
|
|
|
|
### Corpus
|
|
|
|
* `facebook-json-2021-05-01` - Facebook JSON export
|
|
* `facebook-json-2025-11-29` - Facebook JSON export
|