base-data-manager/test/fixtures/README.md

13 lines
522 B
Markdown

### Addition process
* Paste in a new folder
* Manually scan through _folders_ and delete any long lists of files. For example, facebook chats, we dont need 100 chat thread .json files
* Run `util/scrub.ts` inside that folder
* Fix any warnings that come out of the above for manual edits
* Manually blank out any file/folder names with IDs or PII
* Manually scan through each file for any remaining PII
### Corpus
* `facebook-json-2021-05-01` - Facebook JSON export
* `facebook-json-2025-11-29` - Facebook JSON export