Coordinates are arbitrary precision floats, so it is silly to compare, say, 35.320366666667 against 35.320367 and have them not be equal. I have yet to test this, but it should speed up importing duplicate location points since it will skip coordinates that are within about 1 meter of each other.
- Somehow I totally forgot to relate sidecar motion photos in Google Photos. (They don't use sidecars on Google phones.)
- Item page now displays entities in the picture even without face coordinates
- Quick unit tests for a function related to Google Takeout archives
- We now combine existing metadata with new according to the update policy, instead of either writing all or none of incoming metadata. This merging happens before the DB update query and is a bit of a special case as the policy is applied per-key.
- Special handling for corrupted timestamp in Google Photos data. This is a singular case I haven't observed more of, but seems like a reasonable heuristic. There might be thousands more out there, who knows.
- Fix job creation time (milliseconds)
- Hopefully make repeated imports faster by skipping duplicate items more intelligently based on update policies.
This is useful on the gallery page where we do NOT want to show motion pictures. We will also need to block motion pictures from being displayed as separate items on other UI views when they do show non-root items.
The sorting can help imports go faster if we put DB-heavy sources first, when the database is still small.
The data source names were also standardized to use snake_case like most other word-IDs in the app.
- Obfuscation mode enabled would set a fake phone number in smsbackuprestore's DS options, which led to bad data. Now, the UI does not auto-fill that value. But that means we need...
- SMS Backup & Restore: Phone number can now be inferred from repo owner in the backend, if ds opt phone number is empty. This works even with obfuscation enabled.
- Aborting a scheduled job before it starts now stays aborted. (Unless you manually restart it.)
- Added a data validation error modal for DS options on the import page. For now, if smsbackuprestore has no phone number set, and the timeline repo owner doesn't have a phone number, an error will be shown.
- Timestamp year cannot be > 9999 (JSON serialization panics)
- Lat/lon now considered equivalent after a certain decimal point, since not all sources have high precision (we choose 5 decimal points for now, or about 1.1 meters)
- Map style must be loaded before source is added, apparently (got this error once)
* Add WhatsApp importer
A first pass at importing WhatsApp chat exports.
Some open questions:
- Do we want to import context messages ("you deleted this message")?
- In WhatsApp its possible to have groups with the same participants but a different group name. Is it possible to tag a conversation with a "group name" in Timelinize? If not, this may end up with different conversations being interleaved.
- Is it safe to assume the current location for timezone analysis on import? WhatsApp exports use timezoneless timestamps, which (I've confirmed manually) are just "what the time would have been where you are now" (for me, messages sent in summer are in BST, and in winter are GMT)
Annoying quirks of the export format we should find good ways to communicate to users:
- Any caption text sent with an attachment isn't exported by WhatsApp. (The text is lost and unavailable to Timelinize — I've opened a bug with Meta, for all the good that'll do)
- If there are silent members of a group chat, their presence isn't recorded in the data WhatsApp exports
Todo:
- I _think_ it's safe to assume there's only ever one attachment per message, this would change & simplify the way I parse attachment lines. I'll keep exploring my own exports to identify if this is reasonable.
* Include polls & locations in tests
Polls are currently ignored, but I'll move them to being imported as a message, or as some special datatype, after discussion.
* Add text formatting examples, and show they're not processed
* Fix lint issues
* WhatsApp: Add Retrieval keys to messages
The key on the message isn't perfect, as it'll change if the person exporting their chat history has changed the name of one of the participants between exports (this would mean that participant's name would be different between exports, and their retrieval key would be different).
This seems as close as we can get without exported IDs though.
(I can't find a good way to test that the retrieval key is set properly)
* WhatsApp: Polls, Locations, Metadata
- Correctly parses attachments (even those which have been omitted, as not being available on the device that performed the export)
- Parses Polls (only in English, for now), including adding metadata for the Poll
- Extracts location metadata (Foursquare ID for named locations, or Lat/Long)
- Adds more test data to demonstrate other kinds of messages included in exports
* WhatsApp: Handle other locales
- 🤦♂️ The timestamp format changes based on the locale of the device performing the export — which makes accurate extraction of dates impossible between DD/MM/YYYY and MM/DD/YYYY dates. This parser will assume DD/MM/YYYY date if the last set of digits is 4 long. Perhaps we need an import option for "I'm using American dates"?
- Swaps the Poll scraping structure to allow for the localised words used when the exporting phone is set to other locales (eg. OPCIÓN instead of OPTION)
- Added a chat line test fixture to illustrate this (though normally the entire file would only ever be in a single locale)
* WhatsApp: Correct Poll Structure & fix parsing
I had incorrect POLL lines in the test fixtures; this commit fixes them, and the importer so it can read them properly.
* Use snake case for datasource name
Co-authored-by: Matt Holt <mholt@users.noreply.github.com>
* WhatsApp: Be cautious with matching
Be slightly less confident with matching `_chat.txt` files as WhatsApp exports!
* WhatsApp: Fix lint errors
Fix magic number linting errors
* WhatsApp: swap metadata namespaces
Switch to using "Pin" instead of "Location" to more accurately describe what's being tagged with the metadata.
---------
Co-authored-by: Matt Holt <mholt@users.noreply.github.com>