I wish ...\",\r\n \"points\": null,\r\n \"parent_id\": 27941108,\r\n \"story_id\": 27941108\r\n}\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 952189173, "label": "Use HN algolia endpoint to retrieve trees"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/hacker-news-to-sqlite/issues/3#issuecomment-886142671", "issue_url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/3", "id": 886142671, "node_id": "IC_kwDODtX3eM400XbP", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-07-25T03:51:05Z", "updated_at": "2021-07-25T03:51:05Z", "author_association": "MEMBER", "body": "Prototype:\r\n\r\n curl 'https://hn.algolia.com/api/v1/items/27941108' \\\r\n | jq '[recurse(.children[]) | del(.children)]' \\\r\n | sqlite-utils insert hn.db items - --pk id\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 952189173, "label": "Use HN algolia endpoint to retrieve trees"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/hacker-news-to-sqlite/issues/2#issuecomment-886140431", "issue_url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2", "id": 886140431, "node_id": "IC_kwDODtX3eM400W4P", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-07-25T03:12:57Z", "updated_at": "2021-07-25T03:12:57Z", "author_association": "MEMBER", "body": "I'm going to build a general-purpose `hacker-new-to-sqlite search ...` command, where one of the options is to search within the URL.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 952179830, "label": "Command for fetching Hacker News threads from the search API"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/hacker-news-to-sqlite/issues/2#issuecomment-886136224", "issue_url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2", "id": 886136224, "node_id": "IC_kwDODtX3eM400V2g", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-07-25T02:08:29Z", "updated_at": "2021-07-25T02:08:29Z", "author_association": "MEMBER", "body": "Prototype:\r\n\r\n curl \"https://hn.algolia.com/api/v1/search_by_date?query=simonwillison.net&restrictSearchableAttributes=url&hitsPerPage=1000\" | \\\r\n jq .hits | sqlite-utils insert hn.db items - --pk objectID --alter", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 952179830, "label": "Command for fetching Hacker News threads from the search API"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/hacker-news-to-sqlite/issues/2#issuecomment-886135922", "issue_url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2", "id": 886135922, "node_id": "IC_kwDODtX3eM400Vxy", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-07-25T02:06:20Z", "updated_at": "2021-07-25T02:06:20Z", "author_association": "MEMBER", "body": "https://hn.algolia.com/api/v1/search_by_date?query=simonwillison.net&restrictSearchableAttributes=url looks like it does what I want.\r\n\r\nhttps://hn.algolia.com/api/v1/search_by_date?query=simonwillison.net&restrictSearchableAttributes=url&hitsPerPage=1000 - returns 1000 at once.\r\n\r\nOtherwise you have to paginate using `&page=2` etc - up to `nbPages` pages.\r\n\r\nhttps://www.algolia.com/doc/api-reference/api-parameters/hitsPerPage/ says 1000 is the maximum.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 952179830, "label": "Command for fetching Hacker News threads from the search API"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/hacker-news-to-sqlite/issues/2#issuecomment-886135562", "issue_url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2", "id": 886135562, "node_id": "IC_kwDODtX3eM400VsK", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-07-25T02:01:11Z", "updated_at": "2021-07-25T02:01:11Z", "author_association": "MEMBER", "body": "That page doesn't have an API but does look easy to scrape.\r\n\r\nThe other option here is the HN Search API powered by Algolia, documented at https://hn.algolia.com/api", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 952179830, "label": "Command for fetching Hacker News threads from the search API"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/healthkit-to-sqlite/issues/12#issuecomment-879477586", "issue_url": "https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/12", "id": 879477586, "node_id": "MDEyOklzc3VlQ29tbWVudDg3OTQ3NzU4Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-07-13T23:50:06Z", "updated_at": "2021-07-13T23:50:06Z", "author_association": "MEMBER", "body": "Unfortunately I don't think updating the database is practical, because the export doesn't include unique identifiers which can be used to update existing records and create new ones. Recreating from scratch works around that limitation.\r\n\r\nI've not explored workouts with SpatiaLite but that's a really good idea.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 727848625, "label": "Some workout columns should be float, not text"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/github-to-sqlite/issues/64#issuecomment-861042050", "issue_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64", "id": 861042050, "node_id": "MDEyOklzc3VlQ29tbWVudDg2MTA0MjA1MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-06-14T22:45:42Z", "updated_at": "2021-06-14T22:45:42Z", "author_association": "MEMBER", "body": "I'm definitely interested in supporting events in this tool - see #14.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 920636216, "label": "feature: support \"events\""}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/github-to-sqlite/issues/64#issuecomment-861041597", "issue_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64", "id": 861041597, "node_id": "MDEyOklzc3VlQ29tbWVudDg2MTA0MTU5Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-06-14T22:44:54Z", "updated_at": "2021-06-14T22:44:54Z", "author_association": "MEMBER", "body": "Have you found a way to access events in GraphQL? I can only see way to access a timeline of events for a single issue or a single pull request. See also https://github.community/t/get-event-equivalent-for-v4/13600/2", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 920636216, "label": "feature: support \"events\""}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/github-to-sqlite/pull/59#issuecomment-844250232", "issue_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/59", "id": 844250232, "node_id": "MDEyOklzc3VlQ29tbWVudDg0NDI1MDIzMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-05-19T16:08:10Z", "updated_at": "2021-05-19T16:08:10Z", "author_association": "MEMBER", "body": "Thanks for catching this.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 771872303, "label": "Remove unneeded exists=True for -a/--auth flag."}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/github-to-sqlite/pull/61#issuecomment-844249385", "issue_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/61", "id": 844249385, "node_id": "MDEyOklzc3VlQ29tbWVudDg0NDI0OTM4NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-05-19T16:07:06Z", "updated_at": "2021-05-19T16:07:06Z", "author_association": "MEMBER", "body": "Thanks!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 797108702, "label": "fixing typo in get cli help text"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790695126", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790695126, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDY5NTEyNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T15:20:42Z", "updated_at": "2021-03-04T15:20:42Z", "author_association": "MEMBER", "body": "I'm not sure why but my most recent import, when displayed in Datasette, looks like this:\r\n\r\n\r\n\r\nSorting by `id` in the opposite order gives me the data I would expect - so it looks like a bunch of null/blank messages are being imported at some point and showing up first due to ID ordering.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790693674", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790693674, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDY5MzY3NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T15:18:36Z", "updated_at": "2021-03-04T15:18:36Z", "author_association": "MEMBER", "body": "I imported my 10GB mbox with 750,000 emails in it, ran this tool (with a hacked fix for the blob column problem) - and now a search that returns 92 results takes 25.37ms! This is fantastic.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790669767", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790669767, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDY2OTc2Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T14:46:06Z", "updated_at": "2021-03-04T14:46:06Z", "author_association": "MEMBER", "body": "Solution could be to pre-process that string by splitting on `(` and dropping everything afterwards, assuming that the `(...)` bit isn't necessary for correctly parsing the date.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790668263", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790668263, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDY2ODI2Mw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T14:43:58Z", "updated_at": "2021-03-04T14:43:58Z", "author_association": "MEMBER", "body": "I added this code to output a message ID on errors:\r\n```diff\r\n print(\"Errors: {}\".format(num_errors))\r\n print(traceback.format_exc())\r\n+ print(\"Message-Id: {}\".format(email.get(\"Message-Id\", \"None\")))\r\n continue\r\n```\r\nHaving found a message ID that had an error, I ran this command to see the context:\r\n\r\n rg --text --context 20 '44F289B0.000001.02100@SCHWARZE-DWFXMI' ~/gmail.mbox\r\n\r\nThis was for the following error:\r\n```\r\n File \"/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py\", line 102, in get_mbox\r\n message[\"date\"] = get_message_date(email.get(\"Date\"), email.get_from())\r\n File \"/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py\", line 178, in get_message_date\r\n datetime_tuple = email.utils.parsedate_tz(mail_date)\r\n File \"/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py\", line 50, in parsedate_tz\r\n res = _parsedate_tz(data)\r\n File \"/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py\", line 69, in _parsedate_tz\r\n data = data.split()\r\nAttributeError: 'Header' object has no attribute 'split'\r\n```\r\nHere's what I spotted in the `ripgrep` output:\r\n```\r\n177133570:Message-Id: <44F289B0.000001.02100@SCHWARZE-DWFXMI>\r\n177133571-Date: Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop\ufffdische Sommerzeit)\r\n177133572-X-Mailer: IncrediMail (5002253)\r\n```\r\nSo it could it be that `_parsedate_tz` is having trouble with that `Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop\ufffdische Sommerzeit)` string.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/issues/6#issuecomment-790384087", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/6", "id": 790384087, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM4NDA4Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:22:51Z", "updated_at": "2021-03-04T07:22:51Z", "author_association": "MEMBER", "body": "#3 also mentions the conflicting version with other tools.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 821841046, "label": "Upgrade to latest sqlite-utils"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790380839", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790380839, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM4MDgzOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:17:05Z", "updated_at": "2021-03-04T07:17:05Z", "author_association": "MEMBER", "body": "Looks like you're doing this:\r\n```python\r\n elif message.get_content_type() == \"text/plain\":\r\n body = message.get_payload(decode=True)\r\n```\r\nSo presumably that decodes to a unicode string?\r\n\r\nI imagine the reason the column is a `BLOB` for me is that `sqlite-utils` determines the column type based on the first batch of items - https://github.com/simonw/sqlite-utils/blob/09c3386f55f766b135b6a1c00295646c4ae29bec/sqlite_utils/db.py#L1927-L1928 - and I got unlucky and had something in my first batch that wasn't a unicode string.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790379629", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790379629, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM3OTYyOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:14:41Z", "updated_at": "2021-03-04T07:14:41Z", "author_association": "MEMBER", "body": "Confirmed: removing the `len()` call does not speed things up, so it's reading through the entire file for some other purpose too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790378658", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790378658, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM3ODY1OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:12:48Z", "updated_at": "2021-03-04T07:12:48Z", "author_association": "MEMBER", "body": "It looks like the `body` is being loaded into a BLOB column - so in Datasette default it looks like this:\r\n\r\n\r\n\r\nIf I `datasette install datasette-render-binary` and then try again I get this:\r\n\r\n\r\n\r\nIt would be great if we could store the `body` as unicode text instead. May have to do something clever to decode it based on some kind of charset header?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790373024", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790373024, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM3MzAyNA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:01:58Z", "updated_at": "2021-03-04T07:04:06Z", "author_association": "MEMBER", "body": "I got 9 warnings that look like this:\r\n```\r\nErrors: 1\r\nTraceback (most recent call last):\r\n File \"/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py\", line 103, in get_mbox\r\n message[\"date\"] = get_message_date(email.get(\"Date\"), email.get_from())\r\n File \"/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py\", line 167, in get_message_date\r\n datetime_tuple = email.utils.parsedate_tz(mail_date)\r\n File \"/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py\", line 50, in parsedate_tz\r\n res = _parsedate_tz(data)\r\n File \"/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py\", line 69, in _parsedate_tz\r\n data = data.split()\r\nAttributeError: 'Header' object has no attribute 'split'\r\n```\r\nIt would be useful if those warnings told me the message ID (or similar) of the affected message so I could grep for it in the `mbox` and see what was going on.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790372621", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790372621, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM3MjYyMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:01:18Z", "updated_at": "2021-03-04T07:01:18Z", "author_association": "MEMBER", "body": "I'm not sure if it would work, but there is an alternative pattern for showing a progress bar against a really large file that I've used in `healthkit-to-sqlite` - you set the progress bar size to the size of the file in bytes, then update a counter as you read the file.\r\n\r\nhttps://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/cli.py#L24-L57 and https://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/utils.py#L4-L19 (the `progress_callback()` bit) is where that happens.\r\n\r\nIt can be a bit of a convoluted pattern, and I'm not at all sure it would work for `mbox` files since it looks like that library has other reasons it needs to do a file scan rather than streaming it through one chunk of bytes at a time. So I imagine this would not work here.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790370485", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790370485, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM3MDQ4NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T06:57:25Z", "updated_at": "2021-03-04T06:57:48Z", "author_association": "MEMBER", "body": "The command takes quite a while to start running, presumably because this line causes it to have to scan the WHOLE file in order to generate a count:\r\n\r\nhttps://github.com/dogsheep/google-takeout-to-sqlite/blob/a3de045eba0fae4b309da21aa3119102b0efc576/google_takeout_to_sqlite/utils.py#L66-L67\r\n\r\nI'm fine with waiting though. It's not like this is a command people run every day - and without that count we can't show a progress bar, which seems pretty important for a process that takes this long.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790369076", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790369076, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM2OTA3Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T06:54:46Z", "updated_at": "2021-03-04T06:54:46Z", "author_association": "MEMBER", "body": "The Rich-powered progress bar is pretty:\r\n\r\n![rich](https://user-images.githubusercontent.com/9599/109923307-71f69200-7c73-11eb-9ee2-8f0a240f3994.gif)\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790312268", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790312268, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDMxMjI2OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T05:48:16Z", "updated_at": "2021-03-04T05:48:16Z", "author_association": "MEMBER", "body": "Wow, my mbox is a 10.35 GB download!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-786925280", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 786925280, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjkyNTI4MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T22:23:10Z", "updated_at": "2021-02-26T22:23:10Z", "author_association": "MEMBER", "body": "Thanks!\r\n\r\nI requested my Gmail export from takeout - once that arrives I'll test it against this and then merge the PR.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/evernote-to-sqlite/pull/10#issuecomment-777839351", "issue_url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/10", "id": 777839351, "node_id": "MDEyOklzc3VlQ29tbWVudDc3NzgzOTM1MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-11T22:37:55Z", "updated_at": "2021-02-11T22:37:55Z", "author_association": "MEMBER", "body": "I've merged these changes by hand now, thanks!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770712149, "label": "BugFix for encoding and not update info."}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/evernote-to-sqlite/issues/7#issuecomment-777827396", "issue_url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/7", "id": 777827396, "node_id": "MDEyOklzc3VlQ29tbWVudDc3NzgyNzM5Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-11T22:13:14Z", "updated_at": "2021-02-11T22:13:14Z", "author_association": "MEMBER", "body": "My best guess is that you have an older version of `sqlite-utils` installed here - the `replace=True` argument was added in version 2.0. I've bumped the dependency in `setup.py`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 743297582, "label": "evernote-to-sqlite on windows 10 give this error: TypeError: insert() got an unexpected keyword argument 'replace'"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/evernote-to-sqlite/issues/9#issuecomment-777821383", "issue_url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/9", "id": 777821383, "node_id": "MDEyOklzc3VlQ29tbWVudDc3NzgyMTM4Mw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-11T22:01:28Z", "updated_at": "2021-02-11T22:01:28Z", "author_association": "MEMBER", "body": "Aha! I think I've figured out what's going on here.\r\n\r\nThe CData blocks containing the notes look like this:\r\n\r\n`