github

This data as json, CSV

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	issue
https://github.com/dogsheep/healthkit-to-sqlite/issues/12#issuecomment-879477586	https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/12	879477586	MDEyOklzc3VlQ29tbWVudDg3OTQ3NzU4Ng==	9599	2021-07-13T23:50:06Z	2021-07-13T23:50:06Z	MEMBER	Unfortunately I don't think updating the database is practical, because the export doesn't include unique identifiers which can be used to update existing records and create new ones. Recreating from scratch works around that limitation. I've not explored workouts with SpatiaLite but that's a really good idea.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	727848625
https://github.com/dogsheep/github-to-sqlite/issues/64#issuecomment-861042050	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64	861042050	MDEyOklzc3VlQ29tbWVudDg2MTA0MjA1MA==	9599	2021-06-14T22:45:42Z	2021-06-14T22:45:42Z	MEMBER	I'm definitely interested in supporting events in this tool - see #14.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	920636216
https://github.com/dogsheep/github-to-sqlite/issues/64#issuecomment-861041597	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64	861041597	MDEyOklzc3VlQ29tbWVudDg2MTA0MTU5Nw==	9599	2021-06-14T22:44:54Z	2021-06-14T22:44:54Z	MEMBER	Have you found a way to access events in GraphQL? I can only see way to access a timeline of events for a single issue or a single pull request. See also https://github.community/t/get-event-equivalent-for-v4/13600/2	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	920636216
https://github.com/dogsheep/github-to-sqlite/pull/59#issuecomment-844250232	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/59	844250232	MDEyOklzc3VlQ29tbWVudDg0NDI1MDIzMg==	9599	2021-05-19T16:08:10Z	2021-05-19T16:08:10Z	MEMBER	Thanks for catching this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	771872303
https://github.com/dogsheep/github-to-sqlite/pull/61#issuecomment-844249385	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/61	844249385	MDEyOklzc3VlQ29tbWVudDg0NDI0OTM4NQ==	9599	2021-05-19T16:07:06Z	2021-05-19T16:07:06Z	MEMBER	Thanks!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	797108702
https://github.com/dogsheep/dogsheep-photos/pull/29#issuecomment-739058820	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/29	739058820	MDEyOklzc3VlQ29tbWVudDczOTA1ODgyMA==	9599	2020-12-04T22:32:35Z	2020-12-04T22:32:35Z	MEMBER	Thanks for this!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	638375985
https://github.com/dogsheep/github-to-sqlite/issues/53#issuecomment-735485677	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/53	735485677	MDEyOklzc3VlQ29tbWVudDczNTQ4NTY3Nw==	9599	2020-11-30T00:36:09Z	2020-11-30T00:36:09Z	MEMBER	Given rate limits (see #51) this command might be better implemented by running a `git clone` into a temporary directory - doing so would retrieve all of the files in one go.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	753000405
https://github.com/dogsheep/github-to-sqlite/issues/51#issuecomment-735484186	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/51	735484186	MDEyOklzc3VlQ29tbWVudDczNTQ4NDE4Ng==	9599	2020-11-30T00:29:31Z	2020-11-30T00:29:31Z	MEMBER	This just caused a failure in deploying the demo: https://github.com/dogsheep/github-to-sqlite/runs/1471304407?check_suite_focus=true ``` File "/opt/hostedtoolcache/Python/3.8.6/x64/bin/github-to-sqlite", line 33, in <module> sys.exit(load_entry_point('github-to-sqlite', 'console_scripts', 'github-to-sqlite')()) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 829, in __call__ return self.main(args, kwargs) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(args, **kwargs) File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/cli.py", line 142, in issue_comments for comment in utils.fetch_issue_comments(repo, token, issue): File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/utils.py", line 380, in fetch_issue_comments for comments in paginate(url, headers): File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/utils.py", line 472, in paginate raise GitHubError.from_response(response) github_to_sqlite.utils.GitHubError: ('API rate limit exceeded for user ID 9599.', 403) Error: Process completed with exit code 1. ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	703246031
https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735483820	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46	735483820	MDEyOklzc3VlQ29tbWVudDczNTQ4MzgyMA==	9599	2020-11-30T00:27:47Z	2020-11-30T00:27:47Z	MEMBER	So it looks like anything that pulls reviews needs to pull each review, then for each one pull the comments. I'm going to consider this blocked on smarter rate limit handling in #51.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	664485022
https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735483604	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46	735483604	MDEyOklzc3VlQ29tbWVudDczNTQ4MzYwNA==	9599	2020-11-30T00:26:50Z	2020-11-30T00:26:50Z	MEMBER	It seems like there's a lot missing from that - those aren't particularly interesting given the data that is returned. From the docs at https://docs.github.com/en/free-pro-team@latest/rest/reference/pulls#reviews it looks like each review consists of multiple comments, and the comments are where the useful material is - https://docs.github.com/en/free-pro-team@latest/rest/reference/pulls#list-comments-for-a-pull-request-review `github-to-sqlite get https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48/reviews/503368921/comments --accept 'application/vnd.github.v3+json'` ```json [ { "id": 500603838, "node_id": "MDI0OlB1bGxSZXF1ZXN0UmV2aWV3Q29tbWVudDUwMDYwMzgzOA==", "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/comments/500603838", "pull_request_review_id": 503368921, "diff_hunk": "@@ -0,0 +1,370 @@\n+[\n+ {\n+ \"url\": \"https://api.github.com/repos/simonw/datasette/pulls/571\",\n+ \"id\": 313384926,\n+ \"node_id\": \"MDExOlB1bGxSZXF1ZXN0MzEzMzg0OTI2\",\n+ \"html_url\": \"https://github.com/simonw/datasette/pull/571\",\n+ \"diff_url\": \"https://github.com/simonw/datasette/pull/571.diff\",\n+ \"patch_url\": \"https://github.com/simonw/datasette/pull/571.patch\",\n+ \"issue_url\": \"https://api.github.com/repos/simonw/datasette/issues/571\",\n+ \"number\": 571,\n+ \"state\": \"closed\",\n+ \"locked\": false,\n+ \"title\": \"detect_fts now works with alternative table escaping\",\n+ \"user\": {\n+ \"login\": \"simonw\",\n+ \"id\": 9599,\n+ \"node_id\": \"MDQ6VXNlcjk1OTk=\",\n+ \"avatar_url\": \"https://avatars0.githubusercontent.com/u/9599?v=4\",\n+ \"gravatar_id\": \"\",\n+ \"url\": \"https://api.github.com/users/simonw\",\n+ \"html_url\": \"https://github.com/simonw\",\n+ \"followers_url\": \"https://api.github.com/users/simonw/followers\",\n+ \"following_ur…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	664485022
https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735482546	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46	735482546	MDEyOklzc3VlQ29tbWVudDczNTQ4MjU0Ng==	9599	2020-11-30T00:22:02Z	2020-11-30T00:22:02Z	MEMBER	As for reviews... here's the output of `github-to-sqlite get https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48/reviews --accept 'application/vnd.github.v3+json'` ```json [ { "id": 503368921, "node_id": "MDE3OlB1bGxSZXF1ZXN0UmV2aWV3NTAzMzY4OTIx", "user": { "login": "simonw", "id": 9599, "node_id": "MDQ6VXNlcjk1OTk=", "avatar_url": "https://avatars0.githubusercontent.com/u/9599?u=5968723deb1a55b82620e106f5ca58e9b11a0942&v=4", "gravatar_id": "", "url": "https://api.github.com/users/simonw", "html_url": "https://github.com/simonw", "followers_url": "https://api.github.com/users/simonw/followers", "following_url": "https://api.github.com/users/simonw/following{/other_user}", "gists_url": "https://api.github.com/users/simonw/gists{/gist_id}", "starred_url": "https://api.github.com/users/simonw/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/simonw/subscriptions", "organizations_url": "https://api.github.com/users/simonw/orgs", "repos_url": "https://api.github.com/users/simonw/repos", "events_url": "https://api.github.com/users/simonw/events{/privacy}", "received_events_url": "https://api.github.com/users/simonw/received_events", "type": "User", "site_admin": false }, "body": "", "state": "CHANGES_REQUESTED", "html_url": "https://github.com/dogsheep/github-to-sqlite/pull/48#pullrequestreview-503368921", "pull_request_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48", "author_association": "MEMBER", "_links": { "html": { "href": "https://github.com/dogsheep/github-to-sqlite/pull/48#pullrequestreview-503368921" }, "pull_request": { "href": "https://api.github.c…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	664485022
https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735482187	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46	735482187	MDEyOklzc3VlQ29tbWVudDczNTQ4MjE4Nw==	9599	2020-11-30T00:20:11Z	2020-11-30T00:20:11Z	MEMBER	Pull request are now added, thanks to @adamjonas.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	664485022
https://github.com/dogsheep/github-to-sqlite/issues/54#issuecomment-735465708	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/54	735465708	MDEyOklzc3VlQ29tbWVudDczNTQ2NTcwOA==	9599	2020-11-29T22:08:46Z	2020-11-29T22:08:46Z	MEMBER	Demo: - https://github-to-sqlite.dogsheep.net/github/steps?_facet=repo - https://github-to-sqlite.dogsheep.net/github/workflows - https://github-to-sqlite.dogsheep.net/github/jobs	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	753026003
https://github.com/dogsheep/github-to-sqlite/issues/54#issuecomment-735464438	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/54	735464438	MDEyOklzc3VlQ29tbWVudDczNTQ2NDQzOA==	9599	2020-11-29T21:57:08Z	2020-11-29T21:57:08Z	MEMBER	Inspired by this tweet from Michael Heap https://twitter.com/mheap/status/1333108608817631238	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	753026003
https://github.com/dogsheep/github-to-sqlite/issues/54#issuecomment-735464493	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/54	735464493	MDEyOklzc3VlQ29tbWVudDczNTQ2NDQ5Mw==	9599	2020-11-29T21:57:32Z	2020-11-29T21:57:32Z	MEMBER	`$ github-to-sqlite workflows github.db simonw/datasette dogsheep/github-to-sqlite`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	753026003
https://github.com/dogsheep/swarm-to-sqlite/issues/11#issuecomment-727692413	https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/11	727692413	MDEyOklzc3VlQ29tbWVudDcyNzY5MjQxMw==	9599	2020-11-16T02:15:22Z	2020-11-16T02:15:22Z	MEMBER	Thanks, I'll look into this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	743400216
https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-712266834	https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29	712266834	MDEyOklzc3VlQ29tbWVudDcxMjI2NjgzNA==	9599	2020-10-19T16:01:23Z	2020-10-19T16:01:23Z	MEMBER	Might just be a documented pattern for how to configure this in YAML templates.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	724759588
https://github.com/dogsheep/github-to-sqlite/issues/50#issuecomment-711569063	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/50	711569063	MDEyOklzc3VlQ29tbWVudDcxMTU2OTA2Mw==	9599	2020-10-19T05:01:29Z	2020-10-19T05:01:29Z	MEMBER	Demo of `--accept`: github-to-sqlite get /repos/simonw/datasette/readme --accept 'application/vnd.github.VERSION.html'	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	703218756
https://github.com/dogsheep/dogsheep-beta/issues/28#issuecomment-711089647	https://api.github.com/repos/dogsheep/dogsheep-beta/issues/28	711089647	MDEyOklzc3VlQ29tbWVudDcxMTA4OTY0Nw==	9599	2020-10-17T22:43:13Z	2020-10-17T22:43:13Z	MEMBER	Since my personal Dogsheep uses Datasette authentication, I'm going to need to pass through cookies. https://github.com/simonw/datasette/issues/1020 will solve that in the future but for now I need to solve it explicitly.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	723861683
https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711081703	https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11	711081703	MDEyOklzc3VlQ29tbWVudDcxMTA4MTcwMw==	9599	2020-10-17T21:18:35Z	2020-10-17T21:18:35Z	MEMBER	OK, if you upgrade to the just-released 1.0 this should work (it worked against my Spanish export).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	723838331
https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711079760	https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11	711079760	MDEyOklzc3VlQ29tbWVudDcxMTA3OTc2MA==	9599	2020-10-17T21:00:05Z	2020-10-17T21:00:05Z	MEMBER	Checking for either `<!DOCTYPE HealthData` or `<HealthData` in the first 1000 bytes should do it.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	723838331
https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711079056	https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11	711079056	MDEyOklzc3VlQ29tbWVudDcxMTA3OTA1Ng==	9599	2020-10-17T20:53:00Z	2020-10-17T20:53:00Z	MEMBER	I think the safest thing is to sniff the first few lines of the file. Those should be the same no matter the language that was used: ```xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE HealthData [ ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	723838331
https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711078917	https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11	711078917	MDEyOklzc3VlQ29tbWVudDcxMTA3ODkxNw==	9599	2020-10-17T20:51:55Z	2020-10-17T20:52:03Z	MEMBER	I switched my phone to Spanish and ran an export - I got a file called `exportar.zip`. Unzipped I still got a `apple_ health_export` folder but the root contained: ``` electrocardiograms/ export_cda.xml exportar.xml workout-routes/ ``` It looks like `export_cda.xml` does not have a translated name, so maybe I can ignore it and look for the _other_ `.xml` file in that directory.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	723838331
https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711074306	https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11	711074306	MDEyOklzc3VlQ29tbWVudDcxMTA3NDMwNg==	9599	2020-10-17T20:16:22Z	2020-10-17T20:16:22Z	MEMBER	The "first XML file in the root" solution is probably easier though!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	723838331
https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711074031	https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11	711074031	MDEyOklzc3VlQ29tbWVudDcxMTA3NDAzMQ==	9599	2020-10-17T20:14:01Z	2020-10-17T20:14:01Z	MEMBER	I'd be happy to teach the tool to look for `export.xml` or `eksport.xml` - and then expand that list to other languages.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	723838331
https://github.com/dogsheep/swarm-to-sqlite/issues/8#issuecomment-707332912	https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/8	707332912	MDEyOklzc3VlQ29tbWVudDcwNzMzMjkxMg==	9599	2020-10-12T20:35:06Z	2020-10-12T20:35:06Z	MEMBER	Shipped a fix for this in [swarm-to-sqlite 0.3.2](https://github.com/dogsheep/swarm-to-sqlite/releases/tag/0.3.2).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	648245071
https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706834800	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	706834800	MDEyOklzc3VlQ29tbWVudDcwNjgzNDgwMA==	9599	2020-10-12T03:24:57Z	2020-10-16T20:16:28Z	MEMBER	Here's my first attempt at a plugin for this: ```python from datasette import hookimpl import jinja2 START = "<en-note" END = "</en-note>" TEMPLATE = """ <div style="max-width: 500px; white-space: normal; overflow-wrap: break-word;">{}</div> """.strip() EN_MEDIA_SCRIPT = """ Array.from(document.querySelectorAll('en-media')).forEach(el => { let hash = el.getAttribute('hash'); let type = el.getAttribute('type'); let path = `/evernote/resources_data/${hash}.json?_shape=array`; fetch(path).then(r => r.json()).then(rows => { let b64 = rows[0].data.encoded; let data = `data:${type};base64,${b64}`; el.innerHTML = `<img style="max-width: 300px" src="${data}">`; }); }); """ @hookimpl def render_cell(value, table): if not table: # Don't render content from arbitrary SQL queries, could be XSS hole return if not value or not isinstance(value, str): return value = value.strip() if value.startswith(START) and value.endswith(END): trimmed = value[len(START) : -len(END)] trimmed = trimmed.split(">", 1)[1] # Replace those horrible double newlines trimmed = trimmed.replace("<div><br /></div>", "<br>") return jinja2.Markup(TEMPLATE.format(trimmed)) @hookimpl def extra_body_script(): return EN_MEDIA_SCRIPT ``` It works! It does however demonstrate that Evernote's "clip this webpage" feature means there is a LOT of weird HTML that can get into a note. It looks like they've filtered out the scripts but I wouldn't bet on it - they certainly don't filter out many of the inline styles. So running Bleach is almost certainly a good idea.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718938889
https://github.com/dogsheep/evernote-to-sqlite/issues/4#issuecomment-706786548	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/4	706786548	MDEyOklzc3VlQ29tbWVudDcwNjc4NjU0OA==	9599	2020-10-11T23:39:46Z	2020-10-11T23:39:46Z	MEMBER	Should have used porter stemming for this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718938508
https://github.com/dogsheep/evernote-to-sqlite/issues/6#issuecomment-706785201	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/6	706785201	MDEyOklzc3VlQ29tbWVudDcwNjc4NTIwMQ==	9599	2020-10-11T23:29:39Z	2020-10-11T23:29:39Z	MEMBER	It looks to me like each of those `<item>` blocks has a number of guesses in order of confidence: ```xml <item x="215" y="190" w="187" h="39"> <t w="57">wonders,</t> <t w="55">wanders,</t> <t w="52">wonders ?</t> <t w="45">wonders</t> <t w="42">wonders.</t> </item> ``` So maybe the best approach here is to just take the first `t` element within each `item`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718949182
https://github.com/dogsheep/evernote-to-sqlite/issues/6#issuecomment-706785086	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/6	706785086	MDEyOklzc3VlQ29tbWVudDcwNjc4NTA4Ng==	9599	2020-10-11T23:28:50Z	2020-10-11T23:28:50Z	MEMBER	The XML for the OCR stuff is a bit weird. Currently I'm doing this to it: https://github.com/dogsheep/evernote-to-sqlite/blob/c33d7b043a45eb3e88676e5fa3ce31755199d9f8/evernote_to_sqlite/utils.py#L70-L78 This can produce some odd results, for example: > Sure 'Sure, 'Sure. Sure, Sure. sure sure. sure ? If you If Yau [you live jive In m 1n an area devoid of natural wonders, wanders, wonders ? wonders wonders. your mind will be blown, blown' blown. blown ? -e i ? ,1 IL it ? at ? KY ? fl ft bat at Which came from this image: ![image](https://user-images.githubusercontent.com/9599/95692952-5dd7c880-0bde-11eb-939a-d10b800a4105.png) The XML for that is: ```xml <recoIndex docType="unknown" objType="image" objID="05ffb72b307bf495f064243c7099d94f" engineVersion="6.5.17.7" recoType="service" lang="en" objWidth="1000" objHeight="1504"> <item x="68" y="75" w="104" h="37"> <t w="60">Sure</t> <t w="52">'Sure,</t> <t w="47">'Sure.</t> <t w="33">Sure,</t> <t w="26">Sure.</t> </item> <item x="182" y="83" w="92" h="26"> <t w="62">sure</t> <t w="58">sure.</t> <t w="46">sure ?</t> </item> <item x="69" y="132" w="107" h="45"> <t w="81">If you</t> <t w="64">If Yau</t> <t w="31">[you</t> </item> <item x="186" y="132" w="67" h="35"> <t w="85">live</t> <t w="51">jive</t> </item> <item x="263" y="140" w="36" h="27"> <t w="82">In</t> <t w="56">m</t> <t w="53">1n</t> </item> <item x="309" y="140" w="53" h="27"> <t w="82">an</t> </item> <item x="372" y="141" w="90" h="26"> <t w="94">area</t> </item> <item x="472" y="132" w="138" h="35"> <t w="85">devoid</t> </item> <item x="620" y="132" w="43" h="35"> <t w="82">of</t> </item> <item x="68" y="190" w="137" h="35"> <t w="87">natural</t> </item> <item x="215" y="190" w="187" h="39"> <t w="57">wonders,</t> <t w="55">wanders,</t> <t w="52">wonders ?</t> <t w="45">wonders</t> <t w="42">won…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718949182
https://github.com/dogsheep/evernote-to-sqlite/issues/4#issuecomment-706784028	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/4	706784028	MDEyOklzc3VlQ29tbWVudDcwNjc4NDAyOA==	9599	2020-10-11T23:20:32Z	2020-10-11T23:20:32Z	MEMBER	I haven't done the FTS on OCR yet. I'm going to move that to another ticket because it requires more thought.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718938508
https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776808	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	706776808	MDEyOklzc3VlQ29tbWVudDcwNjc3NjgwOA==	9599	2020-10-11T22:23:14Z	2020-10-11T22:23:14Z	MEMBER	... but it's still important to be able to get to the rendered note directly from the browse notes `/evernote/notes` page. Maybe use a simple `render_cell()` hook that just knows how to generate the link to the rendered note page?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718938889
https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776680	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	706776680	MDEyOklzc3VlQ29tbWVudDcwNjc3NjY4MA==	9599	2020-10-11T22:22:16Z	2020-10-11T22:22:16Z	MEMBER	Maybe the best way do this is with a custom route, `/-/evernote/note-id` - that way I can clean the HTML and resolve the other things in the `<en-note>` structure without using `render_cell()` and the like. My concern about using `render_cell()` is that it could lead to weird security problems when combined with `?sql=` queries.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718938889
https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776447	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	706776447	MDEyOklzc3VlQ29tbWVudDcwNjc3NjQ0Nw==	9599	2020-10-11T22:20:32Z	2020-10-11T22:20:32Z	MEMBER	Or... I could do this client-side. JavaScript that looks for `<en-media>` tags and fetches the data using `fetch()` wouldn't be too hard to write.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718938889
https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776242	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	706776242	MDEyOklzc3VlQ29tbWVudDcwNjc3NjI0Mg==	9599	2020-10-11T22:18:30Z	2020-10-11T22:19:48Z	MEMBER	Alternatively, rather than relying on `datasette-media` this could base64-embed the images. `evernote-to-sqlite` could register itself as a Datasette plugin that knows how to do this. Maybe rename the column to `evernote_content` and register a render cell hook that knows how to rewrite those note bodies so that they are visible? Might need to feed them through Bleach too, just in case any nasty code can get into them.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718938889
https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776180	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5	706776180	MDEyOklzc3VlQ29tbWVudDcwNjc3NjE4MA==	9599	2020-10-11T22:17:55Z	2020-10-11T22:17:55Z	MEMBER	We could even do server-side thumbnailing for some of these images, but I'm inclined to serve up the full size ones and set a width on the image element based on the `width` attribute on `<en-media>`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718938889
https://github.com/dogsheep/evernote-to-sqlite/issues/1#issuecomment-706775706	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/1	706775706	MDEyOklzc3VlQ29tbWVudDcwNjc3NTcwNg==	9599	2020-10-11T22:14:00Z	2020-10-11T22:14:00Z	MEMBER	A live demo would be good too.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718934942
https://github.com/dogsheep/github-to-sqlite/pull/48#issuecomment-704553385	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/48	704553385	MDEyOklzc3VlQ29tbWVudDcwNDU1MzM4NQ==	9599	2020-10-06T21:07:44Z	2020-10-06T21:07:44Z	MEMBER	Sorry for not looking at this sooner, trying it out now - pull request looks great!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	681228542
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790695126	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790695126	MDEyOklzc3VlQ29tbWVudDc5MDY5NTEyNg==	9599	2021-03-04T15:20:42Z	2021-03-04T15:20:42Z	MEMBER	I'm not sure why but my most recent import, when displayed in Datasette, looks like this: <img width="574" alt="mbox__mbox_emails__753_446_rows" src="https://user-images.githubusercontent.com/9599/109985836-0ab00080-7cba-11eb-97d5-0631a0835b61.png"> Sorting by `id` in the opposite order gives me the data I would expect - so it looks like a bunch of null/blank messages are being imported at some point and showing up first due to ID ordering.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790693674	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790693674	MDEyOklzc3VlQ29tbWVudDc5MDY5MzY3NA==	9599	2021-03-04T15:18:36Z	2021-03-04T15:18:36Z	MEMBER	I imported my 10GB mbox with 750,000 emails in it, ran this tool (with a hacked fix for the blob column problem) - and now a search that returns 92 results takes 25.37ms! This is fantastic.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790669767	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790669767	MDEyOklzc3VlQ29tbWVudDc5MDY2OTc2Nw==	9599	2021-03-04T14:46:06Z	2021-03-04T14:46:06Z	MEMBER	Solution could be to pre-process that string by splitting on `(` and dropping everything afterwards, assuming that the `(...)` bit isn't necessary for correctly parsing the date.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790668263	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790668263	MDEyOklzc3VlQ29tbWVudDc5MDY2ODI2Mw==	9599	2021-03-04T14:43:58Z	2021-03-04T14:43:58Z	MEMBER	I added this code to output a message ID on errors: ```diff print("Errors: {}".format(num_errors)) print(traceback.format_exc()) + print("Message-Id: {}".format(email.get("Message-Id", "None"))) continue ``` Having found a message ID that had an error, I ran this command to see the context: rg --text --context 20 '44F289B0.000001.02100@SCHWARZE-DWFXMI' ~/gmail.mbox This was for the following error: ``` File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 102, in get_mbox message["date"] = get_message_date(email.get("Date"), email.get_from()) File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 178, in get_message_date datetime_tuple = email.utils.parsedate_tz(mail_date) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 50, in parsedate_tz res = _parsedate_tz(data) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 69, in _parsedate_tz data = data.split() AttributeError: 'Header' object has no attribute 'split' ``` Here's what I spotted in the `ripgrep` output: ``` 177133570:Message-Id: <44F289B0.000001.02100@SCHWARZE-DWFXMI> 177133571-Date: Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop�ische Sommerzeit) 177133572-X-Mailer: IncrediMail (5002253) ``` So it could it be that `_parsedate_tz` is having trouble with that `Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop�ische Sommerzeit)` string.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790312268	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790312268	MDEyOklzc3VlQ29tbWVudDc5MDMxMjI2OA==	9599	2021-03-04T05:48:16Z	2021-03-04T05:48:16Z	MEMBER	Wow, my mbox is a 10.35 GB download!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/issues/6#issuecomment-790384087	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/6	790384087	MDEyOklzc3VlQ29tbWVudDc5MDM4NDA4Nw==	9599	2021-03-04T07:22:51Z	2021-03-04T07:22:51Z	MEMBER	#3 also mentions the conflicting version with other tools.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	821841046
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790380839	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790380839	MDEyOklzc3VlQ29tbWVudDc5MDM4MDgzOQ==	9599	2021-03-04T07:17:05Z	2021-03-04T07:17:05Z	MEMBER	Looks like you're doing this: ```python elif message.get_content_type() == "text/plain": body = message.get_payload(decode=True) ``` So presumably that decodes to a unicode string? I imagine the reason the column is a `BLOB` for me is that `sqlite-utils` determines the column type based on the first batch of items - https://github.com/simonw/sqlite-utils/blob/09c3386f55f766b135b6a1c00295646c4ae29bec/sqlite_utils/db.py#L1927-L1928 - and I got unlucky and had something in my first batch that wasn't a unicode string.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790379629	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790379629	MDEyOklzc3VlQ29tbWVudDc5MDM3OTYyOQ==	9599	2021-03-04T07:14:41Z	2021-03-04T07:14:41Z	MEMBER	Confirmed: removing the `len()` call does not speed things up, so it's reading through the entire file for some other purpose too.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790378658	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790378658	MDEyOklzc3VlQ29tbWVudDc5MDM3ODY1OA==	9599	2021-03-04T07:12:48Z	2021-03-04T07:12:48Z	MEMBER	It looks like the `body` is being loaded into a BLOB column - so in Datasette default it looks like this: <img width="1650" alt="mbox__mbox_emails__753_446_rows" src="https://user-images.githubusercontent.com/9599/109924808-b4b96980-7c75-11eb-8c9e-307f2ae32d5a.png"> If I `datasette install datasette-render-binary` and then try again I get this: <img width="1487" alt="mbox__mbox_emails__753_446_rows" src="https://user-images.githubusercontent.com/9599/109924944-ea5e5280-7c75-11eb-9a32-404f3d68455f.png"> It would be great if we could store the `body` as unicode text instead. May have to do something clever to decode it based on some kind of charset header?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790373024	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790373024	MDEyOklzc3VlQ29tbWVudDc5MDM3MzAyNA==	9599	2021-03-04T07:01:58Z	2021-03-04T07:04:06Z	MEMBER	I got 9 warnings that look like this: ``` Errors: 1 Traceback (most recent call last): File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 103, in get_mbox message["date"] = get_message_date(email.get("Date"), email.get_from()) File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 167, in get_message_date datetime_tuple = email.utils.parsedate_tz(mail_date) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 50, in parsedate_tz res = _parsedate_tz(data) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 69, in _parsedate_tz data = data.split() AttributeError: 'Header' object has no attribute 'split' ``` It would be useful if those warnings told me the message ID (or similar) of the affected message so I could grep for it in the `mbox` and see what was going on.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790372621	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790372621	MDEyOklzc3VlQ29tbWVudDc5MDM3MjYyMQ==	9599	2021-03-04T07:01:18Z	2021-03-04T07:01:18Z	MEMBER	I'm not sure if it would work, but there is an alternative pattern for showing a progress bar against a really large file that I've used in `healthkit-to-sqlite` - you set the progress bar size to the size of the file in bytes, then update a counter as you read the file. https://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/cli.py#L24-L57 and https://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/utils.py#L4-L19 (the `progress_callback()` bit) is where that happens. It can be a bit of a convoluted pattern, and I'm not at all sure it would work for `mbox` files since it looks like that library has other reasons it needs to do a file scan rather than streaming it through one chunk of bytes at a time. So I imagine this would not work here.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790370485	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790370485	MDEyOklzc3VlQ29tbWVudDc5MDM3MDQ4NQ==	9599	2021-03-04T06:57:25Z	2021-03-04T06:57:48Z	MEMBER	The command takes quite a while to start running, presumably because this line causes it to have to scan the WHOLE file in order to generate a count: https://github.com/dogsheep/google-takeout-to-sqlite/blob/a3de045eba0fae4b309da21aa3119102b0efc576/google_takeout_to_sqlite/utils.py#L66-L67 I'm fine with waiting though. It's not like this is a command people run every day - and without that count we can't show a progress bar, which seems pretty important for a process that takes this long.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790369076	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790369076	MDEyOklzc3VlQ29tbWVudDc5MDM2OTA3Ng==	9599	2021-03-04T06:54:46Z	2021-03-04T06:54:46Z	MEMBER	The Rich-powered progress bar is pretty: ![rich](https://user-images.githubusercontent.com/9599/109923307-71f69200-7c73-11eb-9ee2-8f0a240f3994.gif)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-786925280	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	786925280	MDEyOklzc3VlQ29tbWVudDc4NjkyNTI4MA==	9599	2021-02-26T22:23:10Z	2021-02-26T22:23:10Z	MEMBER	Thanks! I requested my Gmail export from takeout - once that arrives I'll test it against this and then merge the PR.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/evernote-to-sqlite/pull/10#issuecomment-777839351	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/10	777839351	MDEyOklzc3VlQ29tbWVudDc3NzgzOTM1MQ==	9599	2021-02-11T22:37:55Z	2021-02-11T22:37:55Z	MEMBER	I've merged these changes by hand now, thanks!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	770712149
https://github.com/dogsheep/evernote-to-sqlite/issues/7#issuecomment-777827396	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/7	777827396	MDEyOklzc3VlQ29tbWVudDc3NzgyNzM5Ng==	9599	2021-02-11T22:13:14Z	2021-02-11T22:13:14Z	MEMBER	My best guess is that you have an older version of `sqlite-utils` installed here - the `replace=True` argument was added in version 2.0. I've bumped the dependency in `setup.py`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	743297582
https://github.com/dogsheep/evernote-to-sqlite/issues/9#issuecomment-777821383	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/9	777821383	MDEyOklzc3VlQ29tbWVudDc3NzgyMTM4Mw==	9599	2021-02-11T22:01:28Z	2021-02-11T22:01:28Z	MEMBER	Aha! I think I've figured out what's going on here. The CData blocks containing the notes look like this: `<![CDATA[<!DOCTYPE en-note SYSTEM "http://xml.evernote.com/pub/enml2.dtd"><en-note><div>This note includes two images.</div><div><br /></div>...` The DTD at http://xml.evernote.com/pub/enml2.dtd includes some entities: ``` <!--=========== External character mnemonic entities ===================--> <!ENTITY % HTMLlat1 PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"> %HTMLlat1; <!ENTITY % HTMLsymbol PUBLIC "-//W3C//ENTITIES Symbols for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent"> %HTMLsymbol; <!ENTITY % HTMLspecial PUBLIC "-//W3C//ENTITIES Special for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent"> %HTMLspecial; ``` So I need to be able to handle all of those different entities. I think I can do that using `html.entities.entitydefs` from the Python standard library, which looks a bit like this: ```python {'Aacute': 'Á', 'aacute': 'á', 'Aacute;': 'Á', 'aacute;': 'á', 'Abreve;': 'Ă', 'abreve;': 'ă', 'ac;': '∾', 'acd;': '∿', # ... } ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	748372469
https://github.com/dogsheep/evernote-to-sqlite/issues/11#issuecomment-777798330	https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/11	777798330	MDEyOklzc3VlQ29tbWVudDc3Nzc5ODMzMA==	9599	2021-02-11T21:18:58Z	2021-02-11T21:18:58Z	MEMBER	Thanks for the fix!	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	792851444
https://github.com/dogsheep/github-to-sqlite/issues/60#issuecomment-770071568	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/60	770071568	MDEyOklzc3VlQ29tbWVudDc3MDA3MTU2OA==	9599	2021-01-29T21:56:15Z	2021-01-29T21:56:15Z	MEMBER	I really like the way you're using pipes here - really smart. It's similar to how I build the demo database in this GitHub Actions workflow: https://github.com/dogsheep/github-to-sqlite/blob/62dfd3bc4014b108200001ef4bc746feb6f33b45/.github/workflows/deploy-demo.yml#L52-L82 `twitter-to-sqlite` actually has a mechanism for doing this kind of thing, documented at https://github.com/dogsheep/twitter-to-sqlite#providing-input-from-a-sql-query-with---sql-and---attach It lets you do things like: ``` $ twitter-to-sqlite users-lookup my.db --sql="select follower_id from following" --ids ``` Maybe I should add something similar to `github-to-sqlite`? Feels like it could be really useful.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	797097140
https://github.com/dogsheep/twitter-to-sqlite/issues/56#issuecomment-769957751	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/56	769957751	MDEyOklzc3VlQ29tbWVudDc2OTk1Nzc1MQ==	9599	2021-01-29T17:59:40Z	2021-01-29T17:59:40Z	MEMBER	This is interesting - how did you create that initial table? Was this using the `twitter-to-sqlite import archive.db ~/Downloads/twitter-2019-06-25-b31f2.zip` command, or something else?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	796736607
https://github.com/dogsheep/swarm-to-sqlite/issues/11#issuecomment-761967094	https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/11	761967094	MDEyOklzc3VlQ29tbWVudDc2MTk2NzA5NA==	9599	2021-01-18T04:11:13Z	2021-01-18T04:11:13Z	MEMBER	I just got a similar error: ``` File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/swarm_to_sqlite/utils.py", line 79, in save_checkin checkins_table.m2m("users", user, m2m_table="with", pk="id") File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 2048, in m2m id = other_table.insert(record, pk=pk, replace=True).last_pk File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 1781, in insert return self.insert_all( File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 1899, in insert_all self.insert_chunk( File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 1709, in insert_chunk result = self.db.execute(query, params) File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 226, in execute return self.conn.execute(sql, parameters) pysqlite3.dbapi2.OperationalError: table users has no column named countryCode ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	743400216
https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426877	https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31	748426877	MDEyOklzc3VlQ29tbWVudDc0ODQyNjg3Nw==	9599	2020-12-19T06:16:11Z	2020-12-19T06:16:11Z	MEMBER	Here's why: if "fts5" in str(e): But the error being raised here is: sqlite3.OperationalError: no such column: to I'm going to attempt the escaped on on every error.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	771316301
https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426663	https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31	748426663	MDEyOklzc3VlQ29tbWVudDc0ODQyNjY2Mw==	9599	2020-12-19T06:14:06Z	2020-12-19T06:14:06Z	MEMBER	Looks like I already do that here: https://github.com/dogsheep/dogsheep-beta/blob/9ba4401017ac24ffa3bc1db38e0910ea49de7616/dogsheep_beta/__init__.py#L141-L146	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	771316301
https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426501	https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31	748426501	MDEyOklzc3VlQ29tbWVudDc0ODQyNjUwMQ==	9599	2020-12-19T06:12:22Z	2020-12-19T06:12:22Z	MEMBER	I deliberately added support for advanced FTS in https://github.com/dogsheep/dogsheep-beta/commit/cbb2491b85d7ff416d6d429b60109e6c2d6d50b9 for #13 but that's the cause of this bug.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	771316301
https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426581	https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31	748426581	MDEyOklzc3VlQ29tbWVudDc0ODQyNjU4MQ==	9599	2020-12-19T06:13:17Z	2020-12-19T06:13:17Z	MEMBER	One fix for this could be to try running the raw query, but if it throws an error run it again with the query escaped.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	771316301
https://github.com/dogsheep/google-takeout-to-sqlite/issues/2#issuecomment-747126777	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/2	747126777	MDEyOklzc3VlQ29tbWVudDc0NzEyNjc3Nw==	9599	2020-12-17T00:36:52Z	2020-12-17T00:36:52Z	MEMBER	The memory profiler tricks I used in https://github.com/dogsheep/healthkit-to-sqlite/issues/7 could help figure out what's going on here.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	769376447
https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747034481	https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29	747034481	MDEyOklzc3VlQ29tbWVudDc0NzAzNDQ4MQ==	9599	2020-12-16T21:17:05Z	2020-12-16T21:17:05Z	MEMBER	I'm just going to add `q` for the moment.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	724759588
https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747031608	https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29	747031608	MDEyOklzc3VlQ29tbWVudDc0NzAzMTYwOA==	9599	2020-12-16T21:15:18Z	2020-12-16T21:15:18Z	MEMBER	Should I pass any other details to the `display_sql` here as well?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	724759588
https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747030964	https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29	747030964	MDEyOklzc3VlQ29tbWVudDc0NzAzMDk2NA==	9599	2020-12-16T21:14:54Z	2020-12-16T21:14:54Z	MEMBER	To do this I'll need the search term to be passed to the `display_sql` SQL query: https://github.com/dogsheep/dogsheep-beta/blob/4890ec87b5e2ec48940f32c9ad1f5aae25c75a4d/dogsheep_beta/__init__.py#L164-L171	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	724759588
https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747029636	https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29	747029636	MDEyOklzc3VlQ29tbWVudDc0NzAyOTYzNg==	9599	2020-12-16T21:14:03Z	2020-12-16T21:14:03Z	MEMBER	I think I can do this as a cunning trick in `display_sql`. Consider this example query: https://til.simonwillison.net/tils?sql=select%0D%0A++path%2C%0D%0A++snippet%28til_fts%2C+-1%2C+%27b4de2a49c8%27%2C+%278c94a2ed4b%27%2C+%27...%27%2C+60%29+as+snippet%0D%0Afrom%0D%0A++til%0D%0A++join+til_fts+on+til.rowid+%3D+til_fts.rowid%0D%0Awhere%0D%0A++til_fts+match+escape_fts%28%3Aq%29%0D%0A++and+path+%3D+%27asgi_lifespan-test-httpx.md%27%0D%0A&q=pytest ```sql select path, snippet(til_fts, -1, 'b4de2a49c8', '8c94a2ed4b', '...', 60) as snippet from til join til_fts on til.rowid = til_fts.rowid where til_fts match escape_fts(:q) and path = 'asgi_lifespan-test-httpx.md' ``` The `and path = 'asgi_lifespan-test-httpx.md'` bit means we only get back a specific document - but the snippet highlighting is applied to it.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	724759588
https://github.com/dogsheep/github-to-sqlite/issues/58#issuecomment-746735889	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/58	746735889	MDEyOklzc3VlQ29tbWVudDc0NjczNTg4OQ==	9599	2020-12-16T17:59:50Z	2020-12-16T17:59:50Z	MEMBER	I don't want to add a full HTML parser (like BeautifulSoup) as a dependency for this feature. Since the HTML comes from a single, trusted source (GitHub) I could probably handle this using [regular expressions](https://stackoverflow.com/a/1732454).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	769150394
https://github.com/dogsheep/github-to-sqlite/issues/58#issuecomment-746734412	https://api.github.com/repos/dogsheep/github-to-sqlite/issues/58	746734412	MDEyOklzc3VlQ29tbWVudDc0NjczNDQxMg==	9599	2020-12-16T17:58:56Z	2020-12-16T17:58:56Z	MEMBER	I'm going to rewrite those `<a href="#filtering-tables">` links to `<a href="#user-content-filtering-tables">` - but only if a corresponding `id="user-content-filtering-tables"` element exists.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	769150394
https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633704127	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	633704127	MDEyOklzc3VlQ29tbWVudDYzMzcwNDEyNw==	9599	2020-05-25T20:14:22Z	2020-05-25T20:14:22Z	MEMBER	https://github.com/dogsheep/dogsheep-photos/blob/0.4.1/README.md#serving-photos-locally-with-datasette-media	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	613006393
https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633629944	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	633629944	MDEyOklzc3VlQ29tbWVudDYzMzYyOTk0NA==	9599	2020-05-25T15:47:42Z	2020-05-25T15:47:42Z	MEMBER	I'll add a proper section to the README, but for the moment here's how I do this. First, install `datasette` and the `datasette-media` plugin. Create a `metadata.yaml` file with the following content: ```yaml plugins: datasette-media: photo: sql: \|- select path as filepath, 200 as resize_height from apple_photos where uuid = :key photo-big: sql: \|- select path as filepath, 1024 as resize_height from apple_photos where uuid = :key ``` Now run `datasette -m metadata.yaml photos.db` - thumbnails will be served at http://127.0.0.1:8001/-/media/photo/F4469918-13F3-43D8-9EC1-734C0E6B60AD and larger sizes of the image at http://127.0.0.1:8001/-/media/photo-big/A8B02C7D-365E-448B-9510-69F80C26304D I also made myself two custom pages, one showing recent images and one showing random images. To do this, install the `datasette-template-sql` plugin and then create a `templates/pages` directory and add these files: `recent-photos.html` ```html <h1>Recent photos</h1> <div> {% for photo in sql("select * from apple_photos order by date desc limit 100") %} <img src="/-/media/photo/{{ photo['uuid'] }}"> {% endfor %} </div> ``` `random-photos.html` ```html <h1>Random photos</h1> <div> {% for photo in sql("with foo as (select * from apple_photos order by date desc limit 5000) select * from foo order by random() limit 100") %} <img src="/-/media/photo/{{ photo['uuid'] }}"> {% endfor %} </div> ``` Now run `datasette -m metadata.yaml photos.db --template-dir=templates/` Visit http://127.0.0.1:8001/random-photos to see some random photos or http://127.0.0.1:8002/recent-photos for recent photos. This is using this mechanism: https://datasette.readthedocs.io/en/stable/custom_templates.html#custom-pages	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	613006393
https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633626741	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	633626741	MDEyOklzc3VlQ29tbWVudDYzMzYyNjc0MQ==	9599	2020-05-25T15:38:55Z	2020-05-25T15:38:55Z	MEMBER	Sure, I should absolutely document this!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	613006393
https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633644225	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	633644225	MDEyOklzc3VlQ29tbWVudDYzMzY0NDIyNQ==	9599	2020-05-25T16:30:44Z	2020-05-25T16:30:44Z	MEMBER	I'll add docs on using `datasette-json-html` too.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	613006393
https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-633643921	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	633643921	MDEyOklzc3VlQ29tbWVudDYzMzY0MzkyMQ==	9599	2020-05-25T16:29:44Z	2020-05-25T16:29:44Z	MEMBER	https://github.com/dogsheep/dogsheep-photos/blob/dc43fa8653cb9c7238a36f52239b91d1ec916d5c/README.md#serving-photos-locally-with-datasette-media	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	613006393
https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631229409	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	631229409	MDEyOklzc3VlQ29tbWVudDYzMTIyOTQwOQ==	9599	2020-05-20T04:30:40Z	2020-05-20T04:30:40Z	MEMBER	https://pypi.org/project/photos-to-sqlite/ now links to dogsheep-photos.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621444763
https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631229485	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	631229485	MDEyOklzc3VlQ29tbWVudDYzMTIyOTQ4NQ==	9599	2020-05-20T04:31:02Z	2020-05-20T04:31:02Z	MEMBER	https://pypi.org/project/dogsheep-photos/ is live.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621444763
https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631227245	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	631227245	MDEyOklzc3VlQ29tbWVudDYzMTIyNzI0NQ==	9599	2020-05-20T04:21:38Z	2020-05-20T04:21:38Z	MEMBER	I'm going to release 0.4 now.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621444763
https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631227105	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	631227105	MDEyOklzc3VlQ29tbWVudDYzMTIyNzEwNQ==	9599	2020-05-20T04:21:06Z	2020-05-20T04:21:06Z	MEMBER	Then I just need to push a final photos-to-sqlite release that updates the README to tell people about the name change.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621444763
https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631227020	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	631227020	MDEyOklzc3VlQ29tbWVudDYzMTIyNzAyMA==	9599	2020-05-20T04:20:48Z	2020-05-20T04:21:16Z	MEMBER	Next time I push a release it will create `dogsheep-photos` on PyPI.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621444763
https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631226953	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	631226953	MDEyOklzc3VlQ29tbWVudDYzMTIyNjk1Mw==	9599	2020-05-20T04:20:34Z	2020-05-20T04:20:34Z	MEMBER	Huh, it looks like Circle CI picked up the name change automatically. https://app.circleci.com/pipelines/github/dogsheep/dogsheep-photos	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621444763
https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631226572	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	631226572	MDEyOklzc3VlQ29tbWVudDYzMTIyNjU3Mg==	9599	2020-05-20T04:18:52Z	2020-05-20T04:18:52Z	MEMBER	Need to reconfigure Circle CI.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621444763
https://github.com/dogsheep/dogsheep-photos/issues/26#issuecomment-631226481	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/26	631226481	MDEyOklzc3VlQ29tbWVudDYzMTIyNjQ4MQ==	9599	2020-05-20T04:18:29Z	2020-05-20T04:18:29Z	MEMBER	I just renamed the repository.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621444763
https://github.com/dogsheep/dogsheep-photos/issues/24#issuecomment-631255206	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/24	631255206	MDEyOklzc3VlQ29tbWVudDYzMTI1NTIwNg==	9599	2020-05-20T06:00:25Z	2020-05-20T06:00:25Z	MEMBER	This needs documentation.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621323348
https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631253852	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25	631253852	MDEyOklzc3VlQ29tbWVudDYzMTI1Mzg1Mg==	9599	2020-05-20T05:56:17Z	2020-05-21T22:26:16Z	MEMBER	I have a `deploy-demo.sh` script now: ```bash #!/bin/bash if [ -f public.db ]; then rm public.db fi pipenv run dogsheep-photos create-subset photos.db public.db \ "select sha256 from apple_photos where albums like '%Public%'" pipenv run sqlite-utils create-view public.db photos_on_a_map \ "select date, latitude, longitude, apple_photos.sha256, uploads.ext, json_object( 'title', 'Taken on ' \|\| date, 'image', 'https://photos.simonwillison.net/i/' \|\| uploads.sha256 \|\| '.' \|\| uploads.ext \|\| '?w=400', 'link', 'https://photos.simonwillison.net/i/' \|\| uploads.sha256 \|\| '.' \|\| uploads.ext \|\| '?w=1200' ) as popup from apple_photos join uploads on apple_photos.sha256 = uploads.sha256 where latitude is not null order by date desc" \ --replace pipenv run datasette publish now public.db --project dogsheep-photos \ --about=dogsheep/dogsheep-photos \ --about_url="https://github.com/dogsheep/dogsheep-photos" \ --install=datasette-json-html \ --install=datasette-pretty-json \ --install=datasette-cluster-map>=0.10 \ --title "Dogsheep Photos demo" ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621332242
https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631253248	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25	631253248	MDEyOklzc3VlQ29tbWVudDYzMTI1MzI0OA==	9599	2020-05-20T05:54:18Z	2020-05-20T05:54:18Z	MEMBER	https://dogsheep-photos.dogsheep.net/	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621332242
https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631253136	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25	631253136	MDEyOklzc3VlQ29tbWVudDYzMTI1MzEzNg==	9599	2020-05-20T05:53:58Z	2020-05-20T05:53:58Z	MEMBER	Updated deploy command: ``` datasette publish now public.db --project dogsheep-photos \ --about=dogsheep/dogsheep-photos \ --about_url="https://github.com/dogsheep/dogsheep-photos" \ --install=datasette-json-html \ --install=datasette-cluster-map \ --title "Dogsheep Photos demo" ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621332242
https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631251707	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25	631251707	MDEyOklzc3VlQ29tbWVudDYzMTI1MTcwNw==	9599	2020-05-20T05:49:27Z	2020-05-21T15:58:42Z	MEMBER	Renaming this demo to `dogsheep-photos.dogsheep.net`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621332242
https://github.com/dogsheep/dogsheep-photos/issues/25#issuecomment-631127454	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/25	631127454	MDEyOklzc3VlQ29tbWVudDYzMTEyNzQ1NA==	9599	2020-05-19T22:48:00Z	2020-05-21T15:58:32Z	MEMBER	I built #23 to help with this. $ dogsheep-photos create-subset photos.db public.db \ "select sha256 from apple_photos where albums like '%Public%'" And publish with Vercel: $ datasette publish now public.db --project dogsheep-photos \ --about=dogsheep/dogsheep-photos \ --about_url="https://github.com/dogsheep/dogsheep-photos" \ --install=datasette-json-html \ --install=datasette-cluster-map	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621332242
https://github.com/dogsheep/dogsheep-photos/issues/23#issuecomment-631120771	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/23	631120771	MDEyOklzc3VlQ29tbWVudDYzMTEyMDc3MQ==	9599	2020-05-19T22:32:48Z	2020-05-19T22:32:48Z	MEMBER	Documentation: https://github.com/dogsheep/photos-to-sqlite/blob/e2fab012551eed05278040b5d57e7373a1b9a0bf/README.md#creating-a-subset-database	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	621280529
https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-626941278	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22	626941278	MDEyOklzc3VlQ29tbWVudDYyNjk0MTI3OA==	9599	2020-05-11T20:25:58Z	2020-05-11T20:25:58Z	MEMBER	Interesting - do you know if there's anything the `exiftool` process handles that `ExifReader` doesn't? I'm actually just going to extract a subset of the EXIF data at first - since the original photo files will always be available I don't feel the need to get everything out for the first step. My plan is to use EXIF to help support photo collections that aren't in Apple Photos - I'm going to build a database table keyed by the `sha256` of each photo that extracts the camera make, lens, a few settings (ISO, aperture etc) and the GPS lat/lon.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	615626118
https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395781	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	626395781	MDEyOklzc3VlQ29tbWVudDYyNjM5NTc4MQ==	9599	2020-05-10T21:57:09Z	2020-05-10T21:57:09Z	MEMBER	Yes, I just recreated my virtual environment from scratch and the error went away. The problem occurred when I ran `pip install datasette-bplist` in the same virtual environment - https://github.com/simonw/datasette-bplist/blob/master/setup.py depends on `bpylist` which is incompatible with `bpylist2`.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	615474990
https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395209	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	626395209	MDEyOklzc3VlQ29tbWVudDYyNjM5NTIwOQ==	9599	2020-05-10T21:52:42Z	2020-05-10T21:52:42Z	MEMBER	Aha! It looks like I accidentally installed the old bplist into the same environment: ``` $ pip freeze \| grep bpylist bpylist==0.1.4 bpylist2==3.0.0 ```	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	615474990
https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395103	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	626395103	MDEyOklzc3VlQ29tbWVudDYyNjM5NTEwMw==	9599	2020-05-10T21:51:36Z	2020-05-10T21:51:36Z	MEMBER	@RhetTbull I tried that workaround and it turns out I'm getting this error on ALL of my photos now! It's weird: a few day ago this wasn't happening. Now it's happening to everything. I'm not sure what I might have changed.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	615474990
https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626394989	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	626394989	MDEyOklzc3VlQ29tbWVudDYyNjM5NDk4OQ==	9599	2020-05-10T21:50:36Z	2020-05-10T21:50:36Z	MEMBER	https://github.com/Marketcircle/bpylist/pull/2 looks relevant here.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	615474990
https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626388837	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	626388837	MDEyOklzc3VlQ29tbWVudDYyNjM4ODgzNw==	9599	2020-05-10T20:59:32Z	2020-05-10T20:59:32Z	MEMBER	So it appears it's possible for `photo.place` to raise that exception. A workaround could be to catch that and treat those photos as not having a place.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	615474990
https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626388764	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21	626388764	MDEyOklzc3VlQ29tbWVudDYyNjM4ODc2NA==	9599	2020-05-10T20:58:52Z	2020-05-10T20:58:52Z	MEMBER	More from the debugger: ``` > /Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/photoinfo.py(614)place() -> self._place = PlaceInfo5(self._info["reverse_geolocation"]) ``` And: ``` > /Users/simon/Dropbox/Development/photos-to-sqlite/photos_to_sqlite/utils.py(91)osxphoto_to_row() -> place = photo.place ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	615474990
https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-625947133	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	625947133	MDEyOklzc3VlQ29tbWVudDYyNTk0NzEzMw==	9599	2020-05-08T18:13:06Z	2020-05-08T18:13:06Z	MEMBER	`datasette-media` will be able to handle this once I implement https://github.com/simonw/datasette-media/issues/3	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	613006393
https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-624408738	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	624408738	MDEyOklzc3VlQ29tbWVudDYyNDQwODczOA==	9599	2020-05-06T02:21:05Z	2020-05-06T02:21:32Z	MEMBER	Here's rendering code from my hacked-together not-yet-released S3 image proxy: ```python from starlette.responses import Response from PIL import Image, ExifTags import pyheif for ORIENTATION_TAG in ExifTags.TAGS.keys(): if ExifTags.TAGS[ORIENTATION_TAG] == "Orientation": break ... # Load it into Pillow if ext == "heic": heic = pyheif.read_heif(image_response.content) image = Image.frombytes(mode=heic.mode, size=heic.size, data=heic.data) else: image = Image.open(io.BytesIO(image_response.content)) # Does EXIF tell us to rotate it? try: exif = dict(image._getexif().items()) if exif[ORIENTATION_TAG] == 3: image = image.rotate(180, expand=True) elif exif[ORIENTATION_TAG] == 6: image = image.rotate(270, expand=True) elif exif[ORIENTATION_TAG] == 8: image = image.rotate(90, expand=True) except (AttributeError, KeyError, IndexError): pass # Resize based on ?w= and ?h=, if set width, height = image.size w = request.query_params.get("w") h = request.query_params.get("h") if w is not None or h is not None: if h is None: # Set h based on w w = int(w) h = int((float(height) / width) * w) elif w is None: h = int(h) # Set w based on h w = int((float(width) / height) * h) w = int(w) h = int(h) image.thumbnail((w, h)) # ?bw= converts to black and white if request.query_params.get("bw"): image = image.convert("L") # ?q= sets the quality - defaults to 75 quality = 75 q = request.query_params.get("q") if q and q.isdigit() and 1 <= int(q) <= 100: quality = int(q) # Output as JPEG or PNG output_image = io.BytesIO() image_type = "JPEG" kwargs = {"quality": quality} if image.format == "PNG": image_type = "PNG" kwargs = {} …	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	613006393
https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-624408370	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	624408370	MDEyOklzc3VlQ29tbWVudDYyNDQwODM3MA==	9599	2020-05-06T02:19:27Z	2020-05-06T02:19:27Z	MEMBER	The plugin can be generalized: it can be configured to know how to take the URL path, look it up in ANY table (via a custom SQL query) to get a path on disk and then serve that.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	613006393
https://github.com/dogsheep/dogsheep-photos/issues/20#issuecomment-624408220	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/20	624408220	MDEyOklzc3VlQ29tbWVudDYyNDQwODIyMA==	9599	2020-05-06T02:18:47Z	2020-05-06T02:18:47Z	MEMBER	The `apple_photos` table has an indexed `uuid` column and a `path` column which stores the full path to that photo file on disk. I can write a custom Datasette plugin which takes the `uuid` from the URL, looks up the path, then serves up a thumbnail of the jpeg or heic image file. I'll prototype this is a one-off plugin first, then package it on PyPI for other people to install.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	613006393

github

Custom SQL query returning 101 rows (hide)

Query parameters