github

This data as json, CSV

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	issue
https://github.com/simonw/sqlite-utils/issues/159#issuecomment-802032152	https://api.github.com/repos/simonw/sqlite-utils/issues/159	802032152	MDEyOklzc3VlQ29tbWVudDgwMjAzMjE1Mg==	1025224	2021-03-18T15:42:52Z	2021-03-18T15:42:52Z	NONE	I confirm the bug. Happens for me in version 3.6. I use the call to delete all the records: `table.delete_where()` This does not delete anything. I see that `delete()` method DOES use context manager `with self.db.conn:` which should help. You may want to align the code of both methods.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	702386948
https://github.com/simonw/sqlite-utils/issues/246#issuecomment-801816980	https://api.github.com/repos/simonw/sqlite-utils/issues/246	801816980	MDEyOklzc3VlQ29tbWVudDgwMTgxNjk4MA==	37962604	2021-03-18T10:40:32Z	2021-03-18T10:43:04Z	NONE	I have found a similar problem, but I only when using that type of query (with `` for doing a prefix search). I'm also building something on top of FTS5/sqlite-utils, and the way I decided to handle it was creating a specific function for prefixes. According to [the docs](https://www2.sqlite.org/fts5.html#fts5_prefix_queries), the query can be done in this 2 ways: ```sql ... MATCH '"one two thr" ' ... MATCH 'one + two + thr' ``` I thought I could build a query like the first one using this function: ```python def prefix(query: str): return f'"{query}" ' ``` And then I use the output of that function as the query parameter for the standard `.search()` method in sqlite-utils. However, my use case is different because I'm the one "deciding" when to use a prefix search, not the end user. I also haven't done many tests, but maybe you found that useful. One thing I could think of is checking if the query has an `*` at the end, remove it and build the prefix query using the function above. This is just for prefix queries, I think having the escaping function is still useful for other use cases.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	831751367
https://github.com/simonw/sqlite-utils/issues/246#issuecomment-799479175	https://api.github.com/repos/simonw/sqlite-utils/issues/246	799479175	MDEyOklzc3VlQ29tbWVudDc5OTQ3OTE3NQ==	9599	2021-03-15T14:47:31Z	2021-03-15T14:47:31Z	OWNER	This is a smart feature. I have something that does this in Datasette, extracting it out to `sqlite-utils` makes a lot of sense. https://github.com/simonw/datasette/blob/8e18c7943181f228ce5ebcea48deb59ce50bee1f/datasette/utils/__init__.py#L818-L829	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	831751367
https://github.com/simonw/datasette/issues/236#issuecomment-799066252	https://api.github.com/repos/simonw/datasette/issues/236	799066252	MDEyOklzc3VlQ29tbWVudDc5OTA2NjI1Mg==	9599	2021-03-15T03:34:52Z	2021-03-15T03:34:52Z	OWNER	Yeah the Lambda Docker stuff is pretty odd - you still don't get to speak HTTP, you have to speak their custom event protocol instead. https://github.com/glassechidna/serverlessish looks interesting here - it adds a proxy inside the container which allows your existing HTTP Docker image to run within Docker-on-Lambda. I've not tried it out yet though.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	317001500
https://github.com/simonw/datasette/issues/236#issuecomment-799003172	https://api.github.com/repos/simonw/datasette/issues/236	799003172	MDEyOklzc3VlQ29tbWVudDc5OTAwMzE3Mg==	21148	2021-03-14T23:42:57Z	2021-03-14T23:42:57Z	CONTRIBUTOR	Oh, and the container image can be up to 10GB, so the EFS step might not be needed except for pretty big stuff.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	317001500
https://github.com/simonw/datasette/issues/236#issuecomment-799002993	https://api.github.com/repos/simonw/datasette/issues/236	799002993	MDEyOklzc3VlQ29tbWVudDc5OTAwMjk5Mw==	21148	2021-03-14T23:41:51Z	2021-03-14T23:41:51Z	CONTRIBUTOR	Now that [Lambda supports Docker](https://aws.amazon.com/blogs/aws/new-for-aws-lambda-container-image-support/), this probably is a bit easier and may be able to build on top of the existing package command. There are weirdnesses in how the command actually gets invoked; the [aws-lambda-python image](https://hub.docker.com/r/amazon/aws-lambda-python) shows a bit of that. So Datasette would probably need some sort of Lambda-specific entry point to make this work.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	317001500
https://github.com/simonw/datasette/pull/1260#issuecomment-798913090	https://api.github.com/repos/simonw/datasette/issues/1260	798913090	MDEyOklzc3VlQ29tbWVudDc5ODkxMzA5MA==	22429695	2021-03-14T14:01:30Z	2021-03-14T14:01:30Z	NONE	# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1260?src=pr&el=h1) Report > Merging [#1260](https://codecov.io/gh/simonw/datasette/pull/1260?src=pr&el=desc) (90f5fb6) into [main](https://codecov.io/gh/simonw/datasette/commit/8e18c7943181f228ce5ebcea48deb59ce50bee1f?el=desc) (8e18c79) will not change coverage. > The diff coverage is `83.33%`. [![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1260/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1)](https://codecov.io/gh/simonw/datasette/pull/1260?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## main #1260 +/- ## ======================================= Coverage 91.51% 91.51% ======================================= Files 34 34 Lines 4255 4255 ======================================= Hits 3894 3894 Misses 361 361 ``` \| [Impacted Files](https://codecov.io/gh/simonw/datasette/pull/1260?src=pr&el=tree) \| Coverage Δ \| \| \|---\|---\|---\| \| [datasette/inspect.py](https://codecov.io/gh/simonw/datasette/pull/1260/diff?src=pr&el=tree#diff-ZGF0YXNldHRlL2luc3BlY3QucHk=) \| `36.11% <0.00%> (ø)` \| \| \| [datasette/default\_magic\_parameters.py](https://codecov.io/gh/simonw/datasette/pull/1260/diff?src=pr&el=tree#diff-ZGF0YXNldHRlL2RlZmF1bHRfbWFnaWNfcGFyYW1ldGVycy5weQ==) \| `91.17% <50.00%> (ø)` \| \| \| [datasette/app.py](https://codecov.io/gh/simonw/datasette/pull/1260/diff?src=pr&el=tree#diff-ZGF0YXNldHRlL2FwcC5weQ==) \| `95.85% <100.00%> (ø)` \| \| \| [datasette/views/base.py](https://codecov.io/gh/simonw/datasette/pull/1260/diff?src=pr&el=tree#diff-ZGF0YXNldHRlL3ZpZXdzL2Jhc2UucHk=) \| `95.01% <100.00%> (ø)` \| \| \| [datasette/views/table.py](https://codecov.io/gh/simonw/datasette/pull/1260/diff?src=pr&el=tree#diff-ZGF0YXNldHRlL3ZpZXdzL3RhYmxlLnB5) \| `95.88% <100.00%> (ø)` \| \| ------ [Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1260?src=pr&el=contin…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	831163537
https://github.com/dogsheep/healthkit-to-sqlite/issues/14#issuecomment-798468572	https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/14	798468572	MDEyOklzc3VlQ29tbWVudDc5ODQ2ODU3Mg==	1234956	2021-03-13T14:47:31Z	2021-03-13T14:47:31Z	NONE	Ok, new PR works. I'm not `git` enough so I just force-pushed over the old one. I still end up with a lot of activities that are missing an `id` and therefore skipped (since this is used as the primary key). For example: ``` {'workoutActivityType': 'HKWorkoutActivityTypeRunning', 'duration': '35.31666666666667', 'durationUnit': 'min', 'totalDistance': '4.010870267636999', 'totalDistanceUnit': 'mi', 'totalEnergyBurned': '660.3516235351562', 'totalEnergyBurnedUnit': 'Cal', 'sourceName': 'Strava', 'sourceVersion': '22810', 'creationDate': '2020-07-16 13:38:26 -0700', 'startDate': '2020-07-16 06:38:26 -0700', 'endDate': '2020-07-16 07:13:45 -0700'} ``` I also end up with some unhappy characters (in the skipped events), such as: `'sourceName': 'Nathan’s Apple\xa0Watch',`. But it's successfully making it through the file, and the resulting db opens in datasette, so I'd call that progress.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	771608692
https://github.com/dogsheep/healthkit-to-sqlite/issues/14#issuecomment-798436026	https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/14	798436026	MDEyOklzc3VlQ29tbWVudDc5ODQzNjAyNg==	1234956	2021-03-13T14:23:16Z	2021-03-13T14:23:16Z	NONE	This PR allows my import to succeed. It looks like some events don't have an `id`, but do have `HKExternalUUID` (which gets turned into `metadata_HKExternalUUID`), so I use this as a fallback. If a record has neither of these, I changed it to just print the record (for debugging) and `return`. For some odd reason this ran fine at first, and now (after removing the generated db and trying again) I'm getting a different error (duplicate column name). Looks like it may have run when I had two successive runs without remembering to delete the db in between. Will try to refactor.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	771608692
https://github.com/simonw/datasette/issues/1259#issuecomment-797827038	https://api.github.com/repos/simonw/datasette/issues/1259	797827038	MDEyOklzc3VlQ29tbWVudDc5NzgyNzAzOA==	9599	2021-03-13T00:15:40Z	2021-03-13T00:15:40Z	OWNER	If all of the facets were being calculated in a single query, I'd be willing to bump the facet time limit up to something a lot higher, maybe even a full second. There's a chance that could work amazingly well with a materialized CTE.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	830567275
https://github.com/simonw/datasette/issues/1259#issuecomment-797804869	https://api.github.com/repos/simonw/datasette/issues/1259	797804869	MDEyOklzc3VlQ29tbWVudDc5NzgwNDg2OQ==	9599	2021-03-12T23:05:05Z	2021-03-12T23:05:05Z	OWNER	I wonder if I could optimize facet suggestion in the same way? One challenge: the query time limit will apply to the full CTE query, not to the individual columns.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	830567275
https://github.com/simonw/datasette/issues/1259#issuecomment-797801075	https://api.github.com/repos/simonw/datasette/issues/1259	797801075	MDEyOklzc3VlQ29tbWVudDc5NzgwMTA3NQ==	9599	2021-03-12T22:53:56Z	2021-03-12T22:55:16Z	OWNER	OK, a better comparison: https://global-power-plants.datasettes.com/global-power-plants?sql=WITH+data+as+%28%0D%0A++select%0D%0A++++%0D%0A++from%0D%0A++++%5Bglobal-power-plants%5D%0D%0A%29%2C%0D%0Acountry_long+as+%28select+%0D%0A++%27country_long%27+as+col%2C+country_long+as+value%2C+count%28%29+as+c+from+data+group+by+country_long%0D%0A++order+by+c+desc+limit+31%0D%0A%29%2C%0D%0Aprimary_fuel+as+%28%0D%0Aselect%0D%0A++%27primary_fuel%27+as+col%2C+primary_fuel+as+value%2C+count%28%29+as+c+from+data+group+by+primary_fuel%0D%0A++order+by+c+desc+limit+31%0D%0A%29%2C%0D%0Aowner+as+%28%0D%0Aselect%0D%0A++%27owner%27+as+col%2C+owner+as+value%2C+count%28%29+as+c+from+data+group+by+owner%0D%0A++order+by+c+desc+limit+31%0D%0A%29%0D%0Aselect++from+primary_fuel+union+select++from+country_long%0D%0Aunion+select++from+owner+order+by+col%2C+c+desc calculates facets against three columns. It takes 78.5ms* (and 34.5ms when I refreshed it, presumably after warming some SQLite caches of some sort). https://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_facet=country_long&_facet=primary_fuel&_trace=1&_size=0 shows those facets with size=0 on the SQL query - and shows a SQL trace at the bottom of the page. The country_long facet query takes 45.36ms, owner takes 38.45ms, primary_fuel takes 49.04ms - so a total of 132.85ms That's against https://global-power-plants.datasettes.com/-/versions says SQLite 3.27.3 - so even on a SQLite version that doesn't materialize the CTEs there's a significant performance boost to doing all three facets in a single CTE query.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	830567275
https://github.com/simonw/datasette/issues/1259#issuecomment-797790017	https://api.github.com/repos/simonw/datasette/issues/1259	797790017	MDEyOklzc3VlQ29tbWVudDc5Nzc5MDAxNw==	9599	2021-03-12T22:22:12Z	2021-03-12T22:22:12Z	OWNER	https://sqlite.org/lang_with.html > Prior to SQLite 3.35.0, all CTEs where treated as if the NOT MATERIALIZED phrase was present It looks like this optimization is completely unavailable on SQLite prior to 3.35.0 (released 12th March 2021). But I could still rewrite the faceting to work in this way, using the exact same SQL - it would just be significantly faster on 3.35.0+ (assuming it's actually faster in practice - would need to benchmark).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	830567275
https://github.com/simonw/datasette/issues/1193#issuecomment-797159434	https://api.github.com/repos/simonw/datasette/issues/1193	797159434	MDEyOklzc3VlQ29tbWVudDc5NzE1OTQzNA==	9599	2021-03-12T01:01:54Z	2021-03-12T01:01:54Z	OWNER	DuckDB has a read-only mechanism: https://duckdb.org/docs/api/python ```python import duckdb con = duckdb.connect(database="/tmp/blah.db", read_only=True) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	787173276
https://github.com/simonw/datasette/issues/1250#issuecomment-797159221	https://api.github.com/repos/simonw/datasette/issues/1250	797159221	MDEyOklzc3VlQ29tbWVudDc5NzE1OTIyMQ==	9599	2021-03-12T01:01:17Z	2021-03-12T01:01:17Z	OWNER	This is a duplicate of #1193.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	824067604
https://github.com/simonw/datasette/issues/670#issuecomment-797158641	https://api.github.com/repos/simonw/datasette/issues/670	797158641	MDEyOklzc3VlQ29tbWVudDc5NzE1ODY0MQ==	9599	2021-03-12T00:59:49Z	2021-03-12T00:59:49Z	OWNER	> Challenge: what's the equivalent for PostgreSQL of opening a database in read only mode? Will I have to talk users through creating read only credentials? It looks like the answer to this is yes - I'll need users to setup read-only credentials. Here's a TIL about that: https://til.simonwillison.net/postgresql/read-only-postgresql-user	{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 }	564833696
https://github.com/simonw/datasette/pull/1211#issuecomment-796854370	https://api.github.com/repos/simonw/datasette/issues/1211	796854370	MDEyOklzc3VlQ29tbWVudDc5Njg1NDM3MA==	9599	2021-03-11T16:15:29Z	2021-03-11T16:15:29Z	OWNER	Thanks very much for this - it's really comprehensive. I need to bake some of these patterns into my coding habits better!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	797649915
https://github.com/simonw/datasette/issues/838#issuecomment-795950636	https://api.github.com/repos/simonw/datasette/issues/838	795950636	MDEyOklzc3VlQ29tbWVudDc5NTk1MDYzNg==	79913	2021-03-10T19:24:13Z	2021-03-10T19:24:13Z	NONE	I think this could be solved by one of: 1. Stop generating absolute URLs, e.g. ones that include an origin. Relative URLs with absolute paths are fine, as long as they take `base_url` into account (as they do now, yay!). 2. Extend `base_url` to include the expected frontend origin, and then use that information when generating absolute URLs. 3. Document which HTTP headers the reverse proxy should set (e.g. the `X-Forwarded-*` family of conventional headers) to pass the frontend origin information to Datasette, and then use that information when generating absolute URLs. Option 1 seems like the easiest to me, if you can get away with never having to generate an absolute URL.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	637395097
https://github.com/simonw/datasette/issues/838#issuecomment-795939998	https://api.github.com/repos/simonw/datasette/issues/838	795939998	MDEyOklzc3VlQ29tbWVudDc5NTkzOTk5OA==	79913	2021-03-10T19:16:55Z	2021-03-10T19:16:55Z	NONE	Nod. The problem with the tests is that they're ignoring the origin (hostname, port) of links. In a reverse proxy situation, the frontend request origin is different than the backend request origin. The problem is Datasette generates links with the backend request origin.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	637395097
https://github.com/simonw/datasette/issues/838#issuecomment-795918377	https://api.github.com/repos/simonw/datasette/issues/838	795918377	MDEyOklzc3VlQ29tbWVudDc5NTkxODM3Nw==	9599	2021-03-10T19:01:48Z	2021-03-10T19:01:48Z	OWNER	The biggest challenge here I think is to replicate the exact situation here this happens in a Python unit test. The fix should be easy once we have a test in place.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	637395097
https://github.com/simonw/datasette/issues/838#issuecomment-795895436	https://api.github.com/repos/simonw/datasette/issues/838	795895436	MDEyOklzc3VlQ29tbWVudDc5NTg5NTQzNg==	9599	2021-03-10T18:44:46Z	2021-03-10T18:44:57Z	OWNER	Let's reopen this.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	637395097
https://github.com/simonw/datasette/issues/838#issuecomment-795893813	https://api.github.com/repos/simonw/datasette/issues/838	795893813	MDEyOklzc3VlQ29tbWVudDc5NTg5MzgxMw==	79913	2021-03-10T18:43:39Z	2021-03-10T18:43:39Z	NONE	@simonw Unfortunately this issue as I reported it is not actually solved in version 0.55. Every link which is returned by the `Datasette.absolute_url` method is still wrong, because it uses the request URL as the base. This still includes the suggested facet links and pagination links. What I wrote originally still stands: > Although many of the URLs in the pages are correct (presumably because they either use absolute paths which include `base_url` or relative paths), the faceting and pagination links still use fully-qualified URLs pointing at `http://localhost:8001`. > > I looked into this a little in the source code, and it seems to be an issue anywhere `request.url` or `request.path` is used, as these contain the values for the request between the frontend (Apache) and backend (Datasette) server. Those properties are primarily used via the `path_with_…` family of utility functions and the `Datasette.absolute_url` method. Would you prefer to re-open this issue or have me create a new one?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	637395097
https://github.com/simonw/datasette/pull/1254#issuecomment-795870524	https://api.github.com/repos/simonw/datasette/issues/1254	795870524	MDEyOklzc3VlQ29tbWVudDc5NTg3MDUyNA==	9599	2021-03-10T18:27:45Z	2021-03-10T18:27:45Z	OWNER	What other breaks did you spot?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	826613352
https://github.com/simonw/datasette/pull/1256#issuecomment-795869144	https://api.github.com/repos/simonw/datasette/issues/1256	795869144	MDEyOklzc3VlQ29tbWVudDc5NTg2OTE0NA==	9599	2021-03-10T18:26:46Z	2021-03-10T18:26:46Z	OWNER	Thanks!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	827341657
https://github.com/simonw/datasette/pull/1256#issuecomment-795112935	https://api.github.com/repos/simonw/datasette/issues/1256	795112935	MDEyOklzc3VlQ29tbWVudDc5NTExMjkzNQ==	6371750	2021-03-10T08:59:45Z	2021-03-10T08:59:45Z	CONTRIBUTOR	Sorry, I meant "minor typo" not "minor type".	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	827341657
https://github.com/simonw/datasette/pull/1256#issuecomment-795085921	https://api.github.com/repos/simonw/datasette/issues/1256	795085921	MDEyOklzc3VlQ29tbWVudDc5NTA4NTkyMQ==	22429695	2021-03-10T08:35:17Z	2021-03-10T08:35:17Z	NONE	# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1256?src=pr&el=h1) Report > Merging [#1256](https://codecov.io/gh/simonw/datasette/pull/1256?src=pr&el=desc) (4eef524) into [main](https://codecov.io/gh/simonw/datasette/commit/d0fd833b8cdd97e1b91d0f97a69b494895d82bee?el=desc) (d0fd833) will not change coverage. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1256/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1)](https://codecov.io/gh/simonw/datasette/pull/1256?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## main #1256 +/- ## ======================================= Coverage 91.56% 91.56% ======================================= Files 34 34 Lines 4244 4244 ======================================= Hits 3886 3886 Misses 358 358 ``` ------ [Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1256?src=pr&el=continue). > Legend - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1256?src=pr&el=footer). Last update [d0fd833...4eef524](https://codecov.io/gh/simonw/datasette/pull/1256?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	827341657
https://github.com/simonw/datasette/pull/1254#issuecomment-794518438	https://api.github.com/repos/simonw/datasette/issues/1254	794518438	MDEyOklzc3VlQ29tbWVudDc5NDUxODQzOA==	3200608	2021-03-09T22:04:23Z	2021-03-09T22:04:23Z	NONE	Dang, you're absolutely right. Spatialite 5.0 had been working fine for a plugin I was developing, but it apparently is broken in several other ways.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	826613352
https://github.com/simonw/datasette/pull/1254#issuecomment-794441034	https://api.github.com/repos/simonw/datasette/issues/1254	794441034	MDEyOklzc3VlQ29tbWVudDc5NDQ0MTAzNA==	22429695	2021-03-09T20:54:18Z	2021-03-09T21:12:15Z	NONE	# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1254?src=pr&el=h1) Report > Merging [#1254](https://codecov.io/gh/simonw/datasette/pull/1254?src=pr&el=desc) (b103204) into [main](https://codecov.io/gh/simonw/datasette/commit/d0fd833b8cdd97e1b91d0f97a69b494895d82bee?el=desc) (d0fd833) will decrease coverage by `0.04%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1254/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1)](https://codecov.io/gh/simonw/datasette/pull/1254?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## main #1254 +/- ## ========================================== - Coverage 91.56% 91.51% -0.05% ========================================== Files 34 34 Lines 4244 4244 ========================================== - Hits 3886 3884 -2 - Misses 358 360 +2 ``` \| [Impacted Files](https://codecov.io/gh/simonw/datasette/pull/1254?src=pr&el=tree) \| Coverage Δ \| \| \|---\|---\|---\| \| [datasette/database.py](https://codecov.io/gh/simonw/datasette/pull/1254/diff?src=pr&el=tree#diff-ZGF0YXNldHRlL2RhdGFiYXNlLnB5) \| `92.93% <0.00%> (-0.75%)` \| :arrow_down: \| ------ [Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1254?src=pr&el=continue). > Legend - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1254?src=pr&el=footer). Last update [d0fd833...b103204](https://codecov.io/gh/simonw/datasette/pull/1254?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	826613352
https://github.com/simonw/datasette/pull/1254#issuecomment-794443710	https://api.github.com/repos/simonw/datasette/issues/1254	794443710	MDEyOklzc3VlQ29tbWVudDc5NDQ0MzcxMA==	3200608	2021-03-09T20:56:45Z	2021-03-09T20:56:45Z	NONE	Oh wow I didn't even see that you had opened an issue about this so recently. I'll check on `/dbname` and report back.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	826613352
https://github.com/simonw/datasette/pull/1254#issuecomment-794439632	https://api.github.com/repos/simonw/datasette/issues/1254	794439632	MDEyOklzc3VlQ29tbWVudDc5NDQzOTYzMg==	9599	2021-03-09T20:53:02Z	2021-03-09T20:53:02Z	OWNER	Thanks for catching that documentation update!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	826613352
https://github.com/simonw/datasette/pull/1254#issuecomment-794437715	https://api.github.com/repos/simonw/datasette/issues/1254	794437715	MDEyOklzc3VlQ29tbWVudDc5NDQzNzcxNQ==	9599	2021-03-09T20:51:19Z	2021-03-09T20:51:19Z	OWNER	Did you see my note on https://github.com/simonw/datasette/issues/1249#issuecomment-792384382 about a weird issue I was having with the `/dbname` page hanging the server? Have you seen anything like that in your work here?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	826613352
https://github.com/simonw/datasette/pull/1252#issuecomment-793308483	https://api.github.com/repos/simonw/datasette/issues/1252	793308483	MDEyOklzc3VlQ29tbWVudDc5MzMwODQ4Mw==	22429695	2021-03-09T03:06:10Z	2021-03-09T03:06:10Z	NONE	# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1252?src=pr&el=h1) Report > Merging [#1252](https://codecov.io/gh/simonw/datasette/pull/1252?src=pr&el=desc) (d22aa32) into [main](https://codecov.io/gh/simonw/datasette/commit/d0fd833b8cdd97e1b91d0f97a69b494895d82bee?el=desc) (d0fd833) will decrease coverage by `0.04%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1252/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1)](https://codecov.io/gh/simonw/datasette/pull/1252?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## main #1252 +/- ## ========================================== - Coverage 91.56% 91.51% -0.05% ========================================== Files 34 34 Lines 4244 4244 ========================================== - Hits 3886 3884 -2 - Misses 358 360 +2 ``` \| [Impacted Files](https://codecov.io/gh/simonw/datasette/pull/1252?src=pr&el=tree) \| Coverage Δ \| \| \|---\|---\|---\| \| [datasette/database.py](https://codecov.io/gh/simonw/datasette/pull/1252/diff?src=pr&el=tree#diff-ZGF0YXNldHRlL2RhdGFiYXNlLnB5) \| `92.93% <0.00%> (-0.75%)` \| :arrow_down: \| ------ [Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1252?src=pr&el=continue). > Legend - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1252?src=pr&el=footer). Last update [d0fd833...d22aa32](https://codecov.io/gh/simonw/datasette/pull/1252?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	825217564
https://github.com/simonw/datasette/issues/1250#issuecomment-792386484	https://api.github.com/repos/simonw/datasette/issues/1250	792386484	MDEyOklzc3VlQ29tbWVudDc5MjM4NjQ4NA==	9599	2021-03-08T00:29:06Z	2021-03-08T00:29:06Z	OWNER	DuckDB has a read-only mechanism: https://duckdb.org/docs/api/python ```python import duckdb con = duckdb.connect(database="/tmp/blah.db", read_only=True) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	824067604
https://github.com/simonw/datasette/issues/1248#issuecomment-792385274	https://api.github.com/repos/simonw/datasette/issues/1248	792385274	MDEyOklzc3VlQ29tbWVudDc5MjM4NTI3NA==	9599	2021-03-08T00:25:10Z	2021-03-08T00:25:10Z	OWNER	It's not possible yet, unfortunately. This came up on the forums recently: https://github.com/simonw/datasette/discussions/968 I'm leaning further towards making the database connection layer itself work via a plugin hook, which would open up the possibility of supporting DuckDB and other databases as well. I've not committed to doing this yet though.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	823035080
https://github.com/simonw/datasette/issues/1249#issuecomment-792384854	https://api.github.com/repos/simonw/datasette/issues/1249	792384854	MDEyOklzc3VlQ29tbWVudDc5MjM4NDg1NA==	9599	2021-03-08T00:23:38Z	2021-03-08T00:23:38Z	OWNER	One reason to prioritize this issue: Homebrew upgraded to SpatiaLite 5.0 recently https://formulae.brew.sh/formula/spatialite-tools and as a result SpatiaLite database created on my laptop don't appear to be compatible with Datasette when published using `datasette publish`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	824064069
https://github.com/simonw/datasette/issues/1249#issuecomment-792384382	https://api.github.com/repos/simonw/datasette/issues/1249	792384382	MDEyOklzc3VlQ29tbWVudDc5MjM4NDM4Mg==	9599	2021-03-08T00:22:02Z	2021-03-08T00:22:02Z	OWNER	I tried this patch against `Dockerfile`: ```diff diff --git a/Dockerfile b/Dockerfile index f4b1414..dd659e1 100644 --- a/Dockerfile +++ b/Dockerfile @@ -1,25 +1,26 @@ -FROM python:3.7.10-slim-stretch as build +FROM python:3.9.2-slim-buster as build # Setup build dependencies RUN apt update \ -&& apt install -y python3-dev build-essential wget libxml2-dev libproj-dev libgeos-dev libsqlite3-dev zlib1g-dev pkg-config git \ - && apt clean + && apt install -y python3-dev build-essential wget libxml2-dev libproj-dev \ + libminizip-dev libgeos-dev libsqlite3-dev zlib1g-dev pkg-config git \ + && apt clean - -RUN wget "https://www.sqlite.org/2020/sqlite-autoconf-3310100.tar.gz" && tar xzf sqlite-autoconf-3310100.tar.gz \ - && cd sqlite-autoconf-3310100 && ./configure --disable-static --enable-fts5 --enable-json1 CFLAGS="-g -O2 -DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_JSON1" \ +RUN wget "https://www.sqlite.org/2021/sqlite-autoconf-3340100.tar.gz" && tar xzf sqlite-autoconf-3340100.tar.gz \ + && cd sqlite-autoconf-3340100 && ./configure --disable-static --enable-fts5 --enable-json1 \ + CFLAGS="-g -O2 -DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_JSON1" \ && make && make install -RUN wget "http://www.gaia-gis.it/gaia-sins/freexl-sources/freexl-1.0.5.tar.gz" && tar zxf freexl-1.0.5.tar.gz \ - && cd freexl-1.0.5 && ./configure && make && make install +RUN wget "http://www.gaia-gis.it/gaia-sins/freexl-1.0.6.tar.gz" && tar zxf freexl-1.0.6.tar.gz \ + && cd freexl-1.0.6 && ./configure && make && make install -RUN wget "http://www.gaia-gis.it/gaia-sins/libspatialite-sources/libspatialite-4.4.0-RC0.tar.gz" && tar zxf libspatialite-4.4.0-RC0.tar.gz \ - && cd libspatialite-4.4.0-RC0 && ./configure && make && make install +RUN wget "http://www.gaia-gis.it/gaia-sins/libspatialite-5.0.1.tar.gz" && tar zxf libspatialite-…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	824064069
https://github.com/simonw/datasette/issues/1249#issuecomment-792383956	https://api.github.com/repos/simonw/datasette/issues/1249	792383956	MDEyOklzc3VlQ29tbWVudDc5MjM4Mzk1Ng==	9599	2021-03-08T00:20:09Z	2021-03-08T00:20:09Z	OWNER	Worth noting that the Docker image used by `datasette publish cloudrun` doesn't actually use that Datasette docker image - it does this: https://github.com/simonw/datasette/blob/d0fd833b8cdd97e1b91d0f97a69b494895d82bee/datasette/utils/__init__.py#L349-L353 Where the apt extras for SpatiaLite are: https://github.com/simonw/datasette/blob/d0fd833b8cdd97e1b91d0f97a69b494895d82bee/datasette/utils/__init__.py#L344-L345 `libsqlite3-mod-spatialite` against that official `python:3.8` image doesn't appear to install SpatiaLite 5.0.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	824064069
https://github.com/simonw/datasette/issues/858#issuecomment-792308036	https://api.github.com/repos/simonw/datasette/issues/858	792308036	MDEyOklzc3VlQ29tbWVudDc5MjMwODAzNg==	1219001	2021-03-07T16:41:54Z	2021-03-07T16:41:54Z	NONE	Apologies if I sound dense but I don't see where you would pass 'shell=True'. I'm using the CLI installed via pip. On Sun., Mar. 7, 2021, 2:15 a.m. David Smith, <notifications@github.com> wrote: > To get it to work I had to: > > - > > add shell=true to the various commands in datasette > - > > use the name argument of the publish command. ( > https://docs.datasette.io/en/stable/publish.html) > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <https://github.com/simonw/datasette/issues/858#issuecomment-792230560>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAJJTOMZMGYSCGUU4J3AVSDTCMRX5ANCNFSM4ODNEDYA> > . >	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	642388564
https://github.com/simonw/datasette/pull/1223#issuecomment-792233255	https://api.github.com/repos/simonw/datasette/issues/1223	792233255	MDEyOklzc3VlQ29tbWVudDc5MjIzMzI1NQ==	9599	2021-03-07T07:41:01Z	2021-03-07T07:41:01Z	OWNER	This is fantastic, thanks so much for tracking this down.	{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 }	806918878
https://github.com/simonw/datasette/issues/858#issuecomment-792230560	https://api.github.com/repos/simonw/datasette/issues/858	792230560	MDEyOklzc3VlQ29tbWVudDc5MjIzMDU2MA==	39445562	2021-03-07T07:14:58Z	2021-03-07T07:14:58Z	NONE	To get it to work I had to: - add `shell=true` to the various commands in datasette - use the name argument of the publish command. (https://docs.datasette.io/en/stable/publish.html)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	642388564
https://github.com/simonw/datasette/issues/858#issuecomment-792129022	https://api.github.com/repos/simonw/datasette/issues/858	792129022	MDEyOklzc3VlQ29tbWVudDc5MjEyOTAyMg==	1219001	2021-03-07T00:23:34Z	2021-03-07T00:23:34Z	NONE	@smithdc1 Can you tell us what you did to get it to publish in Windows? What commands did you pass?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	642388564
https://github.com/simonw/datasette/issues/766#issuecomment-791509910	https://api.github.com/repos/simonw/datasette/issues/766	791509910	MDEyOklzc3VlQ29tbWVudDc5MTUwOTkxMA==	6371750	2021-03-05T15:57:35Z	2021-03-05T16:35:21Z	CONTRIBUTOR	Hello, I have the same wildcards search problems with an instance of Datasette. http://crbc-dataset.huma-num.fr/inventaires/fonds_auguste_dupouy_1872_1967?_search=gwerz&_sort=rowid is OK but http://crbc-dataset.huma-num.fr/inventaires/fonds_auguste_dupouy_1872_1967?_search=gwe* is not (FTS is activated on "Reference" "IntituleAnalyse" "NomDuProducteur" "PresentationDuContenu" "Notes"). Notice that a SQL query as below launched directly from SQLite in the server's shell, retrieves results. `select * from fonds_auguste_dupouy_1872_1967_fts where IntituleAnalyse MATCH "gwe*";` Thanks,	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	617323873
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-791530093	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	791530093	MDEyOklzc3VlQ29tbWVudDc5MTUzMDA5Mw==	306240	2021-03-05T16:28:07Z	2021-03-05T16:28:07Z	NONE	> I just tried to run this on a small VPS instance with 2GB of memory and it crashed out of memory while processing a 12GB mbox from Takeout. > > Is it possible to stream the emails to sqlite instead of loading it all into memory and upserting at once? @maxhawkins a limitation of the python mbox module is it loads the entire mbox into memory. I did find another approach to this problem that didn't use the builtin python mbox module and created a generator so that it didn't have to load the whole mbox into memory. I was hoping to use standard library modules, but this might be a good reason to investigate that approach a bit more. My worry is making sure a custom processor handles all the ins and outs of the mbox format correctly. Hm. As I'm writing this, I thought of something. I think I can parse each message one at a time, and then use an mbox function to load each message using the python mbox module. That way the mbox module can still deal with the specifics of the mbox format, but I can use a generator. I'll give that a try. Thanks for the feedback @maxhawkins and @simonw. I'll give that a try. @simonw can we hold off on merging this until I can test this new approach?	{ "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-791089881	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	791089881	MDEyOklzc3VlQ29tbWVudDc5MTA4OTg4MQ==	28565	2021-03-05T02:03:19Z	2021-03-05T02:03:19Z	NONE	I just tried to run this on a small VPS instance with 2GB of memory and it crashed out of memory while processing a 12GB mbox from Takeout. Is it possible to stream the emails to sqlite instead of loading it all into memory and upserting at once?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/dogsheep-photos/issues/32#issuecomment-791053721	https://api.github.com/repos/dogsheep/dogsheep-photos/issues/32	791053721	MDEyOklzc3VlQ29tbWVudDc5MTA1MzcyMQ==	6213	2021-03-05T00:31:27Z	2021-03-05T00:31:27Z	NONE	I am getting the same thing for US West (N. California) us-west-1	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	803333769
https://github.com/dogsheep/google-takeout-to-sqlite/issues/4#issuecomment-790934616	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4	790934616	MDEyOklzc3VlQ29tbWVudDc5MDkzNDYxNg==	203343	2021-03-04T20:54:44Z	2021-03-04T20:54:44Z	NONE	Sorry for the delay, I got sidetracked after class last night. I am getting the following error: ``` /content# google-takeout-to-sqlite mbox takeout.db Takeout/Mail/gmail.mbox Usage: google-takeout-to-sqlite [OPTIONS] COMMAND [ARGS]...Try 'google-takeout-to-sqlite --help' for help. Error: No such command 'mbox'. ``` On the box, I installed with pip after cloning: https://github.com/UtahDave/google-takeout-to-sqlite.git	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	778380836
https://github.com/simonw/datasette/issues/1238#issuecomment-790857004	https://api.github.com/repos/simonw/datasette/issues/1238	790857004	MDEyOklzc3VlQ29tbWVudDc5MDg1NzAwNA==	79913	2021-03-04T19:06:55Z	2021-03-04T19:06:55Z	NONE	@rgieseke Ah, that's super helpful. Thank you for the workaround for now!	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813899472
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790695126	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790695126	MDEyOklzc3VlQ29tbWVudDc5MDY5NTEyNg==	9599	2021-03-04T15:20:42Z	2021-03-04T15:20:42Z	MEMBER	I'm not sure why but my most recent import, when displayed in Datasette, looks like this: <img width="574" alt="mbox__mbox_emails__753_446_rows" src="https://user-images.githubusercontent.com/9599/109985836-0ab00080-7cba-11eb-97d5-0631a0835b61.png"> Sorting by `id` in the opposite order gives me the data I would expect - so it looks like a bunch of null/blank messages are being imported at some point and showing up first due to ID ordering.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790693674	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790693674	MDEyOklzc3VlQ29tbWVudDc5MDY5MzY3NA==	9599	2021-03-04T15:18:36Z	2021-03-04T15:18:36Z	MEMBER	I imported my 10GB mbox with 750,000 emails in it, ran this tool (with a hacked fix for the blob column problem) - and now a search that returns 92 results takes 25.37ms! This is fantastic.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790669767	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790669767	MDEyOklzc3VlQ29tbWVudDc5MDY2OTc2Nw==	9599	2021-03-04T14:46:06Z	2021-03-04T14:46:06Z	MEMBER	Solution could be to pre-process that string by splitting on `(` and dropping everything afterwards, assuming that the `(...)` bit isn't necessary for correctly parsing the date.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790668263	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790668263	MDEyOklzc3VlQ29tbWVudDc5MDY2ODI2Mw==	9599	2021-03-04T14:43:58Z	2021-03-04T14:43:58Z	MEMBER	I added this code to output a message ID on errors: ```diff print("Errors: {}".format(num_errors)) print(traceback.format_exc()) + print("Message-Id: {}".format(email.get("Message-Id", "None"))) continue ``` Having found a message ID that had an error, I ran this command to see the context: rg --text --context 20 '44F289B0.000001.02100@SCHWARZE-DWFXMI' ~/gmail.mbox This was for the following error: ``` File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 102, in get_mbox message["date"] = get_message_date(email.get("Date"), email.get_from()) File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 178, in get_message_date datetime_tuple = email.utils.parsedate_tz(mail_date) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 50, in parsedate_tz res = _parsedate_tz(data) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 69, in _parsedate_tz data = data.split() AttributeError: 'Header' object has no attribute 'split' ``` Here's what I spotted in the `ripgrep` output: ``` 177133570:Message-Id: <44F289B0.000001.02100@SCHWARZE-DWFXMI> 177133571-Date: Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop�ische Sommerzeit) 177133572-X-Mailer: IncrediMail (5002253) ``` So it could it be that `_parsedate_tz` is having trouble with that `Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop�ische Sommerzeit)` string.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790391711	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790391711	MDEyOklzc3VlQ29tbWVudDc5MDM5MTcxMQ==	306240	2021-03-04T07:36:24Z	2021-03-04T07:36:24Z	NONE	> Looks like you're doing this: > > ```python > elif message.get_content_type() == "text/plain": > body = message.get_payload(decode=True) > ``` > > So presumably that decodes to a unicode string? > > I imagine the reason the column is a `BLOB` for me is that `sqlite-utils` determines the column type based on the first batch of items - https://github.com/simonw/sqlite-utils/blob/09c3386f55f766b135b6a1c00295646c4ae29bec/sqlite_utils/db.py#L1927-L1928 - and I got unlucky and had something in my first batch that wasn't a unicode string. Ah, that's good to know. I think explicitly creating the tables will be a great improvement. I'll add that. Also, I noticed after I opened this PR that the `message.get_payload()` is being deprecated in favor of `message.get_content()` or something like that. I'll see if that handles the decoding better, too. Thanks for the feedback. I should have time tomorrow to put together some improvements.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790389335	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790389335	MDEyOklzc3VlQ29tbWVudDc5MDM4OTMzNQ==	306240	2021-03-04T07:32:04Z	2021-03-04T07:32:04Z	NONE	> The command takes quite a while to start running, presumably because this line causes it to have to scan the WHOLE file in order to generate a count: > > https://github.com/dogsheep/google-takeout-to-sqlite/blob/a3de045eba0fae4b309da21aa3119102b0efc576/google_takeout_to_sqlite/utils.py#L66-L67 > > I'm fine with waiting though. It's not like this is a command people run every day - and without that count we can't show a progress bar, which seems pretty important for a process that takes this long. The wait is from python loading the mbox file. This happens regardless if you're getting the length of the mbox. The mbox module is on the slow side. It is possible to do one's own parsing of the mbox, but I kind of wanted to avoid doing that.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/issues/6#issuecomment-790384087	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/6	790384087	MDEyOklzc3VlQ29tbWVudDc5MDM4NDA4Nw==	9599	2021-03-04T07:22:51Z	2021-03-04T07:22:51Z	MEMBER	#3 also mentions the conflicting version with other tools.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	821841046
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790380839	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790380839	MDEyOklzc3VlQ29tbWVudDc5MDM4MDgzOQ==	9599	2021-03-04T07:17:05Z	2021-03-04T07:17:05Z	MEMBER	Looks like you're doing this: ```python elif message.get_content_type() == "text/plain": body = message.get_payload(decode=True) ``` So presumably that decodes to a unicode string? I imagine the reason the column is a `BLOB` for me is that `sqlite-utils` determines the column type based on the first batch of items - https://github.com/simonw/sqlite-utils/blob/09c3386f55f766b135b6a1c00295646c4ae29bec/sqlite_utils/db.py#L1927-L1928 - and I got unlucky and had something in my first batch that wasn't a unicode string.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790379629	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790379629	MDEyOklzc3VlQ29tbWVudDc5MDM3OTYyOQ==	9599	2021-03-04T07:14:41Z	2021-03-04T07:14:41Z	MEMBER	Confirmed: removing the `len()` call does not speed things up, so it's reading through the entire file for some other purpose too.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790378658	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790378658	MDEyOklzc3VlQ29tbWVudDc5MDM3ODY1OA==	9599	2021-03-04T07:12:48Z	2021-03-04T07:12:48Z	MEMBER	It looks like the `body` is being loaded into a BLOB column - so in Datasette default it looks like this: <img width="1650" alt="mbox__mbox_emails__753_446_rows" src="https://user-images.githubusercontent.com/9599/109924808-b4b96980-7c75-11eb-8c9e-307f2ae32d5a.png"> If I `datasette install datasette-render-binary` and then try again I get this: <img width="1487" alt="mbox__mbox_emails__753_446_rows" src="https://user-images.githubusercontent.com/9599/109924944-ea5e5280-7c75-11eb-9a32-404f3d68455f.png"> It would be great if we could store the `body` as unicode text instead. May have to do something clever to decode it based on some kind of charset header?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790373024	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790373024	MDEyOklzc3VlQ29tbWVudDc5MDM3MzAyNA==	9599	2021-03-04T07:01:58Z	2021-03-04T07:04:06Z	MEMBER	I got 9 warnings that look like this: ``` Errors: 1 Traceback (most recent call last): File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 103, in get_mbox message["date"] = get_message_date(email.get("Date"), email.get_from()) File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 167, in get_message_date datetime_tuple = email.utils.parsedate_tz(mail_date) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 50, in parsedate_tz res = _parsedate_tz(data) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 69, in _parsedate_tz data = data.split() AttributeError: 'Header' object has no attribute 'split' ``` It would be useful if those warnings told me the message ID (or similar) of the affected message so I could grep for it in the `mbox` and see what was going on.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790372621	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790372621	MDEyOklzc3VlQ29tbWVudDc5MDM3MjYyMQ==	9599	2021-03-04T07:01:18Z	2021-03-04T07:01:18Z	MEMBER	I'm not sure if it would work, but there is an alternative pattern for showing a progress bar against a really large file that I've used in `healthkit-to-sqlite` - you set the progress bar size to the size of the file in bytes, then update a counter as you read the file. https://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/cli.py#L24-L57 and https://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/utils.py#L4-L19 (the `progress_callback()` bit) is where that happens. It can be a bit of a convoluted pattern, and I'm not at all sure it would work for `mbox` files since it looks like that library has other reasons it needs to do a file scan rather than streaming it through one chunk of bytes at a time. So I imagine this would not work here.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790370485	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790370485	MDEyOklzc3VlQ29tbWVudDc5MDM3MDQ4NQ==	9599	2021-03-04T06:57:25Z	2021-03-04T06:57:48Z	MEMBER	The command takes quite a while to start running, presumably because this line causes it to have to scan the WHOLE file in order to generate a count: https://github.com/dogsheep/google-takeout-to-sqlite/blob/a3de045eba0fae4b309da21aa3119102b0efc576/google_takeout_to_sqlite/utils.py#L66-L67 I'm fine with waiting though. It's not like this is a command people run every day - and without that count we can't show a progress bar, which seems pretty important for a process that takes this long.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790369076	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790369076	MDEyOklzc3VlQ29tbWVudDc5MDM2OTA3Ng==	9599	2021-03-04T06:54:46Z	2021-03-04T06:54:46Z	MEMBER	The Rich-powered progress bar is pretty: ![rich](https://user-images.githubusercontent.com/9599/109923307-71f69200-7c73-11eb-9ee2-8f0a240f3994.gif)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790312268	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	790312268	MDEyOklzc3VlQ29tbWVudDc5MDMxMjI2OA==	9599	2021-03-04T05:48:16Z	2021-03-04T05:48:16Z	MEMBER	Wow, my mbox is a 10.35 GB download!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/simonw/datasette/pull/1243#issuecomment-790311215	https://api.github.com/repos/simonw/datasette/issues/1243	790311215	MDEyOklzc3VlQ29tbWVudDc5MDMxMTIxNQ==	9599	2021-03-04T05:45:57Z	2021-03-04T05:45:57Z	OWNER	Thanks!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	815955014
https://github.com/simonw/datasette/issues/268#issuecomment-790257263	https://api.github.com/repos/simonw/datasette/issues/268	790257263	MDEyOklzc3VlQ29tbWVudDc5MDI1NzI2Mw==	649467	2021-03-04T03:20:23Z	2021-03-04T03:20:23Z	NONE	It's kind of an ugly hack, but you can try out what using the fts5 table as an actual datasette-accessible table looks like without changing any datasette code by creating yet another view on top of the fts5 table: `create view proxyview as select *, rank, table_fts as fts from table_fts;` That's now visible from datasette, just like any other view, but you can use `fts match escape_fts(search_string) order by rank`. This is only good as a proof of concept because you're inefficiently going from view -> fts5 external content table -> view -> data table. However, it does show it works.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	323718842
https://github.com/dogsheep/google-takeout-to-sqlite/issues/4#issuecomment-790198930	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4	790198930	MDEyOklzc3VlQ29tbWVudDc5MDE5ODkzMA==	203343	2021-03-04T00:58:40Z	2021-03-04T00:58:40Z	NONE	I am just seeing this sorry, yes! I will kick the tires later on tonight. My apologies for the delay.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	778380836
https://github.com/simonw/datasette/issues/283#issuecomment-789680230	https://api.github.com/repos/simonw/datasette/issues/283	789680230	MDEyOklzc3VlQ29tbWVudDc4OTY4MDIzMA==	605492	2021-03-03T12:28:42Z	2021-03-03T12:28:42Z	NONE	One note on using this pragma I got an error on starting datasette `no such table: pragma_database_list`. I diagnosed this to an older version of sqlite3 (3.14.2) and upgrading to a newer version (3.34.2) fixed the issue.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	325958506
https://github.com/simonw/datasette/issues/268#issuecomment-789409126	https://api.github.com/repos/simonw/datasette/issues/268	789409126	MDEyOklzc3VlQ29tbWVudDc4OTQwOTEyNg==	649467	2021-03-03T03:57:15Z	2021-03-03T03:58:40Z	NONE	In FTS5, I think doing an FTS search is actually much easier than doing a join against the main table like datasette does now. In fact, FTS5 external content tables provide a transparent interface back to the original table or view. Here's what I'm currently doing: * build a view that joins whatever tables I want and rename the columns to non-joiny names (e.g, `chapter.name AS chapter_name` in the view where needed) * Create an FTS5 table with `content="viewname"` * As described in the "external content tables" section (https://www.sqlite.org/fts5.html#external_content_tables), sql queries can be made directly to the FTS table, which behind the covers makes select calls to the content table when the content of the original columns are needed. * In addition, you get "rank" and "bm25()" available to you when you select on the _fts table. Unfortunately, datasette doesn't currently seem happy being coerced into doing a real query on an fts5 table. This works: ```select col1, col2, col3 from table_fts where coll1="value" and table_fts match escape_fts("search term") order by rank``` But this doesn't work in the datasette SQL query interface: ```select col1, col2, col3 from table_fts where coll1="value" and table_fts match escape_fts(:search) order by rank``` (the "search" input text field doesn't show up) For what datasette is doing right now, I think you could just use contentless fts5 tables (`content=""`), since all you care about is the rowid since all you're doing a subselect to get the rowid anyway. In fts5, that's just a contentless table. I guess if you want to follow this suggestion, you'd need a somewhat different code path for fts5.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	323718842
https://github.com/simonw/datasette/issues/1238#issuecomment-789186458	https://api.github.com/repos/simonw/datasette/issues/1238	789186458	MDEyOklzc3VlQ29tbWVudDc4OTE4NjQ1OA==	198537	2021-03-02T20:19:30Z	2021-03-02T20:19:30Z	CONTRIBUTOR	A custom `templates/index.html` seems to work and custom `pages` as a workaround with moving them to `pages/base_url_dir`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813899472
https://github.com/simonw/datasette/issues/1247#issuecomment-787616446	https://api.github.com/repos/simonw/datasette/issues/1247	787616446	MDEyOklzc3VlQ29tbWVudDc4NzYxNjQ0Ng==	9599	2021-03-01T03:50:37Z	2021-03-01T03:50:37Z	OWNER	I like the `.add_memory_database()` option. I also like that it makes it more obvious that this is a capability of Datasette, since I'm excited to see more plugins, features and tests that take advantage of it.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	818430405
https://github.com/simonw/datasette/issues/1247#issuecomment-787616158	https://api.github.com/repos/simonw/datasette/issues/1247	787616158	MDEyOklzc3VlQ29tbWVudDc4NzYxNjE1OA==	9599	2021-03-01T03:49:27Z	2021-03-01T03:49:27Z	OWNER	A couple of options: ```python datasette.add_memory_database("test_json_array") # or make that first argument to add_database() optional and support: datasette.add_database(memory_name="test_json_array") ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	818430405
https://github.com/simonw/datasette/issues/1246#issuecomment-787611153	https://api.github.com/repos/simonw/datasette/issues/1246	787611153	MDEyOklzc3VlQ29tbWVudDc4NzYxMTE1Mw==	9599	2021-03-01T03:30:57Z	2021-03-01T03:30:57Z	OWNER	I'm going to try a new pattern for testing this, enabled by #1151 - the test will create a new named in-memory database, write some records to it and then run some test facets against that. This will save me from having to add yet another fixtures table for this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817597268
https://github.com/simonw/datasette/issues/1005#issuecomment-787536267	https://api.github.com/repos/simonw/datasette/issues/1005	787536267	MDEyOklzc3VlQ29tbWVudDc4NzUzNjI2Nw==	9599	2021-02-28T22:30:37Z	2021-02-28T22:30:37Z	OWNER	It's out! https://github.com/encode/httpx/releases/tag/0.17.0	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	718259202
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787532279	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787532279	MDEyOklzc3VlQ29tbWVudDc4NzUzMjI3OQ==	9599	2021-02-28T22:09:37Z	2021-02-28T22:09:37Z	OWNER	Microsoft's playwright Python library solves this problem by code generating both their sync AND their async libraries https://github.com/microsoft/playwright-python/tree/master/scripts	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787198202	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787198202	MDEyOklzc3VlQ29tbWVudDc4NzE5ODIwMg==	9599	2021-02-27T22:33:58Z	2021-02-27T22:33:58Z	OWNER	Hah or use this trick, which genuinely rewrites the code at runtime using a class decorator! https://github.com/python-happybase/aiohappybase/blob/0990ef45cfdb720dc987afdb4957a0fac591cb99/aiohappybase/sync/_util.py#L19-L32	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787195536	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787195536	MDEyOklzc3VlQ29tbWVudDc4NzE5NTUzNg==	9599	2021-02-27T22:13:24Z	2021-02-27T22:13:24Z	OWNER	Some other interesting background reading: https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html - in particular see how SQLALchemy has a `await conn.run_sync(meta.drop_all)` mechanism for running methods that haven't themselves been provided in an async version	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787190562	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787190562	MDEyOklzc3VlQ29tbWVudDc4NzE5MDU2Mg==	9599	2021-02-27T22:04:00Z	2021-02-27T22:04:00Z	OWNER	From the poster here: https://github.com/sethmlarson/pycon-async-sync-poster/blob/master/poster.pdf <img width="624" alt="pycon-async-sync-poster_poster_pdf_at_master_·_sethmlarson_pycon-async-sync-poster" src="https://user-images.githubusercontent.com/9599/109401634-9f0a1400-7904-11eb-8b3a-37df0678b8dc.png">	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787186826	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787186826	MDEyOklzc3VlQ29tbWVudDc4NzE4NjgyNg==	9599	2021-02-27T22:01:54Z	2021-02-27T22:01:54Z	OWNER	`unasync` is an implementation of the exact pattern I was talking about above - it uses the `tokenize` module from the Python standard library to apply some clever rules to transform an async codebase into a sync one. https://unasync.readthedocs.io/en/latest/ - implementation here: https://github.com/python-trio/unasync/blob/v0.5.0/src/unasync/__init__.py	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787175126	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787175126	MDEyOklzc3VlQ29tbWVudDc4NzE3NTEyNg==	9599	2021-02-27T21:55:05Z	2021-02-27T21:55:05Z	OWNER	"how to use some new tools to more easily maintain a codebase that supports both async and synchronous I/O and multiple async libraries" - yeah that's exactly what I need, thank you!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787150276	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787150276	MDEyOklzc3VlQ29tbWVudDc4NzE1MDI3Ng==	37962604	2021-02-27T21:27:26Z	2021-02-27T21:27:26Z	NONE	I had this resource by Seth Michael Larson saved https://github.com/sethmlarson/pycon-async-sync-poster I haven't had a look at it, but it may contain useful info. On twitter, I mentioned passing an aiosqlite connection during the `Database` creation. I'm not 100% familiar with the `sqlite-utils` codebase, so I may be wrong here, but maybe decorating internal functions could be an option? Then they are awaited or not inside the decorator depending on how they are called.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787144523	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787144523	MDEyOklzc3VlQ29tbWVudDc4NzE0NDUyMw==	9599	2021-02-27T21:18:46Z	2021-02-27T21:18:46Z	OWNER	Here's a really wild idea: I wonder if it would be possible to run a source transformation against either the sync or the async versions of the code to produce the equivalent for the other paradigm? Could that even be as simple as a set of regular expressions against the `await ...` version that strips out or replaces the `await` and `async def` and `async for` statements? If so... I could maintain just the async version, generate the sync version with a script and rely on robust unit testing to guarantee that this actually works.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787142066	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787142066	MDEyOklzc3VlQ29tbWVudDc4NzE0MjA2Ng==	9599	2021-02-27T21:17:10Z	2021-02-27T21:17:10Z	OWNER	I have a hunch this is actually going to be quite difficult, due to the internal complexity of some of the `sqlite-utils` API methods. Consider `db[table].extract(...)` for example. It does a whole bunch of extra queries inside the method - each of those would need to be turned into an `await` call for the async version. Here's the method body today: https://github.com/simonw/sqlite-utils/blob/09c3386f55f766b135b6a1c00295646c4ae29bec/sqlite_utils/db.py#L1060-L1152 Writing this method twice - looking similar but with `await ...` tucked in before every internal method it calls that needs to execute SQL - is going to be pretty messy. One thing that would help a LOT is figuring out how to share the majority of the test code. If the exact same tests could run against both the sync and async versions with a bit of test trickery, maintaining parallel implementations would at least be a bit more feasible.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787121933	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787121933	MDEyOklzc3VlQ29tbWVudDc4NzEyMTkzMw==	25778	2021-02-27T19:18:57Z	2021-02-27T19:18:57Z	CONTRIBUTOR	I think HTTPX gets it exactly right, with a clear separation between sync and async clients, each with a basically identical API. (I'm about to switch [feed-to-sqlite](https://github.com/eyeseast/feed-to-sqlite) over to it, from Requests, to eventually make way for async support.)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787120136	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787120136	MDEyOklzc3VlQ29tbWVudDc4NzEyMDEzNg==	9599	2021-02-27T19:04:47Z	2021-02-27T19:04:47Z	OWNER	Another option here would be to add https://github.com/omnilib/aiosqlite/blob/main/aiosqlite/core.py as a dependency - it's four years old now and actively marinated, and the code is pretty small so it looks like a solid, stable, reliable dependency.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787118691	https://api.github.com/repos/simonw/sqlite-utils/issues/242	787118691	MDEyOklzc3VlQ29tbWVudDc4NzExODY5MQ==	9599	2021-02-27T18:53:23Z	2021-02-27T18:53:23Z	OWNER	Datasette has its own implementation of a write queue for exactly this purpose - and there's no reason at all that should stay in Datasette rather than being extracted out and moved over here to `sqlite-utils`. One small concern I have is around the API design. I'd want to keep supporting the existing synchronous API while also providing a similar API with await-based methods. What are some good examples of libraries that do this? I like how https://www.python-httpx.org/ handles it, maybe that's a good example to imitate?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817989436
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-786925280	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5	786925280	MDEyOklzc3VlQ29tbWVudDc4NjkyNTI4MA==	9599	2021-02-26T22:23:10Z	2021-02-26T22:23:10Z	MEMBER	Thanks! I requested my Gmail export from takeout - once that arrives I'll test it against this and then merge the PR.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813880401
https://github.com/simonw/datasette/issues/1238#issuecomment-786849095	https://api.github.com/repos/simonw/datasette/issues/1238	786849095	MDEyOklzc3VlQ29tbWVudDc4Njg0OTA5NQ==	9599	2021-02-26T19:29:38Z	2021-02-26T19:29:38Z	OWNER	Here's the test I wrote: ```diff git diff tests/test_custom_pages.py diff --git a/tests/test_custom_pages.py b/tests/test_custom_pages.py index 6a23192..5a71f56 100644 --- a/tests/test_custom_pages.py +++ b/tests/test_custom_pages.py @@ -2,11 +2,19 @@ import pathlib import pytest from .fixtures import make_app_client +TEST_TEMPLATE_DIRS = str(pathlib.Path(__file__).parent / "test_templates") + @pytest.fixture(scope="session") def custom_pages_client(): + with make_app_client(template_dir=TEST_TEMPLATE_DIRS) as client: + yield client + + +@pytest.fixture(scope="session") +def custom_pages_client_with_base_url(): with make_app_client( - template_dir=str(pathlib.Path(__file__).parent / "test_templates") + template_dir=TEST_TEMPLATE_DIRS, config={"base_url": "/prefix/"} ) as client: yield client @@ -23,6 +31,12 @@ def test_request_is_available(custom_pages_client): assert "path:/request" == response.text +def test_custom_pages_with_base_url(custom_pages_client_with_base_url): + response = custom_pages_client_with_base_url.get("/prefix/request") + assert 200 == response.status + assert "path:/prefix/request" == response.text + + def test_custom_pages_nested(custom_pages_client): response = custom_pages_client.get("/nested/nest") assert 200 == response.status ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813899472
https://github.com/simonw/datasette/issues/1238#issuecomment-786848654	https://api.github.com/repos/simonw/datasette/issues/1238	786848654	MDEyOklzc3VlQ29tbWVudDc4Njg0ODY1NA==	9599	2021-02-26T19:28:48Z	2021-02-26T19:28:48Z	OWNER	I added a debug line just before `for regex, wildcard_template` here: https://github.com/simonw/datasette/blob/afed51b1e36cf275c39e71c7cb262d6c5bdbaa31/datasette/app.py#L1148-L1155 And it showed that for some reason `request.path` is `/prefix/prefix/request` here - the prefix got doubled somehow.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813899472
https://github.com/simonw/datasette/issues/1238#issuecomment-786841261	https://api.github.com/repos/simonw/datasette/issues/1238	786841261	MDEyOklzc3VlQ29tbWVudDc4Njg0MTI2MQ==	9599	2021-02-26T19:13:44Z	2021-02-26T19:13:44Z	OWNER	Sounds like a bug - thanks for reporting this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	813899472
https://github.com/simonw/datasette/issues/1246#issuecomment-786840734	https://api.github.com/repos/simonw/datasette/issues/1246	786840734	MDEyOklzc3VlQ29tbWVudDc4Njg0MDczNA==	9599	2021-02-26T19:12:39Z	2021-02-26T19:12:47Z	OWNER	Could I take this part: ```python suggested_facet_sql = """ select distinct json_type({column}) from ({sql}) """.format( column=escape_sqlite(column), sql=self.sql ) ``` And add `where {column} is not null and {column} != ''` perhaps?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817597268
https://github.com/simonw/datasette/issues/1246#issuecomment-786840425	https://api.github.com/repos/simonw/datasette/issues/1246	786840425	MDEyOklzc3VlQ29tbWVudDc4Njg0MDQyNQ==	9599	2021-02-26T19:11:56Z	2021-02-26T19:11:56Z	OWNER	Relevant code: https://github.com/simonw/datasette/blob/afed51b1e36cf275c39e71c7cb262d6c5bdbaa31/datasette/facets.py#L271-L295	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817597268
https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786830832	https://api.github.com/repos/simonw/sqlite-utils/issues/239	786830832	MDEyOklzc3VlQ29tbWVudDc4NjgzMDgzMg==	9599	2021-02-26T18:52:40Z	2021-02-26T18:52:40Z	OWNER	Could this handle lists of objects too? That would be pretty amazing - if the column has a `[{...}, {...}]` list in it could turn that into a many-to-many.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	816526538
https://github.com/simonw/datasette/issues/1240#issuecomment-786813506	https://api.github.com/repos/simonw/datasette/issues/1240	786813506	MDEyOklzc3VlQ29tbWVudDc4NjgxMzUwNg==	9599	2021-02-26T18:19:46Z	2021-02-26T18:19:46Z	OWNER	Linking to rows from custom queries is a lot harder - because given an arbitrary string of SQL it's difficult to analyze it and figure out which (if any) of the returned columns represent a primary key. It's possible to manually write a SQL query that returns a column that will be treated as a link to another page using this plugin, but it's not particularly straight-forward: https://datasette.io/plugins/datasette-json-html	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	814591962
https://github.com/simonw/datasette/issues/1240#issuecomment-786812716	https://api.github.com/repos/simonw/datasette/issues/1240	786812716	MDEyOklzc3VlQ29tbWVudDc4NjgxMjcxNg==	9599	2021-02-26T18:18:18Z	2021-02-26T18:18:18Z	OWNER	Agreed, this would be extremely useful. I'd love to be able to facet against custom queries. It's a fair bit of work to implement but it's not impossible. Closing this as a duplicate of #972.	{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 }	814591962
https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786795132	https://api.github.com/repos/simonw/sqlite-utils/issues/239	786795132	MDEyOklzc3VlQ29tbWVudDc4Njc5NTEzMg==	9599	2021-02-26T17:45:53Z	2021-02-26T17:45:53Z	OWNER	If there's no primary key in the JSON could use the `hash_id` mechanism.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	816526538
https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786794435	https://api.github.com/repos/simonw/sqlite-utils/issues/239	786794435	MDEyOklzc3VlQ29tbWVudDc4Njc5NDQzNQ==	9599	2021-02-26T17:44:38Z	2021-02-26T17:44:38Z	OWNER	This came up in office hours!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	816526538
https://github.com/simonw/datasette/issues/1244#issuecomment-786786645	https://api.github.com/repos/simonw/datasette/issues/1244	786786645	MDEyOklzc3VlQ29tbWVudDc4Njc4NjY0NQ==	9599	2021-02-26T17:30:38Z	2021-02-26T17:30:38Z	OWNER	New paragraph at the top of https://docs.datasette.io/en/latest/writing_plugins.html > Want to start by looking at an example? The [Datasette plugins directory](https://datasette.io/plugins) lists more than 50 open source plugins with code you can explore. The [plugin hooks](https://docs.datasette.io/en/latest/plugin_hooks.html#plugin-hooks) page includes links to example plugins for each of the documented hooks.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	817528452
https://github.com/simonw/sqlite-utils/issues/237#issuecomment-786050562	https://api.github.com/repos/simonw/sqlite-utils/issues/237	786050562	MDEyOklzc3VlQ29tbWVudDc4NjA1MDU2Mg==	9599	2021-02-25T16:57:56Z	2021-02-25T16:57:56Z	OWNER	`sqlite-utils create-view` currently has a `--ignore` option, so adding that to `sqlite-utils drop-view` and `sqlite-utils drop-table` makes sense as well.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	815554385
https://github.com/simonw/sqlite-utils/issues/237#issuecomment-786049686	https://api.github.com/repos/simonw/sqlite-utils/issues/237	786049686	MDEyOklzc3VlQ29tbWVudDc4NjA0OTY4Ng==	9599	2021-02-25T16:56:42Z	2021-02-25T16:56:42Z	OWNER	So: ```python db["my_table"].drop(ignore=True) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	815554385
https://github.com/simonw/sqlite-utils/issues/237#issuecomment-786049394	https://api.github.com/repos/simonw/sqlite-utils/issues/237	786049394	MDEyOklzc3VlQ29tbWVudDc4NjA0OTM5NA==	9599	2021-02-25T16:56:14Z	2021-02-25T16:56:14Z	OWNER	Other methods (`db.create_view()` for example) have `ignore=True` to mean "don't throw an error if this causes a problem", so I'm good with adding that to `.drop_view()`. I don't like using it as the default partly because that would be a very minor breaking API change, but mainly because I don't want to hide mistakes people make - e.g. if you mistype the name of the table you are trying to drop.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	815554385
https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786037219	https://api.github.com/repos/simonw/sqlite-utils/issues/240	786037219	MDEyOklzc3VlQ29tbWVudDc4NjAzNzIxOQ==	9599	2021-02-25T16:39:23Z	2021-02-25T16:39:23Z	OWNER	Example from the docs: ```pycon >>> db = sqlite_utils.Database(memory=True) >>> db["dogs"].insert({"name": "Cleo"}) >>> for pk, row in db["dogs"].pks_and_rows_where(): ... print(pk, row) 1 {'rowid': 1, 'name': 'Cleo'} >>> db["dogs_with_pk"].insert({"id": 5, "name": "Cleo"}, pk="id") >>> for pk, row in db["dogs_with_pk"].pks_and_rows_where(): ... print(pk, row) 5 {'id': 5, 'name': 'Cleo'} >>> db["dogs_with_compound_pk"].insert( ... {"species": "dog", "id": 3, "name": "Cleo"}, ... pk=("species", "id") ... ) >>> for pk, row in db["dogs_with_compound_pk"].pks_and_rows_where(): ... print(pk, row) ('dog', 3) {'species': 'dog', 'id': 3, 'name': 'Cleo'} ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	816560819
https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786036355	https://api.github.com/repos/simonw/sqlite-utils/issues/240	786036355	MDEyOklzc3VlQ29tbWVudDc4NjAzNjM1NQ==	9599	2021-02-25T16:38:07Z	2021-02-25T16:38:07Z	OWNER	Documentation: https://sqlite-utils.datasette.io/en/latest/python-api.html#listing-rows-with-their-primary-keys	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	816560819

github

Custom SQL query returning 101 rows (hide)

Query parameters