{"html_url": "https://github.com/simonw/sqlite-utils/issues/246#issuecomment-799479175", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/246", "id": 799479175, "node_id": "MDEyOklzc3VlQ29tbWVudDc5OTQ3OTE3NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-15T14:47:31Z", "updated_at": "2021-03-15T14:47:31Z", "author_association": "OWNER", "body": "This is a smart feature. I have something that does this in Datasette, extracting it out to `sqlite-utils` makes a lot of sense.\r\n\r\nhttps://github.com/simonw/datasette/blob/8e18c7943181f228ce5ebcea48deb59ce50bee1f/datasette/utils/__init__.py#L818-L829", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 831751367, "label": "Escaping FTS search strings"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/236#issuecomment-799066252", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/236", "id": 799066252, "node_id": "MDEyOklzc3VlQ29tbWVudDc5OTA2NjI1Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-15T03:34:52Z", "updated_at": "2021-03-15T03:34:52Z", "author_association": "OWNER", "body": "Yeah the Lambda Docker stuff is pretty odd - you still don't get to speak HTTP, you have to speak their custom event protocol instead.\r\n\r\nhttps://github.com/glassechidna/serverlessish looks interesting here - it adds a proxy inside the container which allows your existing HTTP Docker image to run within Docker-on-Lambda. I've not tried it out yet though.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 317001500, "label": "datasette publish lambda plugin"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1259#issuecomment-797827038", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1259", "id": 797827038, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NzgyNzAzOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-13T00:15:40Z", "updated_at": "2021-03-13T00:15:40Z", "author_association": "OWNER", "body": "If all of the facets were being calculated in a single query, I'd be willing to bump the facet time limit up to something a lot higher, maybe even a full second. There's a chance that could work amazingly well with a materialized CTE.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 830567275, "label": "Research using CTEs for faster facet counts"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1259#issuecomment-797804869", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1259", "id": 797804869, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NzgwNDg2OQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-12T23:05:05Z", "updated_at": "2021-03-12T23:05:05Z", "author_association": "OWNER", "body": "I wonder if I could optimize facet suggestion in the same way?\r\n\r\nOne challenge: the query time limit will apply to the full CTE query, not to the individual columns.\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 830567275, "label": "Research using CTEs for faster facet counts"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1259#issuecomment-797801075", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1259", "id": 797801075, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NzgwMTA3NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-12T22:53:56Z", "updated_at": "2021-03-12T22:55:16Z", "author_association": "OWNER", "body": "OK, a better comparison:\r\n\r\nhttps://global-power-plants.datasettes.com/global-power-plants?sql=WITH+data+as+%28%0D%0A++select%0D%0A++++*%0D%0A++from%0D%0A++++%5Bglobal-power-plants%5D%0D%0A%29%2C%0D%0Acountry_long+as+%28select+%0D%0A++%27country_long%27+as+col%2C+country_long+as+value%2C+count%28*%29+as+c+from+data+group+by+country_long%0D%0A++order+by+c+desc+limit+31%0D%0A%29%2C%0D%0Aprimary_fuel+as+%28%0D%0Aselect%0D%0A++%27primary_fuel%27+as+col%2C+primary_fuel+as+value%2C+count%28*%29+as+c+from+data+group+by+primary_fuel%0D%0A++order+by+c+desc+limit+31%0D%0A%29%2C%0D%0Aowner+as+%28%0D%0Aselect%0D%0A++%27owner%27+as+col%2C+owner+as+value%2C+count%28*%29+as+c+from+data+group+by+owner%0D%0A++order+by+c+desc+limit+31%0D%0A%29%0D%0Aselect+*+from+primary_fuel+union+select+*+from+country_long%0D%0Aunion+select+*+from+owner+order+by+col%2C+c+desc calculates facets against three columns. It takes **78.5ms** (and 34.5ms when I refreshed it, presumably after warming some SQLite caches of some sort).\r\n\r\nhttps://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_facet=country_long&_facet=primary_fuel&_trace=1&_size=0 shows those facets with size=0 on the SQL query - and shows a SQL trace at the bottom of the page.\r\n\r\nThe country_long facet query takes 45.36ms, owner takes 38.45ms, primary_fuel takes 49.04ms - so a total of 132.85ms\r\n\r\nThat's against https://global-power-plants.datasettes.com/-/versions says SQLite 3.27.3 - so even on a SQLite version that doesn't materialize the CTEs there's a significant performance boost to doing all three facets in a single CTE query.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 830567275, "label": "Research using CTEs for faster facet counts"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1259#issuecomment-797790017", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1259", "id": 797790017, "node_id": "MDEyOklzc3VlQ29tbWVudDc5Nzc5MDAxNw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-12T22:22:12Z", "updated_at": "2021-03-12T22:22:12Z", "author_association": "OWNER", "body": "https://sqlite.org/lang_with.html\r\n\r\n> Prior to SQLite 3.35.0, all CTEs where treated as if the NOT MATERIALIZED phrase was present\r\n\r\nIt looks like this optimization is completely unavailable on SQLite prior to 3.35.0 (released 12th March 2021). But I could still rewrite the faceting to work in this way, using the exact same SQL - it would just be significantly faster on 3.35.0+ (assuming it's actually faster in practice - would need to benchmark).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 830567275, "label": "Research using CTEs for faster facet counts"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1193#issuecomment-797159434", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1193", "id": 797159434, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NzE1OTQzNA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-12T01:01:54Z", "updated_at": "2021-03-12T01:01:54Z", "author_association": "OWNER", "body": "DuckDB has a read-only mechanism: https://duckdb.org/docs/api/python\r\n\r\n```python\r\nimport duckdb\r\ncon = duckdb.connect(database=\"/tmp/blah.db\", read_only=True)\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 787173276, "label": "Research plugin hook for alternative database backends"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1250#issuecomment-797159221", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1250", "id": 797159221, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NzE1OTIyMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-12T01:01:17Z", "updated_at": "2021-03-12T01:01:17Z", "author_association": "OWNER", "body": "This is a duplicate of #1193.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 824067604, "label": "Research: Plugin hook for alternative database connections"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/670#issuecomment-797158641", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/670", "id": 797158641, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NzE1ODY0MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-12T00:59:49Z", "updated_at": "2021-03-12T00:59:49Z", "author_association": "OWNER", "body": "> Challenge: what's the equivalent for PostgreSQL of opening a database in read only mode? Will I have to talk users through creating read only credentials?\r\n\r\nIt looks like the answer to this is yes - I'll need users to setup read-only credentials. Here's a TIL about that: https://til.simonwillison.net/postgresql/read-only-postgresql-user", "reactions": "{\"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 1, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 564833696, "label": "Prototoype for Datasette on PostgreSQL"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1211#issuecomment-796854370", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1211", "id": 796854370, "node_id": "MDEyOklzc3VlQ29tbWVudDc5Njg1NDM3MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-11T16:15:29Z", "updated_at": "2021-03-11T16:15:29Z", "author_association": "OWNER", "body": "Thanks very much for this - it's really comprehensive. I need to bake some of these patterns into my coding habits better!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 797649915, "label": "Use context manager instead of plain open"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/838#issuecomment-795918377", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/838", "id": 795918377, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NTkxODM3Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-10T19:01:48Z", "updated_at": "2021-03-10T19:01:48Z", "author_association": "OWNER", "body": "The biggest challenge here I think is to replicate the exact situation here this happens in a Python unit test. The fix should be easy once we have a test in place.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 637395097, "label": "Incorrect URLs when served behind a proxy with base_url set"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/838#issuecomment-795895436", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/838", "id": 795895436, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NTg5NTQzNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-10T18:44:46Z", "updated_at": "2021-03-10T18:44:57Z", "author_association": "OWNER", "body": "Let's reopen this.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 637395097, "label": "Incorrect URLs when served behind a proxy with base_url set"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1254#issuecomment-795870524", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1254", "id": 795870524, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NTg3MDUyNA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-10T18:27:45Z", "updated_at": "2021-03-10T18:27:45Z", "author_association": "OWNER", "body": "What other breaks did you spot?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 826613352, "label": "Update Docker Spatialite version to 5.0.1 + add support for Spatialite topology functions"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1256#issuecomment-795869144", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1256", "id": 795869144, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NTg2OTE0NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-10T18:26:46Z", "updated_at": "2021-03-10T18:26:46Z", "author_association": "OWNER", "body": "Thanks!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 827341657, "label": "Minor type in IP adress"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1254#issuecomment-794439632", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1254", "id": 794439632, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NDQzOTYzMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-09T20:53:02Z", "updated_at": "2021-03-09T20:53:02Z", "author_association": "OWNER", "body": "Thanks for catching that documentation update!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 826613352, "label": "Update Docker Spatialite version to 5.0.1 + add support for Spatialite topology functions"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1254#issuecomment-794437715", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1254", "id": 794437715, "node_id": "MDEyOklzc3VlQ29tbWVudDc5NDQzNzcxNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-09T20:51:19Z", "updated_at": "2021-03-09T20:51:19Z", "author_association": "OWNER", "body": "Did you see my note on https://github.com/simonw/datasette/issues/1249#issuecomment-792384382 about a weird issue I was having with the `/dbname` page hanging the server? Have you seen anything like that in your work here?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 826613352, "label": "Update Docker Spatialite version to 5.0.1 + add support for Spatialite topology functions"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1250#issuecomment-792386484", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1250", "id": 792386484, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MjM4NjQ4NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-08T00:29:06Z", "updated_at": "2021-03-08T00:29:06Z", "author_association": "OWNER", "body": "DuckDB has a read-only mechanism: https://duckdb.org/docs/api/python\r\n\r\n```python\r\nimport duckdb\r\ncon = duckdb.connect(database=\"/tmp/blah.db\", read_only=True)\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 824067604, "label": "Research: Plugin hook for alternative database connections"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1248#issuecomment-792385274", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1248", "id": 792385274, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MjM4NTI3NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-08T00:25:10Z", "updated_at": "2021-03-08T00:25:10Z", "author_association": "OWNER", "body": "It's not possible yet, unfortunately. This came up on the forums recently: https://github.com/simonw/datasette/discussions/968\r\n\r\nI'm leaning further towards making the database connection layer itself work via a plugin hook, which would open up the possibility of supporting DuckDB and other databases as well. I've not committed to doing this yet though.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 823035080, "label": "duckdb database (very low performance in SQLite)"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1249#issuecomment-792384854", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1249", "id": 792384854, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MjM4NDg1NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-08T00:23:38Z", "updated_at": "2021-03-08T00:23:38Z", "author_association": "OWNER", "body": "One reason to prioritize this issue: Homebrew upgraded to SpatiaLite 5.0 recently https://formulae.brew.sh/formula/spatialite-tools and as a result SpatiaLite database created on my laptop don't appear to be compatible with Datasette when published using `datasette publish`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 824064069, "label": "Updated Dockerfile with SpatiaLite version 5.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1249#issuecomment-792384382", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1249", "id": 792384382, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MjM4NDM4Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-08T00:22:02Z", "updated_at": "2021-03-08T00:22:02Z", "author_association": "OWNER", "body": "I tried this patch against `Dockerfile`:\r\n```diff\r\ndiff --git a/Dockerfile b/Dockerfile\r\nindex f4b1414..dd659e1 100644\r\n--- a/Dockerfile\r\n+++ b/Dockerfile\r\n@@ -1,25 +1,26 @@\r\n-FROM python:3.7.10-slim-stretch as build\r\n+FROM python:3.9.2-slim-buster as build\r\n \r\n # Setup build dependencies\r\n RUN apt update \\\r\n-&& apt install -y python3-dev build-essential wget libxml2-dev libproj-dev libgeos-dev libsqlite3-dev zlib1g-dev pkg-config git \\\r\n- && apt clean\r\n+ && apt install -y python3-dev build-essential wget libxml2-dev libproj-dev \\\r\n+ libminizip-dev libgeos-dev libsqlite3-dev zlib1g-dev pkg-config git \\\r\n+ && apt clean\r\n \r\n-\r\n-RUN wget \"https://www.sqlite.org/2020/sqlite-autoconf-3310100.tar.gz\" && tar xzf sqlite-autoconf-3310100.tar.gz \\\r\n- && cd sqlite-autoconf-3310100 && ./configure --disable-static --enable-fts5 --enable-json1 CFLAGS=\"-g -O2 -DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_JSON1\" \\\r\n+RUN wget \"https://www.sqlite.org/2021/sqlite-autoconf-3340100.tar.gz\" && tar xzf sqlite-autoconf-3340100.tar.gz \\\r\n+ && cd sqlite-autoconf-3340100 && ./configure --disable-static --enable-fts5 --enable-json1 \\\r\n+ CFLAGS=\"-g -O2 -DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_JSON1\" \\\r\n && make && make install\r\n \r\n-RUN wget \"http://www.gaia-gis.it/gaia-sins/freexl-sources/freexl-1.0.5.tar.gz\" && tar zxf freexl-1.0.5.tar.gz \\\r\n- && cd freexl-1.0.5 && ./configure && make && make install\r\n+RUN wget \"http://www.gaia-gis.it/gaia-sins/freexl-1.0.6.tar.gz\" && tar zxf freexl-1.0.6.tar.gz \\\r\n+ && cd freexl-1.0.6 && ./configure && make && make install\r\n \r\n-RUN wget \"http://www.gaia-gis.it/gaia-sins/libspatialite-sources/libspatialite-4.4.0-RC0.tar.gz\" && tar zxf libspatialite-4.4.0-RC0.tar.gz \\\r\n- && cd libspatialite-4.4.0-RC0 && ./configure && make && make install\r\n+RUN wget \"http://www.gaia-gis.it/gaia-sins/libspatialite-5.0.1.tar.gz\" && tar zxf libspatialite-5.0.1.tar.gz \\\r\n+ && cd libspatialite-5.0.1 && ./configure --disable-rttopo && make && make install\r\n \r\n RUN wget \"http://www.gaia-gis.it/gaia-sins/readosm-sources/readosm-1.1.0.tar.gz\" && tar zxf readosm-1.1.0.tar.gz && cd readosm-1.1.0 && ./configure && make && make install\r\n \r\n-RUN wget \"http://www.gaia-gis.it/gaia-sins/spatialite-tools-sources/spatialite-tools-4.4.0-RC0.tar.gz\" && tar zxf spatialite-tools-4.4.0-RC0.tar.gz \\\r\n- && cd spatialite-tools-4.4.0-RC0 && ./configure && make && make install\r\n+RUN wget \"http://www.gaia-gis.it/gaia-sins/spatialite-tools-5.0.0.tar.gz\" && tar zxf spatialite-tools-5.0.0.tar.gz \\\r\n+ && cd spatialite-tools-5.0.0 && ./configure --disable-rttopo && make && make install\r\n \r\n \r\n # Add local code to the image instead of fetching from pypi.\r\n@@ -27,7 +28,7 @@ COPY . /datasette\r\n \r\n RUN pip install /datasette\r\n \r\n-FROM python:3.7.10-slim-stretch\r\n+FROM python:3.9.2-slim-buster\r\n \r\n # Copy python dependencies and spatialite libraries\r\n COPY --from=build /usr/local/lib/ /usr/local/lib/\r\n```\r\n\r\nI had to use `--disable-rttopo` from the tip in https://github.com/OSGeo/gdal/pull/3443 and also needed to install `libminizip-dev`.\r\n\r\nThis works, sort of... I'm getting a weird issue where the `/dbname` page is hanging some of the time instead of loading correctly. Other than that it seems to work, but a hanging page is bad!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 824064069, "label": "Updated Dockerfile with SpatiaLite version 5.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1249#issuecomment-792383956", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1249", "id": 792383956, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MjM4Mzk1Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-08T00:20:09Z", "updated_at": "2021-03-08T00:20:09Z", "author_association": "OWNER", "body": "Worth noting that the Docker image used by `datasette publish cloudrun` doesn't actually use that Datasette docker image - it does this:\r\n\r\nhttps://github.com/simonw/datasette/blob/d0fd833b8cdd97e1b91d0f97a69b494895d82bee/datasette/utils/__init__.py#L349-L353\r\n\r\nWhere the apt extras for SpatiaLite are: https://github.com/simonw/datasette/blob/d0fd833b8cdd97e1b91d0f97a69b494895d82bee/datasette/utils/__init__.py#L344-L345\r\n\r\n`libsqlite3-mod-spatialite` against that official `python:3.8` image doesn't appear to install SpatiaLite 5.0.\r\n\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 824064069, "label": "Updated Dockerfile with SpatiaLite version 5.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1223#issuecomment-792233255", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1223", "id": 792233255, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MjIzMzI1NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-07T07:41:01Z", "updated_at": "2021-03-07T07:41:01Z", "author_association": "OWNER", "body": "This is fantastic, thanks so much for tracking this down.", "reactions": "{\"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 1, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 806918878, "label": "Add compile option to Dockerfile to fix failing test (fixes #696)"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790695126", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790695126, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDY5NTEyNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T15:20:42Z", "updated_at": "2021-03-04T15:20:42Z", "author_association": "MEMBER", "body": "I'm not sure why but my most recent import, when displayed in Datasette, looks like this:\r\n\r\n\"mbox__mbox_emails__753_446_rows\"\r\n\r\nSorting by `id` in the opposite order gives me the data I would expect - so it looks like a bunch of null/blank messages are being imported at some point and showing up first due to ID ordering.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790693674", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790693674, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDY5MzY3NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T15:18:36Z", "updated_at": "2021-03-04T15:18:36Z", "author_association": "MEMBER", "body": "I imported my 10GB mbox with 750,000 emails in it, ran this tool (with a hacked fix for the blob column problem) - and now a search that returns 92 results takes 25.37ms! This is fantastic.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790669767", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790669767, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDY2OTc2Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T14:46:06Z", "updated_at": "2021-03-04T14:46:06Z", "author_association": "MEMBER", "body": "Solution could be to pre-process that string by splitting on `(` and dropping everything afterwards, assuming that the `(...)` bit isn't necessary for correctly parsing the date.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790668263", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790668263, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDY2ODI2Mw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T14:43:58Z", "updated_at": "2021-03-04T14:43:58Z", "author_association": "MEMBER", "body": "I added this code to output a message ID on errors:\r\n```diff\r\n print(\"Errors: {}\".format(num_errors))\r\n print(traceback.format_exc())\r\n+ print(\"Message-Id: {}\".format(email.get(\"Message-Id\", \"None\")))\r\n continue\r\n```\r\nHaving found a message ID that had an error, I ran this command to see the context:\r\n\r\n rg --text --context 20 '44F289B0.000001.02100@SCHWARZE-DWFXMI' ~/gmail.mbox\r\n\r\nThis was for the following error:\r\n```\r\n File \"/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py\", line 102, in get_mbox\r\n message[\"date\"] = get_message_date(email.get(\"Date\"), email.get_from())\r\n File \"/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py\", line 178, in get_message_date\r\n datetime_tuple = email.utils.parsedate_tz(mail_date)\r\n File \"/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py\", line 50, in parsedate_tz\r\n res = _parsedate_tz(data)\r\n File \"/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py\", line 69, in _parsedate_tz\r\n data = data.split()\r\nAttributeError: 'Header' object has no attribute 'split'\r\n```\r\nHere's what I spotted in the `ripgrep` output:\r\n```\r\n177133570:Message-Id: <44F289B0.000001.02100@SCHWARZE-DWFXMI>\r\n177133571-Date: Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop\ufffdische Sommerzeit)\r\n177133572-X-Mailer: IncrediMail (5002253)\r\n```\r\nSo it could it be that `_parsedate_tz` is having trouble with that `Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop\ufffdische Sommerzeit)` string.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/issues/6#issuecomment-790384087", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/6", "id": 790384087, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM4NDA4Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:22:51Z", "updated_at": "2021-03-04T07:22:51Z", "author_association": "MEMBER", "body": "#3 also mentions the conflicting version with other tools.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 821841046, "label": "Upgrade to latest sqlite-utils"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790380839", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790380839, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM4MDgzOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:17:05Z", "updated_at": "2021-03-04T07:17:05Z", "author_association": "MEMBER", "body": "Looks like you're doing this:\r\n```python\r\n elif message.get_content_type() == \"text/plain\":\r\n body = message.get_payload(decode=True)\r\n```\r\nSo presumably that decodes to a unicode string?\r\n\r\nI imagine the reason the column is a `BLOB` for me is that `sqlite-utils` determines the column type based on the first batch of items - https://github.com/simonw/sqlite-utils/blob/09c3386f55f766b135b6a1c00295646c4ae29bec/sqlite_utils/db.py#L1927-L1928 - and I got unlucky and had something in my first batch that wasn't a unicode string.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790379629", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790379629, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM3OTYyOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:14:41Z", "updated_at": "2021-03-04T07:14:41Z", "author_association": "MEMBER", "body": "Confirmed: removing the `len()` call does not speed things up, so it's reading through the entire file for some other purpose too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790378658", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790378658, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM3ODY1OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:12:48Z", "updated_at": "2021-03-04T07:12:48Z", "author_association": "MEMBER", "body": "It looks like the `body` is being loaded into a BLOB column - so in Datasette default it looks like this:\r\n\r\n\"mbox__mbox_emails__753_446_rows\"\r\n\r\nIf I `datasette install datasette-render-binary` and then try again I get this:\r\n\r\n\"mbox__mbox_emails__753_446_rows\"\r\n\r\nIt would be great if we could store the `body` as unicode text instead. May have to do something clever to decode it based on some kind of charset header?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790373024", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790373024, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM3MzAyNA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:01:58Z", "updated_at": "2021-03-04T07:04:06Z", "author_association": "MEMBER", "body": "I got 9 warnings that look like this:\r\n```\r\nErrors: 1\r\nTraceback (most recent call last):\r\n File \"/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py\", line 103, in get_mbox\r\n message[\"date\"] = get_message_date(email.get(\"Date\"), email.get_from())\r\n File \"/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py\", line 167, in get_message_date\r\n datetime_tuple = email.utils.parsedate_tz(mail_date)\r\n File \"/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py\", line 50, in parsedate_tz\r\n res = _parsedate_tz(data)\r\n File \"/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py\", line 69, in _parsedate_tz\r\n data = data.split()\r\nAttributeError: 'Header' object has no attribute 'split'\r\n```\r\nIt would be useful if those warnings told me the message ID (or similar) of the affected message so I could grep for it in the `mbox` and see what was going on.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790372621", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790372621, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM3MjYyMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T07:01:18Z", "updated_at": "2021-03-04T07:01:18Z", "author_association": "MEMBER", "body": "I'm not sure if it would work, but there is an alternative pattern for showing a progress bar against a really large file that I've used in `healthkit-to-sqlite` - you set the progress bar size to the size of the file in bytes, then update a counter as you read the file.\r\n\r\nhttps://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/cli.py#L24-L57 and https://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/utils.py#L4-L19 (the `progress_callback()` bit) is where that happens.\r\n\r\nIt can be a bit of a convoluted pattern, and I'm not at all sure it would work for `mbox` files since it looks like that library has other reasons it needs to do a file scan rather than streaming it through one chunk of bytes at a time. So I imagine this would not work here.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790370485", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790370485, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM3MDQ4NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T06:57:25Z", "updated_at": "2021-03-04T06:57:48Z", "author_association": "MEMBER", "body": "The command takes quite a while to start running, presumably because this line causes it to have to scan the WHOLE file in order to generate a count:\r\n\r\nhttps://github.com/dogsheep/google-takeout-to-sqlite/blob/a3de045eba0fae4b309da21aa3119102b0efc576/google_takeout_to_sqlite/utils.py#L66-L67\r\n\r\nI'm fine with waiting though. It's not like this is a command people run every day - and without that count we can't show a progress bar, which seems pretty important for a process that takes this long.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790369076", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790369076, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDM2OTA3Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T06:54:46Z", "updated_at": "2021-03-04T06:54:46Z", "author_association": "MEMBER", "body": "The Rich-powered progress bar is pretty:\r\n\r\n![rich](https://user-images.githubusercontent.com/9599/109923307-71f69200-7c73-11eb-9ee2-8f0a240f3994.gif)\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790312268", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 790312268, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDMxMjI2OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T05:48:16Z", "updated_at": "2021-03-04T05:48:16Z", "author_association": "MEMBER", "body": "Wow, my mbox is a 10.35 GB download!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1243#issuecomment-790311215", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1243", "id": 790311215, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDMxMTIxNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-04T05:45:57Z", "updated_at": "2021-03-04T05:45:57Z", "author_association": "OWNER", "body": "Thanks!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 815955014, "label": "fix small typo"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1247#issuecomment-787616446", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1247", "id": 787616446, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzYxNjQ0Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-01T03:50:37Z", "updated_at": "2021-03-01T03:50:37Z", "author_association": "OWNER", "body": "I like the `.add_memory_database()` option. I also like that it makes it more obvious that this is a capability of Datasette, since I'm excited to see more plugins, features and tests that take advantage of it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 818430405, "label": "datasette.add_memory_database() method"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1247#issuecomment-787616158", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1247", "id": 787616158, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzYxNjE1OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-01T03:49:27Z", "updated_at": "2021-03-01T03:49:27Z", "author_association": "OWNER", "body": "A couple of options:\r\n```python\r\ndatasette.add_memory_database(\"test_json_array\")\r\n# or make that first argument to add_database() optional and support:\r\ndatasette.add_database(memory_name=\"test_json_array\")\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 818430405, "label": "datasette.add_memory_database() method"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1246#issuecomment-787611153", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1246", "id": 787611153, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzYxMTE1Mw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-03-01T03:30:57Z", "updated_at": "2021-03-01T03:30:57Z", "author_association": "OWNER", "body": "I'm going to try a new pattern for testing this, enabled by #1151 - the test will create a new named in-memory database, write some records to it and then run some test facets against that. This will save me from having to add yet another fixtures table for this.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817597268, "label": "Suggest for ArrayFacet possibly confused by blank values"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1005#issuecomment-787536267", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1005", "id": 787536267, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzUzNjI2Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-28T22:30:37Z", "updated_at": "2021-02-28T22:30:37Z", "author_association": "OWNER", "body": "It's out! https://github.com/encode/httpx/releases/tag/0.17.0", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 718259202, "label": "Remove xfail tests when new httpx is released"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787532279", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 787532279, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzUzMjI3OQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-28T22:09:37Z", "updated_at": "2021-02-28T22:09:37Z", "author_association": "OWNER", "body": "Microsoft's playwright Python library solves this problem by code generating both their sync AND their async libraries https://github.com/microsoft/playwright-python/tree/master/scripts", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787198202", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 787198202, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzE5ODIwMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-27T22:33:58Z", "updated_at": "2021-02-27T22:33:58Z", "author_association": "OWNER", "body": "Hah or use this trick, which genuinely rewrites the code at runtime using a class decorator! https://github.com/python-happybase/aiohappybase/blob/0990ef45cfdb720dc987afdb4957a0fac591cb99/aiohappybase/sync/_util.py#L19-L32", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787195536", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 787195536, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzE5NTUzNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-27T22:13:24Z", "updated_at": "2021-02-27T22:13:24Z", "author_association": "OWNER", "body": "Some other interesting background reading: https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html - in particular see how SQLALchemy has a `await conn.run_sync(meta.drop_all)` mechanism for running methods that haven't themselves been provided in an async version", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787190562", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 787190562, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzE5MDU2Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-27T22:04:00Z", "updated_at": "2021-02-27T22:04:00Z", "author_association": "OWNER", "body": "From the poster here: https://github.com/sethmlarson/pycon-async-sync-poster/blob/master/poster.pdf\r\n\r\n\"pycon-async-sync-poster_poster_pdf_at_master_\u00b7_sethmlarson_pycon-async-sync-poster\"\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787186826", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 787186826, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzE4NjgyNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-27T22:01:54Z", "updated_at": "2021-02-27T22:01:54Z", "author_association": "OWNER", "body": "`unasync` is an implementation of the exact pattern I was talking about above - it uses the `tokenize` module from the Python standard library to apply some clever rules to transform an async codebase into a sync one. https://unasync.readthedocs.io/en/latest/ - implementation here: https://github.com/python-trio/unasync/blob/v0.5.0/src/unasync/__init__.py", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787175126", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 787175126, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzE3NTEyNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-27T21:55:05Z", "updated_at": "2021-02-27T21:55:05Z", "author_association": "OWNER", "body": "\"how to use some new tools to more easily maintain a codebase that supports both async and synchronous I/O and multiple async libraries\" - yeah that's exactly what I need, thank you!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787144523", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 787144523, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzE0NDUyMw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-27T21:18:46Z", "updated_at": "2021-02-27T21:18:46Z", "author_association": "OWNER", "body": "Here's a really wild idea: I wonder if it would be possible to run a source transformation against either the sync or the async versions of the code to produce the equivalent for the other paradigm?\r\n\r\nCould that even be as simple as a set of regular expressions against the `await ...` version that strips out or replaces the `await` and `async def` and `async for` statements?\r\n\r\nIf so... I could maintain just the async version, generate the sync version with a script and rely on robust unit testing to guarantee that this actually works.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787142066", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 787142066, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzE0MjA2Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-27T21:17:10Z", "updated_at": "2021-02-27T21:17:10Z", "author_association": "OWNER", "body": "I have a hunch this is actually going to be quite difficult, due to the internal complexity of some of the `sqlite-utils` API methods.\r\n\r\nConsider `db[table].extract(...)` for example. It does a whole bunch of extra queries inside the method - each of those would need to be turned into an `await` call for the async version. Here's the method body today:\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/09c3386f55f766b135b6a1c00295646c4ae29bec/sqlite_utils/db.py#L1060-L1152\r\n\r\nWriting this method twice - looking similar but with `await ...` tucked in before every internal method it calls that needs to execute SQL - is going to be pretty messy.\r\n\r\nOne thing that would help a LOT is figuring out how to share the majority of the test code. If the exact same tests could run against both the sync and async versions with a bit of test trickery, maintaining parallel implementations would at least be a bit more feasible.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787120136", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 787120136, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzEyMDEzNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-27T19:04:47Z", "updated_at": "2021-02-27T19:04:47Z", "author_association": "OWNER", "body": "Another option here would be to add https://github.com/omnilib/aiosqlite/blob/main/aiosqlite/core.py as a dependency - it's four years old now and actively marinated, and the code is pretty small so it looks like a solid, stable, reliable dependency.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787118691", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242", "id": 787118691, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NzExODY5MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-27T18:53:23Z", "updated_at": "2021-02-27T18:53:23Z", "author_association": "OWNER", "body": "Datasette has its own implementation of a write queue for exactly this purpose - and there's no reason at all that should stay in Datasette rather than being extracted out and moved over here to `sqlite-utils`.\r\n\r\nOne small concern I have is around the API design. I'd want to keep supporting the existing synchronous API while also providing a similar API with await-based methods.\r\n\r\nWhat are some good examples of libraries that do this? I like how https://www.python-httpx.org/ handles it, maybe that's a good example to imitate?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817989436, "label": "Async support"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-786925280", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5", "id": 786925280, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjkyNTI4MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T22:23:10Z", "updated_at": "2021-02-26T22:23:10Z", "author_association": "MEMBER", "body": "Thanks!\r\n\r\nI requested my Gmail export from takeout - once that arrives I'll test it against this and then merge the PR.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813880401, "label": "WIP: Add Gmail takeout mbox import"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1238#issuecomment-786849095", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1238", "id": 786849095, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Njg0OTA5NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T19:29:38Z", "updated_at": "2021-02-26T19:29:38Z", "author_association": "OWNER", "body": "Here's the test I wrote:\r\n```diff\r\ngit diff tests/test_custom_pages.py\r\ndiff --git a/tests/test_custom_pages.py b/tests/test_custom_pages.py\r\nindex 6a23192..5a71f56 100644\r\n--- a/tests/test_custom_pages.py\r\n+++ b/tests/test_custom_pages.py\r\n@@ -2,11 +2,19 @@ import pathlib\r\n import pytest\r\n from .fixtures import make_app_client\r\n \r\n+TEST_TEMPLATE_DIRS = str(pathlib.Path(__file__).parent / \"test_templates\")\r\n+\r\n \r\n @pytest.fixture(scope=\"session\")\r\n def custom_pages_client():\r\n+ with make_app_client(template_dir=TEST_TEMPLATE_DIRS) as client:\r\n+ yield client\r\n+\r\n+\r\n+@pytest.fixture(scope=\"session\")\r\n+def custom_pages_client_with_base_url():\r\n with make_app_client(\r\n- template_dir=str(pathlib.Path(__file__).parent / \"test_templates\")\r\n+ template_dir=TEST_TEMPLATE_DIRS, config={\"base_url\": \"/prefix/\"}\r\n ) as client:\r\n yield client\r\n \r\n@@ -23,6 +31,12 @@ def test_request_is_available(custom_pages_client):\r\n assert \"path:/request\" == response.text\r\n \r\n \r\n+def test_custom_pages_with_base_url(custom_pages_client_with_base_url):\r\n+ response = custom_pages_client_with_base_url.get(\"/prefix/request\")\r\n+ assert 200 == response.status\r\n+ assert \"path:/prefix/request\" == response.text\r\n+\r\n+\r\n def test_custom_pages_nested(custom_pages_client):\r\n response = custom_pages_client.get(\"/nested/nest\")\r\n assert 200 == response.status\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813899472, "label": "Custom pages don't work with base_url setting"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1238#issuecomment-786848654", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1238", "id": 786848654, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Njg0ODY1NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T19:28:48Z", "updated_at": "2021-02-26T19:28:48Z", "author_association": "OWNER", "body": "I added a debug line just before `for regex, wildcard_template` here:\r\n\r\nhttps://github.com/simonw/datasette/blob/afed51b1e36cf275c39e71c7cb262d6c5bdbaa31/datasette/app.py#L1148-L1155\r\n\r\nAnd it showed that for some reason `request.path` is `/prefix/prefix/request` here - the prefix got doubled somehow.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813899472, "label": "Custom pages don't work with base_url setting"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1238#issuecomment-786841261", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1238", "id": 786841261, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Njg0MTI2MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T19:13:44Z", "updated_at": "2021-02-26T19:13:44Z", "author_association": "OWNER", "body": "Sounds like a bug - thanks for reporting this.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813899472, "label": "Custom pages don't work with base_url setting"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1246#issuecomment-786840734", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1246", "id": 786840734, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Njg0MDczNA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T19:12:39Z", "updated_at": "2021-02-26T19:12:47Z", "author_association": "OWNER", "body": "Could I take this part:\r\n```python\r\n suggested_facet_sql = \"\"\" \r\n select distinct json_type({column}) \r\n from ({sql}) \r\n \"\"\".format( \r\n column=escape_sqlite(column), sql=self.sql \r\n ) \r\n```\r\nAnd add `where {column} is not null and {column} != ''` perhaps?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817597268, "label": "Suggest for ArrayFacet possibly confused by blank values"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1246#issuecomment-786840425", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1246", "id": 786840425, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Njg0MDQyNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T19:11:56Z", "updated_at": "2021-02-26T19:11:56Z", "author_association": "OWNER", "body": "Relevant code: https://github.com/simonw/datasette/blob/afed51b1e36cf275c39e71c7cb262d6c5bdbaa31/datasette/facets.py#L271-L295", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817597268, "label": "Suggest for ArrayFacet possibly confused by blank values"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786830832", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 786830832, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjgzMDgzMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T18:52:40Z", "updated_at": "2021-02-26T18:52:40Z", "author_association": "OWNER", "body": "Could this handle lists of objects too? That would be pretty amazing - if the column has a `[{...}, {...}]` list in it could turn that into a many-to-many.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1240#issuecomment-786813506", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1240", "id": 786813506, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjgxMzUwNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T18:19:46Z", "updated_at": "2021-02-26T18:19:46Z", "author_association": "OWNER", "body": "Linking to rows from custom queries is a lot harder - because given an arbitrary string of SQL it's difficult to analyze it and figure out which (if any) of the returned columns represent a primary key.\r\n\r\nIt's possible to manually write a SQL query that returns a column that will be treated as a link to another page using this plugin, but it's not particularly straight-forward: https://datasette.io/plugins/datasette-json-html", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 814591962, "label": "Allow facetting on custom queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1240#issuecomment-786812716", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1240", "id": 786812716, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjgxMjcxNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T18:18:18Z", "updated_at": "2021-02-26T18:18:18Z", "author_association": "OWNER", "body": "Agreed, this would be extremely useful. I'd love to be able to facet against custom queries. It's a fair bit of work to implement but it's not impossible. Closing this as a duplicate of #972.", "reactions": "{\"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 1, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 814591962, "label": "Allow facetting on custom queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786795132", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 786795132, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Njc5NTEzMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T17:45:53Z", "updated_at": "2021-02-26T17:45:53Z", "author_association": "OWNER", "body": "If there's no primary key in the JSON could use the `hash_id` mechanism.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786794435", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 786794435, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Njc5NDQzNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T17:44:38Z", "updated_at": "2021-02-26T17:44:38Z", "author_association": "OWNER", "body": "This came up in office hours!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1244#issuecomment-786786645", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1244", "id": 786786645, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Njc4NjY0NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-26T17:30:38Z", "updated_at": "2021-02-26T17:30:38Z", "author_association": "OWNER", "body": "New paragraph at the top of https://docs.datasette.io/en/latest/writing_plugins.html\r\n\r\n> Want to start by looking at an example? The [Datasette plugins directory](https://datasette.io/plugins) lists more than 50 open source plugins with code you can explore. The [plugin hooks](https://docs.datasette.io/en/latest/plugin_hooks.html#plugin-hooks) page includes links to example plugins for each of the documented hooks.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 817528452, "label": "Plugin tip: look at the examples linked from the hooks page"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/237#issuecomment-786050562", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/237", "id": 786050562, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjA1MDU2Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T16:57:56Z", "updated_at": "2021-02-25T16:57:56Z", "author_association": "OWNER", "body": "`sqlite-utils create-view` currently has a `--ignore` option, so adding that to `sqlite-utils drop-view` and `sqlite-utils drop-table` makes sense as well.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 815554385, "label": "db[\"my_table\"].drop(ignore=True) parameter, plus sqlite-utils drop-table --ignore and drop-view --ignore"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/237#issuecomment-786049686", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/237", "id": 786049686, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjA0OTY4Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T16:56:42Z", "updated_at": "2021-02-25T16:56:42Z", "author_association": "OWNER", "body": "So:\r\n```python\r\n db[\"my_table\"].drop(ignore=True)\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 815554385, "label": "db[\"my_table\"].drop(ignore=True) parameter, plus sqlite-utils drop-table --ignore and drop-view --ignore"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/237#issuecomment-786049394", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/237", "id": 786049394, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjA0OTM5NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T16:56:14Z", "updated_at": "2021-02-25T16:56:14Z", "author_association": "OWNER", "body": "Other methods (`db.create_view()` for example) have `ignore=True` to mean \"don't throw an error if this causes a problem\", so I'm good with adding that to `.drop_view()`.\r\n\r\nI don't like using it as the default partly because that would be a very minor breaking API change, but mainly because I don't want to hide mistakes people make - e.g. if you mistype the name of the table you are trying to drop.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 815554385, "label": "db[\"my_table\"].drop(ignore=True) parameter, plus sqlite-utils drop-table --ignore and drop-view --ignore"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786037219", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/240", "id": 786037219, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjAzNzIxOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T16:39:23Z", "updated_at": "2021-02-25T16:39:23Z", "author_association": "OWNER", "body": "Example from the docs:\r\n```pycon\r\n>>> db = sqlite_utils.Database(memory=True)\r\n>>> db[\"dogs\"].insert({\"name\": \"Cleo\"})\r\n>>> for pk, row in db[\"dogs\"].pks_and_rows_where():\r\n... print(pk, row)\r\n1 {'rowid': 1, 'name': 'Cleo'}\r\n\r\n>>> db[\"dogs_with_pk\"].insert({\"id\": 5, \"name\": \"Cleo\"}, pk=\"id\")\r\n>>> for pk, row in db[\"dogs_with_pk\"].pks_and_rows_where():\r\n... print(pk, row)\r\n5 {'id': 5, 'name': 'Cleo'}\r\n\r\n>>> db[\"dogs_with_compound_pk\"].insert(\r\n... {\"species\": \"dog\", \"id\": 3, \"name\": \"Cleo\"},\r\n... pk=(\"species\", \"id\")\r\n... )\r\n>>> for pk, row in db[\"dogs_with_compound_pk\"].pks_and_rows_where():\r\n... print(pk, row)\r\n('dog', 3) {'species': 'dog', 'id': 3, 'name': 'Cleo'}\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816560819, "label": "table.pks_and_rows_where() method returning primary keys along with the rows"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786036355", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/240", "id": 786036355, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjAzNjM1NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T16:38:07Z", "updated_at": "2021-02-25T16:38:07Z", "author_association": "OWNER", "body": "Documentation: https://sqlite-utils.datasette.io/en/latest/python-api.html#listing-rows-with-their-primary-keys", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816560819, "label": "table.pks_and_rows_where() method returning primary keys along with the rows"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786035142", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 786035142, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjAzNTE0Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T16:36:17Z", "updated_at": "2021-02-25T16:36:17Z", "author_association": "OWNER", "body": "WIP in a pull request.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786016380", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/240", "id": 786016380, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjAxNjM4MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T16:10:01Z", "updated_at": "2021-02-25T16:10:01Z", "author_association": "OWNER", "body": "I prototyped this and I like it:\r\n```\r\nIn [1]: import sqlite_utils\r\nIn [2]: db = sqlite_utils.Database(\"/Users/simon/Dropbox/Development/datasette/fixtures.db\")\r\nIn [3]: list(db[\"compound_primary_key\"].pks_and_rows_where())\r\nOut[3]: [(('a', 'b'), {'pk1': 'a', 'pk2': 'b', 'content': 'c'})]\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816560819, "label": "table.pks_and_rows_where() method returning primary keys along with the rows"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786007209", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/240", "id": 786007209, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjAwNzIwOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:57:50Z", "updated_at": "2021-02-25T15:57:50Z", "author_association": "OWNER", "body": "`table.pks_and_rows_where(...)` is explicit and I think less ambiguous than the other options.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816560819, "label": "table.pks_and_rows_where() method returning primary keys along with the rows"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786006794", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/240", "id": 786006794, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjAwNjc5NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:57:17Z", "updated_at": "2021-02-25T15:57:28Z", "author_association": "OWNER", "body": "I quite like `pks_with_rows_where(...)` - but grammatically it suggests it will return the primary keys that exist where their rows match the criteria - \"pks with rows\" can be interpreted as \"pks for the rows that...\" as opposed to \"pks accompanied by rows\"", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816560819, "label": "table.pks_and_rows_where() method returning primary keys along with the rows"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786005078", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/240", "id": 786005078, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjAwNTA3OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:54:59Z", "updated_at": "2021-02-25T15:56:16Z", "author_association": "OWNER", "body": "Is `pk_rows_where()` a good name? It sounds like it returns \"primary key rows\" which isn't a thing. It actually returns rows along with their primary key.\r\n\r\nOther options:\r\n\r\n- `table.rows_with_pk_where(...)` - should this return `(row, pk)` rather than `(pk, row)`?\r\n- `table.rows_where_pk(...)`\r\n- `table.pk_and_rows_where(...)`\r\n- `table.pk_with_rows_where(...)`\r\n- `table.pks_with_rows_where(...)` - because rows is pluralized, so pks should be pluralized too?\r\n- `table.pks_rows_where(...)`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816560819, "label": "table.pks_and_rows_where() method returning primary keys along with the rows"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786001768", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/240", "id": 786001768, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NjAwMTc2OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:50:28Z", "updated_at": "2021-02-25T15:52:12Z", "author_association": "OWNER", "body": "One option: `.rows_where()` could grow a `ensure_pk=True` option which checks to see if the table is a `rowid` table and, if it is, includes that in the `select`.\r\n\r\nOr... how about you can call `.rows_where(..., pks=True)` and it will yield `(pk, rowdict)` tuple pairs instead of just returning the sequence of dictionaries?\r\n\r\nI'm always a little bit nervous of methods that vary their return type based on their arguments. Maybe this would be a separate method instead?\r\n```python\r\n for pk, row in table.pk_rows_where(...):\r\n # ...\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816560819, "label": "table.pks_and_rows_where() method returning primary keys along with the rows"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785992158", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 785992158, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NTk5MjE1OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:37:04Z", "updated_at": "2021-02-25T15:37:04Z", "author_association": "OWNER", "body": "Here's the current implementation of `.extract()`: https://github.com/simonw/sqlite-utils/blob/806c21044ac8d31da35f4c90600e98115aade7c6/sqlite_utils/db.py#L1049-L1074\r\n\r\nTricky detail here: I create the lookup table first, based on the types of the columns that are being extracted.\r\n\r\nI need to do this because extraction currently uses unique tuples of values, so the table has to be created in advance.\r\n\r\nBut if I'm using these new expand functions to figure out what's going to be extracted, I don't know the names of the columns and their types in advance. I'm only going to find those out during the transformation.\r\n\r\nThis may turn out to be incompatible with how `.extract()` works at the moment. I may need a new method, `.extract_expand()` perhaps? It could be simpler - work only against a single column for example.\r\n\r\nI can still use the existing `sqlite-utils extract` CLI command though, with a `--json` flag and a rule that you can't run it against multiple columns.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785983837", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 785983837, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NTk4MzgzNw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:25:21Z", "updated_at": "2021-02-25T15:28:57Z", "author_association": "OWNER", "body": "Problem with calling this argument `transform=` is that the term \"transform\" already means something else in this library.\r\n\r\nI could use `convert=` instead.\r\n\r\n... but that doesn't instantly make me think of turning a value into multiple columns.\r\n\r\nHow about `expand=`? I've not used that term anywhere yet.\r\n\r\n db[\"Reports\"].extract([\"Reported by\"], expand={\"Reported by\": json.loads})\r\n\r\nI think that works. You're expanding a single value into several columns of information.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785983070", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 785983070, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NTk4MzA3MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:24:17Z", "updated_at": "2021-02-25T15:24:17Z", "author_association": "OWNER", "body": "I'm going to go with last-wins - so if multiple transform functions return the same key the last one will over-write the others.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785980813", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 785980813, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NTk4MDgxMw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:21:02Z", "updated_at": "2021-02-25T15:23:47Z", "author_association": "OWNER", "body": "Maybe the Python version takes an optional dictionary mapping column names to transformation functions? It could then merge all of those results together - and maybe throw an error if the same key is produced by more than one column.\r\n\r\n```python\r\n db[\"Reports\"].extract([\"Reported by\"], transform={\"Reported by\": json.loads})\r\n```\r\nOr it could have an option for different strategies if keys collide: first wins, last wins, throw exception, add a prefix to the new column name. That feels a bit too complex for an edge-case though.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785980083", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 785980083, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NTk4MDA4Mw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:20:02Z", "updated_at": "2021-02-25T15:20:02Z", "author_association": "OWNER", "body": "It would be OK if the CLI version only allows you to specify a single column if you are using the `--json` option.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785979769", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 785979769, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NTk3OTc2OQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:19:37Z", "updated_at": "2021-02-25T15:19:37Z", "author_association": "OWNER", "body": "For the Python version I'd like to be able to provide a transformation callback function - which can be `json.loads` but could also be anything else which accepts the value of the current column and returns a Python dictionary of columns and their values to use in the new table.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785979192", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 785979192, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NTk3OTE5Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:18:46Z", "updated_at": "2021-02-25T15:18:46Z", "author_association": "OWNER", "body": "Likewise the `sqlite-utils extract` command takes one or more columns:\r\n```\r\nUsage: sqlite-utils extract [OPTIONS] PATH TABLE COLUMNS...\r\n\r\n Extract one or more columns into a separate table\r\n\r\nOptions:\r\n --table TEXT Name of the other table to extract columns to\r\n --fk-column TEXT Name of the foreign key column to add to the table\r\n --rename ... Rename this column in extracted table\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785978689", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239", "id": 785978689, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NTk3ODY4OQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:18:03Z", "updated_at": "2021-02-25T15:18:03Z", "author_association": "OWNER", "body": "The Python `.extract()` method currently starts like this:\r\n```python\r\ndef extract(self, columns, table=None, fk_column=None, rename=None):\r\n rename = rename or {}\r\n if isinstance(columns, str):\r\n columns = [columns]\r\n if not set(columns).issubset(self.columns_dict.keys()):\r\n raise InvalidColumns(\r\n \"Invalid columns {} for table with columns {}\".format(\r\n columns, list(self.columns_dict.keys())\r\n )\r\n )\r\n ...\r\n```\r\nNote that it takes a list of columns (and treats a string as a single item list). That's because it can be called with a list of columns and it will use them to populate another table of unique tuples of those column values.\r\n\r\nSo a new mechanism that can instead read JSON values from a single column needs to be compatible with that existing design.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816526538, "label": "sqlite-utils extract could handle nested objects"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/238#issuecomment-785972074", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/238", "id": 785972074, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NTk3MjA3NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-25T15:08:36Z", "updated_at": "2021-02-25T15:08:36Z", "author_association": "OWNER", "body": "I bet the bug is in here: https://github.com/simonw/sqlite-utils/blob/806c21044ac8d31da35f4c90600e98115aade7c6/sqlite_utils/db.py#L593-L602", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 816523763, "label": ".add_foreign_key() corrupts database if column contains a space"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1241#issuecomment-784567547", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1241", "id": 784567547, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NDU2NzU0Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-23T22:45:56Z", "updated_at": "2021-02-23T22:46:12Z", "author_association": "OWNER", "body": "I really like the way the Share feature on Stack Overflow works: https://stackoverflow.com/questions/18934149/how-can-i-use-postgresqls-text-column-type-in-django\r\n", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 814595021, "label": "Share button for copying current URL"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1241#issuecomment-784334931", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1241", "id": 784334931, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NDMzNDkzMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-23T16:37:26Z", "updated_at": "2021-02-23T16:37:26Z", "author_association": "OWNER", "body": "A \"Share link\" button would only be needed on the table page and the arbitrary query page I think - and maybe on the row page, especially as that page starts to grow more features in the future.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 814595021, "label": "Share button for copying current URL"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1241#issuecomment-784333768", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1241", "id": 784333768, "node_id": "MDEyOklzc3VlQ29tbWVudDc4NDMzMzc2OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-23T16:35:51Z", "updated_at": "2021-02-23T16:35:51Z", "author_association": "OWNER", "body": "This can definitely be done with a plugin.\r\n\r\nAdding to Datasette itself is an interesting idea. I think it's possible that many users these days no longer assume they can paste a URL from the browser address bar (if they ever understood that at all) because to many apps are SPAs with broken URLs.\r\n\r\nThe shareable URLs are actually a key feature of Datasette - so maybe they should be highlighted in the default UI?\r\n\r\nI built a \"copy to clipboard\" feature for `datasette-copyable` and wrote up how that works here: https://til.simonwillison.net/javascript/copy-button", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 814595021, "label": "Share button for copying current URL"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1239#issuecomment-783774084", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1239", "id": 783774084, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mzc3NDA4NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-23T00:18:56Z", "updated_at": "2021-02-23T00:19:18Z", "author_association": "OWNER", "body": "Bug is here: https://github.com/simonw/datasette/blob/42caabf7e9e6e4d69ef6dd7de16f2cd96bc79d5b/datasette/filters.py#L149-L165\r\n\r\nThose `json_each` lines should be:\r\n\r\n select {t}.rowid from {t}, json_each([{t}].[{c}]) j", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 813978858, "label": "JSON filter fails if column contains spaces"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1237#issuecomment-783676548", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1237", "id": 783676548, "node_id": "MDEyOklzc3VlQ29tbWVudDc4MzY3NjU0OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-22T21:10:19Z", "updated_at": "2021-02-22T21:10:25Z", "author_association": "OWNER", "body": "This is another change which is a little bit hard to figure out because I haven't solved #878 yet.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 812704869, "label": "?_pretty=1 option for pretty-printing JSON output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1234#issuecomment-783674659", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1234", "id": 783674659, "node_id": "MDEyOklzc3VlQ29tbWVudDc4MzY3NDY1OQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-22T21:06:28Z", "updated_at": "2021-02-22T21:06:28Z", "author_association": "OWNER", "body": "I'm not going to work on this for a while, but if anyone has needs or ideas around that they can add them to this issue.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 811505638, "label": "Runtime support for ATTACHing multiple databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1236#issuecomment-783674038", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1236", "id": 783674038, "node_id": "MDEyOklzc3VlQ29tbWVudDc4MzY3NDAzOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-22T21:05:21Z", "updated_at": "2021-02-22T21:05:21Z", "author_association": "OWNER", "body": "It's good on mobile - iOS at least. Going to close this open new issues if anyone reports bugs.", "reactions": "{\"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 1, \"eyes\": 0}", "issue": {"value": 812228314, "label": "Ability to increase size of the SQL editor window"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-782789598", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 782789598, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mjc4OTU5OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-21T03:30:02Z", "updated_at": "2021-02-21T03:30:02Z", "author_association": "OWNER", "body": "Another benefit to default:object - I could include a key that shows a list of available extras. I could then use that to power an interactive API explorer.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-782765665", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 782765665, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mjc2NTY2NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-20T23:34:41Z", "updated_at": "2021-02-20T23:34:41Z", "author_association": "OWNER", "body": "OK, I'm back to the \"top level object as the default\" side of things now - it's pretty much unanimous at this point, and it's certainly true that it's not a decision you'll even regret.", "reactions": "{\"total_count\": 2, \"+1\": 2, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-782748501", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 782748501, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mjc0ODUwMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-20T20:58:18Z", "updated_at": "2021-02-20T20:58:18Z", "author_association": "OWNER", "body": "Yet another option: support a `?_path=x` option which returns a nested path from the result. So you could do this:\r\n\r\n`/github/commits.json?_path=rows` - to get back a top-level array pulled from the `\"rows\"` key.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-782748093", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 782748093, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mjc0ODA5Mw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-20T20:54:52Z", "updated_at": "2021-02-20T20:54:52Z", "author_association": "OWNER", "body": "> Have you given any thought as to whether to pretty print (format with spaces) the output or not? Can be useful for debugging/exploring in a browser or other basic tools which don\u2019t parse the JSON. Could be default (can\u2019t be much bigger with gzip?) or opt-in.\r\n\r\nAdding a `?_pretty=1` option that does that is a great idea, I'm filing a ticket for it: #1237", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-782747878", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 782747878, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mjc0Nzg3OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-20T20:53:11Z", "updated_at": "2021-02-20T20:53:11Z", "author_association": "OWNER", "body": "... though thinking about this further, I could re-implement the `select * from commits` (but only return a max of 10 results) feature using a nested `select * from (select * from commits) limit 10` query.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-782747743", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 782747743, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mjc0Nzc0Mw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-20T20:52:10Z", "updated_at": "2021-02-20T20:52:10Z", "author_association": "OWNER", "body": "> Minor suggestion: rename `size` query param to `limit`, to better reflect that it\u2019s a maximum number of rows returned rather than a guarantee of getting that number, and also for consistency with the SQL keyword?\r\n\r\nThe problem there is that `?_size=x` isn't actually doing the same thing as the SQL `limit` keyword. Consider this query:\r\n\r\nhttps://latest-with-plugins.datasette.io/github?sql=select+*+from+commits - `select * from commits`\r\n\r\nDatasette returns 1,000 results, and shows a \"Custom SQL query returning more than 1,000 rows\" message at the top. That's the `size` kicking in - I only fetch the first 1,000 results from the cursor to avoid exhausting resources. In the JSON version of that at https://latest-with-plugins.datasette.io/github.json?sql=select+*+from+commits there's a `\"truncated\": true` key to let you know what happened.\r\n\r\nI find myself using `?_size=2` against Datasette occasionally if I know the rows being returned are really big and I don't want to load 10+MB of HTML.\r\n\r\nThis is only really a concern for arbitrary SQL queries though - for table pages such as https://latest-with-plugins.datasette.io/github/commits?_size=10 adding `?_size=10` actually puts a `limit 10` on the underlying SQL query.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-782747164", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 782747164, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mjc0NzE2NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-20T20:47:16Z", "updated_at": "2021-02-20T20:47:16Z", "author_association": "OWNER", "body": "(I started a thread on Twitter about this: https://twitter.com/simonw/status/1363220355318358016)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-782746633", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 782746633, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mjc0NjYzMw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-20T20:43:07Z", "updated_at": "2021-02-20T20:43:07Z", "author_association": "OWNER", "body": "Another option: `.json` always returns an object with a list of keys that gets increased through adding `?_extra=` parameters.\r\n\r\n`.jsona` always returns a JSON array of objects\r\n\r\nI had something similar to this in Datasette a few years ago - a `.jsono` extension, which still redirects to the `shape=array` version.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-782742233", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 782742233, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mjc0MjIzMw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-20T20:09:16Z", "updated_at": "2021-02-20T20:09:16Z", "author_association": "OWNER", "body": "I just noticed that https://latest-with-plugins.datasette.io/github/commits.json-preview?_extra=total&_size=0&_trace=1 executes 35 SQL queries at the moment! A great reminder that a big improvement from this change will be a reduction in queries through not calculating things like suggested facets unless they are explicitly requested.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-782741719", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 782741719, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mjc0MTcxOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-20T20:05:04Z", "updated_at": "2021-02-20T20:05:04Z", "author_association": "OWNER", "body": "> The only advantage of headers is that you don\u2019t need to do .rows, but that\u2019s actually good as a data validation step anyway\u2014if .rows is missing assume there\u2019s an error and do your error handling path instead of parsing the rest.\r\n\r\nThis is something I've not thought very hard about. If there's an error, I need to return a top-level object, not a top-level array, so I can provide details of the error.\r\n\r\nBut this means that client code will have to handle this difference - it will have to know that the returned data can be array-shaped if nothing went wrong, and object-shaped if there's an error.\r\n\r\nThe HTTP status code helps here - calling client code can know that a 200 status code means there will be an array, but an error status code means an object.\r\n\r\nIf developers really hate that the shape could be different, they can always use `?_extra=next` to ensure that the top level item is an object whether or not an error occurred. So I think this is OK.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-782741107", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 782741107, "node_id": "MDEyOklzc3VlQ29tbWVudDc4Mjc0MTEwNw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-20T20:00:22Z", "updated_at": "2021-02-20T20:00:22Z", "author_association": "OWNER", "body": "A really exciting opportunity this opens up is for parallel execution - the `facets()` and `suggested_facets()` and `total()` async functions could be called in parallel, which could speed things up if I'm confident the SQLite thread pool can execute on multiple CPU cores (it should be able to because the Python `sqlite3` module releases the GIL while it's executing C code).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null}