html_url,issue_url,id,node_id,user,user_label,created_at,updated_at,author_association,body,reactions,issue,issue_label,performed_via_github_app https://github.com/simonw/datasette/pull/1835#issuecomment-1270586897,https://api.github.com/repos/simonw/datasette/issues/1835,1270586897,IC_kwDOBm6k_c5Lu54R,9599,simonw,2022-10-06T19:34:00Z,2022-10-06T19:34:00Z,OWNER,"Wow, great catch! The whole point of inspect data was to avoid this kind of expensive operation on startup so this makes total sense - I had no idea Datasette was still trying to hash a giant file every time the server started.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1400121355,use inspect data for hash and file size, https://github.com/simonw/datasette/issues/1480#issuecomment-1269275153,https://api.github.com/repos/simonw/datasette/issues/1480,1269275153,IC_kwDOBm6k_c5Lp5oR,9599,simonw,2022-10-06T03:54:33Z,2022-10-06T03:54:33Z,OWNER,"I've been having success using Fly recently for a project which I thought would be too large for Cloud Run. I wrote about that here: - https://simonwillison.net/2022/Sep/5/laion-aesthetics-weeknotes/","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1015646369,Exceeding Cloud Run memory limits when deploying a 4.8G database, https://github.com/simonw/datasette/issues/1832#issuecomment-1267925830,https://api.github.com/repos/simonw/datasette/issues/1832,1267925830,IC_kwDOBm6k_c5LkwNG,9599,simonw,2022-10-05T04:31:57Z,2022-10-05T04:31:57Z,OWNER,"Turns out this already works - `__bool__` falls back on `__len__`: https://docs.python.org/3/reference/datamodel.html#object.__bool__ > When this method is not defined, [`__len__()`](https://docs.python.org/3/reference/datamodel.html#object.__len__ ""object.__len__"") is called, if it is defined, and the object is considered true if its result is nonzero. I'll add a test to demonstrate this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1397193691,__bool__ method on Results, https://github.com/simonw/datasette/issues/1832#issuecomment-1267918117,https://api.github.com/repos/simonw/datasette/issues/1832,1267918117,IC_kwDOBm6k_c5LkuUl,9599,simonw,2022-10-05T04:19:52Z,2022-10-05T04:19:52Z,OWNER,"Code can go here: https://github.com/simonw/datasette/blob/b6ba117b7978b58b40e3c3c2b723b92c3010ed53/datasette/database.py#L511-L515 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1397193691,__bool__ method on Results, https://github.com/simonw/datasette/issues/1829#issuecomment-1267709546,https://api.github.com/repos/simonw/datasette/issues/1829,1267709546,IC_kwDOBm6k_c5Lj7Zq,9599,simonw,2022-10-04T23:19:24Z,2022-10-04T23:21:07Z,OWNER,"There's also a `check_visibility()` helper which I'm not using in these particular cases but which may be relevant. It's called like this: https://github.com/simonw/datasette/blob/4218c9cd742b79b1e3cb80878e42b7e39d16ded2/datasette/views/database.py#L65-L77 And is defined here: https://github.com/simonw/datasette/blob/4218c9cd742b79b1e3cb80878e42b7e39d16ded2/datasette/app.py#L694-L710 It's actually documented as a public method here: https://docs.datasette.io/en/stable/internals.html#await-check-visibility-actor-action-resource-none > This convenience method can be used to answer the question ""should this item be considered private, in that it is visible to me but it is not visible to anonymous users?"" > > It returns a tuple of two booleans, `(visible, private)`. `visible` indicates if the actor can see this resource. `private` will be `True` if an anonymous user would not be able to view the resource. Note that this documented method cannot actually do the right thing - because it's not being given the multiple permissions that need to be checked in order to completely answer the question. So I probably need to redesign that method a bit.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1396948693,Table/database that is private due to inherited permissions does not show padlock, https://github.com/simonw/datasette/issues/1829#issuecomment-1267708232,https://api.github.com/repos/simonw/datasette/issues/1829,1267708232,IC_kwDOBm6k_c5Lj7FI,9599,simonw,2022-10-04T23:17:36Z,2022-10-04T23:17:36Z,OWNER,"Here's the relevant code from the table page: https://github.com/simonw/datasette/blob/4218c9cd742b79b1e3cb80878e42b7e39d16ded2/datasette/views/table.py#L215-L227 Note how `ensure_permissions()` there takes the table, database and instance into account... but the `private` assignment (used to decide if the padlock should display or not) only considers the `view-table` check. Here's the same code for the database page: https://github.com/simonw/datasette/blob/4218c9cd742b79b1e3cb80878e42b7e39d16ded2/datasette/views/database.py#L139-L141 And for canned query pages: https://github.com/simonw/datasette/blob/4218c9cd742b79b1e3cb80878e42b7e39d16ded2/datasette/views/database.py#L228-L240 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1396948693,Table/database that is private due to inherited permissions does not show padlock, https://github.com/simonw/datasette/issues/485#issuecomment-1264769569,https://api.github.com/repos/simonw/datasette/issues/485,1264769569,IC_kwDOBm6k_c5LYtoh,9599,simonw,2022-10-03T00:04:42Z,2022-10-03T00:04:42Z,OWNER,"I love these tips - tools that can compile a simple machine learning model to a SQL query! Would be pretty cool if I could bundle a model in Datasette itself as a big in-memory SQLite SQL query: - https://github.com/Chryzanthemum/xgb2sql - https://github.com/konstantint/SKompiler","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",447469253,Improvements to table label detection , https://github.com/simonw/datasette/issues/1805#issuecomment-1264753894,https://api.github.com/repos/simonw/datasette/issues/1805,1264753894,IC_kwDOBm6k_c5LYpzm,9599,simonw,2022-10-02T23:02:54Z,2022-10-02T23:02:54Z,OWNER,I'm tempted to add `word-wrap: anywhere` only to links that are know to be longer than a certain threshold.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1363552780,truncate_cells_html does not work for links?, https://github.com/simonw/datasette/issues/1805#issuecomment-1264753725,https://api.github.com/repos/simonw/datasette/issues/1805,1264753725,IC_kwDOBm6k_c5LYpw9,9599,simonw,2022-10-02T23:02:17Z,2022-10-02T23:02:17Z,OWNER,"After reverting `word--wrap anywhere` https://latest.datasette.io/_memory?sql=select+%27https%3A%2F%2Fexample.com%2Faaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.jpg%27+as+truncated now looks like this, which isn't as good: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1363552780,truncate_cells_html does not work for links?, https://github.com/simonw/datasette/issues/1828#issuecomment-1264753439,https://api.github.com/repos/simonw/datasette/issues/1828,1264753439,IC_kwDOBm6k_c5LYpsf,9599,simonw,2022-10-02T23:01:17Z,2022-10-02T23:01:17Z,OWNER,"That change deployed and https://github-to-sqlite.dogsheep.net/github/commits now looks like this: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1393903845,word-wrap: anywhere resulting in weird display, https://github.com/simonw/datasette/issues/1828#issuecomment-1264738081,https://api.github.com/repos/simonw/datasette/issues/1828,1264738081,IC_kwDOBm6k_c5LYl8h,9599,simonw,2022-10-02T21:34:37Z,2022-10-02T21:34:37Z,OWNER,I'm running a build of that demo instance here (takes ~30m) https://github.com/dogsheep/github-to-sqlite/actions/runs/3170164705,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1393903845,word-wrap: anywhere resulting in weird display, https://github.com/simonw/datasette/issues/485#issuecomment-1264737290,https://api.github.com/repos/simonw/datasette/issues/485,1264737290,IC_kwDOBm6k_c5LYlwK,9599,simonw,2022-10-02T21:29:59Z,2022-10-02T21:29:59Z,OWNER,"To clarify: the feature this issue is talking about relates to the way Datasette automatically displays foreign key relationships, for example on this page: https://github-to-sqlite.dogsheep.net/github/commits Each of those columns is a foreign key to another table. The link text that is displayed there comes from the ""label column"" that has either been configured or automatically detected for that other table. I wonder if this could be handled with a tiny machine learning model that's trained to help pick the best label column? Inputs to that model could include: - The names of the columns - The number of unique values in each column - The type of each column (or maybe only `TEXT` columns should be considered) - How many `null` values there are - Is the column marked as unique? - What's the average (or median or some other statistic) string length of values in each column? Output would be the most likely label column, or some indicator that no likely candidates had been found. My hunch is that this would be better solved using a few extra heuristics rather than by training a model, but it does feel like an interesting opportunity to experiment with a tiny ML model. Asked for tips about this on Twitter: https://twitter.com/simonw/status/1576680930680262658 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",447469253,Improvements to table label detection , https://github.com/simonw/datasette/issues/1805#issuecomment-1264736537,https://api.github.com/repos/simonw/datasette/issues/1805,1264736537,IC_kwDOBm6k_c5LYlkZ,9599,simonw,2022-10-02T21:25:37Z,2022-10-02T21:25:37Z,OWNER,"`word-wrap: anywhere` had some nasty side-effects, removing that: - #1828","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1363552780,truncate_cells_html does not work for links?, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262920929,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262920929,IC_kwDOCGYnMM5LRqTh,9599,simonw,2022-09-29T23:06:44Z,2022-09-29T23:06:44Z,OWNER,"Currently the only other use of `-t` is for this: ``` -t, --table Output as a formatted table ``` So I think it's OK to use it to mean something slightly different for this command, since `sqlite-utils insert` doesn't do any output of data in any format.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262918833,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262918833,IC_kwDOCGYnMM5LRpyx,9599,simonw,2022-09-29T23:02:52Z,2022-09-29T23:02:52Z,OWNER,"The other nice thing about having this as a separate command is that I can implement a tiny subset of the overall `sqlite-utils insert` features at first, and then add additional features in subsequent releases.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262917059,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262917059,IC_kwDOCGYnMM5LRpXD,9599,simonw,2022-09-29T22:59:28Z,2022-09-29T22:59:28Z,OWNER,"I quite like `sqlite-utils fast-csv` - I think it's clear enough what it does, and running `--help` can clarify if needed.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262915322,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262915322,IC_kwDOCGYnMM5LRo76,9599,simonw,2022-09-29T22:57:31Z,2022-09-29T22:57:42Z,OWNER,Maybe `sqlite-utils fast-csv` is right? Not entirely clear that's an insert though as opposed to a faster version of in-memory querying in the style of `sqlite-utils memory`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262914416,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262914416,IC_kwDOCGYnMM5LRotw,9599,simonw,2022-09-29T22:56:53Z,2022-09-29T22:56:53Z,OWNER,"Potential names/designs: - `sqlite-utils fast data.db rows rows.csv` - `sqlite-utils insert-fast data.db rows rows.csv` - `sqlite-utils fast-csv data.db rows rows.csv` Or more interestingly... what if it could accept multiple CSV files to create multiple tables? - `sqlite-utils fast data.db rows.csv other.csv` Would still need to support creating tables with different names though. Maybe like this: - `sqlite-utils fast data.db -t mytable rows.csv -t othertable other.csv` I seem to be leaning towards `fast` as the command name, but as a standalone command name it's a bit meaningless - how do we know that's about CSV import and not about fast querying or similar?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262913145,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262913145,IC_kwDOCGYnMM5LRoZ5,9599,simonw,2022-09-29T22:54:13Z,2022-09-29T22:54:13Z,OWNER,"After reviewing `sqlite-utils insert --help` I'm confident that MOST of these options wouldn't make sense for a ""fast"" moder that just supports CSV and works by piping directly to the `sqlite3` binary: https://github.com/simonw/sqlite-utils/blob/d792dad1cf5f16525da81b1e162fb71d469995f3/docs/cli-reference.rst#L251-L279 I'm going to implement a separate command instead.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/datasette/pull/1825#issuecomment-1260368537,https://api.github.com/repos/simonw/datasette/issues/1825,1260368537,IC_kwDOBm6k_c5LH7KZ,9599,simonw,2022-09-28T04:21:18Z,2022-09-28T04:21:18Z,OWNER,"This is great, thank you very much! https://datasette--1825.org.readthedocs.build/en/1825/deploying.html#running-datasette-using-openrc","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1388227245,Add documentation for serving via OpenRC, https://github.com/simonw/datasette/issues/1826#issuecomment-1260357878,https://api.github.com/repos/simonw/datasette/issues/1826,1260357878,IC_kwDOBm6k_c5LH4j2,9599,simonw,2022-09-28T04:05:45Z,2022-09-28T04:05:45Z,OWNER,Though now I notice that the copy right there needs to be updated to reflect the new `row` parameter to `render_cell`!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1388631785,render_cell documentation example doesn't match the method signature, https://github.com/simonw/datasette/issues/1826#issuecomment-1260357583,https://api.github.com/repos/simonw/datasette/issues/1826,1260357583,IC_kwDOBm6k_c5LH4fP,9599,simonw,2022-09-28T04:05:16Z,2022-09-28T04:05:16Z,OWNER,"This is deliberate. The Datasette plugin system allows you to specify only a subset of the parameters for a hook - in this example, only the `value` is needed so the others can be omitted. There's a note about this at the very top of that documentation page: https://docs.datasette.io/en/stable/plugin_hooks.html#plugin-hooks > When you implement a plugin hook you can accept any or all of the parameters that are documented as being passed to that hook. > > For example, you can implement the `render_cell` plugin hook like this even though the full documented hook signature is `render_cell(value, column, table, database, datasette)`: > ```python > @hookimpl > def render_cell(value, column): > if column == ""stars"": > return ""*"" * int(value) > ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1388631785,render_cell documentation example doesn't match the method signature, https://github.com/simonw/datasette/issues/526#issuecomment-1260355224,https://api.github.com/repos/simonw/datasette/issues/526,1260355224,IC_kwDOBm6k_c5LH36Y,9599,simonw,2022-09-28T04:01:25Z,2022-09-28T04:01:25Z,OWNER,"The ultimate protection against those memory bombs is to support more streaming output formats. Related issues: - #1177 - #1062","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",459882902,Stream all results for arbitrary SQL and canned queries, https://github.com/simonw/datasette/issues/526#issuecomment-1259693536,https://api.github.com/repos/simonw/datasette/issues/526,1259693536,IC_kwDOBm6k_c5LFWXg,9599,simonw,2022-09-27T15:42:55Z,2022-09-27T15:42:55Z,OWNER,"It's interesting to note WHY the time limit works against this so well. The time limit as-implemented looks like this: https://github.com/simonw/datasette/blob/5f9f567acbc58c9fcd88af440e68034510fb5d2b/datasette/utils/__init__.py#L181-L201 The key here is `conn.set_progress_handler(handler, n)` - which specifies that the handler function should be called every `n` SQLite operations. The handler function then checks to see if too much time has transpired and conditionally cancels the query. This also doubles up as a ""maximum number of operations"" guard, which is what's happening when you attempt to fetch an infinite number of rows from an infinite table. That limit code could even be extended to say ""exit the query after either 5s or 50,000,000 operations"". I don't think that's necessary though. To be honest I'm having trouble with the idea of dropping `max_returned_rows` mainly because what Datasette does (allow arbitrary untrusted SQL queries) is dangerous, so I've designed in multiple redundant defence-in-depth mechanisms right from the start.","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",459882902,Stream all results for arbitrary SQL and canned queries, https://github.com/simonw/datasette/issues/526#issuecomment-1258906440,https://api.github.com/repos/simonw/datasette/issues/526,1258906440,IC_kwDOBm6k_c5LCWNI,9599,simonw,2022-09-27T03:04:37Z,2022-09-27T03:04:37Z,OWNER,"It would be really neat if we could explore this idea in a plugin, but I don't think Datasette has plugin hooks in the right place for that at the moment.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",459882902,Stream all results for arbitrary SQL and canned queries, https://github.com/simonw/datasette/issues/526#issuecomment-1258905781,https://api.github.com/repos/simonw/datasette/issues/526,1258905781,IC_kwDOBm6k_c5LCWC1,9599,simonw,2022-09-27T03:03:35Z,2022-09-27T03:03:47Z,OWNER,"Yes good point, the time limit does already protect against that. I've been contemplating a permissioned-users-only relaxation of that time limit too, and I got that idea mixed up with this one in my head. On that basis maybe this feature would be safe after all? Would need to do some testing, but it may be that the existing time limit provides enough protection here already.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",459882902,Stream all results for arbitrary SQL and canned queries, https://github.com/simonw/datasette/issues/526#issuecomment-1258864140,https://api.github.com/repos/simonw/datasette/issues/526,1258864140,IC_kwDOBm6k_c5LCL4M,9599,simonw,2022-09-27T01:55:32Z,2022-09-27T01:55:32Z,OWNER,"That recursive query is a great example of the kind of thing having a maximum row limit protects against. Imagine if Datasette CSVs did allow unlimited retrievals. Someone could hit the CSV endpoint for that recursive query and tie up Datasette's SQL connection effectively forever. Even if this feature becomes a permission-guarded thing we still need to take that case into account. At the very least it would be good if the query could be cancelled if the client disconnects - so if someone accidentally starts an infinite query they can cancel the request and free up the server resources. It might be a good idea to implement a page that shows ""currently running"" queries and allows users with the right permission to terminate them from that page. Another option: a ""limit of last resource"" - either a very high row limit (10,000,000 perhaps) or even a time limit, saying that all queries will be cancelled if they take longer than thirty minutes or similar.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",459882902,Stream all results for arbitrary SQL and canned queries, https://github.com/simonw/datasette/issues/526#issuecomment-1258860845,https://api.github.com/repos/simonw/datasette/issues/526,1258860845,IC_kwDOBm6k_c5LCLEt,9599,simonw,2022-09-27T01:48:31Z,2022-09-27T01:50:01Z,OWNER,"The protection is supposed to be from this line: ```python rows = cursor.fetchmany(max_returned_rows + 1) ``` By capping the call to `.fetchman()` at `max_returned_rows + 1` (the `+ 1` is to allow detection of whether or not there is a next page) I'm ensuring that Datasette never attempts to iterate over a huge result set. SQLite and the `sqlite3` library seem to handle this correctly. Here's an example: ```pycon >>> import sqlite3 >>> conn = sqlite3.connect("":memory:"") >>> cursor = conn.execute("""""" ... with recursive counter(x) as ( ... select 0 ... union ... select x + 1 from counter ... ) ... select * from counter"""""") >>> cursor.fetchmany(10) [(0,), (1,), (2,), (3,), (4,), (5,), (6,), (7,), (8,), (9,), (10,)] ``` `counter` there is an infinitely long table ([see TIL](https://til.simonwillison.net/sqlite/simple-recursive-cte)) - but we can retrieve the first 10 results without going into an infinite loop. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",459882902,Stream all results for arbitrary SQL and canned queries, https://github.com/simonw/datasette/issues/526#issuecomment-1258846992,https://api.github.com/repos/simonw/datasette/issues/526,1258846992,IC_kwDOBm6k_c5LCHsQ,9599,simonw,2022-09-27T01:21:41Z,2022-09-27T01:21:41Z,OWNER,"My main concern here is that public Datasette instances could easily have all of their available database connections consumed by long-running queries - either accidentally or deliberately. I do totally understand the need for this feature though. I think it can absolutely make sense provided it's protected by authentication and permissions. Maybe even limit the number of concurrent downloads at once such that there's always at least one database connection free for other requests.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",459882902,Stream all results for arbitrary SQL and canned queries, https://github.com/simonw/datasette/pull/1823#issuecomment-1258828705,https://api.github.com/repos/simonw/datasette/issues/1823,1258828705,IC_kwDOBm6k_c5LCDOh,9599,simonw,2022-09-27T00:45:46Z,2022-09-27T00:45:46Z,OWNER,Also need to do a bit more of an audit to see if there is anywhere else that this style should be applied.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1386917344,Keyword-only arguments for a bunch of internal methods, https://github.com/simonw/datasette/pull/1823#issuecomment-1258828509,https://api.github.com/repos/simonw/datasette/issues/1823,1258828509,IC_kwDOBm6k_c5LCDLd,9599,simonw,2022-09-27T00:45:26Z,2022-09-27T00:45:26Z,OWNER,I should update the documentation to reflect this change.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1386917344,Keyword-only arguments for a bunch of internal methods, https://github.com/simonw/datasette/issues/1822#issuecomment-1258827688,https://api.github.com/repos/simonw/datasette/issues/1822,1258827688,IC_kwDOBm6k_c5LCC-o,9599,simonw,2022-09-27T00:44:04Z,2022-09-27T00:44:04Z,OWNER,I'll do this in a PR.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1386854246,Switch to keyword-only arguments for a bunch of internal methods, https://github.com/simonw/datasette/issues/1817#issuecomment-1258818028,https://api.github.com/repos/simonw/datasette/issues/1817,1258818028,IC_kwDOBm6k_c5LCAns,9599,simonw,2022-09-27T00:27:53Z,2022-09-27T00:27:53Z,OWNER,"Made a start on this: ```diff diff --git a/datasette/hookspecs.py b/datasette/hookspecs.py index 34e19664..fe0971e5 100644 --- a/datasette/hookspecs.py +++ b/datasette/hookspecs.py @@ -31,25 +31,29 @@ def prepare_jinja2_environment(env, datasette): @hookspec -def extra_css_urls(template, database, table, columns, view_name, request, datasette): +def extra_css_urls( + template, database, table, columns, sql, params, view_name, request, datasette +): """"""Extra CSS URLs added by this plugin"""""" @hookspec -def extra_js_urls(template, database, table, columns, view_name, request, datasette): +def extra_js_urls( + template, database, table, columns, sql, params, view_name, request, datasette +): """"""Extra JavaScript URLs added by this plugin"""""" @hookspec def extra_body_script( - template, database, table, columns, view_name, request, datasette + template, database, table, columns, sql, params, view_name, request, datasette ): """"""Extra JavaScript code to be included in ' > dist/index.html # Run a server for that dist/ folder cd dist python3 -m http.server 8529 & cd .. shot-scraper javascript http://localhost:8529/ "" async () => { let pyodide = await loadPyodide(); await pyodide.loadPackage(['micropip', 'ssl', 'setuptools']); let output = await pyodide.runPythonAsync(\` import micropip await micropip.install('h11==0.12.0') await micropip.install('http://localhost:8529/$wheel') import ssl import setuptools from datasette.app import Datasette ds = Datasette(memory=True, settings={'num_sql_threads': 0}) (await ds.client.get('/_memory.json?sql=select+55+as+itworks&_shape=array')).text \`); if (JSON.parse(output)[0].itworks != 55) { throw 'Got ' + output + ', expected itworks: 55'; } return 'Test passed!'; } "" # Shut down the server pkill -f 'http.server 8529' ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223459734,Automated test for Pyodide compatibility, https://github.com/simonw/datasette/issues/1733#issuecomment-1115404729,https://api.github.com/repos/simonw/datasette/issues/1733,1115404729,IC_kwDOBm6k_c5Ce7m5,9599,simonw,2022-05-02T21:49:01Z,2022-05-02T21:49:38Z,OWNER,"That alpha release works! https://pyodide.org/en/stable/console.html ```pycon Welcome to the Pyodide terminal emulator 🐍 Python 3.10.2 (main, Apr 9 2022 20:52:01) on WebAssembly VM Type ""help"", ""copyright"", ""credits"" or ""license"" for more information. >>> import micropip >>> await micropip.install(""datasette==0.62a0"") >>> import ssl >>> import setuptools >>> from datasette.app import Datasette >>> ds = Datasette(memory=True, settings={""num_sql_threads"": 0}) >>> await ds.client.get(""/.json"") >>> (await ds.client.get(""/.json"")).json() {'_memory': {'name': '_memory', 'hash': None, 'color': 'a6c7b9', 'path': '/_memory', 'tables_and_views_truncated': [], 'tab les_and_views_more': False, 'tables_count': 0, 'table_rows_sum': 0, 'show_table_row_counts': False, 'hidden_table_rows_sum' : 0, 'hidden_tables_count': 0, 'views_count': 0, 'private': False}} >>> ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932,Get Datasette compatible with Pyodide, https://github.com/simonw/datasette/issues/1733#issuecomment-1115318417,https://api.github.com/repos/simonw/datasette/issues/1733,1115318417,IC_kwDOBm6k_c5CemiR,9599,simonw,2022-05-02T20:13:43Z,2022-05-02T20:13:43Z,OWNER,This is good enough to push an alpha.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932,Get Datasette compatible with Pyodide, https://github.com/simonw/datasette/issues/1733#issuecomment-1115318303,https://api.github.com/repos/simonw/datasette/issues/1733,1115318303,IC_kwDOBm6k_c5Cemgf,9599,simonw,2022-05-02T20:13:36Z,2022-05-02T20:13:36Z,OWNER,"I got a build from the `pyodide` branch to work! ``` Welcome to the Pyodide terminal emulator 🐍 Python 3.10.2 (main, Apr 9 2022 20:52:01) on WebAssembly VM Type ""help"", ""copyright"", ""credits"" or ""license"" for more information. >>> import micropip >>> await micropip.install(""https://s3.amazonaws.com/simonwillison-cors-allowed-public/datasette-0.62a0-py3-none-any.whl"") Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/asyncio/futures.py"", line 284, in __await__ yield self # This tells Task to wait for completion. File ""/lib/python3.10/asyncio/tasks.py"", line 304, in __wakeup future.result() File ""/lib/python3.10/asyncio/futures.py"", line 201, in result raise self._exception File ""/lib/python3.10/asyncio/tasks.py"", line 234, in __step result = coro.throw(exc) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 183, in install transaction = await self.gather_requirements(requirements, ctx, keep_going) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 173, in gather_requirements await gather(*requirement_promises) File ""/lib/python3.10/asyncio/futures.py"", line 284, in __await__ yield self # This tells Task to wait for completion. File ""/lib/python3.10/asyncio/tasks.py"", line 304, in __wakeup future.result() File ""/lib/python3.10/asyncio/futures.py"", line 201, in result raise self._exception File ""/lib/python3.10/asyncio/tasks.py"", line 232, in __step result = coro.send(None) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 245, in add_requirement await self.add_wheel(name, wheel, version, (), ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 291, in add_requirement await self.add_wheel( File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 291, in add_requirement await self.add_wheel( File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 276, in add_requirement raise ValueError( ValueError: Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed >>> await micropip.install(""https://s3.amazonaws.com/simonwillison-cors-allowed-public/datasette-0.62a0-py3-none-any.whl"") Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/asyncio/futures.py"", line 284, in __await__ yield self # This tells Task to wait for completion. File ""/lib/python3.10/asyncio/tasks.py"", line 304, in __wakeup future.result() File ""/lib/python3.10/asyncio/futures.py"", line 201, in result raise self._exception File ""/lib/python3.10/asyncio/tasks.py"", line 234, in __step result = coro.throw(exc) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 183, in install transaction = await self.gather_requirements(requirements, ctx, keep_going) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 173, in gather_requirements await gather(*requirement_promises) File ""/lib/python3.10/asyncio/futures.py"", line 284, in __await__ yield self # This tells Task to wait for completion. File ""/lib/python3.10/asyncio/tasks.py"", line 304, in __wakeup future.result() File ""/lib/python3.10/asyncio/futures.py"", line 201, in result raise self._exception File ""/lib/python3.10/asyncio/tasks.py"", line 232, in __step result = coro.send(None) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 245, in add_requirement await self.add_wheel(name, wheel, version, (), ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 291, in add_requirement await self.add_wheel( File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 291, in add_requirement await self.add_wheel( File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 276, in add_requirement raise ValueError( ValueError: Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed >>> await micropip.install(""h11==0.12"") >>> await micropip.install(""https://s3.amazonaws.com/simonwillison-cors-allowed-public/datasette-0.62a0-py3-none-any.whl"") >>> import datasette >>> from datasette.app import Datasette Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/site-packages/datasette/app.py"", line 9, in import httpx File ""/lib/python3.10/site-packages/httpx/__init__.py"", line 2, in from ._api import delete, get, head, options, patch, post, put, request, stream File ""/lib/python3.10/site-packages/httpx/_api.py"", line 4, in from ._client import Client File ""/lib/python3.10/site-packages/httpx/_client.py"", line 9, in from ._auth import Auth, BasicAuth, FunctionAuth File ""/lib/python3.10/site-packages/httpx/_auth.py"", line 10, in from ._models import Request, Response File ""/lib/python3.10/site-packages/httpx/_models.py"", line 16, in from ._content import ByteStream, UnattachedStream, encode_request, encode_response File ""/lib/python3.10/site-packages/httpx/_content.py"", line 17, in from ._multipart import MultipartStream File ""/lib/python3.10/site-packages/httpx/_multipart.py"", line 7, in from ._types import ( File ""/lib/python3.10/site-packages/httpx/_types.py"", line 5, in import ssl File ""/lib/python3.10/ssl.py"", line 98, in import _ssl # if we can't import it, let the error propagate ModuleNotFoundError: No module named '_ssl' >>> import ssl >>> from datasette.app import Datasette Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/site-packages/datasette/app.py"", line 14, in import pkg_resources ModuleNotFoundError: No module named 'pkg_resources' >>> import setuptools >>> from datasette.app import Datasette >>> ds = Datasette(memory=True) >>> ds >>> await ds.client.get(""/"") Traceback (most recent call last): File ""/lib/python3.10/site-packages/datasette/app.py"", line 1268, in route_path response = await view(request, send) File ""/lib/python3.10/site-packages/datasette/views/base.py"", line 134, in view return await self.dispatch_request(request) File ""/lib/python3.10/site-packages/datasette/views/base.py"", line 89, in dispatch_request await self.ds.refresh_schemas() File ""/lib/python3.10/site-packages/datasette/app.py"", line 353, in refresh_schemas await self._refresh_schemas() File ""/lib/python3.10/site-packages/datasette/app.py"", line 358, in _refresh_schemas await init_internal_db(internal_db) File ""/lib/python3.10/site-packages/datasette/utils/internal_db.py"", line 65, in init_internal_db await db.execute_write_script(create_tables_sql) File ""/lib/python3.10/site-packages/datasette/database.py"", line 116, in execute_write_script results = await self.execute_write_fn(_inner, block=block) File ""/lib/python3.10/site-packages/datasette/database.py"", line 155, in execute_write_fn self._write_thread.start() File ""/lib/python3.10/threading.py"", line 928, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't start new thread >>> ds = Datasette(memory=True, settings={""num_sql_threads"": 0}) >>> await ds.client.get(""/"") >>> (await ds.client.get(""/"")).text '\n\n\n Datasette: _memory\n \n \n\n\n\n
\n
\n\n\n\n \n\n\n\n
\n\n

Datasette

\n\n\n\n\n\n

r detailsClickedWithin = null;\n while (target && target.tagName != \'DETAILS\') {\n target = target.parentNode;\ n }\n if (target && target.tagName == \'DETAILS\') {\n detailsClickedWithin = target;\n }\n Array.from(d ocument.getElementsByTagName(\'details\')).filter(\n (details) => details.open && details != detailsClickedWithin\n ).forEach(details => details.open = false);\n});\n\n\n\n\n\n\n ' >>> ``` That `ValueError: Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed` error is annoying. I assume it's a `uvicorn` dependency clash of some sort, because I wasn't getting that when I removed `uvicorn` as a dependency. I can avoid it by running this first though: await micropip.install(""h11==0.12"")","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932,Get Datasette compatible with Pyodide, https://github.com/simonw/datasette/issues/1735#issuecomment-1115301733,https://api.github.com/repos/simonw/datasette/issues/1735,1115301733,IC_kwDOBm6k_c5Ceidl,9599,simonw,2022-05-02T19:57:19Z,2022-05-02T19:59:03Z,OWNER,"This code breaks if that setting is 0: https://github.com/simonw/datasette/blob/a29c1277896b6a7905ef5441c42a37bc15f67599/datasette/app.py#L291-L293 It's used here: https://github.com/simonw/datasette/blob/a29c1277896b6a7905ef5441c42a37bc15f67599/datasette/database.py#L188-L190","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223263540,Datasette setting to disable threading (for Pyodide), https://github.com/simonw/datasette/issues/1733#issuecomment-1115288284,https://api.github.com/repos/simonw/datasette/issues/1733,1115288284,IC_kwDOBm6k_c5CefLc,9599,simonw,2022-05-02T19:40:33Z,2022-05-02T19:40:33Z,OWNER,"I'll release this as a `0.62a0` as soon as it's ready, so I can start testing it out in Pyodide for real.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932,Get Datasette compatible with Pyodide, https://github.com/simonw/datasette/issues/1734#issuecomment-1115283922,https://api.github.com/repos/simonw/datasette/issues/1734,1115283922,IC_kwDOBm6k_c5CeeHS,9599,simonw,2022-05-02T19:35:32Z,2022-05-02T19:35:32Z,OWNER,I'll use my original from 2009: https://www.djangosnippets.org/snippets/1431/,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223241647,Remove python-baseconv dependency, https://github.com/simonw/datasette/issues/1734#issuecomment-1115282773,https://api.github.com/repos/simonw/datasette/issues/1734,1115282773,IC_kwDOBm6k_c5Ced1V,9599,simonw,2022-05-02T19:34:15Z,2022-05-02T19:34:15Z,OWNER,I'm going to vendor it and update the documentation.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223241647,Remove python-baseconv dependency, https://github.com/simonw/datasette/issues/1733#issuecomment-1115278325,https://api.github.com/repos/simonw/datasette/issues/1733,1115278325,IC_kwDOBm6k_c5Cecv1,9599,simonw,2022-05-02T19:29:05Z,2022-05-02T19:29:05Z,OWNER,"I'm going to add a Datasette setting to disable threading entirely, designed for usage in this particular case. I thought about adding a new setting, then I noticed this: datasette mydatabase.db --setting num_sql_threads 10 I'm going to let users set that to `0` to disable threaded execution of SQL queries.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932,Get Datasette compatible with Pyodide, https://github.com/simonw/datasette/issues/1733#issuecomment-1115268245,https://api.github.com/repos/simonw/datasette/issues/1733,1115268245,IC_kwDOBm6k_c5CeaSV,9599,simonw,2022-05-02T19:18:11Z,2022-05-02T19:18:11Z,OWNER,"Maybe I can leave `uvicorn` as a dependency? Installing it works OK, it only generates errors when you try to import it: ```pycon Welcome to the Pyodide terminal emulator 🐍 Python 3.10.2 (main, Apr 9 2022 20:52:01) on WebAssembly VM Type ""help"", ""copyright"", ""credits"" or ""license"" for more information. >>> import micropip >>> await micropip.install(""uvicorn"") >>> import uvicorn Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/site-packages/uvicorn/__init__.py"", line 1, in from uvicorn.config import Config File ""/lib/python3.10/site-packages/uvicorn/config.py"", line 8, in import ssl File ""/lib/python3.10/ssl.py"", line 98, in import _ssl # if we can't import it, let the error propagate ModuleNotFoundError: No module named '_ssl' >>> import ssl >>> import uvicorn Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/site-packages/uvicorn/__init__.py"", line 2, in from uvicorn.main import Server, main, run File ""/lib/python3.10/site-packages/uvicorn/main.py"", line 24, in from uvicorn.supervisors import ChangeReload, Multiprocess File ""/lib/python3.10/site-packages/uvicorn/supervisors/__init__.py"", line 3, in from uvicorn.supervisors.basereload import BaseReload File ""/lib/python3.10/site-packages/uvicorn/supervisors/basereload.py"", line 12, in from uvicorn.subprocess import get_subprocess File ""/lib/python3.10/site-packages/uvicorn/subprocess.py"", line 14, in multiprocessing.allow_connection_pickling() File ""/lib/python3.10/multiprocessing/context.py"", line 170, in allow_connection_pickling from . import connection File ""/lib/python3.10/multiprocessing/connection.py"", line 21, in import _multiprocessing ModuleNotFoundError: No module named '_multiprocessing' >>> import multiprocessing >>> import uvicorn Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/site-packages/uvicorn/__init__.py"", line 2, in from uvicorn.main import Server, main, run File ""/lib/python3.10/site-packages/uvicorn/main.py"", line 24, in from uvicorn.supervisors import ChangeReload, Multiprocess File ""/lib/python3.10/site-packages/uvicorn/supervisors/__init__.py"", line 3, in from uvicorn.supervisors.basereload import BaseReload File ""/lib/python3.10/site-packages/uvicorn/supervisors/basereload.py"", line 12, in from uvicorn.subprocess import get_subprocess File ""/lib/python3.10/site-packages/uvicorn/subprocess.py"", line 14, in multiprocessing.allow_connection_pickling() File ""/lib/python3.10/multiprocessing/context.py"", line 170, in allow_connection_pickling from . import connection File ""/lib/python3.10/multiprocessing/connection.py"", line 21, in import _multiprocessing ModuleNotFoundError: No module named '_multiprocessing' >>> ``` Since the `import ssl` trick fixed the `_ssl` error I was hopeful that `import multiprocessing` could fix the `_multiprocessing` one, but sadly it did not. But it looks like i can address this issue just by making `import uvicorn` in `app.py` an optional import.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932,Get Datasette compatible with Pyodide, https://github.com/simonw/datasette/issues/1733#issuecomment-1115262218,https://api.github.com/repos/simonw/datasette/issues/1733,1115262218,IC_kwDOBm6k_c5CeY0K,9599,simonw,2022-05-02T19:11:51Z,2022-05-02T19:14:01Z,OWNER,"Here's the full diff I applied to Datasette to get it fully working in Pyodide: https://github.com/simonw/datasette/compare/94a3171b01fde5c52697aeeff052e3ad4bab5391...8af32bc5b03c30b1f7a4a8cc4bd80eb7e2ee7b81 And as a visible diff: ```diff diff --git a/datasette/app.py b/datasette/app.py index d269372..6c0c5fc 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -15,7 +15,6 @@ import pkg_resources import re import secrets import sys -import threading import traceback import urllib.parse from concurrent import futures @@ -26,7 +25,6 @@ from itsdangerous import URLSafeSerializer from jinja2 import ChoiceLoader, Environment, FileSystemLoader, PrefixLoader from jinja2.environment import Template from jinja2.exceptions import TemplateNotFound -import uvicorn from .views.base import DatasetteError, ureg from .views.database import DatabaseDownload, DatabaseView @@ -813,7 +811,6 @@ class Datasette: }, ""datasette"": datasette_version, ""asgi"": ""3.0"", - ""uvicorn"": uvicorn.__version__, ""sqlite"": { ""version"": sqlite_version, ""fts_versions"": fts_versions, @@ -854,23 +851,7 @@ class Datasette: ] def _threads(self): - threads = list(threading.enumerate()) - d = { - ""num_threads"": len(threads), - ""threads"": [ - {""name"": t.name, ""ident"": t.ident, ""daemon"": t.daemon} for t in threads - ], - } - # Only available in Python 3.7+ - if hasattr(asyncio, ""all_tasks""): - tasks = asyncio.all_tasks() - d.update( - { - ""num_tasks"": len(tasks), - ""tasks"": [_cleaner_task_str(t) for t in tasks], - } - ) - return d + return {""num_threads"": 0, ""threads"": []} def _actor(self, request): return {""actor"": request.actor} diff --git a/datasette/database.py b/datasette/database.py index ba594a8..b50142d 100644 --- a/datasette/database.py +++ b/datasette/database.py @@ -4,7 +4,6 @@ from pathlib import Path import janus import queue import sys -import threading import uuid from .tracer import trace @@ -21,8 +20,6 @@ from .utils import ( ) from .inspect import inspect_hash -connections = threading.local() - AttachedDatabase = namedtuple(""AttachedDatabase"", (""seq"", ""name"", ""file"")) @@ -43,12 +40,12 @@ class Database: self.hash = None self.cached_size = None self._cached_table_counts = None - self._write_thread = None - self._write_queue = None if not self.is_mutable and not self.is_memory: p = Path(path) self.hash = inspect_hash(p) self.cached_size = p.stat().st_size + self._read_connection = None + self._write_connection = None @property def cached_table_counts(self): @@ -134,60 +131,17 @@ class Database: return results async def execute_write_fn(self, fn, block=True): - task_id = uuid.uuid5(uuid.NAMESPACE_DNS, ""datasette.io"") - if self._write_queue is None: - self._write_queue = queue.Queue() - if self._write_thread is None: - self._write_thread = threading.Thread( - target=self._execute_writes, daemon=True - ) - self._write_thread.start() - reply_queue = janus.Queue() - self._write_queue.put(WriteTask(fn, task_id, reply_queue)) - if block: - result = await reply_queue.async_q.get() - if isinstance(result, Exception): - raise result - else: - return result - else: - return task_id - - def _execute_writes(self): - # Infinite looping thread that protects the single write connection - # to this database - conn_exception = None - conn = None - try: - conn = self.connect(write=True) - self.ds._prepare_connection(conn, self.name) - except Exception as e: - conn_exception = e - while True: - task = self._write_queue.get() - if conn_exception is not None: - result = conn_exception - else: - try: - result = task.fn(conn) - except Exception as e: - sys.stderr.write(""{}\n"".format(e)) - sys.stderr.flush() - result = e - task.reply_queue.sync_q.put(result) + # We always treat it as if block=True now + if self._write_connection is None: + self._write_connection = self.connect(write=True) + self.ds._prepare_connection(self._write_connection, self.name) + return fn(self._write_connection) async def execute_fn(self, fn): - def in_thread(): - conn = getattr(connections, self.name, None) - if not conn: - conn = self.connect() - self.ds._prepare_connection(conn, self.name) - setattr(connections, self.name, conn) - return fn(conn) - - return await asyncio.get_event_loop().run_in_executor( - self.ds.executor, in_thread - ) + if self._read_connection is None: + self._read_connection = self.connect() + self.ds._prepare_connection(self._read_connection, self.name) + return fn(self._read_connection) async def execute( self, diff --git a/setup.py b/setup.py index 7f0562f..c41669c 100644 --- a/setup.py +++ b/setup.py @@ -44,20 +44,20 @@ setup( install_requires=[ ""asgiref>=3.2.10,<3.6.0"", ""click>=7.1.1,<8.2.0"", - ""click-default-group~=1.2.2"", + # ""click-default-group~=1.2.2"", ""Jinja2>=2.10.3,<3.1.0"", ""hupper~=1.9"", ""httpx>=0.20"", ""pint~=0.9"", ""pluggy>=1.0,<1.1"", - ""uvicorn~=0.11"", + # ""uvicorn~=0.11"", ""aiofiles>=0.4,<0.9"", ""janus>=0.6.2,<1.1"", ""asgi-csrf>=0.9"", ""PyYAML>=5.3,<7.0"", ""mergedeep>=1.1.1,<1.4.0"", ""itsdangerous>=1.1,<3.0"", - ""python-baseconv==1.2.2"", + # ""python-baseconv==1.2.2"", ], entry_points="""""" [console_scripts] ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932,Get Datasette compatible with Pyodide, https://github.com/simonw/datasette/issues/1734#issuecomment-1115260999,https://api.github.com/repos/simonw/datasette/issues/1734,1115260999,IC_kwDOBm6k_c5CeYhH,9599,simonw,2022-05-02T19:10:34Z,2022-05-02T19:10:34Z,OWNER,"This is actually mostly a documentation thing: here: https://docs.datasette.io/en/0.61.1/authentication.html#including-an-expiry-time In the code it's only used in these two places: https://github.com/simonw/datasette/blob/0a7621f96f8ad14da17e7172e8a7bce24ef78966/datasette/actor_auth_cookie.py#L16-L20 https://github.com/simonw/datasette/blob/0a7621f96f8ad14da17e7172e8a7bce24ef78966/tests/test_auth.py#L56-L60","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223241647,Remove python-baseconv dependency, https://github.com/simonw/datasette/issues/1733#issuecomment-1115258737,https://api.github.com/repos/simonw/datasette/issues/1733,1115258737,IC_kwDOBm6k_c5CeX9x,9599,simonw,2022-05-02T19:08:17Z,2022-05-02T19:08:17Z,OWNER,"I was going to vendor `baseconv.py`, but then I reconsidered - what if there are plugins out there that expect `import baseconv` to work because they have dependend on Datasette? I used https://cs.github.com/ and as far as I can tell there aren't any! So I'm going to remove that dependency and work out a smarter way to do this - probably by providing a utility function within Datasette itself.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932,Get Datasette compatible with Pyodide, https://github.com/simonw/datasette/issues/1733#issuecomment-1115256318,https://api.github.com/repos/simonw/datasette/issues/1733,1115256318,IC_kwDOBm6k_c5CeXX-,9599,simonw,2022-05-02T19:05:55Z,2022-05-02T19:05:55Z,OWNER,"I released a `click-default-group-wheel` package to solve that dependency issue. I've already upgraded `sqlite-utils` to that, so now you can use that in Pyodide: - https://github.com/simonw/sqlite-utils/pull/429 `python-baseconv` is only used for actor cookie expiration times: https://github.com/simonw/datasette/blob/0a7621f96f8ad14da17e7172e8a7bce24ef78966/datasette/actor_auth_cookie.py#L16-L20 Datasette never actually sets that cookie itself - it instead encourages plugins to set it in the authentication documentation here: https://docs.datasette.io/en/0.61.1/authentication.html#including-an-expiry-time","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932,Get Datasette compatible with Pyodide, https://github.com/simonw/sqlite-utils/pull/429#issuecomment-1115196863,https://api.github.com/repos/simonw/sqlite-utils/issues/429,1115196863,IC_kwDOCGYnMM5CeI2_,9599,simonw,2022-05-02T18:03:47Z,2022-05-02T18:52:42Z,OWNER,"I made a build of this branch and tested it like this: https://pyodide.org/en/stable/console.html ```pycon >>> import micropip >>> await micropip.install(""https://s3.amazonaws.com/simonwillison-cors-allowed-public/sqlite_utils-3.26-py3-none-any.whl"") >>> import sqlite_utils >>> db = sqlite_utils.Database(memory=True) >>> list(db.query(""select 32443 + 55"")) [{'32443 + 55': 32498}] ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223177069,Depend on click-default-group-wheel, https://github.com/simonw/sqlite-utils/pull/429#issuecomment-1115197644,https://api.github.com/repos/simonw/sqlite-utils/issues/429,1115197644,IC_kwDOCGYnMM5CeJDM,9599,simonw,2022-05-02T18:04:28Z,2022-05-02T18:04:28Z,OWNER,I'm going to ship this straight away as `3.26.1`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223177069,Depend on click-default-group-wheel, https://github.com/simonw/datasette/issues/1727#issuecomment-1114058210,https://api.github.com/repos/simonw/datasette/issues/1727,1114058210,IC_kwDOBm6k_c5CZy3i,9599,simonw,2022-04-30T21:39:34Z,2022-04-30T21:39:34Z,OWNER,"Something to consider if I look into subprocesses for parallel query execution: https://sqlite.org/howtocorrupt.html#_carrying_an_open_database_connection_across_a_fork_ > Do not open an SQLite database connection, then fork(), then try to use that database connection in the child process. All kinds of locking problems will result and you can easily end up with a corrupt database. SQLite is not designed to support that kind of behavior. Any database connection that is used in a child process must be opened in the child process, not inherited from the parent. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1729#issuecomment-1114038259,https://api.github.com/repos/simonw/datasette/issues/1729,1114038259,IC_kwDOBm6k_c5CZt_z,9599,simonw,2022-04-30T19:06:03Z,2022-04-30T19:06:03Z,OWNER,"> but actually the facet results would be better if they were a list rather than a dictionary I think `facet_results` in the JSON should match this (used by the HTML) instead: https://github.com/simonw/datasette/blob/942411ef946e9a34a2094944d3423cddad27efd3/datasette/views/table.py#L737-L741 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1729#issuecomment-1114036946,https://api.github.com/repos/simonw/datasette/issues/1729,1114036946,IC_kwDOBm6k_c5CZtrS,9599,simonw,2022-04-30T18:56:25Z,2022-04-30T19:04:03Z,OWNER,"Related: - #1558 Which talks about how there was confusion in this example: https://latest.datasette.io/fixtures/facetable.json?_facet=created&_facet_date=created&_facet=tags&_facet_array=tags&_nosuggest=1&_size=0 Which I fixed in #625 by introducing `tags` and `tags_2` keys, but actually the facet results would be better if they were a list rather than a dictionary.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1729#issuecomment-1114037521,https://api.github.com/repos/simonw/datasette/issues/1729,1114037521,IC_kwDOBm6k_c5CZt0R,9599,simonw,2022-04-30T19:01:07Z,2022-04-30T19:01:07Z,OWNER,"I had to look up what `hideable` means - it means that you can't hide the current facet because it was defined in metadata, not as a `?_facet=` parameter: https://github.com/simonw/datasette/blob/4e47a2d894b96854348343374c8e97c9d7055cf6/datasette/facets.py#L228 That's a bit of a weird thing to expose in the API. Maybe change that to `source` so it can be `metadata` or `request`? That's very slightly less coupled to how the UI works.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1729#issuecomment-1114013757,https://api.github.com/repos/simonw/datasette/issues/1729,1114013757,IC_kwDOBm6k_c5CZoA9,9599,simonw,2022-04-30T16:15:51Z,2022-04-30T18:54:39Z,OWNER,"Deployed a preview of this here: https://latest-1-0-alpha.datasette.io/ Examples: - https://latest-1-0-alpha.datasette.io/fixtures/facetable.json - https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count Second example produces: ```json { ""rows"": [], ""next"": null, ""next_url"": null, ""count"": 15, ""facet_results"": { ""state"": { ""name"": ""state"", ""type"": ""column"", ""hideable"": true, ""toggle_url"": ""/fixtures/facetable.json?_size=0&_extra=facet_results&_extra=count"", ""results"": [ { ""value"": ""CA"", ""label"": ""CA"", ""count"": 10, ""toggle_url"": ""https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=CA"", ""selected"": false }, { ""value"": ""MI"", ""label"": ""MI"", ""count"": 4, ""toggle_url"": ""https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=MI"", ""selected"": false }, { ""value"": ""MC"", ""label"": ""MC"", ""count"": 1, ""toggle_url"": ""https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=MC"", ""selected"": false } ], ""truncated"": false } } } ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1727#issuecomment-1112889800,https://api.github.com/repos/simonw/datasette/issues/1727,1112889800,IC_kwDOBm6k_c5CVVnI,9599,simonw,2022-04-29T05:29:38Z,2022-04-29T05:29:38Z,OWNER,"OK, I just got the most incredible result with that! I started up a container running `bash` like this, from my `datasette` checkout. I'm mapping port 8005 on my laptop to port 8001 inside the container because laptop port 8001 was already doing something else: ``` docker run -it --rm --name my-running-script -p 8005:8001 -v ""$PWD"":/usr/src/myapp \ -w /usr/src/myapp nogil/python bash ``` Then in `bash` I ran the following commands to install Datasette and its dependencies: ``` pip install -e '.[test]' pip install datasette-pretty-traces # For debug tracing ``` Then I started Datasette against my `github.db` database (from github-to-sqlite.dogsheep.net/github.db) like this: ``` datasette github.db -h 0.0.0.0 --setting trace_debug 1 ``` I hit the following two URLs to compare the parallel v.s. not parallel implementations: - `http://127.0.0.1:8005/github/issues?_facet=milestone&_facet=repo&_trace=1&_size=10` - `http://127.0.0.1:8005/github/issues?_facet=milestone&_facet=repo&_trace=1&_size=10&_noparallel=1` And... the parallel one beat the non-parallel one decisively, on multiple page refreshes! Not parallel: 77ms Parallel: 47ms So yeah, I'm very confident this is a problem with the GIL. And I am absolutely **stunned** that @colesbury's fork ran Datasette (which has some reasonably tricky threading and async stuff going on) out of the box!","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1112879463,https://api.github.com/repos/simonw/datasette/issues/1727,1112879463,IC_kwDOBm6k_c5CVTFn,9599,simonw,2022-04-29T05:03:58Z,2022-04-29T05:03:58Z,OWNER,"It would be _really_ fun to try running this with the in-development `nogil` Python from https://github.com/colesbury/nogil There's a Docker container for it: https://hub.docker.com/r/nogil/python It suggests you can run something like this: docker run -it --rm --name my-running-script -v ""$PWD"":/usr/src/myapp \ -w /usr/src/myapp nogil/python python your-daemon-or-script.py","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1112878955,https://api.github.com/repos/simonw/datasette/issues/1727,1112878955,IC_kwDOBm6k_c5CVS9r,9599,simonw,2022-04-29T05:02:40Z,2022-04-29T05:02:40Z,OWNER,"Here's a very useful (recent) article about how the GIL works and how to think about it: https://pythonspeed.com/articles/python-gil/ - via https://lobste.rs/s/9hj80j/when_python_can_t_thread_deep_dive_into_gil From that article: > For example, let's consider an extension module written in C or Rust that lets you talk to a PostgreSQL database server. > > Conceptually, handling a SQL query with this library will go through three steps: > > 1. Deserialize from Python to the internal library representation. Since this will be reading Python objects, it needs to hold the GIL. > 2. Send the query to the database server, and wait for a response. This doesn't need the GIL. > 3. Convert the response into Python objects. This needs the GIL again. > > As you can see, how much parallelism you can get depends on how much time is spent in each step. If the bulk of time is spent in step 2, you'll get parallelism there. But if, for example, you run a `SELECT` and get a large number of rows back, the library will need to create many Python objects, and step 3 will have to hold GIL for a while. That explains what I'm seeing here. I'm pretty convinced now that the reason I'm not getting a performance boost from parallel queries is that there's more time spent in Python code assembling the results than in SQLite C code executing the query.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1729#issuecomment-1112734577,https://api.github.com/repos/simonw/datasette/issues/1729,1112734577,IC_kwDOBm6k_c5CUvtx,9599,simonw,2022-04-28T23:08:42Z,2022-04-28T23:08:42Z,OWNER,"That prototype is a very small amount of code so far: ```diff diff --git a/datasette/renderer.py b/datasette/renderer.py index 4508949..b600e1b 100644 --- a/datasette/renderer.py +++ b/datasette/renderer.py @@ -28,6 +28,10 @@ def convert_specific_columns_to_json(rows, columns, json_cols): def json_renderer(args, data, view_name): """"""Render a response as JSON"""""" + from pprint import pprint + + pprint(data) + status_code = 200 # Handle the _json= parameter which may modify data[""rows""] @@ -43,6 +47,41 @@ def json_renderer(args, data, view_name): if ""rows"" in data and not value_as_boolean(args.get(""_json_infinity"", ""0"")): data[""rows""] = [remove_infinites(row) for row in data[""rows""]] + # Start building the default JSON here + columns = data[""columns""] + next_url = data.get(""next_url"") + output = { + ""rows"": [dict(zip(columns, row)) for row in data[""rows""]], + ""next"": data[""next""], + ""next_url"": next_url, + } + + extras = set(args.getlist(""_extra"")) + + extras_map = { + # _extra= : data[field] + ""count"": ""filtered_table_rows_count"", + ""facet_results"": ""facet_results"", + ""suggested_facets"": ""suggested_facets"", + ""columns"": ""columns"", + ""primary_keys"": ""primary_keys"", + ""query_ms"": ""query_ms"", + ""query"": ""query"", + } + for extra_key, data_key in extras_map.items(): + if extra_key in extras: + output[extra_key] = data[data_key] + + body = json.dumps(output, cls=CustomJSONEncoder) + content_type = ""application/json; charset=utf-8"" + headers = {} + if next_url: + headers[""link""] = f'<{next_url}>; rel=""next""' + return Response( + body, status=status_code, headers=headers, content_type=content_type + ) + + # Deal with the _shape option shape = args.get(""_shape"", ""arrays"") # if there's an error, ignore the shape entirely ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1729#issuecomment-1112732563,https://api.github.com/repos/simonw/datasette/issues/1729,1112732563,IC_kwDOBm6k_c5CUvOT,9599,simonw,2022-04-28T23:05:03Z,2022-04-28T23:05:03Z,OWNER,"OK, the prototype of this is looking really good - it's very pleasant to use. `http://127.0.0.1:8001/github_memory/issue_comments.json?_search=simon&_sort=id&_size=5&_extra=query_ms&_extra=count&_col=body` returns this: ```json { ""rows"": [ { ""id"": 338854988, ""body"": "" /database-name/table-name?name__contains=simon&sort=id+desc\r\n\r\nNote that if there's a column called \""sort\"" you can still do sort__exact=blah\r\n\r\n"" }, { ""id"": 346427794, ""body"": ""Thanks. There is a way to use pip to grab apsw, which also let's you configure it (flags to build extensions, use an internal sqlite, etc). Don't know how that works as a dependency for another package, though.\n\nOn November 22, 2017 11:38:06 AM EST, Simon Willison wrote:\n>I have a solution for FTS already, but I'm interested in apsw as a\n>mechanism for allowing custom virtual tables to be written in Python\n>(pysqlite only lets you write custom functions)\n>\n>Not having PyPI support is pretty tough though. I'm planning a\n>plugin/extension system which would be ideal for things like an\n>optional apsw mode, but that's a lot harder if apsw isn't in PyPI.\n>\n>-- \n>You are receiving this because you authored the thread.\n>Reply to this email directly or view it on GitHub:\n>https://github.com/simonw/datasette/issues/144#issuecomment-346405660\n"" }, { ""id"": 348252037, ""body"": ""WOW!\n\n--\nPaul Ford // (646) 369-7128 // @ftrain\n\nOn Thu, Nov 30, 2017 at 11:47 AM, Simon Willison \nwrote:\n\n> Remaining work on this now lives in a milestone:\n> https://github.com/simonw/datasette/milestone/6\n>\n> —\n> You are receiving this because you were mentioned.\n> Reply to this email directly, view it on GitHub\n> ,\n> or mute the thread\n> \n> .\n>\n"" }, { ""id"": 391141391, ""body"": ""I'm going to clean this up for consistency tomorrow morning so hold off\nmerging until then please\n\nOn Tue, May 22, 2018 at 6:34 PM, Simon Willison \nwrote:\n\n> Yeah let's try this without pysqlite3 and see if we still get the correct\n> version.\n>\n> —\n> You are receiving this because you authored the thread.\n> Reply to this email directly, view it on GitHub\n> , or mute\n> the thread\n> \n> .\n>\n"" }, { ""id"": 391355030, ""body"": ""No objections;\r\nIt's good to go @simonw\r\n\r\nOn Wed, 23 May 2018, 14:51 Simon Willison, wrote:\r\n\r\n> @r4vi any objections to me merging this?\r\n>\r\n> —\r\n> You are receiving this because you were mentioned.\r\n> Reply to this email directly, view it on GitHub\r\n> , or mute\r\n> the thread\r\n> \r\n> .\r\n>\r\n"" } ], ""next"": ""391355030,391355030"", ""next_url"": ""http://127.0.0.1:8001/github_memory/issue_comments.json?_search=simon&_size=5&_extra=query_ms&_extra=count&_col=body&_next=391355030%2C391355030&_sort=id"", ""count"": 57, ""query_ms"": 21.780223003588617 } ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1729#issuecomment-1112730416,https://api.github.com/repos/simonw/datasette/issues/1729,1112730416,IC_kwDOBm6k_c5CUusw,9599,simonw,2022-04-28T23:01:21Z,2022-04-28T23:01:21Z,OWNER,"I'm not sure what to do about the `""truncated"": true/false` key. It's not really relevant to table results, since they are paginated whether or not you ask for them to be. It plays a role in query results, where you might run `select * from table` and get back 1000 results because Datasette truncates at that point rather than returning everything. Adding it to every table result and always setting it to `""truncated"": false` feels confusing. I think I'm going to keep it exclusively in the default representation for the `/db?sql=...` query endpoint, and not return it at all for tables.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1729#issuecomment-1112721321,https://api.github.com/repos/simonw/datasette/issues/1729,1112721321,IC_kwDOBm6k_c5CUsep,9599,simonw,2022-04-28T22:44:05Z,2022-04-28T22:44:14Z,OWNER,I may be able to implement this mostly in the `json_renderer()` function: https://github.com/simonw/datasette/blob/94a3171b01fde5c52697aeeff052e3ad4bab5391/datasette/renderer.py#L29-L34,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1729#issuecomment-1112717745,https://api.github.com/repos/simonw/datasette/issues/1729,1112717745,IC_kwDOBm6k_c5CUrmx,9599,simonw,2022-04-28T22:38:39Z,2022-04-28T22:39:05Z,OWNER,"(I remain keen on the idea of shipping a plugin that restores the old default API shape to people who have written pre-Datasette-1.0 code against it, but I'll tackle that much later. I really like how jQuery has a culture of doing this.)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1729#issuecomment-1112717210,https://api.github.com/repos/simonw/datasette/issues/1729,1112717210,IC_kwDOBm6k_c5CUrea,9599,simonw,2022-04-28T22:37:37Z,2022-04-28T22:37:37Z,OWNER,"This means `filtered_table_rows_count` is going to become `count`. I had originally picked that terrible name to avoid confusion between the count of all rows in the table and the count of rows that were filtered. I'll add `?_extra=table_count` for getting back the full table count instead. I think `count` is clear enough!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1729#issuecomment-1112716611,https://api.github.com/repos/simonw/datasette/issues/1729,1112716611,IC_kwDOBm6k_c5CUrVD,9599,simonw,2022-04-28T22:36:24Z,2022-04-28T22:36:24Z,OWNER,"Then I'm going to implement the following `?_extra=` options: - `?_extra=facet_results` - to see facet results - `?_extra=suggested_facets` - for suggested facets - `?_extra=count` - for the count of total rows - `?_extra=columns` - for a list of column names - `?_extra=primary_keys` - for a list of primary keys - `?_extra=query` - a `{""sql"" ""select ..."", ""params"": {}}` object I thought about having `?_extra=facet_results` returned automatically if the user specifies at least one `?_facet` - but that doesn't work for default facets configured in `metadata.json` - how can the user opt out of those being returned? So I'm going to say you don't see facets at all if you don't include `?_extra=facet_results`. I'm tempted to add `?_extra=_all` to return everything, but I can decide if that's a good idea later.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1729#issuecomment-1112713581,https://api.github.com/repos/simonw/datasette/issues/1729,1112713581,IC_kwDOBm6k_c5CUqlt,9599,simonw,2022-04-28T22:31:11Z,2022-04-28T22:31:11Z,OWNER,"I'm going to change the default API response to look like this: ```json { ""rows"": [ { ""pk"": 1, ""created"": ""2019-01-14 08:00:00"", ""planet_int"": 1, ""on_earth"": 1, ""state"": ""CA"", ""_city_id"": 1, ""_neighborhood"": ""Mission"", ""tags"": ""[\""tag1\"", \""tag2\""]"", ""complex_array"": ""[{\""foo\"": \""bar\""}]"", ""distinct_some_null"": ""one"", ""n"": ""n1"" }, { ""pk"": 2, ""created"": ""2019-01-14 08:00:00"", ""planet_int"": 1, ""on_earth"": 1, ""state"": ""CA"", ""_city_id"": 1, ""_neighborhood"": ""Dogpatch"", ""tags"": ""[\""tag1\"", \""tag3\""]"", ""complex_array"": ""[]"", ""distinct_some_null"": ""two"", ""n"": ""n2"" } ], ""next"": null, ""next_url"": null } ``` Basically https://latest.datasette.io/fixtures/facetable.json?_shape=objects but with just the `rows`, `next` and `next_url` fields returned by default.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669,Implement ?_extra and new API design for TableView, https://github.com/simonw/datasette/issues/1715#issuecomment-1112711115,https://api.github.com/repos/simonw/datasette/issues/1715,1112711115,IC_kwDOBm6k_c5CUp_L,9599,simonw,2022-04-28T22:26:56Z,2022-04-28T22:26:56Z,OWNER,"I'm not going to use `asyncinject` in this refactor - at least not until I really need it. My research in these issues has put me off the idea ( in favour of `asyncio.gather()` or even not trying for parallel execution at all): - #1727","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1727#issuecomment-1112668411,https://api.github.com/repos/simonw/datasette/issues/1727,1112668411,IC_kwDOBm6k_c5CUfj7,9599,simonw,2022-04-28T21:25:34Z,2022-04-28T21:25:44Z,OWNER,"The two most promising theories at the moment, from here and Twitter and the SQLite forum, are: - SQLite is I/O bound - it generally only goes as fast as it can load data from disk. Multiple connections all competing for the same file on disk are going to end up blocked at the file system layer. But maybe this means in-memory databases will perform better? - It's the GIL. The sqlite3 C code may release the GIL, but the bits that do things like assembling `Row` objects to return still happen in Python, and that Python can only run on a single core. A couple of ways to research the in-memory theory: - Use a RAM disk on macOS (or Linux). https://stackoverflow.com/a/2033417/6083 has instructions - short version: hdiutil attach -nomount ram://$((2 * 1024 * 100)) diskutil eraseVolume HFS+ RAMDisk name-returned-by-previous-command (was `/dev/disk2` when I tried it) cd /Volumes/RAMDisk cp ~/fixtures.db . - Copy Datasette databases into an in-memory database on startup. I built a new plugin to do that here: https://github.com/simonw/datasette-copy-to-memory I need to do some more, better benchmarks using these different approaches. https://twitter.com/laurencerowe/status/1519780174560169987 also suggests: > Maybe try: > 1. Copy the sqlite file to /dev/shm and rerun (all in ram.) > 2. Create a CTE which calculates Fibonacci or similar so you can test something completely cpu bound (only return max value or something to avoid crossing between sqlite/Python.) I like that second idea a lot - I could use the mandelbrot example from https://www.sqlite.org/lang_with.html#outlandish_recursive_query_examples","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111726586,https://api.github.com/repos/simonw/datasette/issues/1727,1111726586,IC_kwDOBm6k_c5CQ5n6,9599,simonw,2022-04-28T04:17:16Z,2022-04-28T04:19:31Z,OWNER,"I could experiment with the `await asyncio.run_in_executor(processpool_executor, fn)` mechanism described in https://stackoverflow.com/a/29147750 Code examples: https://cs.github.com/?scopeName=All+repos&scope=&q=run_in_executor+ProcessPoolExecutor","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111725638,https://api.github.com/repos/simonw/datasette/issues/1727,1111725638,IC_kwDOBm6k_c5CQ5ZG,9599,simonw,2022-04-28T04:15:15Z,2022-04-28T04:15:15Z,OWNER,"Useful theory from Keith Medcalf https://sqlite.org/forum/forumpost/e363c69d3441172e > This is true, but the concurrency is limited to the execution which occurs with the GIL released (that is, in the native C sqlite3 library itself). Each row (for example) can be retrieved in parallel but ""constructing the python return objects for each row"" will be serialized (by the GIL). > > That is to say that if your have two python threads each with their own connection, and each one is performing a select that returns 1,000,000 rows (lets say that is 25% of the candidates for each select) then the difference in execution time between executing two python threads in parallel vs a single serial thead will not be much different (if even detectable at all). In fact it is possible that the multiple-threaded version takes longer to run both queries to completion because of the increased contention over a shared resource (the GIL). So maybe this is a GIL thing. I should test with some expensive SQL queries (maybe big aggregations against large tables) and see if I can spot an improvement there.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1728#issuecomment-1111714665,https://api.github.com/repos/simonw/datasette/issues/1728,1111714665,IC_kwDOBm6k_c5CQ2tp,9599,simonw,2022-04-28T03:52:47Z,2022-04-28T03:52:58Z,OWNER,"Nice custom template/theme! Yeah, for that I'd recommend hosting elsewhere - on a regular VPS (I use `systemd` like this: https://docs.datasette.io/en/stable/deploying.html#running-datasette-using-systemd ) or using Fly if you want to tub containers without managing a full server.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366,Writable canned queries fail with useless non-error against immutable databases, https://github.com/simonw/datasette/issues/1728#issuecomment-1111708206,https://api.github.com/repos/simonw/datasette/issues/1728,1111708206,IC_kwDOBm6k_c5CQ1Iu,9599,simonw,2022-04-28T03:38:56Z,2022-04-28T03:38:56Z,OWNER,"In terms of this bug, there are a few potential fixes: 1. Detect the write to a immutable database and show the user a proper, meaningful error message in the red error box at the top of the page 2. Don't allow the user to even submit the form - show a message saying that this canned query is unavailable because the database cannot be written to 3. Don't even allow Datasette to start running at all - if there's a canned query configured in `metadata.yml` and the database it refers to is in `-i` immutable mode throw an error on startup I'm not keen on that last one because it would be frustrating if you couldn't launch Datasette just because you had an old canned query lying around in your metadata file. So I'm leaning towards option 2.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366,Writable canned queries fail with useless non-error against immutable databases, https://github.com/simonw/datasette/issues/1728#issuecomment-1111707384,https://api.github.com/repos/simonw/datasette/issues/1728,1111707384,IC_kwDOBm6k_c5CQ074,9599,simonw,2022-04-28T03:36:46Z,2022-04-28T03:36:56Z,OWNER,"A more realistic solution (which I've been using on several of my own projects) is to keep the data itself in GitHub and encourage users to edit it there - using the GitHub web interface to edit YAML files or similar. Needs your users to be comfortable hand-editing YAML though! You can at least guard against critical errors by having CI run tests against their YAML before deploying. I have a dream of building a more friendly web forms interface which edits the YAML back on GitHub for the user, but that's just a concept at the moment. Even more fun would be if a user-friendly form could submit PRs for review without the user having to know what a PR is!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366,Writable canned queries fail with useless non-error against immutable databases, https://github.com/simonw/datasette/issues/1728#issuecomment-1111706519,https://api.github.com/repos/simonw/datasette/issues/1728,1111706519,IC_kwDOBm6k_c5CQ0uX,9599,simonw,2022-04-28T03:34:49Z,2022-04-28T03:34:49Z,OWNER,"I've wanted to do stuff like that on Cloud Run too. So far I've assumed that it's not feasible, but recently I've been wondering how hard it would be to have a small (like less than 100KB or so) Datasette instance which persists data to a backing GitHub repository such that when it starts up it can pull the latest copy and any time someone edits it can push their changes. I'm still not sure it would work well on Cloud Run due to the uncertainty at what would happen if Cloud Run decided to boot up a second instance - but it's still an interesting thought exercise.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366,Writable canned queries fail with useless non-error against immutable databases, https://github.com/simonw/datasette/issues/1728#issuecomment-1111705069,https://api.github.com/repos/simonw/datasette/issues/1728,1111705069,IC_kwDOBm6k_c5CQ0Xt,9599,simonw,2022-04-28T03:31:33Z,2022-04-28T03:31:33Z,OWNER,"Confirmed - this is a bug where immutable databases fail to show a useful error if you write to them with a canned query. Steps to reproduce: ``` echo ' databases: writable: queries: add_name: sql: insert into names(name) values (:name) write: true ' > write-metadata.yml echo '{""name"": ""Simon""}' | sqlite-utils insert writable.db names - datasette writable.db -m write-metadata.yml ``` Then visit http://127.0.0.1:8001/writable/add_name - adding names works. Now do this instead: ``` datasette -i writable.db -m write-metadata.yml ``` And I'm getting a broken error: ![error](https://user-images.githubusercontent.com/9599/165670823-6604dd69-9905-475c-8098-5da22ab026a1.gif) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366,Writable canned queries fail with useless non-error against immutable databases, https://github.com/simonw/datasette/issues/1727#issuecomment-1111699175,https://api.github.com/repos/simonw/datasette/issues/1727,1111699175,IC_kwDOBm6k_c5CQy7n,9599,simonw,2022-04-28T03:19:48Z,2022-04-28T03:20:08Z,OWNER,"I ran `py-spy` and then hammered refresh a bunch of times on the `http://127.0.0.1:8856/github/commits?_facet=repo&_facet=committer&_trace=1&_noparallel=` page - it generated this SVG profile for me. The area on the right is the threads running the DB queries: ![profile](https://user-images.githubusercontent.com/9599/165669677-5461ede5-3dc4-4b49-8319-bfe5fd8a723d.svg) Interactive version here: https://static.simonwillison.net/static/2022/datasette-parallel-profile.svg","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1728#issuecomment-1111698307,https://api.github.com/repos/simonw/datasette/issues/1728,1111698307,IC_kwDOBm6k_c5CQyuD,9599,simonw,2022-04-28T03:18:02Z,2022-04-28T03:18:02Z,OWNER,If the behaviour you are seeing is because the database is running in immutable mode then that's a bug - you should get a useful error message instead!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366,Writable canned queries fail with useless non-error against immutable databases, https://github.com/simonw/datasette/issues/1728#issuecomment-1111697985,https://api.github.com/repos/simonw/datasette/issues/1728,1111697985,IC_kwDOBm6k_c5CQypB,9599,simonw,2022-04-28T03:17:20Z,2022-04-28T03:17:20Z,OWNER,"How did you deploy to Cloud Run? `datasette publish cloudrun` defaults to running databases there in `-i` immutable mode, because if you managed to change a file on disk on Cloud Run those changes would be lost the next time your container restarted there. That's why I upgraded `datasette-publish-fly` to provide a way of working with their volumes support - they're the best option I know of right now for running Datasette in a container with a persistent volume that can accept writes: https://simonwillison.net/2022/Feb/15/fly-volumes/","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366,Writable canned queries fail with useless non-error against immutable databases, https://github.com/simonw/datasette/issues/1727#issuecomment-1111683539,https://api.github.com/repos/simonw/datasette/issues/1727,1111683539,IC_kwDOBm6k_c5CQvHT,9599,simonw,2022-04-28T02:47:57Z,2022-04-28T02:47:57Z,OWNER,"Maybe this is the Python GIL after all? I've been hoping that the GIL won't be an issue because the `sqlite3` module releases the GIL for the duration of the execution of a SQL query - see https://github.com/python/cpython/blob/f348154c8f8a9c254503306c59d6779d4d09b3a9/Modules/_sqlite/cursor.c#L749-L759 So I've been hoping this means that SQLite code itself can run concurrently on multiple cores even when Python threads cannot. But maybe I'm misunderstanding how that works?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111681513,https://api.github.com/repos/simonw/datasette/issues/1727,1111681513,IC_kwDOBm6k_c5CQunp,9599,simonw,2022-04-28T02:44:26Z,2022-04-28T02:44:26Z,OWNER,"I could try `py-spy top`, which I previously used here: - https://github.com/simonw/datasette/issues/1673","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111661331,https://api.github.com/repos/simonw/datasette/issues/1727,1111661331,IC_kwDOBm6k_c5CQpsT,9599,simonw,2022-04-28T02:07:31Z,2022-04-28T02:07:31Z,OWNER,Asked on the SQLite forum about this here: https://sqlite.org/forum/forumpost/ffbfa9f38e,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111602802,https://api.github.com/repos/simonw/datasette/issues/1727,1111602802,IC_kwDOBm6k_c5CQbZy,9599,simonw,2022-04-28T00:21:35Z,2022-04-28T00:21:35Z,OWNER,"Tried this but I'm getting back an empty JSON array of traces at the bottom of the page most of the time (intermittently it works correctly): ```diff diff --git a/datasette/database.py b/datasette/database.py index ba594a8..d7f9172 100644 --- a/datasette/database.py +++ b/datasette/database.py @@ -7,7 +7,7 @@ import sys import threading import uuid -from .tracer import trace +from .tracer import trace, trace_child_tasks from .utils import ( detect_fts, detect_primary_keys, @@ -207,30 +207,31 @@ class Database: time_limit_ms = custom_time_limit with sqlite_timelimit(conn, time_limit_ms): - try: - cursor = conn.cursor() - cursor.execute(sql, params if params is not None else {}) - max_returned_rows = self.ds.max_returned_rows - if max_returned_rows == page_size: - max_returned_rows += 1 - if max_returned_rows and truncate: - rows = cursor.fetchmany(max_returned_rows + 1) - truncated = len(rows) > max_returned_rows - rows = rows[:max_returned_rows] - else: - rows = cursor.fetchall() - truncated = False - except (sqlite3.OperationalError, sqlite3.DatabaseError) as e: - if e.args == (""interrupted"",): - raise QueryInterrupted(e, sql, params) - if log_sql_errors: - sys.stderr.write( - ""ERROR: conn={}, sql = {}, params = {}: {}\n"".format( - conn, repr(sql), params, e + with trace(""sql"", database=self.name, sql=sql.strip(), params=params): + try: + cursor = conn.cursor() + cursor.execute(sql, params if params is not None else {}) + max_returned_rows = self.ds.max_returned_rows + if max_returned_rows == page_size: + max_returned_rows += 1 + if max_returned_rows and truncate: + rows = cursor.fetchmany(max_returned_rows + 1) + truncated = len(rows) > max_returned_rows + rows = rows[:max_returned_rows] + else: + rows = cursor.fetchall() + truncated = False + except (sqlite3.OperationalError, sqlite3.DatabaseError) as e: + if e.args == (""interrupted"",): + raise QueryInterrupted(e, sql, params) + if log_sql_errors: + sys.stderr.write( + ""ERROR: conn={}, sql = {}, params = {}: {}\n"".format( + conn, repr(sql), params, e + ) ) - ) - sys.stderr.flush() - raise + sys.stderr.flush() + raise if truncate: return Results(rows, truncated, cursor.description) @@ -238,9 +239,8 @@ class Database: else: return Results(rows, False, cursor.description) - with trace(""sql"", database=self.name, sql=sql.strip(), params=params): - results = await self.execute_fn(sql_operation_in_thread) - return results + with trace_child_tasks(): + return await self.execute_fn(sql_operation_in_thread) @property def size(self): ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111597176,https://api.github.com/repos/simonw/datasette/issues/1727,1111597176,IC_kwDOBm6k_c5CQaB4,9599,simonw,2022-04-28T00:11:44Z,2022-04-28T00:11:44Z,OWNER,"Though it would be interesting to also have the trace reveal how much time is spent in the functions that wrap that core SQL - the stuff that is being measured at the moment. I have a hunch that this could help solve the over-arching performance mystery.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111595319,https://api.github.com/repos/simonw/datasette/issues/1727,1111595319,IC_kwDOBm6k_c5CQZk3,9599,simonw,2022-04-28T00:09:45Z,2022-04-28T00:11:01Z,OWNER,"Here's where read queries are instrumented: https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L241-L242 So the instrumentation is actually capturing quite a bit of Python activity before it gets to SQLite: https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L179-L190 And then: https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L204-L233 Ideally I'd like that `trace()` block to wrap just the `cursor.execute()` and `cursor.fetchmany(...)` or `cursor.fetchall()` calls.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111558204,https://api.github.com/repos/simonw/datasette/issues/1727,1111558204,IC_kwDOBm6k_c5CQQg8,9599,simonw,2022-04-27T22:58:39Z,2022-04-27T22:58:39Z,OWNER,"I should check my timing mechanism. Am I capturing the time taken just in SQLite or does it include time spent in Python crossing between async and threaded world and waiting for a thread pool worker to become available? That could explain the longer query times.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111553029,https://api.github.com/repos/simonw/datasette/issues/1727,1111553029,IC_kwDOBm6k_c5CQPQF,9599,simonw,2022-04-27T22:48:21Z,2022-04-27T22:48:21Z,OWNER,I wonder if it would be worth exploring multiprocessing here.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111551076,https://api.github.com/repos/simonw/datasette/issues/1727,1111551076,IC_kwDOBm6k_c5CQOxk,9599,simonw,2022-04-27T22:44:51Z,2022-04-27T22:45:04Z,OWNER,Really wild idea: what if I created three copies of the SQLite database file - as three separate file names - and then balanced the parallel queries across all these? Any chance that could avoid any mysterious locking issues?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111535818,https://api.github.com/repos/simonw/datasette/issues/1727,1111535818,IC_kwDOBm6k_c5CQLDK,9599,simonw,2022-04-27T22:18:45Z,2022-04-27T22:18:45Z,OWNER,"Another avenue: https://twitter.com/weargoggles/status/1519426289920270337 > SQLite has its own mutexes to provide thread safety, which as another poster noted are out of play in multi process setups. Perhaps downgrading from the “serializable” to “multi-threaded” safety would be okay for Datasette? https://sqlite.org/c3ref/c_config_covering_index_scan.html#sqliteconfigmultithread Doesn't look like there's an obvious way to access that from Python via the `sqlite3` module though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111485722,https://api.github.com/repos/simonw/datasette/issues/1727,1111485722,IC_kwDOBm6k_c5CP-0a,9599,simonw,2022-04-27T21:08:20Z,2022-04-27T21:08:20Z,OWNER,"Tried that and it didn't seem to make a difference either. I really need a much deeper view of what's going on here.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111462442,https://api.github.com/repos/simonw/datasette/issues/1727,1111462442,IC_kwDOBm6k_c5CP5Iq,9599,simonw,2022-04-27T20:40:59Z,2022-04-27T20:42:49Z,OWNER,"This looks VERY relevant: [SQLite Shared-Cache Mode](https://www.sqlite.org/sharedcache.html): > SQLite includes a special ""shared-cache"" mode (disabled by default) intended for use in embedded servers. If shared-cache mode is enabled and a thread establishes multiple connections to the same database, the connections share a single data and schema cache. This can significantly reduce the quantity of memory and IO required by the system. Enabled as part of the URI filename: ATTACH 'file:aux.db?cache=shared' AS aux; Turns out I'm already using this for in-memory databases that have `.memory_name` set, but not (yet) for regular file-backed databases: https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L73-L75 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111460068,https://api.github.com/repos/simonw/datasette/issues/1727,1111460068,IC_kwDOBm6k_c5CP4jk,9599,simonw,2022-04-27T20:38:32Z,2022-04-27T20:38:32Z,OWNER,WAL mode didn't seem to make a difference. I thought there was a chance it might help multiple read connections operate at the same time but it looks like it really does only matter for when writes are going on.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111456500,https://api.github.com/repos/simonw/datasette/issues/1727,1111456500,IC_kwDOBm6k_c5CP3r0,9599,simonw,2022-04-27T20:36:01Z,2022-04-27T20:36:01Z,OWNER,"Yeah all of this is pretty much assuming read-only connections. Datasette has a separate mechanism for ensuring that writes are executed one at a time against a dedicated connection from an in-memory queue: - https://github.com/simonw/datasette/issues/682","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111442012,https://api.github.com/repos/simonw/datasette/issues/1727,1111442012,IC_kwDOBm6k_c5CP0Jc,9599,simonw,2022-04-27T20:19:00Z,2022-04-27T20:19:00Z,OWNER,"Something worth digging into: are these parallel queries running against the same SQLite connection or are they each rubbing against a separate SQLite connection? Just realized I know the answer: they're running against separate SQLite connections, because that's how the time limit mechanism works: it installs a progress handler for each connection which terminates it after a set time. This means that if SQLite benefits from multiple threads using the same connection (due to shared caches or similar) then Datasette will not be seeing those benefits. It also means that if there's some mechanism within SQLite that penalizes you for having multiple parallel connections to a single file (just guessing here, maybe there's some kind of locking going on?) then Datasette will suffer those penalties. I should try seeing what happens with WAL mode enabled.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111432375,https://api.github.com/repos/simonw/datasette/issues/1727,1111432375,IC_kwDOBm6k_c5CPxy3,9599,simonw,2022-04-27T20:07:57Z,2022-04-27T20:07:57Z,OWNER,Also useful: https://avi.im/blag/2021/fast-sqlite-inserts/ - from a tip on Twitter: https://twitter.com/ricardoanderegg/status/1519402047556235264,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111431785,https://api.github.com/repos/simonw/datasette/issues/1727,1111431785,IC_kwDOBm6k_c5CPxpp,9599,simonw,2022-04-27T20:07:16Z,2022-04-27T20:07:16Z,OWNER,"I think I need some much more in-depth tracing tricks for this. https://www.maartenbreddels.com/perf/jupyter/python/tracing/gil/2021/01/14/Tracing-the-Python-GIL.html looks relevant - uses the `perf` tool on Linux.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111408273,https://api.github.com/repos/simonw/datasette/issues/1727,1111408273,IC_kwDOBm6k_c5CPr6R,9599,simonw,2022-04-27T19:40:51Z,2022-04-27T19:42:17Z,OWNER,"Relevant: here's the code that sets up a Datasette SQLite connection: https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L73-L96 It's using `check_same_thread=False` - here's [the Python docs on that](https://docs.python.org/3/library/sqlite3.html#sqlite3.connect): > By default, *check_same_thread* is [`True`](https://docs.python.org/3/library/constants.html#True ""True"") and only the creating thread may use the connection. If set [`False`](https://docs.python.org/3/library/constants.html#False ""False""), the returned connection may be shared across multiple threads. When using multiple threads with the same connection writing operations should be serialized by the user to avoid data corruption. This is why Datasette reserves a single connection for write queries and queues them up in memory, [as described here](https://simonwillison.net/2020/Feb/26/weeknotes-datasette-writes/).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111390433,https://api.github.com/repos/simonw/datasette/issues/1727,1111390433,IC_kwDOBm6k_c5CPnjh,9599,simonw,2022-04-27T19:21:02Z,2022-04-27T19:21:02Z,OWNER,"One weird thing: I noticed that in the parallel trace above the SQL query bars are wider. Mousover shows duration in ms, and I got 13ms for this query: select message as value, count(*) as n from ( But in the `?_noparallel=1` version that some query took 2.97ms. Given those numbers though I would expect the overall page time to be MUCH worse for the parallel version - but the page load times are instead very close to each other, with parallel often winning. This is super-weird.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111385875,https://api.github.com/repos/simonw/datasette/issues/1727,1111385875,IC_kwDOBm6k_c5CPmcT,9599,simonw,2022-04-27T19:16:57Z,2022-04-27T19:16:57Z,OWNER,"I just remembered the `--setting num_sql_threads` option... which defaults to 3! https://github.com/simonw/datasette/blob/942411ef946e9a34a2094944d3423cddad27efd3/datasette/app.py#L109-L113 Would explain why the first trace never seems to show more than three SQL queries executing at once.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1727#issuecomment-1111380282,https://api.github.com/repos/simonw/datasette/issues/1727,1111380282,IC_kwDOBm6k_c5CPlE6,9599,simonw,2022-04-27T19:10:27Z,2022-04-27T19:10:27Z,OWNER,"Wrote more about that here: https://simonwillison.net/2022/Apr/27/parallel-queries/ Compare https://latest-with-plugins.datasette.io/github/commits?_facet=repo&_facet=committer&_trace=1 ![image](https://user-images.githubusercontent.com/9599/165601503-2083c5d2-d740-405c-b34d-85570744ca82.png) With the same thing but with parallel execution disabled: https://latest-with-plugins.datasette.io/github/commits?_facet=repo&_facet=committer&_trace=1&_noparallel=1 ![image](https://user-images.githubusercontent.com/9599/165601525-98abbfb1-5631-4040-b6bd-700948d1db6e.png) Those total page load time numbers are very similar. Is this parallel optimization worthwhile? Maybe it's only worth it on larger databases? Or maybe larger databases perform worse with this?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,Research: demonstrate if parallel SQL queries are worthwhile, https://github.com/simonw/datasette/issues/1724#issuecomment-1110585475,https://api.github.com/repos/simonw/datasette/issues/1724,1110585475,IC_kwDOBm6k_c5CMjCD,9599,simonw,2022-04-27T06:15:14Z,2022-04-27T06:15:14Z,OWNER,"Yeah, that page is 438K (but only 20K gzipped).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216619276,?_trace=1 doesn't work on Global Power Plants demo, https://github.com/simonw/datasette/issues/1724#issuecomment-1110370095,https://api.github.com/repos/simonw/datasette/issues/1724,1110370095,IC_kwDOBm6k_c5CLucv,9599,simonw,2022-04-27T00:18:30Z,2022-04-27T00:18:30Z,OWNER,"So this isn't a bug here, it's working as intended.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216619276,?_trace=1 doesn't work on Global Power Plants demo, https://github.com/simonw/datasette/issues/1724#issuecomment-1110369004,https://api.github.com/repos/simonw/datasette/issues/1724,1110369004,IC_kwDOBm6k_c5CLuLs,9599,simonw,2022-04-27T00:16:35Z,2022-04-27T00:17:04Z,OWNER,"I bet this is because it's exceeding the size limit: https://github.com/simonw/datasette/blob/da53e0360da4771ffb56a8e3eb3f7476f3168299/datasette/tracer.py#L80-L88 https://github.com/simonw/datasette/blob/da53e0360da4771ffb56a8e3eb3f7476f3168299/datasette/tracer.py#L102-L113","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216619276,?_trace=1 doesn't work on Global Power Plants demo, https://github.com/simonw/datasette/issues/1723#issuecomment-1110330554,https://api.github.com/repos/simonw/datasette/issues/1723,1110330554,IC_kwDOBm6k_c5CLky6,9599,simonw,2022-04-26T23:06:20Z,2022-04-26T23:06:20Z,OWNER,Deployed here: https://latest-with-plugins.datasette.io/github/commits?_facet=repo&_trace=1&_facet=committer,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216508080,Research running SQL in table view in parallel using `asyncio.gather()`, https://github.com/simonw/datasette/issues/1723#issuecomment-1110305790,https://api.github.com/repos/simonw/datasette/issues/1723,1110305790,IC_kwDOBm6k_c5CLev-,9599,simonw,2022-04-26T22:19:04Z,2022-04-26T22:19:04Z,OWNER,"I realized that seeing the total time in queries wasn't enough to understand this, because if the queries were executed in serial or parallel it should still sum up to the same amount of SQL time (roughly). Instead I need to know how long the page took to render. But that's hard to display on the page since you can't measure it until rendering has finished! So I built an ASGI plugin to handle that measurement: https://github.com/simonw/datasette-total-page-time And with that plugin installed, `http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel2&_facet=other_fuel1&_parallel=1` (the parallel version) takes 377ms: While `http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel2&_facet=other_fuel1` (the serial version) takes 762ms: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216508080,Research running SQL in table view in parallel using `asyncio.gather()`, https://github.com/simonw/datasette/issues/1723#issuecomment-1110279869,https://api.github.com/repos/simonw/datasette/issues/1723,1110279869,IC_kwDOBm6k_c5CLYa9,9599,simonw,2022-04-26T21:45:39Z,2022-04-26T21:45:39Z,OWNER,"Getting some nice traces out of this: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216508080,Research running SQL in table view in parallel using `asyncio.gather()`, https://github.com/simonw/datasette/issues/1723#issuecomment-1110278577,https://api.github.com/repos/simonw/datasette/issues/1723,1110278577,IC_kwDOBm6k_c5CLYGx,9599,simonw,2022-04-26T21:44:04Z,2022-04-26T21:44:04Z,OWNER,"And some simple benchmarks with `ab` - using the `?_parallel=1` hack to try it with and without a parallel `asyncio.gather()`: ``` ~ % ab -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2' This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 127.0.0.1 (be patient).....done Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001 Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2 Document Length: 314187 bytes Concurrency Level: 1 Time taken for tests: 68.279 seconds Complete requests: 100 Failed requests: 13 (Connect: 0, Receive: 0, Length: 13, Exceptions: 0) Total transferred: 31454937 bytes HTML transferred: 31418437 bytes Requests per second: 1.46 [#/sec] (mean) Time per request: 682.787 [ms] (mean) Time per request: 682.787 [ms] (mean, across all concurrent requests) Transfer rate: 449.89 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 621 683 68.0 658 993 Waiting: 620 682 68.0 657 992 Total: 621 683 68.0 658 993 Percentage of the requests served within a certain time (ms) 50% 658 66% 678 75% 687 80% 711 90% 763 95% 879 98% 926 99% 993 100% 993 (longest request) ---- In parallel: ~ % ab -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1' This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 127.0.0.1 (be patient).....done Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001 Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1 Document Length: 315703 bytes Concurrency Level: 1 Time taken for tests: 34.763 seconds Complete requests: 100 Failed requests: 11 (Connect: 0, Receive: 0, Length: 11, Exceptions: 0) Total transferred: 31607988 bytes HTML transferred: 31570288 bytes Requests per second: 2.88 [#/sec] (mean) Time per request: 347.632 [ms] (mean) Time per request: 347.632 [ms] (mean, across all concurrent requests) Transfer rate: 887.93 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 311 347 28.0 338 450 Waiting: 311 347 28.0 338 450 Total: 312 348 28.0 338 451 Percentage of the requests served within a certain time (ms) 50% 338 66% 348 75% 361 80% 367 90% 396 95% 408 98% 436 99% 451 100% 451 (longest request) ---- With concurrency 10, not parallel: ~ % ab -c 10 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=' This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 127.0.0.1 (be patient).....done Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001 Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel= Document Length: 314346 bytes Concurrency Level: 10 Time taken for tests: 38.408 seconds Complete requests: 100 Failed requests: 93 (Connect: 0, Receive: 0, Length: 93, Exceptions: 0) Total transferred: 31471333 bytes HTML transferred: 31433733 bytes Requests per second: 2.60 [#/sec] (mean) Time per request: 3840.829 [ms] (mean) Time per request: 384.083 [ms] (mean, across all concurrent requests) Transfer rate: 800.18 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 1 Processing: 685 3719 354.0 3774 4096 Waiting: 684 3707 353.7 3750 4095 Total: 685 3719 354.0 3774 4096 Percentage of the requests served within a certain time (ms) 50% 3774 66% 3832 75% 3855 80% 3878 90% 3944 95% 4006 98% 4057 99% 4096 100% 4096 (longest request) ---- Concurrency 10 parallel: ~ % ab -c 10 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1' This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 127.0.0.1 (be patient).....done Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001 Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1 Document Length: 315703 bytes Concurrency Level: 10 Time taken for tests: 36.762 seconds Complete requests: 100 Failed requests: 89 (Connect: 0, Receive: 0, Length: 89, Exceptions: 0) Total transferred: 31606516 bytes HTML transferred: 31568816 bytes Requests per second: 2.72 [#/sec] (mean) Time per request: 3676.182 [ms] (mean) Time per request: 367.618 [ms] (mean, across all concurrent requests) Transfer rate: 839.61 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 0 Processing: 381 3602 419.6 3609 4458 Waiting: 381 3586 418.7 3607 4457 Total: 381 3603 419.6 3609 4458 Percentage of the requests served within a certain time (ms) 50% 3609 66% 3741 75% 3791 80% 3821 90% 3972 95% 4074 98% 4386 99% 4458 100% 4458 (longest request) Trying -c 3 instead. Non parallel: ~ % ab -c 3 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=' This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 127.0.0.1 (be patient).....done Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001 Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel= Document Length: 314346 bytes Concurrency Level: 3 Time taken for tests: 39.365 seconds Complete requests: 100 Failed requests: 83 (Connect: 0, Receive: 0, Length: 83, Exceptions: 0) Total transferred: 31470808 bytes HTML transferred: 31433208 bytes Requests per second: 2.54 [#/sec] (mean) Time per request: 1180.955 [ms] (mean) Time per request: 393.652 [ms] (mean, across all concurrent requests) Transfer rate: 780.72 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 731 1153 126.2 1189 1359 Waiting: 730 1151 125.9 1188 1358 Total: 731 1153 126.2 1189 1359 Percentage of the requests served within a certain time (ms) 50% 1189 66% 1221 75% 1234 80% 1247 90% 1296 95% 1309 98% 1343 99% 1359 100% 1359 (longest request) ---- Parallel: ~ % ab -c 3 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1' This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 127.0.0.1 (be patient).....done Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001 Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1 Document Length: 315703 bytes Concurrency Level: 3 Time taken for tests: 34.530 seconds Complete requests: 100 Failed requests: 18 (Connect: 0, Receive: 0, Length: 18, Exceptions: 0) Total transferred: 31606179 bytes HTML transferred: 31568479 bytes Requests per second: 2.90 [#/sec] (mean) Time per request: 1035.902 [ms] (mean) Time per request: 345.301 [ms] (mean, across all concurrent requests) Transfer rate: 893.87 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 412 1020 104.4 1018 1280 Waiting: 411 1018 104.1 1014 1275 Total: 412 1021 104.4 1018 1280 Percentage of the requests served within a certain time (ms) 50% 1018 66% 1041 75% 1061 80% 1079 90% 1136 95% 1176 98% 1251 99% 1280 100% 1280 (longest request) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216508080,Research running SQL in table view in parallel using `asyncio.gather()`, https://github.com/simonw/datasette/issues/1723#issuecomment-1110278182,https://api.github.com/repos/simonw/datasette/issues/1723,1110278182,IC_kwDOBm6k_c5CLYAm,9599,simonw,2022-04-26T21:43:34Z,2022-04-26T21:43:34Z,OWNER,"Here's the diff I'm using: ```diff diff --git a/datasette/views/table.py b/datasette/views/table.py index d66adb8..f15ef1e 100644 --- a/datasette/views/table.py +++ b/datasette/views/table.py @@ -1,3 +1,4 @@ +import asyncio import itertools import json @@ -5,6 +6,7 @@ import markupsafe from datasette.plugins import pm from datasette.database import QueryInterrupted +from datasette import tracer from datasette.utils import ( await_me_maybe, CustomRow, @@ -150,6 +152,16 @@ class TableView(DataView): default_labels=False, _next=None, _size=None, + ): + with tracer.trace_child_tasks(): + return await self._data_traced(request, default_labels, _next, _size) + + async def _data_traced( + self, + request, + default_labels=False, + _next=None, + _size=None, ): database_route = tilde_decode(request.url_vars[""database""]) table_name = tilde_decode(request.url_vars[""table""]) @@ -159,6 +171,20 @@ class TableView(DataView): raise NotFound(""Database not found: {}"".format(database_route)) database_name = db.name + # For performance profiling purposes, ?_parallel=1 turns on asyncio.gather + async def _gather_parallel(*args): + return await asyncio.gather(*args) + + async def _gather_sequential(*args): + results = [] + for fn in args: + results.append(await fn) + return results + + gather = ( + _gather_parallel if request.args.get(""_parallel"") else _gather_sequential + ) + # If this is a canned query, not a table, then dispatch to QueryView instead canned_query = await self.ds.get_canned_query( database_name, table_name, request.actor @@ -174,8 +200,12 @@ class TableView(DataView): write=bool(canned_query.get(""write"")), ) - is_view = bool(await db.get_view_definition(table_name)) - table_exists = bool(await db.table_exists(table_name)) + is_view, table_exists = map( + bool, + await gather( + db.get_view_definition(table_name), db.table_exists(table_name) + ), + ) # If table or view not found, return 404 if not is_view and not table_exists: @@ -497,33 +527,44 @@ class TableView(DataView): ) ) - if not nofacet: - for facet in facet_instances: - ( + async def execute_facets(): + if not nofacet: + # Run them in parallel + facet_awaitables = [facet.facet_results() for facet in facet_instances] + facet_awaitable_results = await gather(*facet_awaitables) + for ( instance_facet_results, instance_facets_timed_out, - ) = await facet.facet_results() - for facet_info in instance_facet_results: - base_key = facet_info[""name""] - key = base_key - i = 1 - while key in facet_results: - i += 1 - key = f""{base_key}_{i}"" - facet_results[key] = facet_info - facets_timed_out.extend(instance_facets_timed_out) - - # Calculate suggested facets + ) in facet_awaitable_results: + for facet_info in instance_facet_results: + base_key = facet_info[""name""] + key = base_key + i = 1 + while key in facet_results: + i += 1 + key = f""{base_key}_{i}"" + facet_results[key] = facet_info + facets_timed_out.extend(instance_facets_timed_out) + suggested_facets = [] - if ( - self.ds.setting(""suggest_facets"") - and self.ds.setting(""allow_facet"") - and not _next - and not nofacet - and not nosuggest - ): - for facet in facet_instances: - suggested_facets.extend(await facet.suggest()) + + async def execute_suggested_facets(): + # Calculate suggested facets + if ( + self.ds.setting(""suggest_facets"") + and self.ds.setting(""allow_facet"") + and not _next + and not nofacet + and not nosuggest + ): + # Run them in parallel + facet_suggest_awaitables = [ + facet.suggest() for facet in facet_instances + ] + for suggest_result in await gather(*facet_suggest_awaitables): + suggested_facets.extend(suggest_result) + + await gather(execute_facets(), execute_suggested_facets()) # Figure out columns and rows for the query columns = [r[0] for r in results.description] ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216508080,Research running SQL in table view in parallel using `asyncio.gather()`, https://github.com/simonw/datasette/issues/1715#issuecomment-1110265087,https://api.github.com/repos/simonw/datasette/issues/1715,1110265087,IC_kwDOBm6k_c5CLUz_,9599,simonw,2022-04-26T21:26:17Z,2022-04-26T21:26:17Z,OWNER,"Running facets and facet suggestions in parallel using `asyncio.gather()` turns out to be a lot less hassle than I had thought - maybe I don't need `asyncinject` for this at all? ```diff if not nofacet: - for facet in facet_instances: - ( - instance_facet_results, - instance_facets_timed_out, - ) = await facet.facet_results() + # Run them in parallel + facet_awaitables = [facet.facet_results() for facet in facet_instances] + facet_awaitable_results = await asyncio.gather(*facet_awaitables) + for ( + instance_facet_results, + instance_facets_timed_out, + ) in facet_awaitable_results: for facet_info in instance_facet_results: base_key = facet_info[""name""] key = base_key @@ -522,8 +540,10 @@ class TableView(DataView): and not nofacet and not nosuggest ): - for facet in facet_instances: - suggested_facets.extend(await facet.suggest()) + # Run them in parallel + facet_suggest_awaitables = [facet.suggest() for facet in facet_instances] + for suggest_result in await asyncio.gather(*facet_suggest_awaitables): + suggested_facets.extend(suggest_result) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1715#issuecomment-1110246593,https://api.github.com/repos/simonw/datasette/issues/1715,1110246593,IC_kwDOBm6k_c5CLQTB,9599,simonw,2022-04-26T21:03:56Z,2022-04-26T21:03:56Z,OWNER,"Well this is fun... I applied this change: ```diff diff --git a/datasette/views/table.py b/datasette/views/table.py index d66adb8..85f9e44 100644 --- a/datasette/views/table.py +++ b/datasette/views/table.py @@ -1,3 +1,4 @@ +import asyncio import itertools import json @@ -5,6 +6,7 @@ import markupsafe from datasette.plugins import pm from datasette.database import QueryInterrupted +from datasette import tracer from datasette.utils import ( await_me_maybe, CustomRow, @@ -174,8 +176,11 @@ class TableView(DataView): write=bool(canned_query.get(""write"")), ) - is_view = bool(await db.get_view_definition(table_name)) - table_exists = bool(await db.table_exists(table_name)) + with tracer.trace_child_tasks(): + is_view, table_exists = map(bool, await asyncio.gather( + db.get_view_definition(table_name), + db.table_exists(table_name) + )) # If table or view not found, return 404 if not is_view and not table_exists: ``` And now using https://datasette.io/plugins/datasette-pretty-traces I get this: ![CleanShot 2022-04-26 at 14 03 33@2x](https://user-images.githubusercontent.com/9599/165392009-84c4399d-3e94-46d4-ba7b-a64a116cac5c.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1715#issuecomment-1110219185,https://api.github.com/repos/simonw/datasette/issues/1715,1110219185,IC_kwDOBm6k_c5CLJmx,9599,simonw,2022-04-26T20:28:40Z,2022-04-26T20:56:48Z,OWNER,"The refactor I did in #1719 pretty much clashes with all of the changes in https://github.com/simonw/datasette/commit/5053f1ea83194ecb0a5693ad5dada5b25bf0f7e6 so I'll probably need to start my `api-extras` branch again from scratch. Using a new `tableview-asyncinject` branch.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1715#issuecomment-1110239536,https://api.github.com/repos/simonw/datasette/issues/1715,1110239536,IC_kwDOBm6k_c5CLOkw,9599,simonw,2022-04-26T20:54:53Z,2022-04-26T20:54:53Z,OWNER,`pytest tests/test_table_*` runs the tests quickly.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1715#issuecomment-1110238896,https://api.github.com/repos/simonw/datasette/issues/1715,1110238896,IC_kwDOBm6k_c5CLOaw,9599,simonw,2022-04-26T20:53:59Z,2022-04-26T20:53:59Z,OWNER,I'm going to rename `database` to `database_name` and `table` to `table_name` to avoid confusion with the `Database` object as opposed to the string name for the database.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1715#issuecomment-1110229319,https://api.github.com/repos/simonw/datasette/issues/1715,1110229319,IC_kwDOBm6k_c5CLMFH,9599,simonw,2022-04-26T20:41:32Z,2022-04-26T20:44:38Z,OWNER,"This time I'm not going to bother with the `filter_args` thing - I'm going to just try to use `asyncinject` to execute some big high level things in parallel - facets, suggested facets, counts, the query - and then combine it with the `extras` mechanism I'm trying to introduce too. Most importantly: I want that `extra_template()` function that adds more template context for the HTML to be executed as part of an `asyncinject` flow!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1720#issuecomment-1110212021,https://api.github.com/repos/simonw/datasette/issues/1720,1110212021,IC_kwDOBm6k_c5CLH21,9599,simonw,2022-04-26T20:20:27Z,2022-04-26T20:20:27Z,OWNER,Closing this because I have a good enough idea of the design for now - the details of the parameters can be figured out when I implement this.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109309683,https://api.github.com/repos/simonw/datasette/issues/1720,1109309683,IC_kwDOBm6k_c5CHrjz,9599,simonw,2022-04-26T04:12:39Z,2022-04-26T04:12:39Z,OWNER,"I think the rough shape of the three plugin hooks is right. The detailed decisions that are needed concern what the parameters should be, which I think will mainly happen as part of: - #1715","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109306070,https://api.github.com/repos/simonw/datasette/issues/1720,1109306070,IC_kwDOBm6k_c5CHqrW,9599,simonw,2022-04-26T04:05:20Z,2022-04-26T04:05:20Z,OWNER,"The proposed plugin for annotations - allowing users to attach comments to database tables, columns and rows - would be a great application for all three of those `?_extra=` plugin hooks.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109305184,https://api.github.com/repos/simonw/datasette/issues/1720,1109305184,IC_kwDOBm6k_c5CHqdg,9599,simonw,2022-04-26T04:03:35Z,2022-04-26T04:03:35Z,OWNER,I bet there's all kinds of interesting potential extras that could be calculated by loading the results of the query into a Pandas DataFrame.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109200774,https://api.github.com/repos/simonw/datasette/issues/1720,1109200774,IC_kwDOBm6k_c5CHQ-G,9599,simonw,2022-04-26T01:25:43Z,2022-04-26T01:26:15Z,OWNER,"Had a thought: if a custom HTML template is going to make use of stuff generated using these extras, it will need a way to tell Datasette to execute those extras even in the absence of the `?_extra=...` URL parameters. Is that necessary? Or should those kinds of plugins use the existing `extra_template_vars` hook instead? Or maybe the `extra_template_vars` hook gets redesigned so it can depend on other `extras` in some way?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109200335,https://api.github.com/repos/simonw/datasette/issues/1720,1109200335,IC_kwDOBm6k_c5CHQ3P,9599,simonw,2022-04-26T01:24:47Z,2022-04-26T01:24:47Z,OWNER,"Sketching out a `?_extra=statistics` table plugin: ```python from datasette import hookimpl @hookimpl def register_table_extras(datasette): return [statistics] async def statistics(datasette, query, columns, sql): # ... need to figure out which columns are integer/floats # then build and execute a SQL query that calculates sum/avg/etc for each column ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/sqlite-utils/issues/428#issuecomment-1109190401,https://api.github.com/repos/simonw/sqlite-utils/issues/428,1109190401,IC_kwDOCGYnMM5CHOcB,9599,simonw,2022-04-26T01:05:29Z,2022-04-26T01:05:29Z,OWNER,Django makes extensive use of savepoints for nested transactions: https://docs.djangoproject.com/en/4.0/topics/db/transactions/#savepoints,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215216249,Research adding support for savepoints, https://github.com/simonw/datasette/issues/1720#issuecomment-1109174715,https://api.github.com/repos/simonw/datasette/issues/1720,1109174715,IC_kwDOBm6k_c5CHKm7,9599,simonw,2022-04-26T00:40:13Z,2022-04-26T00:43:33Z,OWNER,"Some of the things I'd like to use `?_extra=` for, that may or not make sense as plugins: - Performance breakdown information, maybe including explain output for a query/table - Information about the tables that were consulted in a query - imagine pulling in additional table metadata - Statistical aggregates against the full set of results. This may well be a Datasette core feature at some point in the future, but being able to provide it early as a plugin would be really cool. - For tables, what are the other tables they can join against? - Suggested facets - Facet results themselves - New custom facets I haven't thought of - though the `register_facet_classes` hook covers that already - Table schema - Table metadata - Analytics - how many times has this table been queried? Would be a plugin thing - For geospatial data, how about a GeoJSON polygon that represents the bounding box for all returned results? Effectively this is an extra aggregation. Looking at https://github-to-sqlite.dogsheep.net/github/commits.json?_labels=on&_shape=objects for inspiration. I think there's a separate potential mechanism in the future that lets you add custom columns to a table. This would affect `.csv` and the HTML presentation too, which makes it a different concept from the `?_extra=` hook that affects the JSON export (and the context that is fed to the HTML templates).","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109171871,https://api.github.com/repos/simonw/datasette/issues/1720,1109171871,IC_kwDOBm6k_c5CHJ6f,9599,simonw,2022-04-26T00:34:48Z,2022-04-26T00:34:48Z,OWNER,"Let's try sketching out a `register_table_extras` plugin for something new. The first idea I came up with suggests adding new fields to the individual row records that come back - my mental model for extras so far has been that they add new keys to the root object. So if a table result looked like this: ```json { ""rows"": [ {""id"": 1, ""name"": ""Cleo""}, {""id"": 2, ""name"": ""Suna""} ], ""next_url"": null } ``` I was initially thinking that `?_extra=facets` would add a `""facets"": {...}` key to that root object. Here's a plugin idea I came up with that would probably justify adding to the individual row objects instead: - `?_extra=check404s` - does an async `HEAD` request against every column value that looks like a URL and checks if it returns a 404 This could also work by adding a `""check404s"": {""url-here"": 200}` key to the root object though. I think I need some better plugin concepts before committing to this new hook. There's overlap between this and how I want the enrichments mechanism ([see here](https://simonwillison.net/2021/Jan/17/weeknotes-still-pretty-distracted/)) to work.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109165411,https://api.github.com/repos/simonw/datasette/issues/1720,1109165411,IC_kwDOBm6k_c5CHIVj,9599,simonw,2022-04-26T00:22:42Z,2022-04-26T00:22:42Z,OWNER,Passing `pk_values` to the plugin hook feels odd. I think I'd pass a `row` object instead and let the code look up the primary key values on that row (by introspecting the primary keys for the table).,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109164803,https://api.github.com/repos/simonw/datasette/issues/1720,1109164803,IC_kwDOBm6k_c5CHIMD,9599,simonw,2022-04-26T00:21:40Z,2022-04-26T00:21:40Z,OWNER,"What would the existing https://latest.datasette.io/fixtures/simple_primary_key/1.json?_extras=foreign_key_tables feature look like if it was re-imagined as a `register_row_extras()` plugin? Rough sketch, copying most of the code from https://github.com/simonw/datasette/blob/579f59dcec43a91dd7d404e00b87a00afd8515f2/datasette/views/row.py#L98 ```python from datasette import hookimpl @hookimpl def register_row_extras(datasette): return [foreign_key_tables] async def foreign_key_tables(datasette, database, table, pk_values): if len(pk_values) != 1: return [] db = datasette.get_database(database) all_foreign_keys = await db.get_all_foreign_keys() foreign_keys = all_foreign_keys[table][""incoming""] if len(foreign_keys) == 0: return [] sql = ""select "" + "", "".join( [ ""(select count(*) from {table} where {column}=:id)"".format( table=escape_sqlite(fk[""other_table""]), column=escape_sqlite(fk[""other_column""]), ) for fk in foreign_keys ] ) try: rows = list(await db.execute(sql, {""id"": pk_values[0]})) except QueryInterrupted: # Almost certainly hit the timeout return [] foreign_table_counts = dict( zip( [(fk[""other_table""], fk[""other_column""]) for fk in foreign_keys], list(rows[0]), ) ) foreign_key_tables = [] for fk in foreign_keys: count = ( foreign_table_counts.get((fk[""other_table""], fk[""other_column""])) or 0 ) key = fk[""other_column""] if key.startswith(""_""): key += ""__exact"" link = ""{}?{}={}"".format( self.ds.urls.table(database, fk[""other_table""]), key, "","".join(pk_values), ) foreign_key_tables.append({**fk, **{""count"": count, ""link"": link}}) return foreign_key_tables ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109162123,https://api.github.com/repos/simonw/datasette/issues/1720,1109162123,IC_kwDOBm6k_c5CHHiL,9599,simonw,2022-04-26T00:16:42Z,2022-04-26T00:16:51Z,OWNER,"Actually I'm going to imitate the existing `register_*` hooks: - `def register_output_renderer(datasette)` - `def register_facet_classes()` - `def register_routes(datasette)` - `def register_commands(cli)` - `def register_magic_parameters(datasette)` So I'm going to call the new hooks: - `register_table_extras(datasette)` - `register_row_extras(datasette)` - `register_query_extras(datasette)` They'll return a list of `async def` functions. The names of those functions will become the names of the extras.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109160226,https://api.github.com/repos/simonw/datasette/issues/1720,1109160226,IC_kwDOBm6k_c5CHHEi,9599,simonw,2022-04-26T00:14:11Z,2022-04-26T00:14:11Z,OWNER,"There are four existing plugin hooks that include the word ""extra"" but use it to mean something else - to mean additional CSS/JS/variables to be injected into the page: - `def extra_css_urls(...)` - `def extra_js_urls(...)` - `def extra_body_script(...)` - `def extra_template_vars(...)` I think `extra_*` and `*_extras` are different enough that they won't be confused with each other.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109159307,https://api.github.com/repos/simonw/datasette/issues/1720,1109159307,IC_kwDOBm6k_c5CHG2L,9599,simonw,2022-04-26T00:12:28Z,2022-04-26T00:12:28Z,OWNER,"I'm going to keep table and row separate. So I think I need to add three new plugin hooks: - `table_extras()` - `row_extras()` - `query_extras()`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1720#issuecomment-1109158903,https://api.github.com/repos/simonw/datasette/issues/1720,1109158903,IC_kwDOBm6k_c5CHGv3,9599,simonw,2022-04-26T00:11:42Z,2022-04-26T00:11:42Z,OWNER,"Places this plugin hook (or hooks?) should be able to affect: - JSON for a table/view - JSON for a row - JSON for a canned query - JSON for a custom arbitrary query I'm going to combine those last two, which means there are three places. But maybe I can combine the table one and the row one as well?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,Design plugin hook for extras, https://github.com/simonw/datasette/issues/1719#issuecomment-1108907238,https://api.github.com/repos/simonw/datasette/issues/1719,1108907238,IC_kwDOBm6k_c5CGJTm,9599,simonw,2022-04-25T18:34:21Z,2022-04-25T18:34:21Z,OWNER,Well this refactor turned out to be pretty quick and really does greatly simplify both the `RowView` and `TableView` classes. Very happy with this.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1214859703,Refactor `RowView` and remove `RowTableShared`, https://github.com/simonw/datasette/issues/262#issuecomment-1108890170,https://api.github.com/repos/simonw/datasette/issues/262,1108890170,IC_kwDOBm6k_c5CGFI6,9599,simonw,2022-04-25T18:17:09Z,2022-04-25T18:18:39Z,OWNER,"I spotted in https://github.com/simonw/datasette/issues/1719#issuecomment-1108888494 that there's actually already an undocumented implementation of `?_extras=foreign_key_tables` - https://latest.datasette.io/fixtures/simple_primary_key/1.json?_extras=foreign_key_tables I added that feature all the way back in November 2017! https://github.com/simonw/datasette/commit/a30c5b220c15360d575e94b0e67f3255e120b916","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",323658641,Add ?_extra= mechanism for requesting extra properties in JSON, https://github.com/simonw/datasette/issues/1719#issuecomment-1108888494,https://api.github.com/repos/simonw/datasette/issues/1719,1108888494,IC_kwDOBm6k_c5CGEuu,9599,simonw,2022-04-25T18:15:42Z,2022-04-25T18:15:42Z,OWNER,"Here's an undocumented feature I forgot existed: https://latest.datasette.io/fixtures/simple_primary_key/1.json?_extras=foreign_key_tables `?_extras=foreign_key_tables` https://github.com/simonw/datasette/blob/0bc5186b7bb4fc82392df08f99a9132f84dcb331/datasette/views/table.py#L1021-L1024 It's even covered by the tests: https://github.com/simonw/datasette/blob/b9c2b1cfc8692b9700416db98721fa3ec982f6be/tests/test_api.py#L691-L703","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1214859703,Refactor `RowView` and remove `RowTableShared`, https://github.com/simonw/datasette/issues/1719#issuecomment-1108884171,https://api.github.com/repos/simonw/datasette/issues/1719,1108884171,IC_kwDOBm6k_c5CGDrL,9599,simonw,2022-04-25T18:10:46Z,2022-04-25T18:12:45Z,OWNER,"It looks like the only class method from that shared class needed by `RowView` is `self.display_columns_and_rows()`. Which I've been wanting to refactor to provide to `QueryView` too: - #715","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1214859703,Refactor `RowView` and remove `RowTableShared`, https://github.com/simonw/datasette/issues/1715#issuecomment-1108875068,https://api.github.com/repos/simonw/datasette/issues/1715,1108875068,IC_kwDOBm6k_c5CGBc8,9599,simonw,2022-04-25T18:03:13Z,2022-04-25T18:06:33Z,OWNER,"The `RowTableShared` class is making this a whole lot more complicated. I'm going to split the `RowView` view out into an entirely separate `views/row.py` module.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1715#issuecomment-1108877454,https://api.github.com/repos/simonw/datasette/issues/1715,1108877454,IC_kwDOBm6k_c5CGCCO,9599,simonw,2022-04-25T18:04:27Z,2022-04-25T18:04:27Z,OWNER,Pushed my WIP on this to the `api-extras` branch: 5053f1ea83194ecb0a5693ad5dada5b25bf0f7e6,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1718#issuecomment-1107873311,https://api.github.com/repos/simonw/datasette/issues/1718,1107873311,IC_kwDOBm6k_c5CCM4f,9599,simonw,2022-04-24T16:24:14Z,2022-04-24T16:24:14Z,OWNER,Wrote up what I learned in a TIL: https://til.simonwillison.net/sphinx/blacken-docs,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/issues/1718#issuecomment-1107873271,https://api.github.com/repos/simonw/datasette/issues/1718,1107873271,IC_kwDOBm6k_c5CCM33,9599,simonw,2022-04-24T16:23:57Z,2022-04-24T16:23:57Z,OWNER,"Turns out I didn't need that `git diff-index` trick after all - the `blacken-docs` command returns a non-zero exit code if it changes any files. Submitted a documentation PR to that project instead: - https://github.com/asottile/blacken-docs/pull/162","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/issues/1718#issuecomment-1107870788,https://api.github.com/repos/simonw/datasette/issues/1718,1107870788,IC_kwDOBm6k_c5CCMRE,9599,simonw,2022-04-24T16:09:23Z,2022-04-24T16:09:23Z,OWNER,One more attempt at testing the `git diff-index` trick.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/issues/1718#issuecomment-1107869884,https://api.github.com/repos/simonw/datasette/issues/1718,1107869884,IC_kwDOBm6k_c5CCMC8,9599,simonw,2022-04-24T16:04:03Z,2022-04-24T16:04:03Z,OWNER,"OK, I'm expecting this one to fail at the `git diff-index --quiet HEAD --` check.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/issues/1718#issuecomment-1107869556,https://api.github.com/repos/simonw/datasette/issues/1718,1107869556,IC_kwDOBm6k_c5CCL90,9599,simonw,2022-04-24T16:02:27Z,2022-04-24T16:02:27Z,OWNER,"Looking at that first error it appears to be a place where I had deliberately omitted the body of the function: https://github.com/simonw/datasette/blob/36573638b0948174ae237d62e6369b7d55220d7f/docs/internals.rst#L196-L211 I can use `...` as the function body here to get it to pass. Fixing those warnings actually helped me spot a couple of bugs, so I'm glad this happened.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/issues/1718#issuecomment-1107868585,https://api.github.com/repos/simonw/datasette/issues/1718,1107868585,IC_kwDOBm6k_c5CCLup,9599,simonw,2022-04-24T15:57:10Z,2022-04-24T15:57:19Z,OWNER,"The tests failed there because of what I thought were warnings but turn out to be treated as errors: ``` % blacken-docs -l 60 docs/*.rst docs/internals.rst:196: code block parse error Cannot parse: 14:0: docs/json_api.rst:449: code block parse error Cannot parse: 1:0: docs/testing_plugins.rst:135: code block parse error Cannot parse: 5:0: % echo $? 1 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/issues/1718#issuecomment-1107867281,https://api.github.com/repos/simonw/datasette/issues/1718,1107867281,IC_kwDOBm6k_c5CCLaR,9599,simonw,2022-04-24T15:49:23Z,2022-04-24T15:49:23Z,OWNER,I'm going to push the first commit with a deliberate missing formatting to check that the tests fail.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/issues/1718#issuecomment-1107866013,https://api.github.com/repos/simonw/datasette/issues/1718,1107866013,IC_kwDOBm6k_c5CCLGd,9599,simonw,2022-04-24T15:42:07Z,2022-04-24T15:42:07Z,OWNER,"In the absence of `--check` I can use this to detect if changes are applied: ```zsh % git diff-index --quiet HEAD -- % echo $? 0 % blacken-docs -l 60 docs/*.rst docs/authentication.rst: Rewriting... ... % git diff-index --quiet HEAD -- % echo $? 1 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/issues/1718#issuecomment-1107865493,https://api.github.com/repos/simonw/datasette/issues/1718,1107865493,IC_kwDOBm6k_c5CCK-V,9599,simonw,2022-04-24T15:39:02Z,2022-04-24T15:39:02Z,OWNER,"There's no `blacken-docs --check` option so I filed a feature request: - https://github.com/asottile/blacken-docs/issues/161","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/issues/1718#issuecomment-1107863924,https://api.github.com/repos/simonw/datasette/issues/1718,1107863924,IC_kwDOBm6k_c5CCKl0,9599,simonw,2022-04-24T15:30:03Z,2022-04-24T15:30:03Z,OWNER,"On the one hand, I'm not crazy about some of the indentation decisions Black made here - in particular this one, which I had indented deliberately for readability: ```diff diff --git a/docs/authentication.rst b/docs/authentication.rst index 0d98cf8..8008023 100644 --- a/docs/authentication.rst +++ b/docs/authentication.rst @@ -381,11 +381,7 @@ Authentication plugins can set signed ``ds_actor`` cookies themselves like so: .. code-block:: python response = Response.redirect(""/"") - response.set_cookie(""ds_actor"", datasette.sign({ - ""a"": { - ""id"": ""cleopaws"" - } - }, ""actor"")) + response.set_cookie(""ds_actor"", datasette.sign({""a"": {""id"": ""cleopaws""}}, ""actor"")) ``` But... consistency is a virtue. Maybe I'm OK with just this one disagreement? Also: I've been mentally trying to keep the line lengths a bit shorter to help them be more readable on mobile devices. I'll try a different line length using `blacken-docs -l 60 docs/*.rst` instead. I like this more - here's the result for that example: ```diff diff --git a/docs/authentication.rst b/docs/authentication.rst index 0d98cf8..2496073 100644 --- a/docs/authentication.rst +++ b/docs/authentication.rst @@ -381,11 +381,10 @@ Authentication plugins can set signed ``ds_actor`` cookies themselves like so: .. code-block:: python response = Response.redirect(""/"") - response.set_cookie(""ds_actor"", datasette.sign({ - ""a"": { - ""id"": ""cleopaws"" - } - }, ""actor"")) + response.set_cookie( + ""ds_actor"", + datasette.sign({""a"": {""id"": ""cleopaws""}}, ""actor""), + ) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/issues/1718#issuecomment-1107863365,https://api.github.com/repos/simonw/datasette/issues/1718,1107863365,IC_kwDOBm6k_c5CCKdF,9599,simonw,2022-04-24T15:26:41Z,2022-04-24T15:26:41Z,OWNER,"Tried this: ``` pip install blacken-docs blacken-docs docs/*.rst git diff | pbcopy ``` Got this: ```diff diff --git a/docs/authentication.rst b/docs/authentication.rst index 0d98cf8..8008023 100644 --- a/docs/authentication.rst +++ b/docs/authentication.rst @@ -381,11 +381,7 @@ Authentication plugins can set signed ``ds_actor`` cookies themselves like so: .. code-block:: python response = Response.redirect(""/"") - response.set_cookie(""ds_actor"", datasette.sign({ - ""a"": { - ""id"": ""cleopaws"" - } - }, ""actor"")) + response.set_cookie(""ds_actor"", datasette.sign({""a"": {""id"": ""cleopaws""}}, ""actor"")) Note that you need to pass ``""actor""`` as the namespace to :ref:`datasette_sign`. @@ -412,12 +408,16 @@ To include an expiry, add a ``""e""`` key to the cookie value containing a `base62 expires_at = int(time.time()) + (24 * 60 * 60) response = Response.redirect(""/"") - response.set_cookie(""ds_actor"", datasette.sign({ - ""a"": { - ""id"": ""cleopaws"" - }, - ""e"": baseconv.base62.encode(expires_at), - }, ""actor"")) + response.set_cookie( + ""ds_actor"", + datasette.sign( + { + ""a"": {""id"": ""cleopaws""}, + ""e"": baseconv.base62.encode(expires_at), + }, + ""actor"", + ), + ) The resulting cookie will encode data that looks something like this: diff --git a/docs/spatialite.rst b/docs/spatialite.rst index d1b300b..556bad8 100644 --- a/docs/spatialite.rst +++ b/docs/spatialite.rst @@ -58,19 +58,22 @@ Here's a recipe for taking a table with existing latitude and longitude columns, .. code-block:: python import sqlite3 - conn = sqlite3.connect('museums.db') + + conn = sqlite3.connect(""museums.db"") # Lead the spatialite extension: conn.enable_load_extension(True) - conn.load_extension('/usr/local/lib/mod_spatialite.dylib') + conn.load_extension(""/usr/local/lib/mod_spatialite.dylib"") # Initialize spatial metadata for this database: - conn.execute('select InitSpatialMetadata(1)') + conn.execute(""select InitSpatialMetadata(1)"") # Add a geometry column called point_geom to our museums table: conn.execute(""SELECT AddGeometryColumn('museums', 'point_geom', 4326, 'POINT', 2);"") # Now update that geometry column with the lat/lon points - conn.execute(''' + conn.execute( + """""" UPDATE museums SET point_geom = GeomFromText('POINT('||""longitude""||' '||""latitude""||')',4326); - ''') + """""" + ) # Now add a spatial index to that column conn.execute('select CreateSpatialIndex(""museums"", ""point_geom"");') # If you don't commit your changes will not be persisted: @@ -186,13 +189,14 @@ Here's Python code to create a SQLite database, enable SpatiaLite, create a plac .. code-block:: python import sqlite3 - conn = sqlite3.connect('places.db') + + conn = sqlite3.connect(""places.db"") # Enable SpatialLite extension conn.enable_load_extension(True) - conn.load_extension('/usr/local/lib/mod_spatialite.dylib') + conn.load_extension(""/usr/local/lib/mod_spatialite.dylib"") # Create the masic countries table - conn.execute('select InitSpatialMetadata(1)') - conn.execute('create table places (id integer primary key, name text);') + conn.execute(""select InitSpatialMetadata(1)"") + conn.execute(""create table places (id integer primary key, name text);"") # Add a MULTIPOLYGON Geometry column conn.execute(""SELECT AddGeometryColumn('places', 'geom', 4326, 'MULTIPOLYGON', 2);"") # Add a spatial index against the new column @@ -201,13 +205,17 @@ Here's Python code to create a SQLite database, enable SpatiaLite, create a plac from shapely.geometry.multipolygon import MultiPolygon from shapely.geometry import shape import requests - geojson = requests.get('https://data.whosonfirst.org/404/227/475/404227475.geojson').json() + + geojson = requests.get( + ""https://data.whosonfirst.org/404/227/475/404227475.geojson"" + ).json() # Convert to ""Well Known Text"" format - wkt = shape(geojson['geometry']).wkt + wkt = shape(geojson[""geometry""]).wkt # Insert and commit the record - conn.execute(""INSERT INTO places (id, name, geom) VALUES(null, ?, GeomFromText(?, 4326))"", ( - ""Wales"", wkt - )) + conn.execute( + ""INSERT INTO places (id, name, geom) VALUES(null, ?, GeomFromText(?, 4326))"", + (""Wales"", wkt), + ) conn.commit() Querying polygons using within() diff --git a/docs/writing_plugins.rst b/docs/writing_plugins.rst index bd60a4b..5af01f6 100644 --- a/docs/writing_plugins.rst +++ b/docs/writing_plugins.rst @@ -18,9 +18,10 @@ The quickest way to start writing a plugin is to create a ``my_plugin.py`` file from datasette import hookimpl + @hookimpl def prepare_connection(conn): - conn.create_function('hello_world', 0, lambda: 'Hello world!') + conn.create_function(""hello_world"", 0, lambda: ""Hello world!"") If you save this in ``plugins/my_plugin.py`` you can then start Datasette like this:: @@ -60,22 +61,18 @@ The example consists of two files: a ``setup.py`` file that defines the plugin: from setuptools import setup - VERSION = '0.1' + VERSION = ""0.1"" setup( - name='datasette-plugin-demos', - description='Examples of plugins for Datasette', - author='Simon Willison', - url='https://github.com/simonw/datasette-plugin-demos', - license='Apache License, Version 2.0', + name=""datasette-plugin-demos"", + description=""Examples of plugins for Datasette"", + author=""Simon Willison"", + url=""https://github.com/simonw/datasette-plugin-demos"", + license=""Apache License, Version 2.0"", version=VERSION, - py_modules=['datasette_plugin_demos'], - entry_points={ - 'datasette': [ - 'plugin_demos = datasette_plugin_demos' - ] - }, - install_requires=['datasette'] + py_modules=[""datasette_plugin_demos""], + entry_points={""datasette"": [""plugin_demos = datasette_plugin_demos""]}, + install_requires=[""datasette""], ) And a Python module file, ``datasette_plugin_demos.py``, that implements the plugin: @@ -88,12 +85,12 @@ And a Python module file, ``datasette_plugin_demos.py``, that implements the plu @hookimpl def prepare_jinja2_environment(env): - env.filters['uppercase'] = lambda u: u.upper() + env.filters[""uppercase""] = lambda u: u.upper() @hookimpl def prepare_connection(conn): - conn.create_function('random_integer', 2, random.randint) + conn.create_function(""random_integer"", 2, random.randint) Having built a plugin in this way you can turn it into an installable package using the following command:: @@ -123,11 +120,13 @@ To bundle the static assets for a plugin in the package that you publish to PyPI .. code-block:: python - package_data={ - 'datasette_plugin_name': [ - 'static/plugin.js', - ], - }, + package_data = ( + { + ""datasette_plugin_name"": [ + ""static/plugin.js"", + ], + }, + ) Where ``datasette_plugin_name`` is the name of the plugin package (note that it uses underscores, not hyphens) and ``static/plugin.js`` is the path within that package to the static file. @@ -152,11 +151,13 @@ Templates should be bundled for distribution using the same ``package_data`` mec .. code-block:: python - package_data={ - 'datasette_plugin_name': [ - 'templates/my_template.html', - ], - }, + package_data = ( + { + ""datasette_plugin_name"": [ + ""templates/my_template.html"", + ], + }, + ) You can also use wildcards here such as ``templates/*.html``. See `datasette-edit-schema `__ for an example of this pattern. ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/issues/1718#issuecomment-1107862882,https://api.github.com/repos/simonw/datasette/issues/1718,1107862882,IC_kwDOBm6k_c5CCKVi,9599,simonw,2022-04-24T15:23:56Z,2022-04-24T15:23:56Z,OWNER,"Found https://github.com/asottile/blacken-docs via - https://github.com/psf/black/issues/294","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,Code examples in the documentation should be formatted with Black, https://github.com/simonw/datasette/pull/1717#issuecomment-1107848097,https://api.github.com/repos/simonw/datasette/issues/1717,1107848097,IC_kwDOBm6k_c5CCGuh,9599,simonw,2022-04-24T14:02:37Z,2022-04-24T14:02:37Z,OWNER,"This is a neat feature, thanks!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213281044,Add timeout option to Cloudrun build, https://github.com/simonw/datasette/issues/1715#issuecomment-1106989581,https://api.github.com/repos/simonw/datasette/issues/1715,1106989581,IC_kwDOBm6k_c5B-1IN,9599,simonw,2022-04-22T23:03:29Z,2022-04-22T23:03:29Z,OWNER,I'm having second thoughts about injecting `request` - might be better to have the view function pull the relevant pieces out of the request before triggering the rest of the resolution.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1715#issuecomment-1106947168,https://api.github.com/repos/simonw/datasette/issues/1715,1106947168,IC_kwDOBm6k_c5B-qxg,9599,simonw,2022-04-22T22:25:57Z,2022-04-22T22:26:06Z,OWNER,"```python async def database(request: Request, datasette: Datasette) -> Database: database_route = tilde_decode(request.url_vars[""database""]) try: return datasette.get_database(route=database_route) except KeyError: raise NotFound(""Database not found: {}"".format(database_route)) async def table_name(request: Request) -> str: return tilde_decode(request.url_vars[""table""]) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1715#issuecomment-1106945876,https://api.github.com/repos/simonw/datasette/issues/1715,1106945876,IC_kwDOBm6k_c5B-qdU,9599,simonw,2022-04-22T22:24:29Z,2022-04-22T22:24:29Z,OWNER,"Looking at the start of `TableView.data()`: https://github.com/simonw/datasette/blob/d57c347f35bcd8cff15f913da851b4b8eb030867/datasette/views/table.py#L333-L346 I'm going to resolve `table_name` and `database` from the URL - `table_name` will be a string, `database` will be the DB object returned by `datasette.get_database()`. Then those can be passed in separately too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1716#issuecomment-1106923258,https://api.github.com/repos/simonw/datasette/issues/1716,1106923258,IC_kwDOBm6k_c5B-k76,9599,simonw,2022-04-22T22:02:07Z,2022-04-22T22:02:07Z,OWNER,"https://github.com/simonw/datasette/blame/main/datasette/views/base.py ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212838949,Configure git blame to ignore Black commit, https://github.com/simonw/datasette/issues/1715#issuecomment-1106908642,https://api.github.com/repos/simonw/datasette/issues/1715,1106908642,IC_kwDOBm6k_c5B-hXi,9599,simonw,2022-04-22T21:47:55Z,2022-04-22T21:47:55Z,OWNER,"I need a `asyncio.Registry` with functions registered to perform the role of the table view. Something like this perhaps: ```python def table_html_context(facet_results, query, datasette, rows): return {...} ``` That then gets called like this: ```python async def view(request): registry = Registry(facet_results, query, datasette, rows) context = await registry.resolve(table_html, request=request, datasette=datasette) return Reponse.html(await datasette.render(""table.html"", context) ``` It's also interesting to start thinking about this from a Python client library point of view. If I'm writing code outside of the HTTP request cycle, what would it look like? One thing I could do: break out is the code that turns a request into a list of pairs extracted from the request - this code here: https://github.com/simonw/datasette/blob/8338c66a57502ef27c3d7afb2527fbc0663b2570/datasette/views/table.py#L442-L449 I could turn that into a typed dependency injection function like this: ```python def filter_args(request: Request) -> List[Tuple[str, str]]: # Arguments that start with _ and don't contain a __ are # special - things like ?_search= - and should not be # treated as filters. filter_args = [] for key in request.args: if not (key.startswith(""_"") and ""__"" not in key): for v in request.args.getlist(key): filter_args.append((key, v)) return filter_args ``` Then I can either pass a `request` into a `.resolve()` call, or I can instead skip that function by passing: ```python output = registry.resolve(table_context, filter_args=[(""foo"", ""bar"")]) ``` I do need to think about where plugins get executed in all of this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,Refactor TableView to use asyncinject, https://github.com/simonw/datasette/issues/1101#issuecomment-1105615625,https://api.github.com/repos/simonw/datasette/issues/1101,1105615625,IC_kwDOBm6k_c5B5lsJ,9599,simonw,2022-04-21T18:31:41Z,2022-04-21T18:32:22Z,OWNER,"The `datasette-geojson` plugin is actually an interesting case here, because of the way it converts SpatiaLite geometries into GeoJSON: https://github.com/eyeseast/datasette-geojson/blob/602c4477dc7ddadb1c0a156cbcd2ef6688a5921d/datasette_geojson/__init__.py#L61-L66 ```python if isinstance(geometry, bytes): results = await db.execute( ""SELECT AsGeoJSON(:geometry)"", {""geometry"": geometry} ) return geojson.loads(results.single_value()) ``` That actually seems to work really well as-is, but it does worry me a bit that it ends up having to execute an extra `SELECT` query for every single returned row - especially in streaming mode where it might be asked to return 1m rows at once. My PostgreSQL/MySQL engineering brain says that this would be better handled by doing a chunk of these (maybe 100) at once, to avoid the per-query-overhead - but with SQLite that might not be necessary. At any rate, this is one of the reasons I'm interested in ""iterate over this sequence of chunks of 100 rows at a time"" as a potential option here. Of course, a better solution would be for `datasette-geojson` to have a way to influence the SQL query before it is executed, adding a `AsGeoJSON(geometry)` clause to it - so that's something I'm open to as well.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",749283032,register_output_renderer() should support streaming data, https://github.com/simonw/datasette/issues/1101#issuecomment-1105608964,https://api.github.com/repos/simonw/datasette/issues/1101,1105608964,IC_kwDOBm6k_c5B5kEE,9599,simonw,2022-04-21T18:26:29Z,2022-04-21T18:26:29Z,OWNER,"I'm questioning if the mechanisms should be separate at all now - a single response rendering is really just a case of a streaming response that only pulls the first N records from the iterator. It probably needs to be an `async for` iterator, which I've not worked with much before. Good opportunity to learn. This actually gets a fair bit more complicated due to the work I'm doing right now to improve the default JSON API: - #1709 I want to do things like make faceting results optionally available to custom renderers - which is a separate concern from streaming rows. I'm going to poke around with a bunch of prototypes and see what sticks.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",749283032,register_output_renderer() should support streaming data, https://github.com/simonw/datasette/issues/1101#issuecomment-1105571003,https://api.github.com/repos/simonw/datasette/issues/1101,1105571003,IC_kwDOBm6k_c5B5ay7,9599,simonw,2022-04-21T18:10:38Z,2022-04-21T18:10:46Z,OWNER,"Maybe the simplest design for this is to add an optional `can_stream` to the contract: ```python @hookimpl def register_output_renderer(datasette): return { ""extension"": ""tsv"", ""render"": render_tsv, ""can_render"": lambda: True, ""can_stream"": lambda: True } ``` When streaming, a new parameter could be passed to the render function - maybe `chunks` - which is an iterator/generator over a sequence of chunks of rows. Or it could use the existing `rows` parameter but treat that as an iterator?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",749283032,register_output_renderer() should support streaming data, https://github.com/simonw/sqlite-utils/issues/425#issuecomment-1101594549,https://api.github.com/repos/simonw/sqlite-utils/issues/425,1101594549,IC_kwDOCGYnMM5BqP-1,9599,simonw,2022-04-18T17:36:14Z,2022-04-18T17:36:14Z,OWNER,"Releated: - #408","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203842656,`sqlite3.NotSupportedError`: deterministic=True requires SQLite 3.8.3 or higher, https://github.com/simonw/datasette/issues/1713#issuecomment-1098628334,https://api.github.com/repos/simonw/datasette/issues/1713,1098628334,IC_kwDOBm6k_c5Be7zu,9599,simonw,2022-04-14T01:43:00Z,2022-04-14T01:43:13Z,OWNER,"Current workaround for fast publishing to S3: datasette fixtures.db --get /fixtures/facetable.json | \ s3-credentials put-object my-bucket facetable.json -","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203943272,Datasette feature for publishing snapshots of query results, https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098548931,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098548931,IC_kwDOCGYnMM5BeobD,9599,simonw,2022-04-13T22:41:59Z,2022-04-13T22:41:59Z,OWNER,"I'm going to close this ticket since it looks like this is a bug in the way the Dockerfile builds Python, but I'm going to ship a fix for that issue I found so the `LD_PRELOAD` workaround above should work OK with the next release of `sqlite-utils`. Thanks for the detailed bug report!","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,"""Error: near ""("": syntax error"" when using sqlite-utils indexes CLI", https://github.com/simonw/sqlite-utils/issues/424#issuecomment-1098548090,https://api.github.com/repos/simonw/sqlite-utils/issues/424,1098548090,IC_kwDOCGYnMM5BeoN6,9599,simonw,2022-04-13T22:40:15Z,2022-04-13T22:40:15Z,OWNER,"New error: ```pycon >>> from sqlite_utils import Database >>> db = Database(memory=True) >>> db[""foo""].create({}) Traceback (most recent call last): File """", line 1, in File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py"", line 1465, in create self.db.create_table( File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py"", line 885, in create_table sql = self.create_table_sql( File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py"", line 771, in create_table_sql assert columns, ""Tables must have at least one column"" AssertionError: Tables must have at least one column ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1200866134,Better error message if you try to create a table with no columns, https://github.com/simonw/sqlite-utils/issues/425#issuecomment-1098545390,https://api.github.com/repos/simonw/sqlite-utils/issues/425,1098545390,IC_kwDOCGYnMM5Benju,9599,simonw,2022-04-13T22:34:52Z,2022-04-13T22:34:52Z,OWNER,"That broke Python 3.7 because it doesn't support `deterministic=True` even being passed: > function takes at most 3 arguments (4 given)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203842656,`sqlite3.NotSupportedError`: deterministic=True requires SQLite 3.8.3 or higher, https://github.com/simonw/sqlite-utils/issues/425#issuecomment-1098537000,https://api.github.com/repos/simonw/sqlite-utils/issues/425,1098537000,IC_kwDOCGYnMM5Belgo,9599,simonw,2022-04-13T22:18:22Z,2022-04-13T22:18:22Z,OWNER,"I figured out a workaround in https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098535531 The current `register(fn)` method looks like this: https://github.com/simonw/sqlite-utils/blob/95522ad919f96eb6cc8cd3cd30389b534680c717/sqlite_utils/db.py#L389-L403 This alternative implementation worked in the environment where that failed: ```python def register(fn): name = fn.__name__ arity = len(inspect.signature(fn).parameters) if not replace and (name, arity) in self._registered_functions: return fn kwargs = {} done = False if deterministic: # Try this, but fall back if sqlite3.NotSupportedError try: self.conn.create_function(name, arity, fn, **dict(kwargs, deterministic=True)) done = True except sqlite3.NotSupportedError: pass if not done: self.conn.create_function(name, arity, fn, **kwargs) self._registered_functions.add((name, arity)) return fn ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203842656,`sqlite3.NotSupportedError`: deterministic=True requires SQLite 3.8.3 or higher, https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098535531,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098535531,IC_kwDOCGYnMM5BelJr,9599,simonw,2022-04-13T22:15:48Z,2022-04-13T22:15:48Z,OWNER,"Trying this alternative implementation of the `register()` method: ```python def register(fn): name = fn.__name__ arity = len(inspect.signature(fn).parameters) if not replace and (name, arity) in self._registered_functions: return fn kwargs = {} done = False if deterministic: # Try this, but fall back if sqlite3.NotSupportedError try: self.conn.create_function(name, arity, fn, **dict(kwargs, deterministic=True)) done = True except sqlite3.NotSupportedError: pass if not done: self.conn.create_function(name, arity, fn, **kwargs) self._registered_functions.add((name, arity)) return fn ``` With that fix, the following worked! ``` LD_PRELOAD=./build/sqlite-autoconf-3360000/.libs/libsqlite3.so sqlite-utils indexes /tmp/global.db --table table index_name seqno cid name desc coll key --------- -------------------------- ------- ----- ------- ------ ------ ----- countries idx_countries_country_name 0 1 country 0 BINARY 1 countries idx_countries_country_name 1 2 name 0 BINARY 1 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,"""Error: near ""("": syntax error"" when using sqlite-utils indexes CLI", https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098532220,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098532220,IC_kwDOCGYnMM5BekV8,9599,simonw,2022-04-13T22:09:52Z,2022-04-13T22:09:52Z,OWNER,That error is weird - it's not supposed to happen according to this code here: https://github.com/simonw/sqlite-utils/blob/95522ad919f96eb6cc8cd3cd30389b534680c717/sqlite_utils/db.py#L389-L400,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,"""Error: near ""("": syntax error"" when using sqlite-utils indexes CLI", https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098531354,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098531354,IC_kwDOCGYnMM5BekIa,9599,simonw,2022-04-13T22:08:20Z,2022-04-13T22:08:20Z,OWNER,"OK I figured out what's going on here. First I added an extra `print(sql)` statement to the `indexes` command to see what SQL it was running: ``` (app-root) sqlite-utils indexes global.db --table select sqlite_master.name as ""table"", indexes.name as index_name, xinfo.* from sqlite_master join pragma_index_list(sqlite_master.name) indexes join pragma_index_xinfo(index_name) xinfo where sqlite_master.type = 'table' and xinfo.key = 1 Error: near ""("": syntax error ``` This made me suspicious that the SQLite version being used here didn't support joining against the `pragma_index_list(...)` table-valued functions in that way. So I checked the version: ``` (app-root) sqlite3 SQLite version 3.36.0 2021-06-18 18:36:39 ``` That version should be fine - it's the one you compiled in the Dockerfile. Then I checked the version that `sqlite-utils` itself was using: ``` (app-root) sqlite-utils memory 'select sqlite_version()' [{""sqlite_version()"": ""3.7.17""}] ``` It's running SQLite 3.7.17! So the problem here is that the Python in that Docker image is running a very old version of SQLite. I tried using the trick in https://til.simonwillison.net/sqlite/ld-preload as a workaround, and it almost worked: ``` (app-root) python3 -c 'import sqlite3; print(sqlite3.connect("":memory"").execute(""select sqlite_version()"").fetchone())' ('3.7.17',) (app-root) LD_PRELOAD=./build/sqlite-autoconf-3360000/.libs/libsqlite3.so python3 -c 'import sqlite3; print(sqlite3.connect("":memory"").execute(""select sqlite_version()"").fetchone())' ('3.36.0',) ``` But when I try to run `sqlite-utils` like that I get an error: ``` (app-root) LD_PRELOAD=./build/sqlite-autoconf-3360000/.libs/libsqlite3.so sqlite-utils indexes /tmp/global.db ... File ""/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/cli.py"", line 1624, in query db.register_fts4_bm25() File ""/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/db.py"", line 412, in register_fts4_bm25 self.register_function(rank_bm25, deterministic=True) File ""/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/db.py"", line 408, in register_function register(fn) File ""/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/db.py"", line 401, in register self.conn.create_function(name, arity, fn, **kwargs) sqlite3.NotSupportedError: deterministic=True requires SQLite 3.8.3 or higher ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,"""Error: near ""("": syntax error"" when using sqlite-utils indexes CLI", https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098295517,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098295517,IC_kwDOCGYnMM5Bdqjd,9599,simonw,2022-04-13T17:16:20Z,2022-04-13T17:16:20Z,OWNER,"Aha! I was able to replicate the bug using your `Dockerfile` - thanks very much for providing that. ``` (app-root) sqlite-utils indexes global.db --table Error: near ""("": syntax error ``` (That wa sbefore I even ran the `extract` command.) To build your `Dockerfile` I copied it into an empty folder and ran the following: ``` wget https://www.sqlite.org/2021/sqlite-autoconf-3360000.tar.gz docker build . -t centos-sqlite-utils docker run -it centos-sqlite-utils /bin/bash ``` This gave me a shell in which I could replicate the bug.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,"""Error: near ""("": syntax error"" when using sqlite-utils indexes CLI", https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098288158,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098288158,IC_kwDOCGYnMM5Bdowe,9599,simonw,2022-04-13T17:07:53Z,2022-04-13T17:07:53Z,OWNER,"I can't replicate the bug I'm afraid: ``` % wget ""https://github.com/wri/global-power-plant-database/blob/232a6666/output_database/global_power_plant_database.csv?raw=true"" ... 2022-04-13 10:06:29 (8.97 MB/s) - ‘global_power_plant_database.csv?raw=true’ saved [8856038/8856038] % sqlite-utils insert global.db power_plants \ 'global_power_plant_database.csv?raw=true' --csv [------------------------------------] 0% [###################################-] 99% 00:00:00% % sqlite-utils indexes global.db --table table index_name seqno cid name desc coll key ------- ------------ ------- ----- ------ ------ ------ ----- % sqlite-utils extract global.db power_plants country country_long \ --table countries \ --fk-column country_id \ --rename country_long name % sqlite-utils indexes global.db --table table index_name seqno cid name desc coll key --------- -------------------------- ------- ----- ------- ------ ------ ----- countries idx_countries_country_name 0 1 country 0 BINARY 1 countries idx_countries_country_name 1 2 name 0 BINARY 1 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,"""Error: near ""("": syntax error"" when using sqlite-utils indexes CLI", https://github.com/simonw/datasette/issues/1712#issuecomment-1097115034,https://api.github.com/repos/simonw/datasette/issues/1712,1097115034,IC_kwDOBm6k_c5BZKWa,9599,simonw,2022-04-12T19:12:21Z,2022-04-12T19:12:21Z,OWNER,Got a TIL out of this too: https://til.simonwillison.net/spatialite/gunion-to-combine-geometries,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1202227104,"Make """" easier to read", https://github.com/simonw/datasette/issues/1712#issuecomment-1097076622,https://api.github.com/repos/simonw/datasette/issues/1712,1097076622,IC_kwDOBm6k_c5BZA-O,9599,simonw,2022-04-12T18:42:04Z,2022-04-12T18:42:04Z,OWNER,I'm not going to show the tooltip if the formatted number is in bytes.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1202227104,"Make """" easier to read", https://github.com/simonw/datasette/issues/1712#issuecomment-1097068474,https://api.github.com/repos/simonw/datasette/issues/1712,1097068474,IC_kwDOBm6k_c5BY--6,9599,simonw,2022-04-12T18:38:18Z,2022-04-12T18:38:18Z,OWNER," ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1202227104,"Make """" easier to read", https://github.com/simonw/datasette/issues/1708#issuecomment-1095687566,https://api.github.com/repos/simonw/datasette/issues/1708,1095687566,IC_kwDOBm6k_c5BTt2O,9599,simonw,2022-04-11T23:24:30Z,2022-04-11T23:24:30Z,OWNER,"## Redesigned template context **Warning:** if you use any custom templates with your Datasette instance they are likely to break when you upgrade to 1.0. The template context has been redesigned to be based on the documented JSON API. This means that the template context can be considered stable going forward, so any custom templates you implement should continue to work when you upgrade Datasette in the future.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1200649124,Datasette 1.0 alpha upcoming release notes, https://github.com/simonw/datasette/issues/1705#issuecomment-1095673947,https://api.github.com/repos/simonw/datasette/issues/1705,1095673947,IC_kwDOBm6k_c5BTqhb,9599,simonw,2022-04-11T23:03:49Z,2022-04-11T23:03:49Z,OWNER,I'll also encourage testing against both Datasette 0.x and Datasette 1.0 using a GitHub Actions matrix.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1197926598,How to upgrade your plugin for 1.0 documentation, https://github.com/simonw/datasette/issues/1710#issuecomment-1095673670,https://api.github.com/repos/simonw/datasette/issues/1710,1095673670,IC_kwDOBm6k_c5BTqdG,9599,simonw,2022-04-11T23:03:25Z,2022-04-11T23:03:25Z,OWNER,"Dupe of: - #1705","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1200649889,Guide for plugin authors to upgrade their plugins for 1.0, https://github.com/simonw/datasette/issues/1709#issuecomment-1095671940,https://api.github.com/repos/simonw/datasette/issues/1709,1095671940,IC_kwDOBm6k_c5BTqCE,9599,simonw,2022-04-11T23:00:39Z,2022-04-11T23:01:41Z,OWNER,"- #262 - #782 - #1509","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1200649502,Redesigned JSON API with ?_extra= parameters, https://github.com/simonw/datasette/issues/1711#issuecomment-1095672127,https://api.github.com/repos/simonw/datasette/issues/1711,1095672127,IC_kwDOBm6k_c5BTqE_,9599,simonw,2022-04-11T23:00:58Z,2022-04-11T23:00:58Z,OWNER,- #1510,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1200650491,Template context powered entirely by the JSON API format, https://github.com/simonw/datasette/issues/1707#issuecomment-1095277937,https://api.github.com/repos/simonw/datasette/issues/1707,1095277937,IC_kwDOBm6k_c5BSJ1x,9599,simonw,2022-04-11T16:32:31Z,2022-04-11T16:33:00Z,OWNER,"That's a really interesting idea! That page is one of the least developed at the moment. There's plenty of room for it to grow new useful features. I like this suggestion because it feels like a good opportunity to introduce some unobtrusive JavaScript. Could use a details/summary element that uses `fetch()` to load in the extra data for example. Could even do something with the `` Web Component here... https://github.com/simonw/datasette-table","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1200224939,[feature] expanded detail page, https://github.com/simonw/datasette/issues/1706#issuecomment-1094152642,https://api.github.com/repos/simonw/datasette/issues/1706,1094152642,IC_kwDOBm6k_c5BN3HC,9599,simonw,2022-04-10T01:11:54Z,2022-04-10T01:11:54Z,OWNER,"This relates to this much larger vision: - #417 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1198822563,"[feature] immutable mode for a directory, not just individual sqlite file", https://github.com/simonw/datasette/issues/1706#issuecomment-1094152173,https://api.github.com/repos/simonw/datasette/issues/1706,1094152173,IC_kwDOBm6k_c5BN2_t,9599,simonw,2022-04-10T01:08:50Z,2022-04-10T01:08:50Z,OWNER,This is a good idea - it matches the way `datasette .` works for mutable database files already.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1198822563,"[feature] immutable mode for a directory, not just individual sqlite file", https://github.com/simonw/datasette/pull/1693#issuecomment-1093454899,https://api.github.com/repos/simonw/datasette/issues/1693,1093454899,IC_kwDOBm6k_c5BLMwz,9599,simonw,2022-04-08T23:07:04Z,2022-04-08T23:07:04Z,OWNER,"Tests failed here due to this issue: - https://github.com/psf/black/pull/2987 A future Black release should fix that.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1184850337,Bump black from 22.1.0 to 22.3.0, https://github.com/simonw/datasette/issues/1699#issuecomment-1092361727,https://api.github.com/repos/simonw/datasette/issues/1699,1092361727,IC_kwDOBm6k_c5BHB3_,9599,simonw,2022-04-08T01:47:43Z,2022-04-08T01:47:43Z,OWNER,"A render mode for that plugin hook that writes to a stream is exactly what I have in mind: - #1062 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1193090967,Proposal: datasette query, https://github.com/simonw/datasette/issues/1699#issuecomment-1092321966,https://api.github.com/repos/simonw/datasette/issues/1699,1092321966,IC_kwDOBm6k_c5BG4Ku,9599,simonw,2022-04-08T00:20:32Z,2022-04-08T00:20:56Z,OWNER,"If we do this I'm keen to have it be more than just an alternative to the existing `sqlite-utils` command - especially since if I add `sqlite-utils` as a dependency of Datasette in the future that command will be installed as part of `pip install datasette` anyway. My best thought on how to differentiate them so far is plugins: if Datasette plugins that provide alternative outputs - like `.geojson` and `.yml` and suchlike - also work for the `datasette query` command that would make a lot of sense to me. One way that could work: a `--fmt geojson` option to this command which uses the plugin that was registered for the specified extension.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1193090967,Proposal: datasette query, https://github.com/simonw/datasette/issues/1698#issuecomment-1086784547,https://api.github.com/repos/simonw/datasette/issues/1698,1086784547,IC_kwDOBm6k_c5AxwQj,9599,simonw,2022-04-03T06:10:24Z,2022-04-03T06:10:24Z,OWNER,Warning added here: https://docs.datasette.io/en/latest/publish.html#publishing-to-google-cloud-run,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1190828163,Add a warning about bots and Cloud Run, https://github.com/simonw/datasette/issues/1697#issuecomment-1085323192,https://api.github.com/repos/simonw/datasette/issues/1697,1085323192,IC_kwDOBm6k_c5AsLe4,9599,simonw,2022-04-01T02:01:51Z,2022-04-01T02:01:51Z,OWNER,"Huh, turns out `Request.fake()` wasn't yet documented.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1189113609,"`Request.fake(..., url_vars={})`", https://github.com/simonw/datasette/issues/1696#issuecomment-1083351437,https://api.github.com/repos/simonw/datasette/issues/1696,1083351437,IC_kwDOBm6k_c5AkqGN,9599,simonw,2022-03-30T16:20:49Z,2022-03-30T16:21:02Z,OWNER,"Maybe like this: ```html

283 rows where dcode = 3 (Human Related: Other)

```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1186696202,Show foreign key label when filtering, https://github.com/simonw/datasette/issues/1692#issuecomment-1082663746,https://api.github.com/repos/simonw/datasette/issues/1692,1082663746,IC_kwDOBm6k_c5AiCNC,9599,simonw,2022-03-30T06:14:39Z,2022-03-30T06:14:51Z,OWNER,"I like your design, though I think it should be `""nomodule"": True` for consistency with the other options. I think `""async"": True` is worth supporting too.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1182227211,[plugins][feature request]: Support additional script tag attributes when loading custom JS, https://github.com/simonw/datasette/issues/1692#issuecomment-1082661795,https://api.github.com/repos/simonw/datasette/issues/1692,1082661795,IC_kwDOBm6k_c5AiBuj,9599,simonw,2022-03-30T06:11:41Z,2022-03-30T06:11:41Z,OWNER,This is a good idea.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1182227211,[plugins][feature request]: Support additional script tag attributes when loading custom JS, https://github.com/simonw/datasette/issues/1695#issuecomment-1082617386,https://api.github.com/repos/simonw/datasette/issues/1695,1082617386,IC_kwDOBm6k_c5Ah24q,9599,simonw,2022-03-30T04:46:18Z,2022-03-30T04:46:18Z,OWNER,"` selected = (column_qs, str(row[""value""])) in qs_pairs` is wrong.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1185868354,Option to un-filter facet not shown for `?col__exact=value`, https://github.com/simonw/datasette/issues/1695#issuecomment-1082617241,https://api.github.com/repos/simonw/datasette/issues/1695,1082617241,IC_kwDOBm6k_c5Ah22Z,9599,simonw,2022-03-30T04:45:55Z,2022-03-30T04:45:55Z,OWNER,"Relevant template: https://github.com/simonw/datasette/blob/e73fa72917ca28c152208d62d07a490c81cadf52/datasette/templates/table.html#L168-L172 Populated from here: https://github.com/simonw/datasette/blob/c496f2b663ff0cef908ffaaa68b8cb63111fb5f2/datasette/facets.py#L246-L253","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1185868354,Option to un-filter facet not shown for `?col__exact=value`, https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1081047053,https://api.github.com/repos/simonw/sqlite-utils/issues/420,1081047053,IC_kwDOCGYnMM5Ab3gN,9599,simonw,2022-03-28T19:22:37Z,2022-03-28T19:22:37Z,OWNER,Wrote about this in my weeknotes: https://simonwillison.net/2022/Mar/28/datasette-auth0/#new-features-as-documentation,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1178546862,Document how to use a `--convert` function that runs initialization code first, https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1080141111,https://api.github.com/repos/simonw/sqlite-utils/issues/420,1080141111,IC_kwDOCGYnMM5AYaU3,9599,simonw,2022-03-28T03:25:57Z,2022-03-28T03:54:37Z,OWNER,"So now this should solve your problem: ``` echo '[{""name"": ""notaword""}, {""name"": ""word""}] ' | python3 -m sqlite_utils insert listings.db listings - --convert ' import enchant d = enchant.Dict(""en_US"") def convert(row): global d row[""is_dictionary_word""] = d.check(row[""name""]) ' ```","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1178546862,Document how to use a `--convert` function that runs initialization code first, https://github.com/simonw/datasette/issues/1690#issuecomment-1079788375,https://api.github.com/repos/simonw/datasette/issues/1690,1079788375,IC_kwDOBm6k_c5AXENX,9599,simonw,2022-03-26T22:43:00Z,2022-03-26T22:43:00Z,OWNER,Then I can update this section of the documentation which currently recommends the above pattern: https://docs.datasette.io/en/stable/authentication.html#the-ds-actor-cookie,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1182141761,"Idea: `datasette.set_actor_cookie(response, actor)`", https://github.com/simonw/datasette/issues/1690#issuecomment-1079788346,https://api.github.com/repos/simonw/datasette/issues/1690,1079788346,IC_kwDOBm6k_c5AXEM6,9599,simonw,2022-03-26T22:42:40Z,2022-03-26T22:42:40Z,OWNER,"I don't want to do a `response.set_actor_cookie()` method because I like `Response` not to carry too many Datasette-specific features. So `datasette.set_actor_cookie(response, actor, expire_after=None)` would be a better place for this I think.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1182141761,"Idea: `datasette.set_actor_cookie(response, actor)`", https://github.com/simonw/datasette/issues/1689#issuecomment-1079779040,https://api.github.com/repos/simonw/datasette/issues/1689,1079779040,IC_kwDOBm6k_c5AXB7g,9599,simonw,2022-03-26T21:35:57Z,2022-03-26T21:35:57Z,OWNER,Fixed: https://docs.datasette.io/en/latest/internals.html#add-message-request-message-type-datasette-info,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1182065616,datasette.add_message() documentation is incorrect, https://github.com/simonw/datasette/issues/1688#issuecomment-1079582485,https://api.github.com/repos/simonw/datasette/issues/1688,1079582485,IC_kwDOBm6k_c5AWR8V,9599,simonw,2022-03-26T03:15:34Z,2022-03-26T03:15:34Z,OWNER,"Yup, you're right in what you figured out here: stand-alone plugins can't currently package static assets other then using the static folder. The `datasette-plugin` cookiecutter template should make creating a Python package pretty easy though: https://github.com/simonw/datasette-plugin You can run that yourself, or you can run it using this GitHub template repository: https://github.com/simonw/datasette-plugin-template-repository ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1181432624,[plugins][documentation] Is it possible to serve per-plugin static folders when writing one-off (single file) plugins?, https://github.com/simonw/sqlite-utils/issues/417#issuecomment-1079441621,https://api.github.com/repos/simonw/sqlite-utils/issues/417,1079441621,IC_kwDOCGYnMM5AVvjV,9599,simonw,2022-03-25T21:18:37Z,2022-03-25T21:18:37Z,OWNER,Updated documentation: https://sqlite-utils.datasette.io/en/latest/cli.html#inserting-newline-delimited-json,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175744654,insert fails on JSONL with whitespace, https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1079407962,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1079407962,IC_kwDOCGYnMM5AVnVa,9599,simonw,2022-03-25T20:25:10Z,2022-03-25T20:25:18Z,OWNER,"Can you share either your whole `global.db` table or a shrunk down example that illustrates the bug? My hunch is that you may have a table or column with a name that triggers the error.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,"""Error: near ""("": syntax error"" when using sqlite-utils indexes CLI", https://github.com/simonw/sqlite-utils/issues/422#issuecomment-1079406708,https://api.github.com/repos/simonw/sqlite-utils/issues/422,1079406708,IC_kwDOCGYnMM5AVnB0,9599,simonw,2022-03-25T20:23:21Z,2022-03-25T20:23:21Z,OWNER,"Fixing this would require a bump to 4.0 because it would break existing code. The alternative would be to introduce a new `ignore_nulls=True` parameter which users can change to `ignore_nulls=False`. Or come up with better wording for that.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1181236173,Reconsider not running convert functions against null values, https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1079404281,https://api.github.com/repos/simonw/sqlite-utils/issues/420,1079404281,IC_kwDOCGYnMM5AVmb5,9599,simonw,2022-03-25T20:19:50Z,2022-03-25T20:19:50Z,OWNER,Now documented here: https://sqlite-utils.datasette.io/en/latest/cli.html#using-a-convert-function-to-execute-initialization,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1178546862,Document how to use a `--convert` function that runs initialization code first, https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1079384771,https://api.github.com/repos/simonw/sqlite-utils/issues/420,1079384771,IC_kwDOCGYnMM5AVhrD,9599,simonw,2022-03-25T19:51:34Z,2022-03-25T19:53:01Z,OWNER,"This works: ``` % sqlite-utils insert dogs.db dogs dogs.json --convert ' import random print(""seeding"") random.seed(10) print(random.random()) def convert(row): global random print(row) row[""random_score""] = random.random() ' seeding 0.5714025946899135 {'id': 1, 'name': 'Cleo'} {'id': 2, 'name': 'Pancakes'} {'id': 3, 'name': 'New dog'} (sqlite-utils) sqlite-utils % sqlite-utils rows dogs.db dogs [{""id"": 1, ""name"": ""Cleo"", ""random_score"": 0.4288890546751146}, {""id"": 2, ""name"": ""Pancakes"", ""random_score"": 0.5780913011344704}, {""id"": 3, ""name"": ""New dog"", ""random_score"": 0.20609823213950174}] ``` Having to use `global random` inside the function is frustrating but apparently necessary. https://stackoverflow.com/a/56552138/6083","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1178546862,Document how to use a `--convert` function that runs initialization code first, https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1079376283,https://api.github.com/repos/simonw/sqlite-utils/issues/420,1079376283,IC_kwDOCGYnMM5AVfmb,9599,simonw,2022-03-25T19:39:30Z,2022-03-25T19:43:35Z,OWNER,"Actually this doesn't work as I thought. This demo shows that the initialization code is run once per item, not a single time at the start of the run: ``` % sqlite-utils insert dogs.db dogs dogs.json --convert ' import random print(""seeding"") random.seed(10) print(random.random()) def convert(row): print(row) row[""random_score""] = random.random() ' seeding 0.5714025946899135 seeding 0.5714025946899135 seeding 0.5714025946899135 seeding 0.5714025946899135 ``` Also that `print(row)` line is not being printed anywhere that gets to the console for some reason. ... my mistake, that happened because I changed this line in order to try to get local imports to work: ```python try: exec(code, globals, locals) return globals[""convert""] except (AttributeError, SyntaxError, NameError, KeyError, TypeError): ``` It should be `locals[""convert""]`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1178546862,Document how to use a `--convert` function that runs initialization code first, https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1079243535,https://api.github.com/repos/simonw/sqlite-utils/issues/420,1079243535,IC_kwDOCGYnMM5AU_MP,9599,simonw,2022-03-25T17:25:12Z,2022-03-25T17:25:12Z,OWNER,"That documentation is split across a few places. This is the only bit that talks about `def convert()` pattern right now: - https://sqlite-utils.datasette.io/en/stable/cli.html#converting-data-in-columns But that's for `sqlite-utils convert` - the documentation for `sqlite-utils insert --convert` at https://sqlite-utils.datasette.io/en/stable/cli.html#applying-conversions-while-inserting-data doesn't mention it. Since both `sqlite-utils convert` and `sqlite-utils insert --convert` apply the same rules to the code, they should link to a shared explanation in the documentation.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1178546862,Document how to use a `--convert` function that runs initialization code first, https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1078343231,https://api.github.com/repos/simonw/sqlite-utils/issues/420,1078343231,IC_kwDOCGYnMM5ARjY_,9599,simonw,2022-03-24T21:16:10Z,2022-03-24T21:17:20Z,OWNER,"Aha! This may be possible already: https://github.com/simonw/sqlite-utils/blob/396f80fcc60da8dd844577114f7920830a2e5403/sqlite_utils/utils.py#L311-L316 And yes, this does indeed work - you can do something like this: ``` echo '{""name"": ""harry""}' | sqlite-utils insert db.db people - --convert ' import time # Simulate something expensive time.sleep(1) def convert(row): row[""upper""] = row[""name""].upper() ' ``` And after running that: ``` sqlite-utils dump db.db BEGIN TRANSACTION; CREATE TABLE [people] ( [name] TEXT, [upper] TEXT ); INSERT INTO ""people"" VALUES('harry','HARRY'); COMMIT; ``` So this is a documentation issue - there's a trick for it but I didn't know what the trick was!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1178546862,Document how to use a `--convert` function that runs initialization code first, https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1078328774,https://api.github.com/repos/simonw/sqlite-utils/issues/420,1078328774,IC_kwDOCGYnMM5ARf3G,9599,simonw,2022-03-24T21:12:33Z,2022-03-24T21:12:33Z,OWNER,"Here's how the `_compile_code()` mechanism works at the moment: https://github.com/simonw/sqlite-utils/blob/396f80fcc60da8dd844577114f7920830a2e5403/sqlite_utils/utils.py#L308-L342 At the end it does this: ```python return locals[""fn""] ``` So it's already building and then returning a function. The question is if there's a sensible way to allow people to further customize that function by executing some code first, in a way that's easy to explain.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1178546862,Document how to use a `--convert` function that runs initialization code first, https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1078322301,https://api.github.com/repos/simonw/sqlite-utils/issues/420,1078322301,IC_kwDOCGYnMM5AReR9,9599,simonw,2022-03-24T21:10:52Z,2022-03-24T21:10:52Z,OWNER,"I can think of three ways forward: - Figure out a pattern that gets that local file import workaround to work - Add another option such as `--convert-init` that lets you pass code that will be executed once at the start - Come up with a pattern where the `--convert` code can run some initialization code and then return a function which will be called against each value I quite like the idea of that third option - I'm going to prototype it and see if I can work something out.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1178546862,Document how to use a `--convert` function that runs initialization code first, https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1078315922,https://api.github.com/repos/simonw/sqlite-utils/issues/420,1078315922,IC_kwDOCGYnMM5ARcuS,9599,simonw,2022-03-24T21:09:27Z,2022-03-24T21:09:27Z,OWNER,"Yeah, this is WAY harder than it should be. There's a clumsy workaround you could use which looks something like this: create a file `my_enchant.py` containing: ```python import enchant d = enchant.Dict(""en_US"") def check(word): return d.check(word) ``` Then run `sqlite-utils` like this: ``` PYTHONPATH=. cat items.json | jq '.data' | sqlite-utils insert listings.db listings - --convert 'my_enchant.check(value)' --import my_enchant ``` Except I tried that and it doesn't work! I don't know the right pattern for getting `--import` to work with modules in the same directory. So yeah, this is definitely a big feature gap.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1178546862,Document how to use a `--convert` function that runs initialization code first, https://github.com/simonw/datasette/issues/1682#issuecomment-1076696791,https://api.github.com/repos/simonw/datasette/issues/1682,1076696791,IC_kwDOBm6k_c5ALRbX,9599,simonw,2022-03-23T18:45:49Z,2022-03-23T18:45:49Z,OWNER,"The problem is here in `QueryView`: https://github.com/simonw/datasette/blob/d7c793d7998388d915f8d270079c68a77a785051/datasette/views/database.py#L206-L238 It should be resolving `database` based on the route path, as seen in other methods like this one: https://github.com/simonw/datasette/blob/d7c793d7998388d915f8d270079c68a77a785051/datasette/views/table.py#L270-L279 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1178521513,SQL queries against databases with different routes are broken, https://github.com/simonw/datasette/issues/1670#issuecomment-1076683297,https://api.github.com/repos/simonw/datasette/issues/1670,1076683297,IC_kwDOBm6k_c5ALOIh,9599,simonw,2022-03-23T18:32:32Z,2022-03-23T18:32:32Z,OWNER,Added this to news on https://datasette.io/ https://github.com/simonw/datasette.io/commit/fd3ec57cdd5b935f75cbf52a86b3aabf2c97d217,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174423568,Ship Datasette 0.61, https://github.com/simonw/datasette/issues/1670#issuecomment-1076666293,https://api.github.com/repos/simonw/datasette/issues/1670,1076666293,IC_kwDOBm6k_c5ALJ-1,9599,simonw,2022-03-23T18:16:29Z,2022-03-23T18:16:29Z,OWNER,https://docs.datasette.io/en/stable/changelog.html#v0-61,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174423568,Ship Datasette 0.61, https://github.com/simonw/datasette/issues/1670#issuecomment-1076665837,https://api.github.com/repos/simonw/datasette/issues/1670,1076665837,IC_kwDOBm6k_c5ALJ3t,9599,simonw,2022-03-23T18:16:01Z,2022-03-23T18:16:01Z,OWNER,"https://github.com/simonw/datasette/releases/tag/0.61 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174423568,Ship Datasette 0.61, https://github.com/simonw/datasette/issues/1670#issuecomment-1076652046,https://api.github.com/repos/simonw/datasette/issues/1670,1076652046,IC_kwDOBm6k_c5ALGgO,9599,simonw,2022-03-23T18:02:30Z,2022-03-23T18:02:30Z,OWNER,"Two new things to add to the release notes from https://github.com/simonw/datasette/compare/0.61a0...main - https://github.com/simonw/datasette/issues/1678 - https://github.com/simonw/datasette/issues/1675 (now also a documented API)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174423568,Ship Datasette 0.61, https://github.com/simonw/datasette/issues/1670#issuecomment-1076647495,https://api.github.com/repos/simonw/datasette/issues/1670,1076647495,IC_kwDOBm6k_c5ALFZH,9599,simonw,2022-03-23T17:58:16Z,2022-03-23T17:58:16Z,OWNER,"I think the release notes are fine, but they need an opening paragraph highlighting the changes that are most likely to break backwards compatibility.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174423568,Ship Datasette 0.61, https://github.com/simonw/datasette/pull/1574#issuecomment-1076645636,https://api.github.com/repos/simonw/datasette/issues/1574,1076645636,IC_kwDOBm6k_c5ALE8E,9599,simonw,2022-03-23T17:56:35Z,2022-03-23T17:56:35Z,OWNER,I'd actually like to switch to slim as the default - I think Datasette should ship the smallest possible container that can still support extra packages being installed using `apt-get install`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1084193403,introduce new option for datasette package to use a slim base image, https://github.com/simonw/datasette/pull/1665#issuecomment-1076644362,https://api.github.com/repos/simonw/datasette/issues/1665,1076644362,IC_kwDOBm6k_c5ALEoK,9599,simonw,2022-03-23T17:55:39Z,2022-03-23T17:55:39Z,OWNER,Thanks for the PR - I spotted an error about this and went through and fixed this in all of my repos the other day: https://github.com/search?o=desc&q=user%3Asimonw+google-github-actions%2Fsetup-gcloud%40v0&s=committer-date&type=Commits,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173828092,Pin setup-gcloud to v0 instead of master, https://github.com/simonw/datasette/issues/1670#issuecomment-1076638278,https://api.github.com/repos/simonw/datasette/issues/1670,1076638278,IC_kwDOBm6k_c5ALDJG,9599,simonw,2022-03-23T17:50:55Z,2022-03-23T17:50:55Z,OWNER,"Release notes are mostly written for the alpha, just need to clean them up a bit https://github.com/simonw/datasette/blob/c4c9dbd0386e46d2bf199f0ed34e4895c98cb78c/docs/changelog.rst#061a0-2022-03-19","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174423568,Ship Datasette 0.61, https://github.com/simonw/datasette/issues/1681#issuecomment-1075438684,https://api.github.com/repos/simonw/datasette/issues/1681,1075438684,IC_kwDOBm6k_c5AGeRc,9599,simonw,2022-03-22T17:45:50Z,2022-03-22T17:49:09Z,OWNER,"I would expect this to break against SQL views that include calculated columns though - something like this: ```sql create view this_will_break as select pk + 1 as pk_plus_one, 0.5 as score from searchable; ``` Confirmed: the filter interface for that view plain doesn't work for any comparison against that table - except for `score > 0` since `0` is converted to an integer. `0.1` breaks though because it doesn't get converted as it doesn't match `.isdigit()`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1177101697,Potential bug in numeric handling where_clause for filters, https://github.com/simonw/datasette/issues/1681#issuecomment-1075437598,https://api.github.com/repos/simonw/datasette/issues/1681,1075437598,IC_kwDOBm6k_c5AGeAe,9599,simonw,2022-03-22T17:44:42Z,2022-03-22T17:45:04Z,OWNER,"My hunch is that this mechanism doesn't actually do anything useful at all, because of the type conversion that automatically happens for data from tables based on the column type affinities, see: - #1671 So either remove the `self.numeric` type conversion bit entirely, or prove that it is necessary and upgrade it to be able to handle floating point values too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1177101697,Potential bug in numeric handling where_clause for filters, https://github.com/simonw/datasette/issues/1671#issuecomment-1075432283,https://api.github.com/repos/simonw/datasette/issues/1671,1075432283,IC_kwDOBm6k_c5AGctb,9599,simonw,2022-03-22T17:39:04Z,2022-03-22T17:43:12Z,OWNER,"Note that Datasette does already have special logic to convert parameters to integers for numeric comparisons like `>`: https://github.com/simonw/datasette/blob/c4c9dbd0386e46d2bf199f0ed34e4895c98cb78c/datasette/filters.py#L203-L212 Though... it looks like there's a bug in that? It doesn't account for `float` values - `""3.5"".isdigit()` return `False` - probably for the best, because `int(3.5)` would break that value anyway.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174655187,Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply, https://github.com/simonw/datasette/issues/1671#issuecomment-1075435185,https://api.github.com/repos/simonw/datasette/issues/1671,1075435185,IC_kwDOBm6k_c5AGdax,9599,simonw,2022-03-22T17:42:09Z,2022-03-22T17:42:09Z,OWNER,"Also made me realize that this query: ```sql select * from sortable where sortable > :p0 ``` Only works here thanks to the column affinity thing kicking in too: https://latest.datasette.io/fixtures?sql=select+*+from+sortable+where+sortable+%3E+%3Ap0&p0=70","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174655187,Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply, https://github.com/simonw/datasette/issues/1671#issuecomment-1075428030,https://api.github.com/repos/simonw/datasette/issues/1671,1075428030,IC_kwDOBm6k_c5AGbq-,9599,simonw,2022-03-22T17:34:30Z,2022-03-22T17:34:30Z,OWNER,"No, I think I need to use `cast` - I can't think of any way to ask SQLite ""for this query, what types are the columns that will come back from it?"" Even the details from the `explain` trick explored in #1293 don't seem to come back with column type information: https://latest.datasette.io/fixtures?sql=explain+select+pk%2C+text1%2C+text2%2C+[name+with+.+and+spaces]+from+searchable_view+where+%22pk%22+%3D+%3Ap0&p0=1","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174655187,Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply, https://github.com/simonw/datasette/issues/1671#issuecomment-1075425513,https://api.github.com/repos/simonw/datasette/issues/1671,1075425513,IC_kwDOBm6k_c5AGbDp,9599,simonw,2022-03-22T17:31:53Z,2022-03-22T17:31:53Z,OWNER,"The alternative to using `cast` here would be for Datasette to convert the `""1""` to a `1` in Python code before passing it as a param. This feels a bit neater to me, but I still then need to solve the problem of how to identify the ""type"" of a column that I want to use in a query.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174655187,Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply, https://github.com/simonw/datasette/issues/339#issuecomment-1074479932,https://api.github.com/repos/simonw/datasette/issues/339,1074479932,IC_kwDOBm6k_c5AC0M8,9599,simonw,2022-03-21T22:22:34Z,2022-03-21T22:22:34Z,OWNER,Closing this as obsolete since Datasette no longer uses Sanic.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",340396247,Expose SANIC_RESPONSE_TIMEOUT config option in a sensible way, https://github.com/simonw/datasette/issues/276#issuecomment-1074479768,https://api.github.com/repos/simonw/datasette/issues/276,1074479768,IC_kwDOBm6k_c5AC0KY,9599,simonw,2022-03-21T22:22:20Z,2022-03-21T22:22:20Z,OWNER,"I'm closing this issue because this is now solved by a number of neat plugins: - https://datasette.io/plugins/datasette-geojson-map shows the geometry from SpatiaLite columns on a map - https://datasette.io/plugins/datasette-leaflet-geojson can be used to display inline maps next to each column","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",324835838,Handle spatialite geometry columns better, https://github.com/simonw/datasette/issues/1671#issuecomment-1074478299,https://api.github.com/repos/simonw/datasette/issues/1671,1074478299,IC_kwDOBm6k_c5ACzzb,9599,simonw,2022-03-21T22:20:26Z,2022-03-21T22:20:26Z,OWNER,"Thinking about options for fixing this... The following query works fine: ```sql select * from test_view where cast(has_expired as text) = '1' ``` I don't want to start using this for every query, because one of the goals of Datasette is to help people who are learning SQL: - #1613 If someone clicks on ""View and edit SQL"" from a filtered table page I don't want them to have to wonder why that `cast` is there. But... for querying views, the `cast` turns out to be necessary. So one fix would be to get the SQL generating logic to use casts like this any time it is operating against a view. An even better fix would be to detect which columns in a view come from a table and which ones might not, and only use casts for the columns that aren't definitely from a table. The trick I was exploring here might be able to help with that: - #1293 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174655187,Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply, https://github.com/simonw/datasette/issues/1671#issuecomment-1074470568,https://api.github.com/repos/simonw/datasette/issues/1671,1074470568,IC_kwDOBm6k_c5ACx6o,9599,simonw,2022-03-21T22:11:14Z,2022-03-21T22:12:49Z,OWNER,"I wonder if this will be a problem with generated columns, or with SQLite strict tables? My hunch is that strict tables will continue to work without any changes, because https://www.sqlite.org/stricttables.html says nothing about their impact on comparison operations. I should test this to make absolutely sure though. Generated columns have a type, so my hunch is they will continue to work fine too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174655187,Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply, https://github.com/simonw/datasette/issues/1671#issuecomment-1074468450,https://api.github.com/repos/simonw/datasette/issues/1671,1074468450,IC_kwDOBm6k_c5ACxZi,9599,simonw,2022-03-21T22:08:35Z,2022-03-21T22:10:00Z,OWNER,"Relevant section of the SQLite documentation: [3.2. Affinity Of Expressions](https://www.sqlite.org/datatype3.html#affinity_of_expressions): > When an expression is a simple reference to a column of a real table (not a [VIEW](https://www.sqlite.org/lang_createview.html) or subquery) then the expression has the same affinity as the table column. In your example, `has_expired` is no longer a simple reference to a column of a real table, hence the bug. Then [4.2. Type Conversions Prior To Comparison](https://www.sqlite.org/datatype3.html#type_conversions_prior_to_comparison) fills in the rest: > SQLite may attempt to convert values between the storage classes INTEGER, REAL, and/or TEXT before performing a comparison. Whether or not any conversions are attempted before the comparison takes place depends on the type affinity of the operands. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174655187,Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply, https://github.com/simonw/datasette/issues/1671#issuecomment-1074465536,https://api.github.com/repos/simonw/datasette/issues/1671,1074465536,IC_kwDOBm6k_c5ACwsA,9599,simonw,2022-03-21T22:04:31Z,2022-03-21T22:04:31Z,OWNER,"Oh this is fascinating! I replicated the bug (thanks for the steps to reproduce) and it looks like this is down to the following: Against views, `where has_expired = 1` returns different results from `where has_expired = '1'` This doesn't happen against tables because of SQLite's [type affinity](https://www.sqlite.org/datatype3.html#type_affinity) mechanism, which handles the type conversion automatically.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174655187,Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply, https://github.com/simonw/datasette/issues/1679#issuecomment-1074459746,https://api.github.com/repos/simonw/datasette/issues/1679,1074459746,IC_kwDOBm6k_c5ACvRi,9599,simonw,2022-03-21T21:55:45Z,2022-03-21T21:55:45Z,OWNER,I'm going to change the original logic to set n=1 for times that are `<= 20ms` - and update the comments to make it more obvious what is happening.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175854982,Research: how much overhead does the n=1 time limit have?, https://github.com/simonw/datasette/issues/1679#issuecomment-1074458506,https://api.github.com/repos/simonw/datasette/issues/1679,1074458506,IC_kwDOBm6k_c5ACu-K,9599,simonw,2022-03-21T21:53:47Z,2022-03-21T21:53:47Z,OWNER,"Oh interesting, it turns out there is ONE place in the code that sets the `ms` to less than 20 - this test fixture: https://github.com/simonw/datasette/blob/4e47a2d894b96854348343374c8e97c9d7055cf6/tests/fixtures.py#L224-L226","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175854982,Research: how much overhead does the n=1 time limit have?, https://github.com/simonw/datasette/issues/1679#issuecomment-1074454687,https://api.github.com/repos/simonw/datasette/issues/1679,1074454687,IC_kwDOBm6k_c5ACuCf,9599,simonw,2022-03-21T21:48:02Z,2022-03-21T21:48:02Z,OWNER,"Here's another microbenchmark that measures how many nanoseconds it takes to run 1,000 vmops: ```python import sqlite3 import time db = sqlite3.connect("":memory:"") i = 0 out = [] def count(): global i i += 1000 out.append(((i, time.perf_counter_ns()))) db.set_progress_handler(count, 1000) print(""Start:"", time.perf_counter_ns()) all = db.execute("""""" with recursive counter(x) as ( select 0 union select x + 1 from counter ) select * from counter limit 10000; """""").fetchall() print(""End:"", time.perf_counter_ns()) print() print(""So how long does it take to execute 1000 ops?"") prev_time_ns = None for i, time_ns in out: if prev_time_ns is not None: print(time_ns - prev_time_ns, ""ns"") prev_time_ns = time_ns ``` Running it: ``` % python nanobench.py Start: 330877620374821 End: 330877632515822 So how long does it take to execute 1000 ops? 47290 ns 49573 ns 48226 ns 45674 ns 53238 ns 47313 ns 52346 ns 48689 ns 47092 ns 87596 ns 69999 ns 52522 ns 52809 ns 53259 ns 52478 ns 53478 ns 65812 ns ``` 87596ns is 0.087596ms - so even a measure rate of every 1000 ops is easily finely grained enough to capture differences of less than 0.1ms. If anything I could bump that default 1000 up - and I can definitely eliminate the `if ms < 50` branch entirely.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175854982,Research: how much overhead does the n=1 time limit have?, https://github.com/simonw/datasette/issues/1679#issuecomment-1074446576,https://api.github.com/repos/simonw/datasette/issues/1679,1074446576,IC_kwDOBm6k_c5ACsDw,9599,simonw,2022-03-21T21:38:27Z,2022-03-21T21:38:27Z,OWNER,"OK here's a microbenchmark script: ```python import sqlite3 import timeit db = sqlite3.connect("":memory:"") db_with_progress_handler_1 = sqlite3.connect("":memory:"") db_with_progress_handler_1000 = sqlite3.connect("":memory:"") db_with_progress_handler_1.set_progress_handler(lambda: None, 1) db_with_progress_handler_1000.set_progress_handler(lambda: None, 1000) def execute_query(db): cursor = db.execute("""""" with recursive counter(x) as ( select 0 union select x + 1 from counter ) select * from counter limit 10000; """""") list(cursor.fetchall()) print(""Without progress_handler"") print(timeit.timeit(lambda: execute_query(db), number=100)) print(""progress_handler every 1000 ops"") print(timeit.timeit(lambda: execute_query(db_with_progress_handler_1000), number=100)) print(""progress_handler every 1 op"") print(timeit.timeit(lambda: execute_query(db_with_progress_handler_1), number=100)) ``` Results: ``` % python3 bench.py Without progress_handler 0.8789225700311363 progress_handler every 1000 ops 0.8829826560104266 progress_handler every 1 op 2.8892734259716235 ``` So running every 1000 ops makes almost no difference at all, but running every single op is a 3.2x performance degradation.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175854982,Research: how much overhead does the n=1 time limit have?, https://github.com/simonw/datasette/issues/1679#issuecomment-1074439309,https://api.github.com/repos/simonw/datasette/issues/1679,1074439309,IC_kwDOBm6k_c5ACqSN,9599,simonw,2022-03-21T21:28:58Z,2022-03-21T21:28:58Z,OWNER,"David Raymond solved it there: https://sqlite.org/forum/forumpost/330c8532d8a88bcd > Don't forget to step through the results. All .execute() has done is prepared it. > > db.execute(query).fetchall() Sure enough, adding that gets the VM steps number up to 190,007 which is close enough that I'm happy.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175854982,Research: how much overhead does the n=1 time limit have?, https://github.com/simonw/datasette/issues/1676#issuecomment-1074378472,https://api.github.com/repos/simonw/datasette/issues/1676,1074378472,IC_kwDOBm6k_c5ACbbo,9599,simonw,2022-03-21T20:18:10Z,2022-03-21T20:18:10Z,OWNER,Maybe there is a better name for this method that helps emphasize its cascading nature.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175690070,"Reconsider ensure_permissions() logic, can it be less confusing?", https://github.com/simonw/datasette/issues/1679#issuecomment-1074347023,https://api.github.com/repos/simonw/datasette/issues/1679,1074347023,IC_kwDOBm6k_c5ACTwP,9599,simonw,2022-03-21T19:48:59Z,2022-03-21T19:48:59Z,OWNER,Posed a question about that here: https://sqlite.org/forum/forumpost/de9ff10fa7,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175854982,Research: how much overhead does the n=1 time limit have?, https://github.com/simonw/datasette/issues/1679#issuecomment-1074341924,https://api.github.com/repos/simonw/datasette/issues/1679,1074341924,IC_kwDOBm6k_c5ACSgk,9599,simonw,2022-03-21T19:42:08Z,2022-03-21T19:42:08Z,OWNER,"Here's the Python-C implementation of `set_progress_handler`: https://github.com/python/cpython/blob/4674fd4e938eb4a29ccd5b12c15455bd2a41c335/Modules/_sqlite/connection.c#L1177-L1201 It calls `sqlite3_progress_handler(self->db, n, progress_callback, ctx);` https://www.sqlite.org/c3ref/progress_handler.html says: > The parameter N is the approximate number of [virtual machine instructions](https://www.sqlite.org/opcode.html) that are evaluated between successive invocations of the callback X So maybe VM-steps and virtual machine instructions are different things?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175854982,Research: how much overhead does the n=1 time limit have?, https://github.com/simonw/datasette/issues/1679#issuecomment-1074337997,https://api.github.com/repos/simonw/datasette/issues/1679,1074337997,IC_kwDOBm6k_c5ACRjN,9599,simonw,2022-03-21T19:37:08Z,2022-03-21T19:37:08Z,OWNER,"This is weird: ```python import sqlite3 db = sqlite3.connect("":memory:"") i = 0 def count(): global i i += 1 db.set_progress_handler(count, 1) db.execute("""""" with recursive counter(x) as ( select 0 union select x + 1 from counter ) select * from counter limit 10000; """""") print(i) ``` Outputs `24`. But if you try the same thing in the SQLite console: ``` sqlite> .stats vmstep sqlite> with recursive counter(x) as ( ...> select 0 ...> union ...> select x + 1 from counter ...> ) ...> select * from counter limit 10000; ... VM-steps: 200007 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175854982,Research: how much overhead does the n=1 time limit have?, https://github.com/simonw/datasette/issues/1679#issuecomment-1074332718,https://api.github.com/repos/simonw/datasette/issues/1679,1074332718,IC_kwDOBm6k_c5ACQQu,9599,simonw,2022-03-21T19:31:10Z,2022-03-21T19:31:10Z,OWNER,How long does it take for SQLite to execute 1000 opcodes anyway?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175854982,Research: how much overhead does the n=1 time limit have?, https://github.com/simonw/datasette/issues/1679#issuecomment-1074332325,https://api.github.com/repos/simonw/datasette/issues/1679,1074332325,IC_kwDOBm6k_c5ACQKl,9599,simonw,2022-03-21T19:30:44Z,2022-03-21T19:30:44Z,OWNER,So it looks like even for facet suggestion `n=1000` always - it's never reduced to `n=1`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175854982,Research: how much overhead does the n=1 time limit have?, https://github.com/simonw/datasette/issues/1679#issuecomment-1074331743,https://api.github.com/repos/simonw/datasette/issues/1679,1074331743,IC_kwDOBm6k_c5ACQBf,9599,simonw,2022-03-21T19:30:05Z,2022-03-21T19:30:05Z,OWNER,"https://github.com/simonw/datasette/blob/1a7750eb29fd15dd2eea3b9f6e33028ce441b143/datasette/app.py#L118-L122 sets it to 50ms for facet suggestion but that's not going to pass `ms < 50`: ```python Setting( ""facet_suggest_time_limit_ms"", 50, ""Time limit for calculating a suggested facet"", ), ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175854982,Research: how much overhead does the n=1 time limit have?, https://github.com/simonw/datasette/issues/1660#issuecomment-1074321862,https://api.github.com/repos/simonw/datasette/issues/1660,1074321862,IC_kwDOBm6k_c5ACNnG,9599,simonw,2022-03-21T19:19:01Z,2022-03-21T19:19:01Z,OWNER,I've simplified this a ton now. I'm going to keep working on this in the long-term but I think this issue can be closed.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170144879,Refactor and simplify Datasette routing and views, https://github.com/simonw/datasette/issues/1678#issuecomment-1074302559,https://api.github.com/repos/simonw/datasette/issues/1678,1074302559,IC_kwDOBm6k_c5ACI5f,9599,simonw,2022-03-21T19:04:03Z,2022-03-21T19:04:03Z,OWNER,Documentation: https://docs.datasette.io/en/latest/internals.html#await-check-visibility-actor-action-resource-none,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175715988,Make `check_visibility()` a documented API, https://github.com/simonw/datasette/issues/1660#issuecomment-1074287177,https://api.github.com/repos/simonw/datasette/issues/1660,1074287177,IC_kwDOBm6k_c5ACFJJ,9599,simonw,2022-03-21T18:51:42Z,2022-03-21T18:51:42Z,OWNER,`BaseView` is looking a LOT slimmer now that I've moved all of the permissions stuff out of it.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170144879,Refactor and simplify Datasette routing and views, https://github.com/simonw/sqlite-utils/issues/417#issuecomment-1074243540,https://api.github.com/repos/simonw/sqlite-utils/issues/417,1074243540,IC_kwDOCGYnMM5AB6fU,9599,simonw,2022-03-21T18:08:03Z,2022-03-21T18:08:03Z,OWNER,"I've not really thought about standards as much here as I should. It looks like there are two competing specs for newline-delimited JSON! http://ndjson.org/ is the one I've been using in `sqlite-utils` - and https://github.com/ndjson/ndjson-spec#31-serialization says: > The JSON texts MUST NOT contain newlines or carriage returns. https://jsonlines.org/ is the other one. It is slightly less clear, but it does say this: > 2. Each Line is a Valid JSON Value > > The most common values will be objects or arrays, but any JSON value is permitted. My interpretation of both of these is that newlines in the middle of a JSON object shouldn't be allowed. So what's `jq` doing here? It looks to me like that `jq` format is its own thing - it's not actually compatible with either of those two loose specs described above. The `jq` docs seem to call this ""whitespace-separated JSON"": https://stedolan.github.io/jq/manual/v1.6/#Invokingjq The thing I like about newline-delimited JSON is that it's really trivial to parse - loop through each line, run it through `json.loads()` and that's it. No need to try and unwrap JSON objects that might span multiple lines. Unless someone has written a robust Python implementation of a `jq`-compatible whitespace-separated JSON parser, I'm inclined to leave this as is. I'd be fine adding some documentation that helps point people towards `jq -c` though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175744654,insert fails on JSONL with whitespace, https://github.com/simonw/datasette/issues/1677#issuecomment-1074184240,https://api.github.com/repos/simonw/datasette/issues/1677,1074184240,IC_kwDOBm6k_c5ABsAw,9599,simonw,2022-03-21T17:20:17Z,2022-03-21T17:20:17Z,OWNER,"https://github.com/simonw/datasette/blob/e627510b760198ccedba9e5af47a771e847785c9/datasette/views/base.py#L69-L77 This is weirdly different from how `check_permissions()` used to work, in that it doesn't differentiate between `None` and `False`. https://github.com/simonw/datasette/blob/4a4164b81191dec35e423486a208b05a9edc65e4/datasette/views/base.py#L79-L103","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175694248,Remove `check_permission()` from `BaseView`, https://github.com/simonw/datasette/issues/1676#issuecomment-1074180312,https://api.github.com/repos/simonw/datasette/issues/1676,1074180312,IC_kwDOBm6k_c5ABrDY,9599,simonw,2022-03-21T17:16:45Z,2022-03-21T17:16:45Z,OWNER,"When looking at this code earlier I assumed that the following would check each permission in turn and fail if any of them failed: ```python await self.ds.ensure_permissions( request.actor, [ (""view-table"", (database, table)), (""view-database"", database), ""view-instance"", ] ) ``` But it's not quite that simple: if any of them fail, it fails... but if an earlier one returns `True` the whole stack passes even if there would have been a failure later on! If that is indeed the right abstraction, I need to work to make the documentation as clear as possible.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175690070,"Reconsider ensure_permissions() logic, can it be less confusing?", https://github.com/simonw/datasette/issues/1676#issuecomment-1074178865,https://api.github.com/repos/simonw/datasette/issues/1676,1074178865,IC_kwDOBm6k_c5ABqsx,9599,simonw,2022-03-21T17:15:27Z,2022-03-21T17:15:27Z,OWNER,This method here: https://github.com/simonw/datasette/blob/e627510b760198ccedba9e5af47a771e847785c9/datasette/app.py#L632-L664,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175690070,"Reconsider ensure_permissions() logic, can it be less confusing?", https://github.com/simonw/datasette/issues/1675#issuecomment-1074177827,https://api.github.com/repos/simonw/datasette/issues/1675,1074177827,IC_kwDOBm6k_c5ABqcj,9599,simonw,2022-03-21T17:14:31Z,2022-03-21T17:14:31Z,OWNER,"Updated documentation: https://github.com/simonw/datasette/blob/e627510b760198ccedba9e5af47a771e847785c9/docs/internals.rst#await-ensure_permissionsactor-permissions > This method allows multiple permissions to be checked at onced. It raises a `datasette.Forbidden` exception if any of the checks are denied before one of them is explicitly granted. > > This is useful when you need to check multiple permissions at once. For example, an actor should be able to view a table if either one of the following checks returns `True` or not a single one of them returns `False`: That's pretty hard to understand! I'm going to open a separate issue to reconsider if this is a useful enough abstraction given how confusing it is.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175648453,Extract out `check_permissions()` from `BaseView, https://github.com/simonw/datasette/issues/1675#issuecomment-1074161523,https://api.github.com/repos/simonw/datasette/issues/1675,1074161523,IC_kwDOBm6k_c5ABmdz,9599,simonw,2022-03-21T16:59:55Z,2022-03-21T17:00:03Z,OWNER,Also calling that function `permissions_allowed()` is confusing because there is a plugin hook with a similar name already: https://docs.datasette.io/en/stable/plugin_hooks.html#permission-allowed-datasette-actor-action-resource,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175648453,Extract out `check_permissions()` from `BaseView, https://github.com/simonw/datasette/issues/1675#issuecomment-1074158890,https://api.github.com/repos/simonw/datasette/issues/1675,1074158890,IC_kwDOBm6k_c5ABl0q,9599,simonw,2022-03-21T16:57:15Z,2022-03-21T16:57:15Z,OWNER,"Idea: `ds.permission_allowed()` continues to just return `True` or `False`. A new `ds.ensure_permissions(...)` method is added which raises a `Forbidden` exception if a check fails (hence the different name)`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175648453,Extract out `check_permissions()` from `BaseView, https://github.com/simonw/datasette/issues/1675#issuecomment-1074156779,https://api.github.com/repos/simonw/datasette/issues/1675,1074156779,IC_kwDOBm6k_c5ABlTr,9599,simonw,2022-03-21T16:55:08Z,2022-03-21T16:56:02Z,OWNER,"One benefit of the current design of `check_permissions` that raises an exception is that the exception includes information on WHICH of the permission checks failed. Returning just `True` or `False` loses that information. I could return an object which evaluates to `False` but also carries extra information? Bit weird, I've never seen anything like that in other Python code.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175648453,Extract out `check_permissions()` from `BaseView, https://github.com/simonw/datasette/issues/1675#issuecomment-1074143209,https://api.github.com/repos/simonw/datasette/issues/1675,1074143209,IC_kwDOBm6k_c5ABh_p,9599,simonw,2022-03-21T16:46:05Z,2022-03-21T16:46:05Z,OWNER,"The other difference though is that `ds.permission_allowed(...)` works against an actor, while `check_permission()` works against a request (though just to access `request.actor`).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175648453,Extract out `check_permissions()` from `BaseView, https://github.com/simonw/datasette/issues/1675#issuecomment-1074142617,https://api.github.com/repos/simonw/datasette/issues/1675,1074142617,IC_kwDOBm6k_c5ABh2Z,9599,simonw,2022-03-21T16:45:27Z,2022-03-21T16:45:27Z,OWNER,"Though at that point `check_permission` is such a light wrapper around `self.ds.permission_allowed()` that there's little point in it existing at all. So maybe `check_permisions()` becomes `ds.permissions_allowed()`. `permission_allowed()` v.s. `permissions_allowed()` is a bit of a subtle naming difference, but I think it works.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175648453,Extract out `check_permissions()` from `BaseView, https://github.com/simonw/datasette/issues/1675#issuecomment-1074141457,https://api.github.com/repos/simonw/datasette/issues/1675,1074141457,IC_kwDOBm6k_c5ABhkR,9599,simonw,2022-03-21T16:44:09Z,2022-03-21T16:44:09Z,OWNER,"A slightly odd thing about these methods is that they either fail silently or they raise a `Forbidden` exception. Maybe they should instead return `True` or `False` and the calling code could decide if it wants to raise the exception? That would make them more usable and a little less surprising.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1175648453,Extract out `check_permissions()` from `BaseView, https://github.com/simonw/datasette/issues/1660#issuecomment-1074136176,https://api.github.com/repos/simonw/datasette/issues/1660,1074136176,IC_kwDOBm6k_c5ABgRw,9599,simonw,2022-03-21T16:38:46Z,2022-03-21T16:38:46Z,OWNER,"I'm going to refactor this stuff out and document it so it can be easily used by plugins: https://github.com/simonw/datasette/blob/4a4164b81191dec35e423486a208b05a9edc65e4/datasette/views/base.py#L69-L103","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170144879,Refactor and simplify Datasette routing and views, https://github.com/simonw/datasette/issues/526#issuecomment-1074019047,https://api.github.com/repos/simonw/datasette/issues/526,1074019047,IC_kwDOBm6k_c5ABDrn,9599,simonw,2022-03-21T15:09:56Z,2022-03-21T15:09:56Z,OWNER,I should research how much overhead creating a new connection costs - it may be that an easy way to solve this is to create A dedicated connection for the query and then close that connection at the end.,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",459882902,Stream all results for arbitrary SQL and canned queries, https://github.com/simonw/datasette/issues/1177#issuecomment-1074017633,https://api.github.com/repos/simonw/datasette/issues/1177,1074017633,IC_kwDOBm6k_c5ABDVh,9599,simonw,2022-03-21T15:08:51Z,2022-03-21T15:08:51Z,OWNER,"Related: - #1062 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",780153562,Ability to stream all rows as newline-delimited JSON, https://github.com/simonw/sqlite-utils/issues/415#issuecomment-1073468996,https://api.github.com/repos/simonw/sqlite-utils/issues/415,1073468996,IC_kwDOCGYnMM4_-9ZE,9599,simonw,2022-03-21T04:14:42Z,2022-03-21T04:14:42Z,OWNER,"I can fix this like so: ``` % sqlite-utils convert demo.db demo foo '{""foo"": ""bar""}' --multi --dry-run abc --- becomes: {""foo"": ""bar""} Would affect 1 row ``` Diff is this: ```diff diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py index 0cf0468..b2a0440 100644 --- a/sqlite_utils/cli.py +++ b/sqlite_utils/cli.py @@ -2676,7 +2676,10 @@ def convert( raise click.ClickException(str(e)) if dry_run: # Pull first 20 values for first column and preview them - db.conn.create_function(""preview_transform"", 1, lambda v: fn(v) if v else v) + preview = lambda v: fn(v) if v else v + if multi: + preview = lambda v: json.dumps(fn(v), default=repr) if v else v + db.conn.create_function(""preview_transform"", 1, preview) sql = """""" select [{column}] as value, ```","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1171599874,Convert with `--multi` and `--dry-run` flag does not work, https://github.com/simonw/sqlite-utils/issues/415#issuecomment-1073463375,https://api.github.com/repos/simonw/sqlite-utils/issues/415,1073463375,IC_kwDOCGYnMM4_-8BP,9599,simonw,2022-03-21T04:02:36Z,2022-03-21T04:02:36Z,OWNER,Thanks for the really clear steps to reproduce!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1171599874,Convert with `--multi` and `--dry-run` flag does not work, https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073456222,https://api.github.com/repos/simonw/sqlite-utils/issues/416,1073456222,IC_kwDOCGYnMM4_-6Re,9599,simonw,2022-03-21T03:45:52Z,2022-03-21T03:45:52Z,OWNER,Needs tests and documentation.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173023272,Options for how `r.parsedate()` should handle invalid dates, https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073456155,https://api.github.com/repos/simonw/sqlite-utils/issues/416,1073456155,IC_kwDOCGYnMM4_-6Qb,9599,simonw,2022-03-21T03:45:37Z,2022-03-21T03:45:37Z,OWNER,"Prototype: ```diff diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py index 8255b56..0a3693e 100644 --- a/sqlite_utils/cli.py +++ b/sqlite_utils/cli.py @@ -2583,7 +2583,11 @@ def _generate_convert_help(): """""" ).strip() recipe_names = [ - n for n in dir(recipes) if not n.startswith(""_"") and n not in (""json"", ""parser"") + n + for n in dir(recipes) + if not n.startswith(""_"") + and n not in (""json"", ""parser"") + and callable(getattr(recipes, n)) ] for name in recipe_names: fn = getattr(recipes, name) diff --git a/sqlite_utils/recipes.py b/sqlite_utils/recipes.py index 6918661..569c30d 100644 --- a/sqlite_utils/recipes.py +++ b/sqlite_utils/recipes.py @@ -1,17 +1,38 @@ from dateutil import parser import json +IGNORE = object() +SET_NULL = object() -def parsedate(value, dayfirst=False, yearfirst=False): + +def parsedate(value, dayfirst=False, yearfirst=False, errors=None): ""Parse a date and convert it to ISO date format: yyyy-mm-dd"" - return ( - parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).date().isoformat() - ) + try: + return ( + parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst) + .date() + .isoformat() + ) + except parser.ParserError: + if errors is IGNORE: + return value + elif errors is SET_NULL: + return None + else: + raise -def parsedatetime(value, dayfirst=False, yearfirst=False): +def parsedatetime(value, dayfirst=False, yearfirst=False, errors=None): ""Parse a datetime and convert it to ISO datetime format: yyyy-mm-ddTHH:MM:SS"" - return parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).isoformat() + try: + return parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).isoformat() + except parser.ParserError: + if errors is IGNORE: + return value + elif errors is SET_NULL: + return None + else: + raise def jsonsplit(value, delimiter="","", type=str): ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173023272,Options for how `r.parsedate()` should handle invalid dates, https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073455905,https://api.github.com/repos/simonw/sqlite-utils/issues/416,1073455905,IC_kwDOCGYnMM4_-6Mh,9599,simonw,2022-03-21T03:44:47Z,2022-03-21T03:45:00Z,OWNER,"This is quite nice: ``` % sqlite-utils convert test-dates.db dates date ""r.parsedate(value, errors=r.IGNORE)"" [####################################] 100% % sqlite-utils rows test-dates.db dates [{""id"": 1, ""date"": ""2016-03-15""}, {""id"": 2, ""date"": ""2016-03-16""}, {""id"": 3, ""date"": ""2016-03-17""}, {""id"": 4, ""date"": ""2016-03-18""}, {""id"": 5, ""date"": ""2016-03-19""}, {""id"": 6, ""date"": ""2016-03-20""}, {""id"": 7, ""date"": ""2016-03-21""}, {""id"": 8, ""date"": ""2016-03-22""}, {""id"": 9, ""date"": ""2016-03-23""}, {""id"": 10, ""date"": ""//""}, {""id"": 11, ""date"": ""2016-03-25""}, {""id"": 12, ""date"": ""2016-03-26""}, {""id"": 13, ""date"": ""2016-03-27""}, {""id"": 14, ""date"": ""2016-03-28""}, {""id"": 15, ""date"": ""2016-03-29""}, {""id"": 16, ""date"": ""2016-03-30""}, {""id"": 17, ""date"": ""2016-03-31""}, {""id"": 18, ""date"": ""2016-04-01""}] % sqlite-utils convert test-dates.db dates date ""r.parsedate(value, errors=r.SET_NULL)"" [####################################] 100% % sqlite-utils rows test-dates.db dates [{""id"": 1, ""date"": ""2016-03-15""}, {""id"": 2, ""date"": ""2016-03-16""}, {""id"": 3, ""date"": ""2016-03-17""}, {""id"": 4, ""date"": ""2016-03-18""}, {""id"": 5, ""date"": ""2016-03-19""}, {""id"": 6, ""date"": ""2016-03-20""}, {""id"": 7, ""date"": ""2016-03-21""}, {""id"": 8, ""date"": ""2016-03-22""}, {""id"": 9, ""date"": ""2016-03-23""}, {""id"": 10, ""date"": null}, {""id"": 11, ""date"": ""2016-03-25""}, {""id"": 12, ""date"": ""2016-03-26""}, {""id"": 13, ""date"": ""2016-03-27""}, {""id"": 14, ""date"": ""2016-03-28""}, {""id"": 15, ""date"": ""2016-03-29""}, {""id"": 16, ""date"": ""2016-03-30""}, {""id"": 17, ""date"": ""2016-03-31""}, {""id"": 18, ""date"": ""2016-04-01""}] ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173023272,Options for how `r.parsedate()` should handle invalid dates, https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073453370,https://api.github.com/repos/simonw/sqlite-utils/issues/416,1073453370,IC_kwDOCGYnMM4_-5k6,9599,simonw,2022-03-21T03:41:06Z,2022-03-21T03:41:06Z,OWNER,I'm going to try the `errors=r.IGNORE` option and see what that looks like once implemented.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173023272,Options for how `r.parsedate()` should handle invalid dates, https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073453230,https://api.github.com/repos/simonw/sqlite-utils/issues/416,1073453230,IC_kwDOCGYnMM4_-5iu,9599,simonw,2022-03-21T03:40:37Z,2022-03-21T03:40:37Z,OWNER,"I think the options here should be: - On error, raise an exception and revert the transaction (the current default) - On error, leave the value as-is - On error, set the value to `None` These need to be indicated by parameters to the `r.parsedate()` function. Some design options: - `ignore=True` to ignore errors - but how does it know if it should leave the value or set it to `None`? This is similar to other `ignore=True` parameters elsewhere in the Python API. - `errors=""ignore""`, `errors=""set-null""` - I don't like magic string values very much, but this is similar to Python's `str.encode(errors=)` mechanism - `errors=r.IGNORE` - using constants, which at least avoids magic strings. The other one could be `errors=r.SET_NULL` - `error=lambda v: None` or `error=lambda v: v` - this is a bit confusing though, introducing another callback that gets to have a go at converting the error if the first callback failed? And what happens if that lambda itself raises an error?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173023272,Options for how `r.parsedate()` should handle invalid dates, https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073451659,https://api.github.com/repos/simonw/sqlite-utils/issues/416,1073451659,IC_kwDOCGYnMM4_-5KL,9599,simonw,2022-03-21T03:35:01Z,2022-03-21T03:35:01Z,OWNER,"I confirmed that if it fails for any value ALL values are left alone, since it runs in a transaction. Here's the code that does that: https://github.com/simonw/sqlite-utils/blob/433813612ff9b4b501739fd7543bef0040dd51fe/sqlite_utils/db.py#L2523-L2526","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173023272,Options for how `r.parsedate()` should handle invalid dates, https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073450588,https://api.github.com/repos/simonw/sqlite-utils/issues/416,1073450588,IC_kwDOCGYnMM4_-45c,9599,simonw,2022-03-21T03:32:58Z,2022-03-21T03:32:58Z,OWNER,"Then I ran this to convert `2016-03-27` etc to `2016/03/27` so I could see which ones were later converted: sqlite-utils convert test-dates.db dates date 'value.replace(""-"", ""/"")' ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173023272,Options for how `r.parsedate()` should handle invalid dates, https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073448904,https://api.github.com/repos/simonw/sqlite-utils/issues/416,1073448904,IC_kwDOCGYnMM4_-4fI,9599,simonw,2022-03-21T03:28:12Z,2022-03-21T03:30:37Z,OWNER,"Generating a test database using a pattern from https://www.geekytidbits.com/date-range-table-sqlite/ ``` sqlite-utils create-database test-dates.db sqlite-utils create-table test-dates.db dates id integer date text --pk id sqlite-utils test-dates.db ""WITH RECURSIVE cnt(x) AS ( SELECT 0 UNION ALL SELECT x+1 FROM cnt LIMIT (SELECT ((julianday('2016-04-01') - julianday('2016-03-15'))) + 1) ) insert into dates (date) select date(julianday('2016-03-15'), '+' || x || ' days') as date FROM cnt;"" ``` After running that: ``` % sqlite-utils rows test-dates.db dates [{""id"": 1, ""date"": ""2016-03-15""}, {""id"": 2, ""date"": ""2016-03-16""}, {""id"": 3, ""date"": ""2016-03-17""}, {""id"": 4, ""date"": ""2016-03-18""}, {""id"": 5, ""date"": ""2016-03-19""}, {""id"": 6, ""date"": ""2016-03-20""}, {""id"": 7, ""date"": ""2016-03-21""}, {""id"": 8, ""date"": ""2016-03-22""}, {""id"": 9, ""date"": ""2016-03-23""}, {""id"": 10, ""date"": ""2016-03-24""}, {""id"": 11, ""date"": ""2016-03-25""}, {""id"": 12, ""date"": ""2016-03-26""}, {""id"": 13, ""date"": ""2016-03-27""}, {""id"": 14, ""date"": ""2016-03-28""}, {""id"": 15, ""date"": ""2016-03-29""}, {""id"": 16, ""date"": ""2016-03-30""}, {""id"": 17, ""date"": ""2016-03-31""}, {""id"": 18, ""date"": ""2016-04-01""}] ``` Then to make one of them invalid: sqlite-utils test-dates.db ""update dates set date = '//' where id = 10""","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173023272,Options for how `r.parsedate()` should handle invalid dates, https://github.com/simonw/datasette/issues/1510#issuecomment-1073366630,https://api.github.com/repos/simonw/datasette/issues/1510,1073366630,IC_kwDOBm6k_c4_-kZm,9599,simonw,2022-03-20T22:59:33Z,2022-03-20T22:59:33Z,OWNER,"I really like the idea of making this effectively the same thing as the fully documented, stable JSON API that comes as part of 1.0. If you want to know what will be available to your templates, consult the API documentation.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1054244712,Datasette 1.0 documented template context (maybe via API docs), https://github.com/simonw/datasette/issues/1674#issuecomment-1073366436,https://api.github.com/repos/simonw/datasette/issues/1674,1073366436,IC_kwDOBm6k_c4_-kWk,9599,simonw,2022-03-20T22:58:40Z,2022-03-20T22:58:40Z,OWNER,"This will probably happen as part of turning this into an officially documented API that serves the template context for the homepage: - #1510","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174717287,Tweak design of /.json, https://github.com/simonw/datasette/issues/1355#issuecomment-1073362979,https://api.github.com/repos/simonw/datasette/issues/1355,1073362979,IC_kwDOBm6k_c4_-jgj,9599,simonw,2022-03-20T22:38:53Z,2022-03-20T22:38:53Z,OWNER,"Built a research prototype: ```diff diff --git a/datasette/app.py b/datasette/app.py index 5c8101a..5cd3e63 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -1,6 +1,7 @@ import asyncio import asgi_csrf import collections +import contextlib import datetime import functools import glob @@ -1490,3 +1491,11 @@ class DatasetteClient: return await client.request( method, self._fix(path, avoid_path_rewrites), **kwargs ) + + @contextlib.asynccontextmanager + async def stream(self, method, path, **kwargs): + async with httpx.AsyncClient(app=self.app) as client: + print(""async with as client"") + async with client.stream(method, self._fix(path), **kwargs) as response: + print(""async with client.stream about to yield response"") + yield response diff --git a/datasette/cli.py b/datasette/cli.py index 3c6e1b2..3025ead 100644 --- a/datasette/cli.py +++ b/datasette/cli.py @@ -585,11 +585,19 @@ def serve( asyncio.get_event_loop().run_until_complete(check_databases(ds)) if get: - client = TestClient(ds) - response = client.get(get) - click.echo(response.text) - exit_code = 0 if response.status == 200 else 1 - sys.exit(exit_code) + + async def _run_get(): + print(""_run_get"") + async with ds.client.stream(""GET"", get) as response: + print(""Got response:"", response) + async for chunk in response.aiter_bytes(chunk_size=1024): + print("" chunk"") + sys.stdout.buffer.write(chunk) + sys.stdout.buffer.flush() + exit_code = 0 if response.status_code == 200 else 1 + sys.exit(exit_code) + + asyncio.get_event_loop().run_until_complete(_run_get()) return # Start the server ``` But for some reason it didn't appear to stream out the response - it would print this out: ``` % datasette covid.db --get '/covid/ny_times_us_counties.csv?_size=10&_stream=on' _run_get async with as client ``` And then hang. I would expect it to start printing out chunks of CSV data here, but instead it looks like it waited for everything to be generated before returning anything to the console. No idea why. I dropped this for the moment.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",910088936,datasette --get should efficiently handle streaming CSV, https://github.com/simonw/datasette/issues/1673#issuecomment-1073361986,https://api.github.com/repos/simonw/datasette/issues/1673,1073361986,IC_kwDOBm6k_c4_-jRC,9599,simonw,2022-03-20T22:31:41Z,2022-03-20T22:34:06Z,OWNER,"Maybe it's because `supports_table_xinfo()` creates a brand new in-memory SQLite connection every time you call it? https://github.com/simonw/datasette/blob/798f075ef9b98819fdb564f9f79c78975a0f71e8/datasette/utils/sqlite.py#L22-L35 Actually no, I'm caching that already: https://github.com/simonw/datasette/blob/798f075ef9b98819fdb564f9f79c78975a0f71e8/datasette/utils/sqlite.py#L12-L19","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174708375,Streaming CSV spends a lot of time in `table_column_details`, https://github.com/simonw/datasette/issues/1672#issuecomment-1073355818,https://api.github.com/repos/simonw/datasette/issues/1672,1073355818,IC_kwDOBm6k_c4_-hwq,9599,simonw,2022-03-20T21:52:38Z,2022-03-20T21:52:38Z,OWNER,"That means taking on these issues: - https://github.com/simonw/datasette/issues/1101 - https://github.com/simonw/datasette/issues/1096 - https://github.com/simonw/datasette/issues/1062","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174697144,Refactor CSV handling code out of DataView, https://github.com/simonw/datasette/issues/1660#issuecomment-1073355032,https://api.github.com/repos/simonw/datasette/issues/1660,1073355032,IC_kwDOBm6k_c4_-hkY,9599,simonw,2022-03-20T21:46:43Z,2022-03-20T21:46:43Z,OWNER,I think the way to get rid of most of the remaining complexity in `DataView` is to refactor how CSV stuff works - pulling it in line with other export factors and extracting the streaming mechanism. Opening a fresh issue for that.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170144879,Refactor and simplify Datasette routing and views, https://github.com/simonw/sqlite-utils/issues/140#issuecomment-1073330388,https://api.github.com/repos/simonw/sqlite-utils/issues/140,1073330388,IC_kwDOCGYnMM4_-bjU,9599,simonw,2022-03-20T19:44:39Z,2022-03-20T19:45:45Z,OWNER,"Alternative idea for specifying types: accept a Python expression, then use Python type literal syntax. For example: ``` sqlite-utils insert-files gifs.db images *.gif \ -c path -c md5 -c last_modified:mtime \ -a file_type '""gif""' ``` Where `-a` indicates an additional column.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688351054,Idea: insert-files mechanism for adding extra columns with fixed values, https://github.com/simonw/datasette/issues/1669#issuecomment-1073143413,https://api.github.com/repos/simonw/datasette/issues/1669,1073143413,IC_kwDOBm6k_c4_9t51,9599,simonw,2022-03-20T01:24:36Z,2022-03-20T01:24:36Z,OWNER,https://github.com/simonw/datasette/releases/tag/0.61a0,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174404647,Release 0.61 alpha, https://github.com/simonw/datasette/issues/1669#issuecomment-1073137170,https://api.github.com/repos/simonw/datasette/issues/1669,1073137170,IC_kwDOBm6k_c4_9sYS,9599,simonw,2022-03-20T00:35:52Z,2022-03-20T00:35:52Z,OWNER,https://github.com/simonw/datasette/compare/0.60.2...main,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174404647,Release 0.61 alpha, https://github.com/simonw/datasette/issues/1668#issuecomment-1073136896,https://api.github.com/repos/simonw/datasette/issues/1668,1073136896,IC_kwDOBm6k_c4_9sUA,9599,simonw,2022-03-20T00:33:23Z,2022-03-20T00:33:23Z,OWNER,I'm going to release this as a 0.61 alpha so I can more easily depend on it from `datasette-hashed-urls`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073136686,https://api.github.com/repos/simonw/datasette/issues/1668,1073136686,IC_kwDOBm6k_c4_9sQu,9599,simonw,2022-03-20T00:31:13Z,2022-03-20T00:31:13Z,OWNER,"That demo is now live: - https://latest.datasette.io/alternative-route - https://latest.datasette.io/alternative-route/attraction_characteristic","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073135433,https://api.github.com/repos/simonw/datasette/issues/1668,1073135433,IC_kwDOBm6k_c4_9r9J,9599,simonw,2022-03-20T00:20:36Z,2022-03-20T00:20:36Z,OWNER,"Building this plugin instantly revealed that all of the links - on the homepage and the database page and so on - are incorrect: ```python from datasette import hookimpl @hookimpl def startup(datasette): db = datasette.get_database(""fixtures2"") db.route = ""alternative-route"" ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073134816,https://api.github.com/repos/simonw/datasette/issues/1668,1073134816,IC_kwDOBm6k_c4_9rzg,9599,simonw,2022-03-20T00:16:22Z,2022-03-20T00:16:22Z,OWNER,I'm going to add a `fixtures2.db` database which has that as the name but `alternative-route` as the route. I'll set that up using a custom plugin in the `plugins/` folder that gets deployed by https://github.com/simonw/datasette/blob/main/.github/workflows/deploy-latest.yml,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073134206,https://api.github.com/repos/simonw/datasette/issues/1668,1073134206,IC_kwDOBm6k_c4_9rp-,9599,simonw,2022-03-20T00:12:03Z,2022-03-20T00:12:03Z,OWNER,I'd like to have a live demo of this up on `latest.datasette.io` too.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073126264,https://api.github.com/repos/simonw/datasette/issues/1668,1073126264,IC_kwDOBm6k_c4_9pt4,9599,simonw,2022-03-19T22:59:30Z,2022-03-19T22:59:30Z,OWNER,"Also need to update the `datasette.urls` methods that construct the URL to a database/table/row - they take the database name but they need to know to look for the route. Need to add tests that check the links in the HTML and can confirm this is working correctly.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073125334,https://api.github.com/repos/simonw/datasette/issues/1668,1073125334,IC_kwDOBm6k_c4_9pfW,9599,simonw,2022-03-19T22:53:55Z,2022-03-19T22:53:55Z,OWNER,"Need to update documentation in a few places - e.g. https://docs.datasette.io/en/stable/internals.html#remove-database-name > This removes a database that has been previously added. `name=` is the unique name of that database, used in its URL path.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073112104,https://api.github.com/repos/simonw/datasette/issues/1668,1073112104,IC_kwDOBm6k_c4_9mQo,9599,simonw,2022-03-19T21:08:21Z,2022-03-19T21:08:21Z,OWNER,"I think I've got this working but I need to write a test for it that covers the rare case when the route is not the same thing as the database name. I'll do that with a new test.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073097394,https://api.github.com/repos/simonw/datasette/issues/1668,1073097394,IC_kwDOBm6k_c4_9iqy,9599,simonw,2022-03-19T20:56:35Z,2022-03-19T20:56:35Z,OWNER,"I'm trying to think if there's any reason not to use `route` for this. Would I possibly want to use that noun for something else in the future? I like it more than `route_path` because it has no underscore. Decision made: I'm going with `route`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1667#issuecomment-1073076624,https://api.github.com/repos/simonw/datasette/issues/1667,1073076624,IC_kwDOBm6k_c4_9dmQ,9599,simonw,2022-03-19T20:31:44Z,2022-03-19T20:31:44Z,OWNER,I can now read `format` from `request.url_vars` and delete this code entirely: https://github.com/simonw/datasette/blob/b9c2b1cfc8692b9700416db98721fa3ec982f6be/datasette/views/base.py#L375-L381,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174302994,Make route matched pattern groups more consistent, https://github.com/simonw/datasette/issues/1668#issuecomment-1073076187,https://api.github.com/repos/simonw/datasette/issues/1668,1073076187,IC_kwDOBm6k_c4_9dfb,9599,simonw,2022-03-19T20:28:20Z,2022-03-19T20:28:20Z,OWNER,I'm going to keep `path` as the path to the file on disk. I'll pick a new name for what is currently `path` in that undocumented JSON API.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073076136,https://api.github.com/repos/simonw/datasette/issues/1668,1073076136,IC_kwDOBm6k_c4_9deo,9599,simonw,2022-03-19T20:27:44Z,2022-03-19T20:27:44Z,OWNER,"Pretty sure changing it will break some existing plugins though, including likely Datasette Desktop.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073076110,https://api.github.com/repos/simonw/datasette/issues/1668,1073076110,IC_kwDOBm6k_c4_9deO,9599,simonw,2022-03-19T20:27:22Z,2022-03-19T20:27:22Z,OWNER,"The docs do currently describe `path` as the filesystem path here: https://docs.datasette.io/en/stable/internals.html#database-class Good thing I'm not at 1.0 yet so I can change that!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073076015,https://api.github.com/repos/simonw/datasette/issues/1668,1073076015,IC_kwDOBm6k_c4_9dcv,9599,simonw,2022-03-19T20:26:32Z,2022-03-19T20:26:32Z,OWNER,I'm inclined to redefine `ds.path` to `ds.file_path` to fix this. Or `ds.filepath`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073075913,https://api.github.com/repos/simonw/datasette/issues/1668,1073075913,IC_kwDOBm6k_c4_9dbJ,9599,simonw,2022-03-19T20:25:46Z,2022-03-19T20:26:08Z,OWNER,"The output of `/.json` DOES use `path` to mean the URL path, not the path to the file on disk: ``` { ""fixtures.dot"": { ""name"": ""fixtures.dot"", ""hash"": null, ""color"": ""631f11"", ""path"": ""/fixtures~2Edot"", ``` So that's a problem already: having `db.path` refer to something different from that JSON is inconsistent.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073075697,https://api.github.com/repos/simonw/datasette/issues/1668,1073075697,IC_kwDOBm6k_c4_9dXx,9599,simonw,2022-03-19T20:24:06Z,2022-03-19T20:24:06Z,OWNER,"Right now if a database has a `.` in its name e.g. `fixtures.dot` the URL to that database is: /fixtures~2Edot But the output on `/-/databases` doesn't reflect that, it still shows the name with the dot.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1660#issuecomment-1073073599,https://api.github.com/repos/simonw/datasette/issues/1660,1073073599,IC_kwDOBm6k_c4_9c2_,9599,simonw,2022-03-19T20:06:40Z,2022-03-19T20:06:40Z,OWNER,"This blocks: - #1668","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170144879,Refactor and simplify Datasette routing and views, https://github.com/simonw/datasette/issues/1668#issuecomment-1073073579,https://api.github.com/repos/simonw/datasette/issues/1668,1073073579,IC_kwDOBm6k_c4_9c2r,9599,simonw,2022-03-19T20:06:27Z,2022-03-19T20:06:27Z,OWNER,Marking this as blocked until #1660 is done.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073073547,https://api.github.com/repos/simonw/datasette/issues/1668,1073073547,IC_kwDOBm6k_c4_9c2L,9599,simonw,2022-03-19T20:06:07Z,2022-03-19T20:06:07Z,OWNER,"Implementing this is a little tricky because there's a whole lot of code that expects the `database` captured by the URL routing to be the name used to look up the database in `datasette.databases` - or via `.get_database()`. The `DataView.get()` method is a good example of the trickyness here. It even has code that dispatches out to plugin hooks that take `database` as a parameter. https://github.com/simonw/datasette/blob/61419388c134001118aaf7dfb913562d467d7913/datasette/views/base.py#L383-L555 All the more reason to get rid of that `BaseView -> DataView -> TableView` hierarchy entirely: - #1660","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073043433,https://api.github.com/repos/simonw/datasette/issues/1668,1073043433,IC_kwDOBm6k_c4_9Vfp,9599,simonw,2022-03-19T16:54:55Z,2022-03-19T20:01:19Z,OWNER,"Options: - `route_path` - `url_path` - `route` I like `route_path`, or maybe `route`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073043713,https://api.github.com/repos/simonw/datasette/issues/1668,1073043713,IC_kwDOBm6k_c4_9VkB,9599,simonw,2022-03-19T16:56:19Z,2022-03-19T16:56:19Z,OWNER,"Worth noting that the `name` right now is picked automatically to avoid conflicts: https://github.com/simonw/datasette/blob/61419388c134001118aaf7dfb913562d467d7913/datasette/app.py#L397-L413","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1668#issuecomment-1073043350,https://api.github.com/repos/simonw/datasette/issues/1668,1073043350,IC_kwDOBm6k_c4_9VeW,9599,simonw,2022-03-19T16:54:26Z,2022-03-19T16:54:26Z,OWNER,"The `Database` class already has a `path` property but it means something else - it's the path to the `.db` file on disk: https://github.com/simonw/datasette/blob/61419388c134001118aaf7dfb913562d467d7913/datasette/database.py#L29-L50 So need a different name for the path-that-is-used-in-the-URL.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174306154,"Introduce concept of a database `route`, separate from its name", https://github.com/simonw/datasette/issues/1667#issuecomment-1073042554,https://api.github.com/repos/simonw/datasette/issues/1667,1073042554,IC_kwDOBm6k_c4_9VR6,9599,simonw,2022-03-19T16:50:01Z,2022-03-19T16:52:35Z,OWNER,"OK, I've made this more consistent - I still need to address the fact that `format` can be `.json` or `json` or not used at all before I close this issue. https://github.com/simonw/datasette/blob/61419388c134001118aaf7dfb913562d467d7913/tests/test_routes.py#L15-L35","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174302994,Make route matched pattern groups more consistent, https://github.com/simonw/datasette/issues/1667#issuecomment-1073040072,https://api.github.com/repos/simonw/datasette/issues/1667,1073040072,IC_kwDOBm6k_c4_9UrI,9599,simonw,2022-03-19T16:34:02Z,2022-03-19T16:34:02Z,OWNER,"I called it `as_format` to avoid clashing with the Python built-in `format()` function when these things were turned into keyword arguments, but now that they're not I can use `format` instead. I think I'm going to go with `database`, `table`, `format` and `pks`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174302994,Make route matched pattern groups more consistent, https://github.com/simonw/datasette/issues/1666#issuecomment-1073039670,https://api.github.com/repos/simonw/datasette/issues/1666,1073039670,IC_kwDOBm6k_c4_9Uk2,9599,simonw,2022-03-19T16:31:08Z,2022-03-19T16:31:57Z,OWNER,"This does make it more interesting - it also highlights how inconsistent the way the capturing works is. Especially `as_format` which can be `None` or `""""` or `.json` or `json` or not used at all in the case of `TableView`. https://github.com/simonw/datasette/blob/764738dfcb16cd98b0987d443f59d5baa9d3c332/tests/test_routes.py#L12-L36","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174162781,Refactor URL routing to enable testing, https://github.com/simonw/datasette/issues/1666#issuecomment-1073039241,https://api.github.com/repos/simonw/datasette/issues/1666,1073039241,IC_kwDOBm6k_c4_9UeJ,9599,simonw,2022-03-19T16:28:15Z,2022-03-19T16:28:15Z,OWNER,This is more interesting if it also asserts against the captured matches from the pattern.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174162781,Refactor URL routing to enable testing, https://github.com/simonw/datasette/issues/878#issuecomment-1073037939,https://api.github.com/repos/simonw/datasette/issues/878,1073037939,IC_kwDOBm6k_c4_9UJz,9599,simonw,2022-03-19T16:19:30Z,2022-03-19T16:19:30Z,OWNER,"On revisiting https://gist.github.com/simonw/281eac9c73b062c3469607ad86470eb2 a few months later I'm having second thoughts about using `@inject` on the `main()` method. But I still like the pattern as a way to resolve more complex cases like ""to generate GeoJSON of the expanded view with labels, the label expansion code needs to run once at some before the GeoJSON formatting code does"". So I'm going to stick with it a tiny bit longer, but maybe try to make it a lot more explicit when it's going to happen rather than having the main view methods themselves also use async DI.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",648435885,"New pattern for views that return either JSON or HTML, available for plugins", https://github.com/simonw/datasette/issues/1561#issuecomment-1072939780,https://api.github.com/repos/simonw/datasette/issues/1561,1072939780,IC_kwDOBm6k_c4_88ME,9599,simonw,2022-03-19T04:45:40Z,2022-03-19T04:45:40Z,OWNER,"I ended up moving hashed URL mode out to a plugin in: - #647 If you're still interested in using it with `_memory` please open an issue in that repo here: https://github.com/simonw/datasette-hashed-urls","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1082765654,"add hash id to ""_memory"" url if hashed url mode is turned on and crossdb is also turned on", https://github.com/simonw/datasette/issues/1666#issuecomment-1072933875,https://api.github.com/repos/simonw/datasette/issues/1666,1072933875,IC_kwDOBm6k_c4_86vz,9599,simonw,2022-03-19T04:03:42Z,2022-03-19T04:03:42Z,OWNER,Tests so far: https://github.com/simonw/datasette/blob/711767bcd3c1e76a0861fe7f24069ff1c8efc97a/tests/test_routes.py#L12-L34,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1174162781,Refactor URL routing to enable testing, https://github.com/simonw/datasette/issues/1228#issuecomment-1072915936,https://api.github.com/repos/simonw/datasette/issues/1228,1072915936,IC_kwDOBm6k_c4_82Xg,9599,simonw,2022-03-19T01:50:27Z,2022-03-19T01:50:27Z,OWNER,Demo: https://latest.datasette.io/fixtures/facetable - which now has a column called `n`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",810397025,500 error caused by faceting if a column called `n` exists, https://github.com/simonw/datasette/issues/1228#issuecomment-1072908029,https://api.github.com/repos/simonw/datasette/issues/1228,1072908029,IC_kwDOBm6k_c4_80b9,9599,simonw,2022-03-19T00:57:54Z,2022-03-19T00:57:54Z,OWNER,"Yes! That's the problem. I was able to replicate it like so: ``` echo '[{ ""n"": ""one"", ""abc"": 1 }, { ""n"": ""one"", ""abc"": 2 }, { ""n"": ""two"", ""abc"": 3 }]' | sqlite-utils insert column-called-n.db t - ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",810397025,500 error caused by faceting if a column called `n` exists, https://github.com/simonw/datasette/issues/1228#issuecomment-1072907680,https://api.github.com/repos/simonw/datasette/issues/1228,1072907680,IC_kwDOBm6k_c4_80Wg,9599,simonw,2022-03-19T00:55:48Z,2022-03-19T00:55:48Z,OWNER,... unless your data had a column called `n`?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",810397025,500 error caused by faceting if a column called `n` exists, https://github.com/simonw/datasette/issues/1228#issuecomment-1072907610,https://api.github.com/repos/simonw/datasette/issues/1228,1072907610,IC_kwDOBm6k_c4_80Va,9599,simonw,2022-03-19T00:55:29Z,2022-03-19T00:55:29Z,OWNER,"It looks to me like something is causing the faceting query here to return a string when it was expected to return a number: https://github.com/simonw/datasette/blob/32963018e7edfab1233de7c7076c428d0e5c7813/datasette/facets.py#L153-L170 I can't think of any way that a `count(*) as n` would turn into a string though!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",810397025,500 error caused by faceting if a column called `n` exists, https://github.com/simonw/datasette/issues/1605#issuecomment-1072907200,https://api.github.com/repos/simonw/datasette/issues/1605,1072907200,IC_kwDOBm6k_c4_80PA,9599,simonw,2022-03-19T00:52:54Z,2022-03-19T00:53:45Z,OWNER,"Had a thought about the implementation of this: it could make a really neat plugin. Something like `datasette-export` which adds a `export` command using https://docs.datasette.io/en/stable/plugin_hooks.html#register-commands-cli - then you could run: datasette export my-export-dir mydatabase.db -m metadata.json --template-dir templates/ And the command would then: - Create a `Datasette()` instance with those databases/metadata/etc - Execute`await datasette.client.get(""/"")` to get the homepage HTML - Parse the HTML using BeautifulSoup to find all `a[href]`, `link[href]`, `script[src]`, `img[src]` elements that reference a relative path as opposed to one that starts with `http://` - Write out the homepage to `my-export-dir/index.html` - Recursively fetch and dump all of the other pages and assets that it found too All of that HTML parsing may be over-complicating things. It could alternatively accept options for which pages you want to export: ``` datasette export my-export-dir \ mydatabase.db -m metadata.json --template-dir templates/ \ --path / \ --path /mydatabase ... ``` Or a really wild option: it could allow you to define the paths you want to export using a SQL query: ``` datasette export my-export-dir \ mydatabase.db -m metadata.json --template-dir templates/ \ --sql "" select '/' as path, 'index.html' as filename union all select '/mydatabase/articles/' || id as path, 'article-' || id || '.html' as filename from articles union all select '/mydatabase/tags/' || tag as path, 'tag-' || tag || '.html' as filename from tags "" ``` Which would save these files: - `index.html` as the content of `/` - `article-1.html` (and more) as the content of `/mydatabase/articles/1` - `tag-python.html` (and more) as the content of `/mydatabase/tags/python`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1108671952,Scripted exports, https://github.com/simonw/datasette/issues/1662#issuecomment-1072905467,https://api.github.com/repos/simonw/datasette/issues/1662,1072905467,IC_kwDOBm6k_c4_8zz7,9599,simonw,2022-03-19T00:42:23Z,2022-03-19T00:42:23Z,OWNER,"Those client-side SQLite tricks are _really_ neat. `datasette publish` defaults to configuring it so the raw SQLite database can be downloaded from `/fixtures.db` - and this issue updated it to be served with a CORS header that would allow client-side scripts to load the file: - #1057 If you're not going to run any server-side code at all you don't need Datasette for this - you can upload the SQLite database file to any static hosting with CORS headers and load it into the client that way. In terms of static publishing, I do think there's something interesting about using Datasette to generate static sites. There's an issue discussing options for that over here: - #1605","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170497629,[feature request] Publish to fully static website, https://github.com/simonw/datasette/issues/1661#issuecomment-1072904703,https://api.github.com/repos/simonw/datasette/issues/1661,1072904703,IC_kwDOBm6k_c4_8zn_,9599,simonw,2022-03-19T00:37:36Z,2022-03-19T00:37:36Z,OWNER,Updated docs: https://docs.datasette.io/en/latest/performance.html#datasette-hashed-urls,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170355774,Remove Hashed URL mode, https://github.com/simonw/datasette/issues/1661#issuecomment-1072901159,https://api.github.com/repos/simonw/datasette/issues/1661,1072901159,IC_kwDOBm6k_c4_8ywn,9599,simonw,2022-03-19T00:20:27Z,2022-03-19T00:20:27Z,OWNER,I can remove the `default_cache_ttl_hashed` setting too.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170355774,Remove Hashed URL mode, https://github.com/simonw/datasette/pull/1664#issuecomment-1072898923,https://api.github.com/repos/simonw/datasette/issues/1664,1072898923,IC_kwDOBm6k_c4_8yNr,9599,simonw,2022-03-19T00:11:33Z,2022-03-19T00:11:33Z,OWNER,I'm going to land this and handle those in separate commits.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173017980,Remove hashed URL mode, https://github.com/simonw/datasette/pull/1664#issuecomment-1072898797,https://api.github.com/repos/simonw/datasette/issues/1664,1072898797,IC_kwDOBm6k_c4_8yLt,9599,simonw,2022-03-19T00:11:09Z,2022-03-19T00:11:09Z,OWNER,Still need to remove it from the documentation and do something about that `hash_urls` setting.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173017980,Remove hashed URL mode, https://github.com/simonw/datasette/pull/1664#issuecomment-1072890524,https://api.github.com/repos/simonw/datasette/issues/1664,1072890524,IC_kwDOBm6k_c4_8wKc,9599,simonw,2022-03-18T23:44:33Z,2022-03-19T00:06:51Z,OWNER,Looks like that was set here: https://github.com/simonw/datasette/blob/77a904fea14f743560af9cc668146339bdbbd0a9/datasette/views/base.py#L490-L492,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173017980,Remove hashed URL mode, https://github.com/simonw/datasette/pull/1664#issuecomment-1072890205,https://api.github.com/repos/simonw/datasette/issues/1664,1072890205,IC_kwDOBm6k_c4_8wFd,9599,simonw,2022-03-18T23:43:15Z,2022-03-18T23:43:15Z,OWNER,"Now almost everything is working except for foreign key expansion: ![CleanShot 2022-03-18 at 16 41 39@2x](https://user-images.githubusercontent.com/9599/159097349-6f41dfdf-5bab-449b-a148-5cda3df6534c.png) Using the debugger I tracked it down to this code: https://github.com/simonw/datasette/blob/30e5f0e67c38054a8087a2a4eae3fc4d1779af90/datasette/views/table.py#L708-L715 Turns out `default_labels` there is `None` - and it's a parameter to that `data()` method: https://github.com/simonw/datasette/blob/30e5f0e67c38054a8087a2a4eae3fc4d1779af90/datasette/views/table.py#L325-L334 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173017980,Remove hashed URL mode, https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1072834273,https://api.github.com/repos/simonw/sqlite-utils/issues/416,1072834273,IC_kwDOCGYnMM4_8ibh,9599,simonw,2022-03-18T21:36:05Z,2022-03-18T21:36:05Z,OWNER,"Python's `str.encode()` method has a `errors=` parameter that does something along these lines: https://docs.python.org/3/library/stdtypes.html#str.encode > *errors* may be given to set a different error handling scheme. The default for *errors* is `'strict'`, meaning that encoding errors raise a [`UnicodeError`](https://docs.python.org/3/library/exceptions.html#UnicodeError ""UnicodeError""). Other possible values are `'ignore'`, `'replace'`, `'xmlcharrefreplace'`, `'backslashreplace'` and any other name registered via [`codecs.register_error()`](https://docs.python.org/3/library/codecs.html#codecs.register_error ""codecs.register_error""), Imitating this might be the way to go.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173023272,Options for how `r.parsedate()` should handle invalid dates, https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1072833174,https://api.github.com/repos/simonw/sqlite-utils/issues/416,1072833174,IC_kwDOCGYnMM4_8iKW,9599,simonw,2022-03-18T21:34:06Z,2022-03-18T21:34:06Z,OWNER,"Good call-out: right now the `parsedate()` and `parsedatetime()` functions both terminate with an exception if they hit something invalid: https://sqlite-utils.datasette.io/en/stable/cli.html#sqlite-utils-convert-recipes It would be better if this was configurable by the user (and properly documented) - options could include ""set null if date is invalid"" and ""leave the value as it is if invalid"" in addition to throwing an error.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173023272,Options for how `r.parsedate()` should handle invalid dates, https://github.com/simonw/datasette/pull/1664#issuecomment-1071813296,https://api.github.com/repos/simonw/datasette/issues/1664,1071813296,IC_kwDOBm6k_c4_4pKw,9599,simonw,2022-03-17T23:26:22Z,2022-03-17T23:26:22Z,OWNER,Probably caused by the convoluted code is `get_format()`: https://github.com/simonw/datasette/blob/30e5f0e67c38054a8087a2a4eae3fc4d1779af90/datasette/views/base.py#L466-L481,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173017980,Remove hashed URL mode, https://github.com/simonw/datasette/pull/1664#issuecomment-1071809988,https://api.github.com/repos/simonw/datasette/issues/1664,1071809988,IC_kwDOBm6k_c4_4oXE,9599,simonw,2022-03-17T23:24:57Z,2022-03-17T23:24:57Z,OWNER,"My hunch is that this is broken because of this: https://github.com/simonw/datasette/blob/30e5f0e67c38054a8087a2a4eae3fc4d1779af90/datasette/app.py#L1098-L1107 Note how the table uses `table_and_format` but the row uses just `table` - I think there's code that's getting confused by this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173017980,Remove hashed URL mode, https://github.com/simonw/datasette/pull/1664#issuecomment-1071803114,https://api.github.com/repos/simonw/datasette/issues/1664,1071803114,IC_kwDOBm6k_c4_4mrq,9599,simonw,2022-03-17T23:22:00Z,2022-03-17T23:22:00Z,OWNER,"Surprisingly I managed to break https://latest.datasette.io/fixtures/custom_foreign_key_label while working on this change: ![CleanShot 2022-03-17 at 16 16 54@2x](https://user-images.githubusercontent.com/9599/158909271-717b65e8-cfcc-44c4-b1cc-f34478b0f803.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173017980,Remove hashed URL mode, https://github.com/simonw/datasette/issues/1661#issuecomment-1071797707,https://api.github.com/repos/simonw/datasette/issues/1661,1071797707,IC_kwDOBm6k_c4_4lXL,9599,simonw,2022-03-17T23:19:24Z,2022-03-17T23:19:24Z,OWNER,"Moving this to PR so I can comment on individual lines: - #1664","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170355774,Remove Hashed URL mode, https://github.com/simonw/datasette/issues/1661#issuecomment-1071793307,https://api.github.com/repos/simonw/datasette/issues/1661,1071793307,IC_kwDOBm6k_c4_4kSb,9599,simonw,2022-03-17T23:17:32Z,2022-03-17T23:17:32Z,OWNER,"Surprisingly I managed to break https://latest.datasette.io/fixtures/custom_foreign_key_label while working on this change: ![CleanShot 2022-03-17 at 16 16 54@2x](https://user-images.githubusercontent.com/9599/158909271-717b65e8-cfcc-44c4-b1cc-f34478b0f803.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170355774,Remove Hashed URL mode, https://github.com/simonw/datasette/issues/1661#issuecomment-1071706993,https://api.github.com/repos/simonw/datasette/issues/1661,1071706993,IC_kwDOBm6k_c4_4PNx,9599,simonw,2022-03-17T22:42:21Z,2022-03-17T22:42:21Z,OWNER,"As part of this I'm going to get rid of this mechanism: https://github.com/simonw/datasette/blob/30e5f0e67c38054a8087a2a4eae3fc4d1779af90/datasette/views/base.py#L170-L173 Unwrapping `request.scope[""url_route""][""kwargs""]` into keyword argument to view functions just made the code harder to follow.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170355774,Remove Hashed URL mode, https://github.com/simonw/datasette/issues/1663#issuecomment-1071519407,https://api.github.com/repos/simonw/datasette/issues/1663,1071519407,IC_kwDOBm6k_c4_3hav,9599,simonw,2022-03-17T21:32:35Z,2022-03-17T21:32:35Z,OWNER,"Updated docs: - https://docs.datasette.io/en/latest/internals.html#datasette-class - https://docs.datasette.io/en/latest/internals.html#db-hash","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170554975,Document the internals that were used in datasette-hashed-urls, https://github.com/simonw/datasette/issues/1532#issuecomment-1069570893,https://api.github.com/repos/simonw/datasette/issues/1532,1069570893,IC_kwDOBm6k_c4_wFtN,9599,simonw,2022-03-16T20:11:41Z,2022-03-16T20:13:34Z,OWNER,"Could also build a CLI Rich/Textual app to exercise the API - which could embed Datasette as a dependency and work using `datasette.client.get(...)` calls. Could be a plugin that adds a `datasette tui` command.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1065429936,Use datasette-table Web Component to guide the design of the JSON API for 1.0, https://github.com/simonw/datasette/issues/1663#issuecomment-1068742624,https://api.github.com/repos/simonw/datasette/issues/1663,1068742624,IC_kwDOBm6k_c4_s7fg,9599,simonw,2022-03-16T05:17:45Z,2022-03-16T05:17:45Z,OWNER,Should be documented here: https://docs.datasette.io/en/stable/internals.html,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170554975,Document the internals that were used in datasette-hashed-urls, https://github.com/simonw/datasette/issues/1661#issuecomment-1068728484,https://api.github.com/repos/simonw/datasette/issues/1661,1068728484,IC_kwDOBm6k_c4_s4Ck,9599,simonw,2022-03-16T04:47:39Z,2022-03-16T04:47:39Z,OWNER,https://datasette.io/plugins/datasette-hashed-urls is released now.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170355774,Remove Hashed URL mode, https://github.com/simonw/datasette/issues/1661#issuecomment-1068630353,https://api.github.com/repos/simonw/datasette/issues/1661,1068630353,IC_kwDOBm6k_c4_sgFR,9599,simonw,2022-03-16T01:24:56Z,2022-03-16T01:25:49Z,OWNER,"Here's the only bit of code that references that `_hash` mechanism: https://github.com/simonw/datasette/blob/77a904fea14f743560af9cc668146339bdbbd0a9/datasette/views/base.py#L259-L265 And here's the test: https://github.com/simonw/datasette/blob/77a904fea14f743560af9cc668146339bdbbd0a9/tests/test_api.py#L828-L854 Related issue: - #471","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170355774,Remove Hashed URL mode, https://github.com/simonw/datasette/issues/1661#issuecomment-1068628839,https://api.github.com/repos/simonw/datasette/issues/1661,1068628839,IC_kwDOBm6k_c4_sftn,9599,simonw,2022-03-16T01:21:36Z,2022-03-16T01:21:48Z,OWNER,"From https://docs.datasette.io/en/0.60.2/performance.html#hashed-url-mode > You can enable these hashed URLs in two ways: using the [hash_urls](https://docs.datasette.io/en/0.60.2/settings.html#setting-hash-urls) configuration setting (which affects all requests to Datasette) or via the `?_hash=1` query string parameter (which only applies to the current request). I'm going to drop` ?_hash=1` entirely. I'd actually forgotten that feature existed!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170355774,Remove Hashed URL mode, https://github.com/simonw/datasette/issues/1661#issuecomment-1068554827,https://api.github.com/repos/simonw/datasette/issues/1661,1068554827,IC_kwDOBm6k_c4_sNpL,9599,simonw,2022-03-15T23:16:58Z,2022-03-15T23:18:58Z,OWNER,"If you attempt to use the [old setting](https://docs.datasette.io/en/stable/settings.html#hash-urls): datasette mydatabase.db --setting hash_urls 1 It should error with a message saying that the feature has been moved to a plugin. I'll do this with a `deprecated_settings` mechanism so the error can be detected even though `datasette --help-settings` will no longer return the setting. https://github.com/simonw/datasette/blob/77a904fea14f743560af9cc668146339bdbbd0a9/datasette/cli.py#L479-L489","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170355774,Remove Hashed URL mode, https://github.com/simonw/datasette/issues/1661#issuecomment-1068553454,https://api.github.com/repos/simonw/datasette/issues/1661,1068553454,IC_kwDOBm6k_c4_sNTu,9599,simonw,2022-03-15T23:14:37Z,2022-03-15T23:14:37Z,OWNER,"This is going to simplify the code in the various view classes substantially: - #1660","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170355774,Remove Hashed URL mode, https://github.com/simonw/datasette/issues/647#issuecomment-1068552696,https://api.github.com/repos/simonw/datasette/issues/647,1068552696,IC_kwDOBm6k_c4_sNH4,9599,simonw,2022-03-15T23:13:06Z,2022-03-15T23:13:06Z,OWNER,"The plugin works. I'm going to implement one last feature for it: - https://github.com/simonw/datasette-hashed-urls/issues/3 Then I can remove hashed URL mode in a separate issue.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",531755959,Move hashed URL mode out to a plugin, https://github.com/simonw/datasette/issues/647#issuecomment-1068539404,https://api.github.com/repos/simonw/datasette/issues/647,1068539404,IC_kwDOBm6k_c4_sJ4M,9599,simonw,2022-03-15T22:49:01Z,2022-03-15T22:49:01Z,OWNER,"I shipped the first version of this: https://github.com/simonw/datasette-hashed-urls Next step: test it with a live demo: - https://github.com/simonw/datasette-hashed-urls/issues/2","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",531755959,Move hashed URL mode out to a plugin, https://github.com/simonw/datasette/issues/1439#issuecomment-1068461449,https://api.github.com/repos/simonw/datasette/issues/1439,1068461449,IC_kwDOBm6k_c4_r22J,9599,simonw,2022-03-15T20:51:26Z,2022-03-15T20:51:26Z,OWNER,I'm happy with this now that I've landed Tilde encoding in #1657.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/857#issuecomment-1068450483,https://api.github.com/repos/simonw/datasette/issues/857,1068450483,IC_kwDOBm6k_c4_r0Kz,9599,simonw,2022-03-15T20:43:55Z,2022-03-15T20:43:55Z,OWNER,Dupe of #1510.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",642297505,Comprehensive documentation for variables made available to templates, https://github.com/simonw/datasette/issues/1509#issuecomment-1068445412,https://api.github.com/repos/simonw/datasette/issues/1509,1068445412,IC_kwDOBm6k_c4_ry7k,9599,simonw,2022-03-15T20:37:50Z,2022-03-15T20:38:56Z,OWNER,"... maybe Datasette itself should include interactive API documentation, in addition to documenting it in the manual? `/dbname/table/-/apidocs` could return documentation about the specific table, taking into account columns and types.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1054243511,Datasette 1.0 JSON API (and documentation), https://github.com/simonw/datasette/issues/1509#issuecomment-1068444767,https://api.github.com/repos/simonw/datasette/issues/1509,1068444767,IC_kwDOBm6k_c4_ryxf,9599,simonw,2022-03-15T20:37:03Z,2022-03-15T20:37:03Z,OWNER,"Idea: I could add Pydantic https://pydantic-docs.helpmanual.io/usage/schema/ as an optional test dependency and use it to generate JSON schemas and run validation against examples in the API documentation. Maybe generate API documentation from it too?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1054243511,Datasette 1.0 JSON API (and documentation), https://github.com/simonw/datasette/issues/1510#issuecomment-1068443509,https://api.github.com/repos/simonw/datasette/issues/1510,1068443509,IC_kwDOBm6k_c4_ryd1,9599,simonw,2022-03-15T20:35:29Z,2022-03-15T20:35:29Z,OWNER,If I set a rule that everything available in the template context MUST also be available via the JSON API (maybe through an extras mechanism) I can combine this with API documentation and solve both at once.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1054244712,Datasette 1.0 documented template context (maybe via API docs), https://github.com/simonw/datasette/issues/870#issuecomment-650696054,https://api.github.com/repos/simonw/datasette/issues/870,650696054,MDEyOklzc3VlQ29tbWVudDY1MDY5NjA1NA==,9599,simonw,2020-06-28T04:52:41Z,2022-03-15T20:07:17Z,OWNER,"This would be a lot easier if I had extracted out the hash logic to a plugin, see: - #647","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",646737558,Refactor default views to use register_routes, https://github.com/simonw/datasette/issues/1660#issuecomment-1068418619,https://api.github.com/repos/simonw/datasette/issues/1660,1068418619,IC_kwDOBm6k_c4_rsY7,9599,simonw,2022-03-15T20:06:19Z,2022-03-15T20:06:19Z,OWNER,"Also related: - #878 - #1512 - #1518 - #870 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170144879,Refactor and simplify Datasette routing and views, https://github.com/simonw/datasette/issues/1660#issuecomment-1068417357,https://api.github.com/repos/simonw/datasette/issues/1660,1068417357,IC_kwDOBm6k_c4_rsFN,9599,simonw,2022-03-15T20:05:08Z,2022-03-15T20:05:08Z,OWNER,"`DataView` is used as the base class for: - `DatabaseView` - `DatabaseDownload` (just so the permissions checks can be called) - `QueryView` - which isn't routed to directly, it's called from `DatabaseView` if `?sql=` is available and `TableView` for canned queries - `RowTableShared` which is the base class for `TableView` and `RowView`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170144879,Refactor and simplify Datasette routing and views, https://github.com/simonw/datasette/issues/1660#issuecomment-1068415072,https://api.github.com/repos/simonw/datasette/issues/1660,1068415072,IC_kwDOBm6k_c4_rrhg,9599,simonw,2022-03-15T20:02:36Z,2022-03-15T20:02:36Z,OWNER,"This is one of the worst bits - the `get_format()` method on the `DataView` base class actually modifies `args`, including removing keys! Really confusing: https://github.com/simonw/datasette/blob/77a904fea14f743560af9cc668146339bdbbd0a9/datasette/views/base.py#L454-L482 Then `BaseView` has some surprising responsibilities. It has a utility helper for checking multiple permissions at once: https://github.com/simonw/datasette/blob/77a904fea14f743560af9cc668146339bdbbd0a9/datasette/views/base.py#L81-L105 And its own render method that adds extra stuff to the template context and handles the rel: alternate header: https://github.com/simonw/datasette/blob/77a904fea14f743560af9cc668146339bdbbd0a9/datasette/views/base.py#L131-L157 Then `DataView` does all sorts of weird stuff - from handling database hashes (which I want to remove, see #647): https://github.com/simonw/datasette/blob/77a904fea14f743560af9cc668146339bdbbd0a9/datasette/views/base.py#L206-L219 To streaming CSV responses: https://github.com/simonw/datasette/blob/77a904fea14f743560af9cc668146339bdbbd0a9/datasette/views/base.py#L286-L308 To handling SQLite exceptions: https://github.com/simonw/datasette/blob/77a904fea14f743560af9cc668146339bdbbd0a9/datasette/views/base.py#L514-L526 And a ton more. It' s a big mess.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1170144879,Refactor and simplify Datasette routing and views, https://github.com/simonw/datasette/issues/1062#issuecomment-1068327874,https://api.github.com/repos/simonw/datasette/issues/1062,1068327874,IC_kwDOBm6k_c4_rWPC,9599,simonw,2022-03-15T18:33:49Z,2022-03-15T18:33:49Z,OWNER,"I can get regular `.json` to stream too, using the pattern described in this TIL: https://til.simonwillison.net/python/output-json-array-streaming","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",732674148,Refactor .csv to be an output renderer - and teach register_output_renderer to stream all rows, https://github.com/simonw/datasette/issues/1651#issuecomment-1068319530,https://api.github.com/repos/simonw/datasette/issues/1651,1068319530,IC_kwDOBm6k_c4_rUMq,9599,simonw,2022-03-15T18:25:42Z,2022-03-15T18:25:42Z,OWNER,"Done: - https://latest.datasette.io/fixtures/table~2Fwith~2Fslashes~2Ecsv - https://latest.datasette.io/fixtures/table~2Fwith~2Fslashes~2Ecsv.csv - https://latest.datasette.io/fixtures/table~2Fwith~2Fslashes~2Ecsv.json","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161584460,Get rid of the no-longer necessary ?_format=json hack for tables called x.json, https://github.com/simonw/datasette/issues/1657#issuecomment-1068318454,https://api.github.com/repos/simonw/datasette/issues/1657,1068318454,IC_kwDOBm6k_c4_rT72,9599,simonw,2022-03-15T18:25:11Z,2022-03-15T18:25:11Z,OWNER,"Demo: - https://latest.datasette.io/fixtures/table~2Fwith~2Fslashes~2Ecsv - https://latest.datasette.io/fixtures/table~2Fwith~2Fslashes~2Ecsv.csv - https://latest.datasette.io/fixtures/table~2Fwith~2Fslashes~2Ecsv.json","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1657#issuecomment-1068306916,https://api.github.com/repos/simonw/datasette/issues/1657,1068306916,IC_kwDOBm6k_c4_rRHk,9599,simonw,2022-03-15T18:15:11Z,2022-03-15T18:15:11Z,OWNER,Now live here: https://fivethirtyeight.datasettes.com/fivethirtyeight/august-senate-polls~2Faugust_senate_polls,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1657#issuecomment-1068296042,https://api.github.com/repos/simonw/datasette/issues/1657,1068296042,IC_kwDOBm6k_c4_rOdq,9599,simonw,2022-03-15T18:05:54Z,2022-03-15T18:05:54Z,OWNER,Documentation: https://docs.datasette.io/en/latest/internals.html#tilde-encoding,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1657#issuecomment-1068181623,https://api.github.com/repos/simonw/datasette/issues/1657,1068181623,IC_kwDOBm6k_c4_qyh3,9599,simonw,2022-03-15T16:18:23Z,2022-03-15T16:18:23Z,OWNER,Moving this to a PR.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1657#issuecomment-1068148013,https://api.github.com/repos/simonw/datasette/issues/1657,1068148013,IC_kwDOBm6k_c4_qqUt,9599,simonw,2022-03-15T15:50:15Z,2022-03-15T15:50:15Z,OWNER,"The thing that broke everything was this change: I'm going to bring back the horrible `get_format()` method for the moment, with its weird mutations of the `args` object, then try and get rid of it again later.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1658#issuecomment-1068138578,https://api.github.com/repos/simonw/datasette/issues/1658,1068138578,IC_kwDOBm6k_c4_qoBS,9599,simonw,2022-03-15T15:42:49Z,2022-03-15T15:42:49Z,OWNER,"Easiest way to do this was with three reverts, then cherry-pick back the code of conduct.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1169840669,Revert main to version that passes tests, https://github.com/simonw/datasette/issues/1657#issuecomment-1068126821,https://api.github.com/repos/simonw/datasette/issues/1657,1068126821,IC_kwDOBm6k_c4_qlJl,9599,simonw,2022-03-15T15:31:54Z,2022-03-15T15:31:54Z,OWNER,The state I had got to prior to that revert is in https://github.com/simonw/datasette/tree/issue-1657-wip,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1657#issuecomment-1068125636,https://api.github.com/repos/simonw/datasette/issues/1657,1068125636,IC_kwDOBm6k_c4_qk3E,9599,simonw,2022-03-15T15:30:54Z,2022-03-15T15:30:54Z,OWNER,I've made a real mess of this. I'm going to revert Datasette`main` back to the last commit that passed the tests and try this again in a branch.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1657#issuecomment-1067423720,https://api.github.com/repos/simonw/datasette/issues/1657,1067423720,IC_kwDOBm6k_c4_n5fo,9599,simonw,2022-03-14T23:59:56Z,2022-03-14T23:59:56Z,OWNER,"Updated test: ```python @pytest.mark.parametrize( ""original,expected"", ( (""abc"", ""abc""), (""/foo/bar"", ""~2Ffoo~2Fbar""), (""/-/bar"", ""~2F-~2Fbar""), (""-/db-/table.csv"", ""-~2Fdb-~2Ftable~2Ecsv""), (r""%~-/"", ""~25~7E-~2F""), (""~25~7E~2D~2F"", ""~7E25~7E7E~7E2D~7E2F""), ), ) def test_tilde_encoding(original, expected): actual = utils.tilde_encode(original) assert actual == expected # And test round-trip assert original == utils.tilde_decode(actual) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1657#issuecomment-1067414156,https://api.github.com/repos/simonw/datasette/issues/1657,1067414156,IC_kwDOBm6k_c4_n3KM,9599,simonw,2022-03-14T23:38:41Z,2022-03-14T23:38:41Z,OWNER,"And in https://datatracker.ietf.org/doc/html/rfc3986#section-2.3 ""Unreserved Characters"": unreserved = ALPHA / DIGIT / ""-"" / ""."" / ""_"" / ""~""","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1657#issuecomment-1067413691,https://api.github.com/repos/simonw/datasette/issues/1657,1067413691,IC_kwDOBm6k_c4_n3C7,9599,simonw,2022-03-14T23:37:42Z,2022-03-14T23:37:42Z,OWNER,"Relevant: https://datatracker.ietf.org/doc/html/rfc3986#section-2.1 ``` reserved = gen-delims / sub-delims gen-delims = "":"" / ""/"" / ""?"" / ""#"" / ""["" / ""]"" / ""@"" sub-delims = ""!"" / ""$"" / ""&"" / ""'"" / ""("" / "")"" / ""*"" / ""+"" / "","" / "";"" / ""="" ``` Notably `~` is not in either of those lists.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1651#issuecomment-1067382442,https://api.github.com/repos/simonw/datasette/issues/1651,1067382442,IC_kwDOBm6k_c4_nvaq,9599,simonw,2022-03-14T22:59:10Z,2022-03-14T22:59:10Z,OWNER,"This work is now blocked on: - https://github.com/simonw/datasette/issues/1657","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161584460,Get rid of the no-longer necessary ?_format=json hack for tables called x.json, https://github.com/simonw/datasette/issues/1657#issuecomment-1067382232,https://api.github.com/repos/simonw/datasette/issues/1657,1067382232,IC_kwDOBm6k_c4_nvXY,9599,simonw,2022-03-14T22:58:47Z,2022-03-14T22:58:47Z,OWNER,"Asked about this [on Twitter](https://twitter.com/simonw/status/1503499169775849473): > Anyone ever seen a proxy or other URL handling system do anything surprising with the tilde ""~"" character? > > I'm considering it as an escaping character, in place of ""-"" as described in Replies so far seem like it should be OK - Apache has supported this for home directories for a couple of decades now without any problems.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1657#issuecomment-1067381556,https://api.github.com/repos/simonw/datasette/issues/1657,1067381556,IC_kwDOBm6k_c4_nvM0,9599,simonw,2022-03-14T22:57:27Z,2022-03-14T22:57:45Z,OWNER,"The problem with the [dash encoding mechanism](https://simonwillison.net/2022/Mar/5/dash-encoding/) is that it turns out dashes are used in a LOT of existing Datasette instances - much of https://fivethirtyeight.datasettes.com/fivethirtyeight for example, and even https://datasette.io/ itself: https://datasette.io/dogsheep-index It's pretty ugly to force all of those to change to their dash-encoded equivalent - and in fact it broke https://datasette.io/ in a subtle way: - https://github.com/simonw/datasette.io/issues/94 I'm going to try using `~` instead and see if that works as well and causes less breakage to existing sites.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1168995756,Tilde encoding: use ~ instead of - for dash-encoding, https://github.com/simonw/datasette/issues/1439#issuecomment-1065988403,https://api.github.com/repos/simonw/datasette/issues/1439,1065988403,IC_kwDOBm6k_c4_ibEz,9599,simonw,2022-03-13T00:06:38Z,2022-03-13T00:07:19Z,OWNER,"If I want to reserve `-` as a character that CAN be used in URLs, the only remaining character that might make sense for escape sequences is `~` - based on this last line of characters that are escape from percentage encoding: ```python _ALWAYS_SAFE = frozenset(b'ABCDEFGHIJKLMNOPQRSTUVWXYZ' b'abcdefghijklmnopqrstuvwxyz' b'0123456789' b'_.-~') ``` So I'd add both `-` and `_` back to the safe list, but use `~` to escape `.` and `/` and suchlike.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1065987808,https://api.github.com/repos/simonw/datasette/issues/1439,1065987808,IC_kwDOBm6k_c4_ia7g,9599,simonw,2022-03-13T00:02:32Z,2022-03-13T00:02:32Z,OWNER,"OK, this has broken a lot more than I expected it would. Turns out `-` is a very common character in existing Datasette database names! https://datasette.io/-/databases for example has two: ```json [ { ""name"": ""docs-index"", ""path"": ""docs-index.db"", ""size"": 1007616, ""is_mutable"": false, ""is_memory"": false, ""hash"": ""0ac6c3de2762fcd174fd249fed8a8fa6046ea345173d22c2766186bf336462b2"" }, { ""name"": ""dogsheep-index"", ""path"": ""dogsheep-index.db"", ""size"": 5496832, ""is_mutable"": false, ""is_memory"": false, ""hash"": ""d1ea238d204e5b9ae783c86e4af5bcdf21267c1f391de3e468d9665494ee012a"" } ] ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/sqlite-utils/issues/411#issuecomment-1065597709,https://api.github.com/repos/simonw/sqlite-utils/issues/411,1065597709,IC_kwDOCGYnMM4_g7sN,9599,simonw,2022-03-11T22:32:43Z,2022-03-11T22:32:43Z,OWNER,"Trying to figure out what that extra field in `table_info` compared to `table_xinfo` is: ``` >>> list(db.query(""PRAGMA table_xinfo('t')"")) [{'cid': 0, 'name': 'body', 'type': 'TEXT', 'notnull': 0, 'dflt_value': None, 'pk': 0, 'hidden': 0}, {'cid': 1, 'name': 'd', 'type': 'INT', 'notnull': 0, 'dflt_value': None, 'pk': 0, 'hidden': 2}] `` Presumably `hidden` 0 v.s 2 v.s. other values has meaning.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160034488,Support for generated columns, https://github.com/simonw/sqlite-utils/issues/411#issuecomment-1065596417,https://api.github.com/repos/simonw/sqlite-utils/issues/411,1065596417,IC_kwDOCGYnMM4_g7YB,9599,simonw,2022-03-11T22:30:15Z,2022-03-11T22:30:15Z,OWNER,"I tried it out in Jupyter and it works as advertised: Introspection is a bit weird: there doesn't seem to be a way to introspect generated columns outside of parsing the stored SQL schema for the columns at the moment! And the `.columns` method doesn't return them at all: https://github.com/simonw/sqlite-utils/blob/433813612ff9b4b501739fd7543bef0040dd51fe/sqlite_utils/db.py#L1207-L1213 Here's why: ``` >>> db.execute(""PRAGMA table_info('t')"").fetchall() [(0, 'body', 'TEXT', 0, None, 0)] >>> db.execute(""PRAGMA table_xinfo('t')"").fetchall() [(0, 'body', 'TEXT', 0, None, 0, 0), (1, 'd', 'INT', 0, None, 0, 2)] ``` So `table_xinfo()` is needed to get back columns including generated columns: https://www.sqlite.org/pragma.html#pragma_table_xinfo > **PRAGMA** *schema.***table_xinfo(***table-name***);** > > This pragma returns one row for each column in the named table, including [hidden columns](https://www.sqlite.org/vtab.html#hiddencol) in virtual tables. The output is the same as for [PRAGMA table_info](https://www.sqlite.org/pragma.html#pragma_table_info) except that hidden columns are shown rather than being omitted.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160034488,Support for generated columns, https://github.com/simonw/sqlite-utils/issues/411#issuecomment-1065402557,https://api.github.com/repos/simonw/sqlite-utils/issues/411,1065402557,IC_kwDOCGYnMM4_gMC9,9599,simonw,2022-03-11T19:01:08Z,2022-03-11T21:42:25Z,OWNER,"Just spotted this in https://www.sqlite.org/gencol.html > The only functional difference is that one cannot add new STORED columns using the [ALTER TABLE ADD COLUMN](https://www.sqlite.org/lang_altertable.html#altertabaddcol) command. Only VIRTUAL columns can be added using ALTER TABLE. So to add stored columns to an existing table we would need to use the `.transform()` trick. Which implies that this should actually be a capability of the various `.create()` methods, since transform works by creating a new table with those and then copying across the old data. Here's where `.transform()` calls `.create_table_sql()` under the hood: https://github.com/simonw/sqlite-utils/blob/9388edf57aa15719095e3cf0952c1653cd070c9b/sqlite_utils/db.py#L1627-L1637","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160034488,Support for generated columns, https://github.com/simonw/sqlite-utils/issues/411#issuecomment-1065389386,https://api.github.com/repos/simonw/sqlite-utils/issues/411,1065389386,IC_kwDOCGYnMM4_gI1K,9599,simonw,2022-03-11T18:42:53Z,2022-03-11T21:40:51Z,OWNER,"The Python API could be: ```python db[table_name].add_generated_column(""field"", str, ""json_extract(data, '$.field')"", stored=True) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160034488,Support for generated columns, https://github.com/simonw/sqlite-utils/issues/411#issuecomment-1065458729,https://api.github.com/repos/simonw/sqlite-utils/issues/411,1065458729,IC_kwDOCGYnMM4_gZwp,9599,simonw,2022-03-11T19:58:50Z,2022-03-11T20:00:25Z,OWNER,"I'm coming round to your suggestion to have this as extra arguments to `sqlite-utils add-column` now, especially since you also need to pass a column type. I'd like to come up with syntax for `sqlite-utils create-table` as well. https://sqlite-utils.datasette.io/en/stable/cli-reference.html#create-table Maybe extra `--generated-stored colname expression` (and `--generated`) options would work there.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160034488,Support for generated columns, https://github.com/simonw/sqlite-utils/issues/411#issuecomment-1065440445,https://api.github.com/repos/simonw/sqlite-utils/issues/411,1065440445,IC_kwDOCGYnMM4_gVS9,9599,simonw,2022-03-11T19:52:15Z,2022-03-11T19:52:15Z,OWNER,"Two new parameters to `.create_table()` and friends: - `generated={...}` - generated column definitions - `generated_stored={...}` generated stored column definitions These columns will be added at the end of the table, but you can use the `column_order=` parameter to apply a different order.","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",1160034488,Support for generated columns, https://github.com/simonw/sqlite-utils/issues/411#issuecomment-1065386352,https://api.github.com/repos/simonw/sqlite-utils/issues/411,1065386352,IC_kwDOCGYnMM4_gIFw,9599,simonw,2022-03-11T18:41:37Z,2022-03-11T18:41:37Z,OWNER,"I like `add-generated-column` - feels very clear to me, and is a nice place for adding logic that checks if the DB version supports it or not and shows a useful error.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160034488,Support for generated columns, https://github.com/simonw/sqlite-utils/issues/414#issuecomment-1065384183,https://api.github.com/repos/simonw/sqlite-utils/issues/414,1065384183,IC_kwDOCGYnMM4_gHj3,9599,simonw,2022-03-11T18:40:39Z,2022-03-11T18:40:39Z,OWNER,"That fixed it: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166731361,I forgot to include the changelog in the 3.25.1 release, https://github.com/simonw/sqlite-utils/issues/414#issuecomment-1065382145,https://api.github.com/repos/simonw/sqlite-utils/issues/414,1065382145,IC_kwDOCGYnMM4_gHEB,9599,simonw,2022-03-11T18:39:05Z,2022-03-11T18:39:05Z,OWNER,"https://sqlite-utils.datasette.io/en/3.25.1/changelog.html is currently wrong: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166731361,I forgot to include the changelog in the 3.25.1 release, https://github.com/simonw/sqlite-utils/issues/414#issuecomment-1065381047,https://api.github.com/repos/simonw/sqlite-utils/issues/414,1065381047,IC_kwDOCGYnMM4_gGy3,9599,simonw,2022-03-11T18:38:27Z,2022-03-11T18:38:27Z,OWNER,"OK that fixed it here: https://sqlite-utils.datasette.io/en/stable/changelog.html I'm going to trigger a rebuild of `3.25.1` too: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166731361,I forgot to include the changelog in the 3.25.1 release, https://github.com/simonw/sqlite-utils/issues/414#issuecomment-1065380286,https://api.github.com/repos/simonw/sqlite-utils/issues/414,1065380286,IC_kwDOCGYnMM4_gGm-,9599,simonw,2022-03-11T18:37:23Z,2022-03-11T18:37:23Z,OWNER,"On ReadTheDocs that triggered a new `stable` build but it didn't seem to trigger a new build of `3.25.1`: https://readthedocs.org/projects/sqlite-utils/builds/ ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166731361,I forgot to include the changelog in the 3.25.1 release, https://github.com/simonw/sqlite-utils/issues/414#issuecomment-1065379528,https://api.github.com/repos/simonw/sqlite-utils/issues/414,1065379528,IC_kwDOCGYnMM4_gGbI,9599,simonw,2022-03-11T18:36:17Z,2022-03-11T18:36:17Z,OWNER,"I created a new tag and release: https://github.com/simonw/sqlite-utils/releases/tag/3.25.1 And I cancelled the publish workflow: https://github.com/simonw/sqlite-utils/actions/runs/1970415399","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166731361,I forgot to include the changelog in the 3.25.1 release, https://github.com/simonw/sqlite-utils/issues/414#issuecomment-1065378902,https://api.github.com/repos/simonw/sqlite-utils/issues/414,1065378902,IC_kwDOCGYnMM4_gGRW,9599,simonw,2022-03-11T18:35:26Z,2022-03-11T18:35:26Z,OWNER,I deleted both the release and the tag from GitHub using the web interface.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166731361,I forgot to include the changelog in the 3.25.1 release, https://github.com/simonw/sqlite-utils/issues/414#issuecomment-1065377926,https://api.github.com/repos/simonw/sqlite-utils/issues/414,1065377926,IC_kwDOCGYnMM4_gGCG,9599,simonw,2022-03-11T18:34:05Z,2022-03-11T18:34:05Z,OWNER,"Two options: - Delete and recreate the release on GitHub, triggering it to be fixed on Read The Docs (as the `stable` version) - but cancel the push to PyPI, since that platform doesn't allow package versions to be over-written and in this case since the changelog file isn't included in the PyPI package there should be no change at all - Push a 3.25.2 release. I'm going to try and do that first option.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166731361,I forgot to include the changelog in the 3.25.1 release, https://github.com/simonw/sqlite-utils/issues/413#issuecomment-1065357081,https://api.github.com/repos/simonw/sqlite-utils/issues/413,1065357081,IC_kwDOCGYnMM4_gA8Z,9599,simonw,2022-03-11T18:07:10Z,2022-03-11T18:07:10Z,OWNER,I'm really happy with this improvement.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166587040,Display autodoc type information more legibly, https://github.com/simonw/sqlite-utils/issues/413#issuecomment-1065345515,https://api.github.com/repos/simonw/sqlite-utils/issues/413,1065345515,IC_kwDOCGYnMM4_f-Hr,9599,simonw,2022-03-11T17:52:22Z,2022-03-11T17:52:22Z,OWNER,"Well this is a huge improvement! https://sqlite-utils.datasette.io/en/latest/reference.html#sqlite_utils.db.Table.insert I'm not crazy about the `extracts=` thing though - I wonder if there's a neat way to customize that to be less verbose?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166587040,Display autodoc type information more legibly, https://github.com/simonw/sqlite-utils/issues/413#issuecomment-1065249883,https://api.github.com/repos/simonw/sqlite-utils/issues/413,1065249883,IC_kwDOCGYnMM4_fmxb,9599,simonw,2022-03-11T16:03:35Z,2022-03-11T16:03:35Z,OWNER,"Applying this change fixes that: ```diff diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py index 3bc528f..2a79711 100644 --- a/sqlite_utils/db.py +++ b/sqlite_utils/db.py @@ -2293,18 +2293,18 @@ class Table(Queryable): """""" Apply conversion function ``fn`` to every value in the specified columns. - - ``columns`` - a single column or list of string column names to convert. - - ``fn`` - a callable that takes a single argument, ``value``, and returns it converted. - - ``output`` - optional string column name to write the results to (defaults to the input column). - - ``output_type`` - if the output column needs to be created, this is the type that will be used + :param columns: a single column or list of string column names to convert. + :param fn: a callable that takes a single argument, ``value``, and returns it converted. + :param output: optional string column name to write the results to (defaults to the input column). + :param output_type: if the output column needs to be created, this is the type that will be used for the new column. - - ``drop`` - boolean, should the original column be dropped once the conversion is complete? - - ``multi`` - boolean, if ``True`` the return value of ``fn(value)`` will be expected to be a + :param drop: boolean, should the original column be dropped once the conversion is complete? + :param multi: boolean, if ``True`` the return value of ``fn(value)`` will be expected to be a dictionary, and new columns will be created for each key of that dictionary. - - ``where`` - a SQL fragment to use as a ``WHERE`` clause to limit the rows to which the conversion + :param where: a SQL fragment to use as a ``WHERE`` clause to limit the rows to which the conversion is applied, for example ``age > ?`` or ``age > :age``. - - ``where_args`` - a list of arguments (if using ``?``) or a dictionary (if using ``:age``). - - ``show_progress`` - boolean, should a progress bar be displayed? + :param where_args: a list of arguments (if using ``?``) or a dictionary (if using ``:age``). + :param show_progress: boolean, should a progress bar be displayed? See :ref:`python_api_convert`. """""" ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166587040,Display autodoc type information more legibly, https://github.com/simonw/sqlite-utils/issues/413#issuecomment-1065247619,https://api.github.com/repos/simonw/sqlite-utils/issues/413,1065247619,IC_kwDOCGYnMM4_fmOD,9599,simonw,2022-03-11T16:01:20Z,2022-03-11T16:01:20Z,OWNER,"Definitely an improvement! It does highlight that I'm not currently using the `:param XXX: description` syntax though, which should move my descriptions of each parameter into that generated list.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166587040,Display autodoc type information more legibly, https://github.com/simonw/sqlite-utils/issues/413#issuecomment-1065245831,https://api.github.com/repos/simonw/sqlite-utils/issues/413,1065245831,IC_kwDOCGYnMM4_flyH,9599,simonw,2022-03-11T15:59:14Z,2022-03-11T15:59:14Z,OWNER,"Hint from https://twitter.com/AdamChainz/status/1502311047612575745 > Try: > > `autodoc_typehints = 'description'` > > For a list-of-arguments format > > https://sphinx-doc.org/en/master/usage/extensions/autodoc.html#confval-autodoc_typehints","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1166587040,Display autodoc type information more legibly, https://github.com/simonw/datasette/issues/1655#issuecomment-1062445113,https://api.github.com/repos/simonw/datasette/issues/1655,1062445113,IC_kwDOBm6k_c4_U6A5,9599,simonw,2022-03-09T01:01:24Z,2022-03-09T01:01:24Z,OWNER,"https://labordata.bunkum.us/-/settings shows `max_returned_rows` had been increased to 5,000 for that instance - the default of 1,000 would help a bit here. Any thoughts on how Datasette could handle this kind of thing better?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1163369515,query result page is using 400mb of browser memory 40x size of html page and 400x size of csv data, https://github.com/simonw/datasette/issues/932#issuecomment-1061891851,https://api.github.com/repos/simonw/datasette/issues/932,1061891851,IC_kwDOBm6k_c4_Sy8L,9599,simonw,2022-03-08T15:20:48Z,2022-03-08T15:20:48Z,OWNER,Made a start on this here: https://datasette.io/tutorials ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",678760988,End-user documentation, https://github.com/simonw/datasette/issues/1651#issuecomment-1061359915,https://api.github.com/repos/simonw/datasette/issues/1651,1061359915,IC_kwDOBm6k_c4_QxEr,9599,simonw,2022-03-08T03:08:14Z,2022-03-08T03:09:24Z,OWNER,"A lot of the code complexity here is caused by `DataView` ([here](https://github.com/simonw/datasette/blob/c5791156d92615f25696ba93dae5bb2dcc192c98/datasette/views/base.py#L182-L669)), which has the logic for CSV streaming and plugin formats such that it can be shared between tables and custom queries. It would be good to get rid of that subclassed shared code, figure out how to do it via a utility function instead.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161584460,Get rid of the no-longer necessary ?_format=json hack for tables called x.json, https://github.com/simonw/datasette/issues/1645#issuecomment-1061355871,https://api.github.com/repos/simonw/datasette/issues/1645,1061355871,IC_kwDOBm6k_c4_QwFf,9599,simonw,2022-03-08T02:59:28Z,2022-03-08T02:59:28Z,OWNER,"Hah, found a TODO about this: https://github.com/simonw/datasette/blob/c5791156d92615f25696ba93dae5bb2dcc192c98/datasette/app.py#L997-L999","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 1, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1154399841,"Sensible `cache-control` headers for static assets, including those served by plugins", https://github.com/simonw/datasette/issues/647#issuecomment-1061282743,https://api.github.com/repos/simonw/datasette/issues/647,1061282743,IC_kwDOBm6k_c4_QeO3,9599,simonw,2022-03-08T00:32:34Z,2022-03-08T00:32:47Z,OWNER,It would be neat if the plugin could spot old-style hyphen hash URLs (maybe on 404) and redirect those too.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",531755959,Move hashed URL mode out to a plugin, https://github.com/simonw/datasette/issues/647#issuecomment-1061276646,https://api.github.com/repos/simonw/datasette/issues/647,1061276646,IC_kwDOBm6k_c4_Qcvm,9599,simonw,2022-03-08T00:22:11Z,2022-03-08T00:22:11Z,OWNER,I'm now convinced this is feasible enough that it's worth doing in time for Datasette 1.0.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",531755959,Move hashed URL mode out to a plugin, https://github.com/simonw/datasette/issues/647#issuecomment-1061276399,https://api.github.com/repos/simonw/datasette/issues/647,1061276399,IC_kwDOBm6k_c4_Qcrv,9599,simonw,2022-03-08T00:21:47Z,2022-03-08T00:21:47Z,OWNER,"This seems to do the job: ```python @hookimpl def startup(datasette): for name, database in datasette.databases.items(): if database.hash: new_name = ""{}_{}"".format(name, database.hash[:7]) del datasette.databases[name] datasette.databases[new_name] = database ``` Would have to teach the rest of the plugin to split on `_` and to only redirect if the user seems to be hitting the URL for an old hash after which Datasette has been restarted with an updated database.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",531755959,Move hashed URL mode out to a plugin, https://github.com/simonw/datasette/issues/647#issuecomment-1061272544,https://api.github.com/repos/simonw/datasette/issues/647,1061272544,IC_kwDOBm6k_c4_Qbvg,9599,simonw,2022-03-08T00:14:42Z,2022-03-08T00:14:42Z,OWNER,Maybe the plugin should interfere with `datasette.databases` on startup and change the registered name for each one?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",531755959,Move hashed URL mode out to a plugin, https://github.com/simonw/datasette/issues/647#issuecomment-1061267615,https://api.github.com/repos/simonw/datasette/issues/647,1061267615,IC_kwDOBm6k_c4_Qaif,9599,simonw,2022-03-08T00:05:43Z,2022-03-08T00:05:43Z,OWNER,"Built a prototype of that plugin: ```python from datasette import hookimpl from functools import wraps @hookimpl def asgi_wrapper(datasette): def wrap_with_hashed_urls(app): @wraps(app) async def hashed_urls(scope, receive, send): # Only triggers on pages with a path not starting in /-/ # and where the first page component matches a database name if scope.get(""type"") != ""http"": await app(scope, receive, send) return path = scope[""path""].lstrip(""/"") if not path or path.startswith(""-/""): await app(scope, receive, send) return potential_database = path.split(""/"")[0] # It may or may not be already dbname~hash if ""~"" in potential_database: db_name, hash = potential_database.split(""~"", 1) else: db_name = potential_database hash = """" # Is db_name a database we have a hash for? try: db = datasette.get_database(db_name) except KeyError: await app(scope, receive, send) return if db.hash is not None: # TODO: make sure db.hash is documented if db.hash[:7] != hash: # Send a redirect path_bits = path.split(""/"") new_path = ""/"" + ""/"".join([""{}-{}"".format(db_name, db.hash[:7])] + path_bits[1:]) if scope.get(""query_string""): new_path += ""?"" + scope[""query_string""].decode(""latin-1"") await send({ ""type"": ""http.response.start"", ""status"": 302, ""headers"": [ [b""location"", new_path.encode(""latin1"")] ], }) await send({""type"": ""http.response.body"", ""body"": b""""}) return else: # Add a far-future cache header async def wrapped_send(event): if event[""type""] == ""http.response.start"": original_headers = event.get(""headers"") or [] event = { ""type"": event[""type""], ""status"": event[""status""], ""headers"": original_headers + [ [b""Cache-Control"", b""max-age=31536000""] ], } await send(event) await app(scope, receive, wrapped_send) return await app(scope, receive, send) return hashed_urls return wrap_with_hashed_urls ``` One catch: it doesn't affect the way URLs are generated - so every internal link within Datasette links to the non-hash version and then triggers a 302 redirect to the hashed version.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",531755959,Move hashed URL mode out to a plugin, https://github.com/simonw/datasette/issues/647#issuecomment-1061226942,https://api.github.com/repos/simonw/datasette/issues/647,1061226942,IC_kwDOBm6k_c4_QQm-,9599,simonw,2022-03-07T23:00:06Z,2022-03-07T23:00:06Z,OWNER,"This needs to take into account the changes made here: - #1439 In the new encoding scheme, `-` has a special meaning in a table name: https://docs.datasette.io/en/latest/internals.html#dash-encoding I think `~` is the right character to use to separate a database name from its hash. `~` should be a URL safe character according to Python's implementation of percent-encoding, see comment here: https://github.com/simonw/datasette/blob/c5791156d92615f25696ba93dae5bb2dcc192c98/datasette/utils/__init__.py#L1146-L1152 So the plugin could check for `dbname~hash` and react based on that.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",531755959,Move hashed URL mode out to a plugin, https://github.com/simonw/datasette/issues/1651#issuecomment-1061223822,https://api.github.com/repos/simonw/datasette/issues/1651,1061223822,IC_kwDOBm6k_c4_QP2O,9599,simonw,2022-03-07T22:54:54Z,2022-03-07T22:54:54Z,OWNER,"I'm going to do a review of how URL routing works at the moment for the various views. I edited down [the full list](https://github.com/simonw/datasette/blob/c5791156d92615f25696ba93dae5bb2dcc192c98/datasette/app.py#L997-L1107) a bit - these are the most relevant: ```python add_route(IndexView.as_view(self), r""/(?P(\.jsono?)?$)"") add_route( DatabaseView.as_view(self), r""/(?P[^/]+?)(?P"" + renderer_regex + r""|.jsono|\.csv)?$"", ) add_route( TableView.as_view(self), r""/(?P[^/]+)/(?P[^/]+?$)"", ) add_route( RowView.as_view(self), r""/(?P[^/]+)/(?P[^/]+?)/(?P[^/]+?)(?P"" + renderer_regex + r"")?$"", ) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161584460,Get rid of the no-longer necessary ?_format=json hack for tables called x.json, https://github.com/simonw/datasette/issues/1654#issuecomment-1061197133,https://api.github.com/repos/simonw/datasette/issues/1654,1061197133,IC_kwDOBm6k_c4_QJVN,9599,simonw,2022-03-07T22:19:35Z,2022-03-07T22:19:35Z,OWNER,"Also now live on https://datasette.io ![CleanShot 2022-03-07 at 14 18 30@2x](https://user-images.githubusercontent.com/9599/157127424-805b3166-f0a8-4fac-be87-c055740af580.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161969891,Adopt a code of conduct, https://github.com/simonw/datasette/issues/1654#issuecomment-1061184206,https://api.github.com/repos/simonw/datasette/issues/1654,1061184206,IC_kwDOBm6k_c4_QGLO,9599,simonw,2022-03-07T22:04:51Z,2022-03-07T22:04:51Z,OWNER,I'm going to add this to the main Datasette repo (done) and the `datasette.io` website too.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161969891,Adopt a code of conduct, https://github.com/simonw/datasette/issues/1654#issuecomment-1061182132,https://api.github.com/repos/simonw/datasette/issues/1654,1061182132,IC_kwDOBm6k_c4_QFq0,9599,simonw,2022-03-07T22:02:43Z,2022-03-07T22:02:43Z,OWNER,"Neat, GitHub have a template for this https://github.com/simonw/datasette/community/code-of-conduct/new?template=contributor-covenant","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161969891,Adopt a code of conduct, https://github.com/simonw/datasette/issues/1654#issuecomment-1061181530,https://api.github.com/repos/simonw/datasette/issues/1654,1061181530,IC_kwDOBm6k_c4_QFha,9599,simonw,2022-03-07T22:02:06Z,2022-03-07T22:02:06Z,OWNER,https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/adding-a-code-of-conduct-to-your-project says this should be called `CODE_OF_CONDUCT.md` in order for GitHub to pick it up.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161969891,Adopt a code of conduct, https://github.com/simonw/datasette/issues/1654#issuecomment-1061181089,https://api.github.com/repos/simonw/datasette/issues/1654,1061181089,IC_kwDOBm6k_c4_QFah,9599,simonw,2022-03-07T22:01:38Z,2022-03-07T22:01:38Z,OWNER,"I'm going to use the [widely adopted](https://www.contributor-covenant.org/adopters/) Contributor Covenant: https://www.contributor-covenant.org/version/1/4/code-of-conduct/","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161969891,Adopt a code of conduct, https://github.com/simonw/datasette/issues/1651#issuecomment-1061170897,https://api.github.com/repos/simonw/datasette/issues/1651,1061170897,IC_kwDOBm6k_c4_QC7R,9599,simonw,2022-03-07T21:48:35Z,2022-03-07T21:48:35Z,OWNER,"My attempts to simplify `get_format()` keep resulting in errors like this one: ``` File ""/Users/simon/Dropbox/Development/datasette/datasette/views/base.py"", line 474, in view_get response_or_template_contexts = await self.data( TypeError: TableView.data() missing 1 required positional argument: 'table' ``` I really need to clean this up.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161584460,Get rid of the no-longer necessary ?_format=json hack for tables called x.json, https://github.com/simonw/datasette/issues/1651#issuecomment-1061169528,https://api.github.com/repos/simonw/datasette/issues/1651,1061169528,IC_kwDOBm6k_c4_QCl4,9599,simonw,2022-03-07T21:47:01Z,2022-03-07T21:47:01Z,OWNER,"Wow, this code is difficult to follow! Look at this bit inside the `get_format()` method: https://github.com/simonw/datasette/blob/bb499942c15c4e2cfa4b6afab8f8debe5948c009/datasette/views/base.py#L469-L478 That's modifying the arguments that were extracted from the path by the routing regular expressions to have `table` as ` dash-decoded value! So calling `.get_format()` has the side effect of decoding the table names for you. Nasty.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161584460,Get rid of the no-longer necessary ?_format=json hack for tables called x.json, https://github.com/simonw/datasette/issues/1653#issuecomment-1061150672,https://api.github.com/repos/simonw/datasette/issues/1653,1061150672,IC_kwDOBm6k_c4_P9_Q,9599,simonw,2022-03-07T21:23:39Z,2022-03-07T21:23:39Z,OWNER,"There may be a short-term fix for this: table view could start accepting a `?_sort_sql=SQLfragment` parameter, similar to the `?_where=` parameter described here: https://docs.datasette.io/en/stable/json_api.html#special-table-arguments That fragment could then be pre-populated in `metadata`. Makes me think maybe that `?_where=` should be optionally settable in metadata too?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161937073,Mechanism to default a table to sorting by multiple columns, https://github.com/simonw/datasette/issues/1653#issuecomment-1061148807,https://api.github.com/repos/simonw/datasette/issues/1653,1061148807,IC_kwDOBm6k_c4_P9iH,9599,simonw,2022-03-07T21:21:23Z,2022-03-07T21:21:23Z,OWNER,"This is currently blocked on the fact that Datasette doesn't have a mechanism for sorting by more than one column: - #197","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161937073,Mechanism to default a table to sorting by multiple columns, https://github.com/simonw/datasette/issues/1651#issuecomment-1061053094,https://api.github.com/repos/simonw/datasette/issues/1651,1061053094,IC_kwDOBm6k_c4_PmKm,9599,simonw,2022-03-07T19:29:01Z,2022-03-07T19:29:01Z,OWNER,"I found an obscure bug in #1650 which I can fix with this too. The following test should pass: ```python @pytest.mark.parametrize( ""path,expected"", ( ( ""/fivethirtyeight/twitter-ratio%2Fsenators"", ""/fivethirtyeight/twitter-2Dratio-2Fsenators"", ), ( ""/fixtures/table%2Fwith%2Fslashes.csv"", ""/fixtures/table-2Fwith-2Fslashes-2Ecsv"", ), # query string should be preserved (""/foo/bar%2Fbaz?id=5"", ""/foo/bar-2Fbaz?id=5""), ), ) def test_redirect_percent_encoding_to_dash_encoding(app_client, path, expected): response = app_client.get(path) assert response.status == 302 assert response.headers[""location""] == expected ``` It currently fails like this: ``` > assert response.headers[""location""] == expected E AssertionError: assert '/fixtures/table-2Fwith-2Fslashes.csv?_nofacet=1&_nocount=1' == '/fixtures/table-2Fwith-2Fslashes-2Ecsv' E - /fixtures/table-2Fwith-2Fslashes-2Ecsv E + /fixtures/table-2Fwith-2Fslashes.csv?_nofacet=1&_nocount=1 ``` Because the logic in that `get_format()` function notices that the table exists, and then weird things happen here: https://github.com/simonw/datasette/blob/1baa030eca375f839f3471237547ab403523e643/datasette/views/base.py#L288-L303 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161584460,Get rid of the no-longer necessary ?_format=json hack for tables called x.json, https://github.com/simonw/datasette/issues/1650#issuecomment-1061041034,https://api.github.com/repos/simonw/datasette/issues/1650,1061041034,IC_kwDOBm6k_c4_PjOK,9599,simonw,2022-03-07T19:16:51Z,2022-03-07T19:16:51Z,OWNER,"Here's the problem: https://github.com/simonw/datasette/blob/020effe47bf89f35182960a9645f2383a42ebd54/datasette/utils/__init__.py#L1173-L1175 Which is called here: https://github.com/simonw/datasette/blob/1baa030eca375f839f3471237547ab403523e643/datasette/views/base.py#L469-L473 So `table%2Fwith%2Fslashes` ends up decoded as if it was using dash encoding.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160750713,Implement redirects from old % encoding to new dash encoding, https://github.com/simonw/datasette/issues/1650#issuecomment-1061038414,https://api.github.com/repos/simonw/datasette/issues/1650,1061038414,IC_kwDOBm6k_c4_PilO,9599,simonw,2022-03-07T19:14:04Z,2022-03-07T19:14:04Z,OWNER,"The problem seems to be that `http://127.0.0.1:8002/fixtures/table%2Fwith%2Fslashes.csv` doesn't result in a 404 at all. If it did, it would be redirected.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160750713,Implement redirects from old % encoding to new dash encoding, https://github.com/simonw/datasette/issues/1439#issuecomment-1060870237,https://api.github.com/repos/simonw/datasette/issues/1439,1060870237,IC_kwDOBm6k_c4_O5hd,9599,simonw,2022-03-07T16:19:22Z,2022-03-07T16:19:22Z,OWNER,"I didn't need to do any of the fancy regular expression routing stuff after all, since the new dash encoding format avoids using `/` so a simple `[^/]+` can capture the correct segments from the URL.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1650#issuecomment-1060864823,https://api.github.com/repos/simonw/datasette/issues/1650,1060864823,IC_kwDOBm6k_c4_O4M3,9599,simonw,2022-03-07T16:14:33Z,2022-03-07T16:14:33Z,OWNER,Same problem here: https://fivethirtyeight.datasettes.com/fivethirtyeight/ahca-2Dpolls%2Fahca_polls should redirect but doesn't.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160750713,Implement redirects from old % encoding to new dash encoding, https://github.com/simonw/datasette/issues/1650#issuecomment-1060863311,https://api.github.com/repos/simonw/datasette/issues/1650,1060863311,IC_kwDOBm6k_c4_O31P,9599,simonw,2022-03-07T16:13:17Z,2022-03-07T16:13:17Z,OWNER,"This doesn't seem to work. https://latest.datasette.io/fixtures/table%2Fwith%2Fslashes.csv should be redirecting now that this is deployed - which it is, because https://latest.datasette.io/-/versions shows 644d25d1de78a36b105cca479e7b3e4375a6eadc - but I'm not getting that redirect.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160750713,Implement redirects from old % encoding to new dash encoding, https://github.com/simonw/datasette/issues/1651#issuecomment-1060853226,https://api.github.com/repos/simonw/datasette/issues/1651,1060853226,IC_kwDOBm6k_c4_O1Xq,9599,simonw,2022-03-07T16:04:26Z,2022-03-07T16:04:26Z,OWNER,"Here's the relevant code: https://github.com/simonw/datasette/blob/1baa030eca375f839f3471237547ab403523e643/datasette/utils/__init__.py#L753-L772 https://github.com/simonw/datasette/blob/1baa030eca375f839f3471237547ab403523e643/datasette/views/base.py#L451-L479","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1161584460,Get rid of the no-longer necessary ?_format=json hack for tables called x.json, https://github.com/simonw/datasette/issues/1650#issuecomment-1060836262,https://api.github.com/repos/simonw/datasette/issues/1650,1060836262,IC_kwDOBm6k_c4_OxOm,9599,simonw,2022-03-07T15:52:09Z,2022-03-07T15:52:09Z,OWNER,"This is a bit tricky. I tried this, sending a redirect only if a 404 happens: ```diff diff --git a/datasette/app.py b/datasette/app.py index 8c5480c..420664c 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -1211,6 +1211,10 @@ class DatasetteRouter: return await self.handle_404(request, send) async def handle_404(self, request, send, exception=None): + # If path contains % encoding, redirect to dash encoding + if '%' in request.scope[""path""]: + await asgi_send_redirect(send, request.scope[""path""].replace(""%"", ""-"")) + return # If URL has a trailing slash, redirect to URL without it path = request.scope.get( ""raw_path"", request.scope[""path""].encode(""utf8"") ``` But this URL didn't work: - http://127.0.0.1:8001/fivethirtyeight/twitter-ratio%2Fsenators I was expecting that to redirect to this page: - http://127.0.0.1:8001/fivethirtyeight/twitter-2Dratio-2Fsenators But instead it took me to another 404: - http://127.0.0.1:8001/fivethirtyeight/twitter-ratio%2Fsenators This is because that URL contains both a %-escaped `/` AND a plain `-` - which was not escaped in the old system but is escaped in the new system.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160750713,Implement redirects from old % encoding to new dash encoding, https://github.com/simonw/datasette/pull/1648#issuecomment-1060067031,https://api.github.com/repos/simonw/datasette/issues/1648,1060067031,IC_kwDOBm6k_c4_L1bX,9599,simonw,2022-03-06T23:50:40Z,2022-03-06T23:58:31Z,OWNER,"I may have to do extra work here ```python def database(self, database, format=None): db = self.ds.databases[database] if self.ds.setting(""hash_urls"") and db.hash: path = self.path( f""{dash_encode(database)}-{db.hash[:HASH_LENGTH]}"", format=format ) else: path = self.path(dash_encode(database), format=format) return path ``` The URLs that incorporate a hash have a `dbname-hash` format - will that `-` in the middle there mess up the dash decoding mechanism? I think it will. Might be able to solve that like so: ```python async def resolve_db_name(self, request, db_name, **kwargs): hash = None name = None decoded_name = dash_decode(db_name) if decoded_name not in self.ds.databases and ""-"" in db_name: # No matching DB found, maybe it's a name-hash? name_bit, hash_bit = db_name.rsplit(""-"", 1) if dash_decode(name_bit) not in self.ds.databases: raise NotFound(f""Database not found: {name}"") else: name = dash_decode(name_bit) hash = hash_bit else: name = decoded_name ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160432941,Use dash encoding for table names and row primary keys in URLs, https://github.com/simonw/datasette/pull/1648#issuecomment-1060065736,https://api.github.com/repos/simonw/datasette/issues/1648,1060065736,IC_kwDOBm6k_c4_L1HI,9599,simonw,2022-03-06T23:43:00Z,2022-03-06T23:43:11Z,OWNER,"> * Maybe use dash encoding for database name too? Yes, I'm going to do this. At the moment if a DB file is called `fixx%tures.db` when you run it in Datasette the path is `/fix%2525tures` - which is liable to break.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160432941,Use dash encoding for table names and row primary keys in URLs, https://github.com/simonw/datasette/pull/1648#issuecomment-1060056510,https://api.github.com/repos/simonw/datasette/issues/1648,1060056510,IC_kwDOBm6k_c4_Ly2-,9599,simonw,2022-03-06T23:02:05Z,2022-03-06T23:04:24Z,OWNER,"Just spotted this: https://github.com/simonw/datasette/blob/de810f49cc57a4f88e4a1553d26c579253ce4531/datasette/views/base.py#L203-L216 Maybe the db name should use dash encoding too? If so, relevant code includes this bit: https://github.com/simonw/datasette/blob/de810f49cc57a4f88e4a1553d26c579253ce4531/datasette/url_builder.py#L30-L38","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160432941,Use dash encoding for table names and row primary keys in URLs, https://github.com/simonw/datasette/pull/1648#issuecomment-1060044592,https://api.github.com/repos/simonw/datasette/issues/1648,1060044592,IC_kwDOBm6k_c4_Lv8w,9599,simonw,2022-03-06T21:42:35Z,2022-03-06T21:42:35Z,OWNER,"For consistency, I'm going to change how `?_next=` tokens work too. Right now they work like this: https://github.com/simonw/datasette/blob/de810f49cc57a4f88e4a1553d26c579253ce4531/datasette/views/table.py#L501-L507 https://github.com/simonw/datasette/blob/de810f49cc57a4f88e4a1553d26c579253ce4531/datasette/utils/__init__.py#L114-L116 I'm going to change those to use dash-encoding instead. I considered looking for `%` in those values and replacing that as `-` too, but since Datasette isn't 1.0 yet I'm going to risk breaking any pagination tokens that people might have saved away somewhere!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160432941,Use dash encoding for table names and row primary keys in URLs, https://github.com/simonw/datasette/issues/1439#issuecomment-1060044007,https://api.github.com/repos/simonw/datasette/issues/1439,1060044007,IC_kwDOBm6k_c4_Lvzn,9599,simonw,2022-03-06T21:38:15Z,2022-03-06T21:38:15Z,OWNER,"Test: https://github.com/simonw/datasette/blob/d2e3fe3facf0ed0abf8b00cd54463af90dd6904d/tests/test_utils.py#L651-L666 One big advantage to this scheme is that redirecting old links to `%2F` pages (e.g. https://fivethirtyeight.datasettes.com/fivethirtyeight/twitter-ratio%2Fsenators) is easy - if you see a `%` in the `raw_path`, redirect to that page with the `%` replaced by `-`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/pull/1648#issuecomment-1060034562,https://api.github.com/repos/simonw/datasette/issues/1648,1060034562,IC_kwDOBm6k_c4_LtgC,9599,simonw,2022-03-06T20:36:12Z,2022-03-06T20:36:12Z,OWNER,"Updated documentation: ![image](https://user-images.githubusercontent.com/9599/156941171-89778c12-41bc-4951-97f2-ecc805025a53.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160432941,Use dash encoding for table names and row primary keys in URLs, https://github.com/simonw/datasette/pull/1648#issuecomment-1060016221,https://api.github.com/repos/simonw/datasette/issues/1648,1060016221,IC_kwDOBm6k_c4_LpBd,9599,simonw,2022-03-06T18:37:59Z,2022-03-06T18:37:59Z,OWNER,"Change of plan: based on extensive conversations on Twitter - see https://github.com/simonw/datasette/issues/1439#issuecomment-1059851259 - I'm going to try a variant of this which is basically percent-encoding but with a hyphen instead of a percent symbol. Reason being that the current scheme doesn't handle the case of `%` being part of the table name, which could cause weird breakage due to some proxies decoding percent encoding before it gets to Datasette.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160432941,Use dash encoding for table names and row primary keys in URLs, https://github.com/simonw/datasette/issues/1439#issuecomment-1059903309,https://api.github.com/repos/simonw/datasette/issues/1439,1059903309,IC_kwDOBm6k_c4_LNdN,9599,simonw,2022-03-06T06:17:51Z,2022-03-06T06:17:51Z,OWNER,"Suggestion from a conversation with Seth Michael Larson: it would be neat if plugins could easily integrate with whatever scheme this ends up using, maybe with the `/db/table/-/plugin-name` standardized pattern or similar. Making it easy for plugins to do the right, consistent thing is a good idea.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/pull/1589#issuecomment-1059875687,https://api.github.com/repos/simonw/datasette/issues/1589,1059875687,IC_kwDOBm6k_c4_LGtn,9599,simonw,2022-03-06T01:58:25Z,2022-03-06T01:58:25Z,OWNER,Thanks for catching this.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1098275181,Typo in docs about default redirect status code, https://github.com/simonw/datasette/issues/1439#issuecomment-1059864154,https://api.github.com/repos/simonw/datasette/issues/1439,1059864154,IC_kwDOBm6k_c4_LD5a,9599,simonw,2022-03-06T00:59:04Z,2022-03-06T00:59:04Z,OWNER,"Needs more testing, but this seems to work for decoding the percent-escaped-with-dashes format: `urllib.parse.unquote(s.replace('-', '%'))`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059855418,https://api.github.com/repos/simonw/datasette/issues/1439,1059855418,IC_kwDOBm6k_c4_LBw6,9599,simonw,2022-03-06T00:00:53Z,2022-03-06T00:04:18Z,OWNER,"```python _ESCAPE_SAFE = frozenset( b'ABCDEFGHIJKLMNOPQRSTUVWXYZ' b'abcdefghijklmnopqrstuvwxyz' b'0123456789_' ) # I removed b'.-~') class Quoter(dict): # Keeps a cache internally, via __missing__ def __missing__(self, b): # Handle a cache miss. Store quoted string in cache and return. res = chr(b) if b in _ESCAPE_SAFE else '-{:02X}'.format(b) self[b] = res return res quoter = Quoter().__getitem__ ''.join([quoter(char) for char in b'foo/bar.csv']) # 'foo-2Fbar-2Ecsv' ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059854864,https://api.github.com/repos/simonw/datasette/issues/1439,1059854864,IC_kwDOBm6k_c4_LBoQ,9599,simonw,2022-03-05T23:59:05Z,2022-03-05T23:59:05Z,OWNER,"OK, for that percentage thing: the Python core implementation of URL percentage escaping deliberately ignores two of the characters we want to escape: `.` and `-`: https://github.com/python/cpython/blob/6927632492cbad86a250aa006c1847e03b03e70b/Lib/urllib/parse.py#L780-L783 ```python _ALWAYS_SAFE = frozenset(b'ABCDEFGHIJKLMNOPQRSTUVWXYZ' b'abcdefghijklmnopqrstuvwxyz' b'0123456789' b'_.-~') ``` It also defaults to skipping `/` (passed as a `safe=` parameter to various things). I'm going to try borrowing and modifying the core of the Python implementation: https://github.com/python/cpython/blob/6927632492cbad86a250aa006c1847e03b03e70b/Lib/urllib/parse.py#L795-L814 ```python class _Quoter(dict): """"""A mapping from bytes numbers (in range(0,256)) to strings. String values are percent-encoded byte values, unless the key < 128, and in either of the specified safe set, or the always safe set. """""" # Keeps a cache internally, via __missing__, for efficiency (lookups # of cached keys don't call Python code at all). def __init__(self, safe): """"""safe: bytes object."""""" self.safe = _ALWAYS_SAFE.union(safe) def __repr__(self): return f"""" def __missing__(self, b): # Handle a cache miss. Store quoted string in cache and return. res = chr(b) if b in self.safe else '%{:02X}'.format(b) self[b] = res return res ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059853526,https://api.github.com/repos/simonw/datasette/issues/1439,1059853526,IC_kwDOBm6k_c4_LBTW,9599,simonw,2022-03-05T23:49:59Z,2022-03-05T23:49:59Z,OWNER,"I want to try regular percentage encoding, except that it also encodes both the `-` and the `.` characters, AND it uses `-` instead of `%` as the encoding character. Should check what it does with emoji too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059851259,https://api.github.com/repos/simonw/datasette/issues/1439,1059851259,IC_kwDOBm6k_c4_LAv7,9599,simonw,2022-03-05T23:35:47Z,2022-03-05T23:35:59Z,OWNER,"This [comment from glyph](https://twitter.com/glyph/status/1500244937312329730) got me thinking: > Have you considered replacing % with some other character and then using percent-encoding? What happens if a table name includes a `%` character and that ends up getting mangled by a misbehaving proxy? I should consider `%` in the escaping system too. And maybe go with that suggestion of using percent-encoding directly but with a different character.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059850369,https://api.github.com/repos/simonw/datasette/issues/1439,1059850369,IC_kwDOBm6k_c4_LAiB,9599,simonw,2022-03-05T23:28:56Z,2022-03-05T23:28:56Z,OWNER,"Lots of great conversations about the dash encoding implementation on Twitter: https://twitter.com/simonw/status/1500228316309061633 @dracos helped me figure out a simpler regex: https://twitter.com/dracos/status/1500236433809973248 `^/(?P[^/]+)/(?P
[^\/\-\.]*|\-/|\-\.|\-\-)*(?P\.\w+)?$` ![image](https://user-images.githubusercontent.com/9599/156903088-c01933ae-4713-4e91-8d71-affebf70b945.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059836599,https://api.github.com/repos/simonw/datasette/issues/1439,1059836599,IC_kwDOBm6k_c4_K9K3,9599,simonw,2022-03-05T21:52:10Z,2022-03-05T21:52:10Z,OWNER,Blogged about this here: https://simonwillison.net/2022/Mar/5/dash-encoding/,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045069481,https://api.github.com/repos/simonw/datasette/issues/1439,1045069481,IC_kwDOBm6k_c4-Sn6p,9599,simonw,2022-02-18T19:34:41Z,2022-03-05T21:32:22Z,OWNER,"I think I got format extraction working! https://regex101.com/r/A0bW1D/1 ^/(?P[^/]+)/(?P
(?:[^\/\-\.]*|(?:\-/)*|(?:\-\.)*|(?:\-\-)*)*?)(?:(?\w+))?$ I had to make that crazy inner one even more complicated to stop it from capturing `.` that was not part of `-.`. (?:[^\/\-\.]*|(?:\-/)*|(?:\-\.)*|(?:\-\-)*)* Visualized: So now I have a regex which can extract out the dot-encoded table name AND spot if there is an optional `.format` at the end: If I end up using this in Datasette it's going to need VERY comprehensive unit tests and inline documentation.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1647#issuecomment-1059823119,https://api.github.com/repos/simonw/datasette/issues/1647,1059823119,IC_kwDOBm6k_c4_K54P,9599,simonw,2022-03-05T19:56:27Z,2022-03-05T19:56:27Z,OWNER,Updated this TIL with extra patterns I figured out: https://til.simonwillison.net/sqlite/ld-preload,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160407071,Test failures with SQLite 3.37.0+ due to column affinity case, https://github.com/simonw/datasette/issues/1439#issuecomment-1059822391,https://api.github.com/repos/simonw/datasette/issues/1439,1059822391,IC_kwDOBm6k_c4_K5s3,9599,simonw,2022-03-05T19:50:12Z,2022-03-05T19:50:12Z,OWNER,I'm going to move this work to a PR.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059822151,https://api.github.com/repos/simonw/datasette/issues/1439,1059822151,IC_kwDOBm6k_c4_K5pH,9599,simonw,2022-03-05T19:48:35Z,2022-03-05T19:48:35Z,OWNER,Those new docs: https://github.com/simonw/datasette/blob/d1cb73180b4b5a07538380db76298618a5fc46b6/docs/internals.rst#dash-encoding,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1647#issuecomment-1059821674,https://api.github.com/repos/simonw/datasette/issues/1647,1059821674,IC_kwDOBm6k_c4_K5hq,9599,simonw,2022-03-05T19:44:32Z,2022-03-05T19:44:32Z,OWNER,"I thought I'd need to introduce https://dirty-equals.helpmanual.io/types/string/ to help write tests for this, but I think I've found a good alternative that doesn't need a new dependency.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160407071,Test failures with SQLite 3.37.0+ due to column affinity case, https://github.com/simonw/datasette/issues/1647#issuecomment-1059819628,https://api.github.com/repos/simonw/datasette/issues/1647,1059819628,IC_kwDOBm6k_c4_K5Bs,9599,simonw,2022-03-05T19:28:54Z,2022-03-05T19:28:54Z,OWNER,"OK, using that trick worked for testing this: docker run -it -p 8001:8001 ubuntu Then inside that container: apt-get install -y python3 build-essential tcl wget python3-pip git python3.8-venv For each version of SQLite I wanted to test I needed to figure out the tarball URL - for example, for `3.38.0` I navigated to https://www.sqlite.org/src/timeline?t=version-3.38.0 and clicked the ""checkin"" link and copied the tarball link: https://www.sqlite.org/src/tarball/40fa792d/SQLite-40fa792d.tar.gz Then to build it (the `CPPFLAGS` took some trial and error): ``` cd /tmp wget https://www.sqlite.org/src/tarball/40fa792d/SQLite-40fa792d.tar.gz tar -xzvf SQLite-40fa792d.tar.gz cd SQLite-40fa792d CPPFLAGS=""-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_RTREE=1"" ./configure make ``` Then to test with Datasette: ``` cd /tmp git clone https://github.com/simonw/datasette cd datasette python3 -m venv venv source venv/bin/activate pip install wheel # So bdist_wheel works in next step pip install -e '.[test]' LD_PRELOAD=/tmp/SQLite-40fa792d/.libs/libsqlite3.so pytest ``` After some trial and error I proved that those tests passed with 3.36.0: ``` cd /tmp wget https://www.sqlite.org/src/tarball/5c9a6c06/SQLite-5c9a6c06.tar.gz tar -xzvf SQLite-5c9a6c06.tar.gz cd SQLite-5c9a6c06 CPPFLAGS=""-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_RTREE=1"" ./configure make cd /tmp/datasette LD_PRELOAD=/tmp/SQLite-5c9a6c06/.libs/libsqlite3.so pytest tests/test_internals_database.py ``` BUT failed with 3.37.0: ``` # 3.37.0 cd /tmp wget https://www.sqlite.org/src/tarball/bd41822c/SQLite-bd41822c.tar.gz tar -xzvf SQLite-bd41822c.tar.gz cd SQLite-bd41822c CPPFLAGS=""-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_RTREE=1"" ./configure make cd /tmp/datasette LD_PRELOAD=/tmp/SQLite-bd41822c/.libs/libsqlite3.so pytest tests/test_internals_database.py ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160407071,Test failures with SQLite 3.37.0+ due to column affinity case, https://github.com/simonw/datasette/issues/1647#issuecomment-1059807598,https://api.github.com/repos/simonw/datasette/issues/1647,1059807598,IC_kwDOBm6k_c4_K2Fu,9599,simonw,2022-03-05T18:06:56Z,2022-03-05T18:08:00Z,OWNER,"Had a look through the commits in https://github.com/sqlite/sqlite/compare/version-3.37.2...version-3.38.0 but couldn't see anything obvious that might have caused this. Really wish I had a good mechanism for running the test suite against different SQLite versions! May have to revisit this old trick: https://til.simonwillison.net/sqlite/ld-preload","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160407071,Test failures with SQLite 3.37.0+ due to column affinity case, https://github.com/simonw/datasette/issues/1647#issuecomment-1059804577,https://api.github.com/repos/simonw/datasette/issues/1647,1059804577,IC_kwDOBm6k_c4_K1Wh,9599,simonw,2022-03-05T17:49:46Z,2022-03-05T17:49:46Z,OWNER,My best guess is that this is an undocumented change in SQLite 3.38 - I get that test failure with that SQLite version.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160407071,Test failures with SQLite 3.37.0+ due to column affinity case, https://github.com/simonw/datasette/issues/1439#issuecomment-1059802318,https://api.github.com/repos/simonw/datasette/issues/1439,1059802318,IC_kwDOBm6k_c4_K0zO,9599,simonw,2022-03-05T17:34:33Z,2022-03-05T17:34:33Z,OWNER,"Wrote documentation: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059652538,https://api.github.com/repos/simonw/sqlite-utils/issues/412,1059652538,IC_kwDOCGYnMM4_KQO6,9599,simonw,2022-03-05T02:13:17Z,2022-03-05T02:13:17Z,OWNER,"> It looks like the existing `pd.read_sql_query()` method has an optional dependency on SQLAlchemy: > > ``` > ... > import pandas as pd > pd.read_sql_query(db.conn, ""select * from articles"") > # ImportError: Using URI string without sqlalchemy installed. > ``` Hah, no I was wrong about this: SQLAlchemy is not needed for SQLite to work, I just had the arguments the wrong way round: ```python pd.read_sql_query(""select * from articles"", db.conn) # Shows a DateFrame ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160182768,Optional Pandas integration, https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059651306,https://api.github.com/repos/simonw/sqlite-utils/issues/412,1059651306,IC_kwDOCGYnMM4_KP7q,9599,simonw,2022-03-05T02:10:49Z,2022-03-05T02:10:49Z,OWNER,"I could teach `.insert_all()` and `.upsert_all()` to optionally accept a DataFrame. A challenge there is `mypy` - if Pandas is an optional dependency, is it possibly to declare types that accept a Union that includes DataFrame?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160182768,Optional Pandas integration, https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059651056,https://api.github.com/repos/simonw/sqlite-utils/issues/412,1059651056,IC_kwDOCGYnMM4_KP3w,9599,simonw,2022-03-05T02:09:38Z,2022-03-05T02:09:38Z,OWNER,"OK, so reading results from existing `sqlite-utils` into a Pandas DataFrame turns out to be trivial. How about writing a DataFrame to a database table? That feels like it could a lot more useful.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160182768,Optional Pandas integration, https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059650190,https://api.github.com/repos/simonw/sqlite-utils/issues/412,1059650190,IC_kwDOCGYnMM4_KPqO,9599,simonw,2022-03-05T02:04:43Z,2022-03-05T02:04:54Z,OWNER,"To be honest, I'm having second thoughts about this now mainly because the idiom for turning a generator of dicts into a DataFrame is SO simple: ```python df = pd.DataFrame(db.query(""select * from articles"")) ``` Given it's that simple, I'm questioning if there's any value to adding this to `sqlite-utils` at all. This likely becomes a documentation thing instead!","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160182768,Optional Pandas integration, https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059649803,https://api.github.com/repos/simonw/sqlite-utils/issues/412,1059649803,IC_kwDOCGYnMM4_KPkL,9599,simonw,2022-03-05T02:02:41Z,2022-03-05T02:02:41Z,OWNER,"It looks like the existing `pd.read_sql_query()` method has an optional dependency on SQLAlchemy: ``` ... import pandas as pd pd.read_sql_query(db.conn, ""select * from articles"") # ImportError: Using URI string without sqlalchemy installed. ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160182768,Optional Pandas integration, https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059649213,https://api.github.com/repos/simonw/sqlite-utils/issues/412,1059649213,IC_kwDOCGYnMM4_KPa9,9599,simonw,2022-03-05T02:00:10Z,2022-03-05T02:00:10Z,OWNER,Requested feedback on Twitter here :https://twitter.com/simonw/status/1499927075930578948,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160182768,Optional Pandas integration, https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059649193,https://api.github.com/repos/simonw/sqlite-utils/issues/412,1059649193,IC_kwDOCGYnMM4_KPap,9599,simonw,2022-03-05T02:00:02Z,2022-03-05T02:00:02Z,OWNER,"Yeah, I imagine there are plenty of ways to do this with Pandas already - I'm opportunistically looking for a way to provide better integration with the rest of the Pandas situation from the work I've done in `sqlite-utils` already. Might be that this isn't worth doing at all.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160182768,Optional Pandas integration, https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059646645,https://api.github.com/repos/simonw/sqlite-utils/issues/412,1059646645,IC_kwDOCGYnMM4_KOy1,9599,simonw,2022-03-05T01:53:10Z,2022-03-05T01:53:10Z,OWNER,I'm not an experienced enough Pandas user to know if this design is right or not. I'm going to leave this open for a while and solicit some feedback.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160182768,Optional Pandas integration, https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059646543,https://api.github.com/repos/simonw/sqlite-utils/issues/412,1059646543,IC_kwDOCGYnMM4_KOxP,9599,simonw,2022-03-05T01:52:47Z,2022-03-05T01:52:47Z,OWNER,"I built a prototype of that second option and it looks pretty good: Here's the `pandas.py` prototype: ```python from .db import Database as _Database, Table as _Table, View as _View import pandas as pd from typing import ( Iterable, Union, Optional, ) class Database(_Database): def query( self, sql: str, params: Optional[Union[Iterable, dict]] = None ) -> pd.DataFrame: return pd.DataFrame(super().query(sql, params)) def table(self, table_name: str, **kwargs) -> Union[""Table"", ""View""]: ""Return a table object, optionally configured with default options."" klass = View if table_name in self.view_names() else Table return klass(self, table_name, **kwargs) class PandasQueryable: def rows_where( self, where: str = None, where_args: Optional[Union[Iterable, dict]] = None, order_by: str = None, select: str = ""*"", limit: int = None, offset: int = None, ) -> pd.DataFrame: return pd.DataFrame( super().rows_where( where, where_args, order_by=order_by, select=select, limit=limit, offset=offset, ) ) class Table(PandasQueryable, _Table): pass class View(PandasQueryable, _View): pass ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160182768,Optional Pandas integration, https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059646247,https://api.github.com/repos/simonw/sqlite-utils/issues/412,1059646247,IC_kwDOCGYnMM4_KOsn,9599,simonw,2022-03-05T01:51:03Z,2022-03-05T01:51:03Z,OWNER,"I considered two ways of doing this. First, have methods such as `db.query_df()` and `table.rows_df` which do the same as `.query()` and `table.rows` but return a DataFrame instead of a generator of dictionaries. Second, have a compatibility class that is imported separately such as: ```python from sqlite_utils.pandas import Database ``` Then have the `.query()` and `.rows` and other similar methods return dataframes.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1160182768,Optional Pandas integration, https://github.com/simonw/datasette/issues/1640#issuecomment-1059638778,https://api.github.com/repos/simonw/datasette/issues/1640,1059638778,IC_kwDOBm6k_c4_KM36,9599,simonw,2022-03-05T01:19:00Z,2022-03-05T01:19:00Z,OWNER,"The reason I implemented it like this was to support things like the `curl` progress bar if users decide to serve up large files using the `--static` mechanism. Here's the code that hooks it up to the URL resolver: https://github.com/simonw/datasette/blob/458f03ad3a454d271f47a643f4530bd8b60ddb76/datasette/app.py#L1001-L1005 Which uses this function: https://github.com/simonw/datasette/blob/a6ff123de5464806441f6a6f95145c9a83b7f20b/datasette/utils/asgi.py#L285-L310 One option here would be to support a workaround that looks something like this: http://localhost:8001/my-static/log.txt?_unknown_size=1` The URL routing code could then look out for that `?_unknown_size=1` option and, if it's present, omit the `content-length` header entirely. It's a bit of a cludge, but it would be pretty straight-forward to implement. Would that work for you @broccolihighkicks?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1148725876,"Support static assets where file length may change, e.g. logs", https://github.com/simonw/datasette/issues/1640#issuecomment-1059636420,https://api.github.com/repos/simonw/datasette/issues/1640,1059636420,IC_kwDOBm6k_c4_KMTE,9599,simonw,2022-03-05T01:13:26Z,2022-03-05T01:13:26Z,OWNER,"Hah, this is certainly unexpected. It looks like this is the code in question: https://github.com/simonw/datasette/blob/a6ff123de5464806441f6a6f95145c9a83b7f20b/datasette/utils/asgi.py#L259-L266 You're right: it assumes that the file it is serving won't change length while it is serving it.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1148725876,"Support static assets where file length may change, e.g. logs", https://github.com/simonw/datasette/issues/1642#issuecomment-1059635969,https://api.github.com/repos/simonw/datasette/issues/1642,1059635969,IC_kwDOBm6k_c4_KMMB,9599,simonw,2022-03-05T01:11:17Z,2022-03-05T01:11:17Z,OWNER,"`pip install datasette` in a fresh virtual environment doesn't show any warnings. Neither does `pip install -e '.'` in a fresh checkout. Or `pip install -e '.[test]'`. Closing this as can't reproduce.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1152072027,Dependency issue with asgiref and uvicorn, https://github.com/simonw/datasette/issues/1645#issuecomment-1059634688,https://api.github.com/repos/simonw/datasette/issues/1645,1059634688,IC_kwDOBm6k_c4_KL4A,9599,simonw,2022-03-05T01:06:08Z,2022-03-05T01:06:08Z,OWNER,"It sounds like you can workaround this with Varnish configuration for the moment, but I'm going to bump this up the list of things to fix - it's particularly relevant now as I'd like to get a solution in place before Datasette 1.0, since it's likely to be beneficial to plugins and hence should be part of the stable, documented plugin interface.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1154399841,"Sensible `cache-control` headers for static assets, including those served by plugins", https://github.com/simonw/datasette/issues/1645#issuecomment-1059634412,https://api.github.com/repos/simonw/datasette/issues/1645,1059634412,IC_kwDOBm6k_c4_KLzs,9599,simonw,2022-03-05T01:04:53Z,2022-03-05T01:04:53Z,OWNER,"The existing `app_css_hash` already isn't good enough, because I built that before `table.js` existed, and that file should obviously be smartly cached too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1154399841,"Sensible `cache-control` headers for static assets, including those served by plugins", https://github.com/simonw/datasette/issues/1645#issuecomment-1059633902,https://api.github.com/repos/simonw/datasette/issues/1645,1059633902,IC_kwDOBm6k_c4_KLru,9599,simonw,2022-03-05T01:03:06Z,2022-03-05T01:03:06Z,OWNER,"I agree: this is bad. Ideally, content served from `/static/` would apply best practices for static content serving - which to my mind means the following: - Where possible, serve with a far-future cache expiry header and use an asset URL that changes when the file itself changes - For assets without that, support conditional GET to avoid transferring the whole asset if it hasn't changed - Some kind of sensible mechanism for setting cache TTLs on assets that don't have a unique-file-per-version - in particular assets that might be served from plugins. Datasette half-implemented the first of these: if you view source on https://latest.datasette.io/ you'll see it links to `/-/static/app.css?cead5a` - which in the template looks like this: https://github.com/simonw/datasette/blob/dd94157f8958bdfe9f45575add934ccf1aba6d63/datasette/templates/base.html#L5 I had forgotten I had implemented this! Here is how it is calculated: https://github.com/simonw/datasette/blob/458f03ad3a454d271f47a643f4530bd8b60ddb76/datasette/app.py#L510-L516 So `app.css` right now could be safely served with a far-future cache header... only it isn't: ``` ~ % curl -i 'https://latest.datasette.io/-/static/app.css?cead5a' HTTP/2 200 content-type: text/css x-databases: _memory, _internal, fixtures, extra_database x-cloud-trace-context: 9ddc825620eb53d30fc127d1c750f342 date: Sat, 05 Mar 2022 01:01:53 GMT server: Google Frontend content-length: 16178 ``` The larger question though is what to do about other assets. I'm particularly interested in plugin assets, since visualization plugins like `datasette-vega` and `datasette-cluster-map` ship with large amounts of JavaScript and I'd really like that to be sensibly cached by default.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1154399841,"Sensible `cache-control` headers for static assets, including those served by plugins", https://github.com/simonw/sqlite-utils/issues/408#issuecomment-1056001414,https://api.github.com/repos/simonw/sqlite-utils/issues/408,1056001414,IC_kwDOCGYnMM4-8U2G,9599,simonw,2022-03-02T00:20:26Z,2022-03-02T00:20:26Z,OWNER,I need a `db.sqlite_version` property to implement this check.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1145882578,`deterministic=True` fails on versions of SQLite prior to 3.8.3, https://github.com/simonw/sqlite-utils/issues/408#issuecomment-1055996626,https://api.github.com/repos/simonw/sqlite-utils/issues/408,1055996626,IC_kwDOCGYnMM4-8TrS,9599,simonw,2022-03-02T00:12:21Z,2022-03-02T00:12:21Z,OWNER,Here's the SQLite changelog mentioning that it was added in 3.8.3: https://www.sqlite.org/changes.html#version_3_8_3,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1145882578,`deterministic=True` fails on versions of SQLite prior to 3.8.3, https://github.com/simonw/sqlite-utils/issues/408#issuecomment-1055995100,https://api.github.com/repos/simonw/sqlite-utils/issues/408,1055995100,IC_kwDOCGYnMM4-8TTc,9599,simonw,2022-03-02T00:10:41Z,2022-03-02T00:10:41Z,OWNER,"Here's the code in question: https://github.com/simonw/sqlite-utils/blob/521921b849003ed3742338f76f9d47ff3d95eaf3/sqlite_utils/db.py#L384-L394 It's checking for Python 3.8, because that's the version of Python that added the `deterministic=True` option: https://docs.python.org/3/library/sqlite3.html#sqlite3.Connection.create_function But from your error message it looks like it should be checking the SQLite version too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1145882578,`deterministic=True` fails on versions of SQLite prior to 3.8.3, https://github.com/simonw/sqlite-utils/issues/408#issuecomment-1055993700,https://api.github.com/repos/simonw/sqlite-utils/issues/408,1055993700,IC_kwDOCGYnMM4-8S9k,9599,simonw,2022-03-02T00:08:10Z,2022-03-02T00:08:10Z,OWNER,"I thought I'd made it so `deterministic=True` would be silently ignored in environments that don't support it, but clearly I missed a case here!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1145882578,`deterministic=True` fails on versions of SQLite prior to 3.8.3, https://github.com/simonw/sqlite-utils/issues/343#issuecomment-1055992544,https://api.github.com/repos/simonw/sqlite-utils/issues/343,1055992544,IC_kwDOCGYnMM4-8Srg,9599,simonw,2022-03-02T00:06:10Z,2022-03-02T00:06:10Z,OWNER,"Updated documentation: https://sqlite-utils.datasette.io/en/latest/python-api.html#setting-an-id-based-on-the-hash-of-the-row-contents Documentation for the renamed `utils.hash_record()` function: https://sqlite-utils.datasette.io/en/latest/reference.html#sqlite-utils-utils-hash-record","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1063388037,Provide function to generate hash_id from specified columns, https://github.com/simonw/sqlite-utils/issues/343#issuecomment-1055991226,https://api.github.com/repos/simonw/sqlite-utils/issues/343,1055991226,IC_kwDOCGYnMM4-8SW6,9599,simonw,2022-03-02T00:03:47Z,2022-03-02T00:03:47Z,OWNER,"Oops, broke mypy: ``` sqlite_utils/db.py:2600: error: Incompatible default for argument ""hash_id_columns"" (default has type ""Default"", argument has type ""Optional[Iterable[str]]"") Found 1 error in 1 file (checked 49 source files) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1063388037,Provide function to generate hash_id from specified columns, https://github.com/simonw/sqlite-utils/issues/343#issuecomment-1055855845,https://api.github.com/repos/simonw/sqlite-utils/issues/343,1055855845,IC_kwDOCGYnMM4-7xTl,9599,simonw,2022-03-01T21:04:45Z,2022-03-01T22:43:38Z,OWNER,"I'm going to make that `_hash()` utility function a documented, non-underscore-prefixed function too - called `hash_record()`.","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 1, ""eyes"": 0}",1063388037,Provide function to generate hash_id from specified columns, https://github.com/simonw/sqlite-utils/issues/409#issuecomment-1055930639,https://api.github.com/repos/simonw/sqlite-utils/issues/409,1055930639,IC_kwDOCGYnMM4-8DkP,9599,simonw,2022-03-01T22:40:15Z,2022-03-01T22:40:15Z,OWNER,"This test fails and I don't understand why: ```python from sqlite_utils import Database def test_transaction(): db1 = Database(memory_name=""transaction_test"", tracer=print) db2 = Database(memory_name=""transaction_test"", tracer=print) with db1.conn: db1[""t""].insert({""foo"": 1}) assert list(db2[""t""].rows) == [] assert list(db2[""t""].rows) == [{""foo"": 1}] ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1149661489,`with db:` for transactions, https://github.com/simonw/sqlite-utils/pull/410#issuecomment-1055856441,https://api.github.com/repos/simonw/sqlite-utils/issues/410,1055856441,IC_kwDOCGYnMM4-7xc5,9599,simonw,2022-03-01T21:05:21Z,2022-03-01T21:05:21Z,OWNER,Thanks!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1149729902,Correct spelling mistakes (found with codespell), https://github.com/simonw/sqlite-utils/issues/343#issuecomment-1055854884,https://api.github.com/repos/simonw/sqlite-utils/issues/343,1055854884,IC_kwDOCGYnMM4-7xEk,9599,simonw,2022-03-01T21:03:45Z,2022-03-01T21:03:45Z,OWNER,"Just found myself needing this capability myself! Relevant code: https://github.com/simonw/sqlite-utils/blob/8f386a0d300d1b1c76132bb75972b755049fb742/sqlite_utils/db.py#L2297-L2307 https://github.com/simonw/sqlite-utils/blob/8f386a0d300d1b1c76132bb75972b755049fb742/sqlite_utils/db.py#L2996-L3001 So various functions could grow a `hash_id_columns=(""title"", ""date"")` argument which causes just those columns to be included in the hash. Bonus: if you use `hash_id_columns=...` without setting `hash_id=""id""` it could assume that you want the column to be called `id`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1063388037,Provide function to generate hash_id from specified columns, https://github.com/simonw/datasette/issues/1439#issuecomment-1053973425,https://api.github.com/repos/simonw/datasette/issues/1439,1053973425,IC_kwDOBm6k_c4-0lux,9599,simonw,2022-02-28T07:40:12Z,2022-02-28T07:40:12Z,OWNER,"If I make this change it will break existing links to one of the oldest Datasette demos: http://fivethirtyeight.datasettes.com/fivethirtyeight/avengers%2Favengers A plugin that fixes those by redirecting them on 404 would be neat.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1049126151,https://api.github.com/repos/simonw/datasette/issues/1439,1049126151,IC_kwDOBm6k_c4-iGUH,9599,simonw,2022-02-23T19:17:01Z,2022-02-23T19:17:01Z,OWNER,Actually the relevant code looks to be: https://github.com/simonw/datasette/blob/7d24fd405f3c60e4c852c5d746c91aa2ba23cf5b/datasette/views/base.py#L481-L498,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1049124390,https://api.github.com/repos/simonw/datasette/issues/1439,1049124390,IC_kwDOBm6k_c4-iF4m,9599,simonw,2022-02-23T19:15:00Z,2022-02-23T19:15:00Z,OWNER,"I'll start by modifying this function: https://github.com/simonw/datasette/blob/458f03ad3a454d271f47a643f4530bd8b60ddb76/datasette/utils/__init__.py#L732-L749 Later I want to move this to the routing layer to split out `format` automatically, as seen in the regexes here: https://github.com/simonw/datasette/issues/1439#issuecomment-1045069481","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1049114724,https://api.github.com/repos/simonw/datasette/issues/1439,1049114724,IC_kwDOBm6k_c4-iDhk,9599,simonw,2022-02-23T19:04:40Z,2022-02-23T19:04:40Z,OWNER,I'm going to try dash encoding for table names (and row IDs) in a branch and see how I like it.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/928#issuecomment-672379897,https://api.github.com/repos/simonw/datasette/issues/928,672379897,MDEyOklzc3VlQ29tbWVudDY3MjM3OTg5Nw==,9599,simonw,2020-08-12T00:07:49Z,2022-02-23T16:19:47Z,OWNER,Made this into a TIL: https://til.simonwillison.net/python/call-pip-programatically,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",677272618,Test failures caused by failed attempts to mock pip, https://github.com/simonw/datasette/issues/1439#issuecomment-1045269544,https://api.github.com/repos/simonw/datasette/issues/1439,1045269544,IC_kwDOBm6k_c4-TYwo,9599,simonw,2022-02-18T22:19:29Z,2022-02-18T22:19:29Z,OWNER,"Note that I've ruled out using `Accept: application/json` to return JSON because it turns out Cloudflare and potentially other CDNs ignore the `Vary: Accept` header entirely: - https://github.com/simonw/datasette/issues/1534","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045134050,https://api.github.com/repos/simonw/datasette/issues/1439,1045134050,IC_kwDOBm6k_c4-S3ri,9599,simonw,2022-02-18T20:25:04Z,2022-02-18T20:25:04Z,OWNER,Here's a useful modern spec for how existing URL percentage encoding is supposed to work: https://url.spec.whatwg.org/#percent-encoded-bytes,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045131086,https://api.github.com/repos/simonw/datasette/issues/1439,1045131086,IC_kwDOBm6k_c4-S29O,9599,simonw,2022-02-18T20:22:13Z,2022-02-18T20:22:47Z,OWNER,"Should it encode `%` symbols too, since they have a special meaning in URLs and we can't guarantee that every single web server / proxy out there will round-trip them safely using percentage encoding? If so, would need to pick a different encoding character for them. Maybe `%` becomes `-p` - and in that case `/` could become `-s` too. Is it worth expanding dash-encoding outside of just `/` and `-` and `.` though? Not sure.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045117304,https://api.github.com/repos/simonw/datasette/issues/1439,1045117304,IC_kwDOBm6k_c4-Szl4,9599,simonw,2022-02-18T20:09:22Z,2022-02-18T20:09:22Z,OWNER,Adopting this could result in supporting database files with surprising characters in their filename too.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045108611,https://api.github.com/repos/simonw/datasette/issues/1439,1045108611,IC_kwDOBm6k_c4-SxeD,9599,simonw,2022-02-18T20:02:19Z,2022-02-18T20:08:34Z,OWNER,"One other potential variant: ```python def dash_encode(s): return s.replace(""-"", ""-dash-"").replace(""."", ""-dot-"").replace(""/"", ""-slash-"") def dash_decode(s): return s.replace(""-slash-"", ""/"").replace(""-dot-"", ""."").replace(""-dash-"", ""-"") ``` Except this has bugs - it doesn't round-trip safely, because it can get confused about things like `-dash-slash-` in terms of is that a `-dash-` or a `-slash-`? ```pycon >>> dash_encode(""/db/table-.csv.csv"") '-slash-db-slash-table-dash--dot-csv-dot-csv' >>> dash_decode('-slash-db-slash-table-dash--dot-csv-dot-csv') '/db/table-.csv.csv' >>> dash_encode('-slash-db-slash-table-dash--dot-csv-dot-csv') '-dash-slash-dash-db-dash-slash-dash-table-dash-dash-dash--dash-dot-dash-csv-dash-dot-dash-csv' >>> dash_decode('-dash-slash-dash-db-dash-slash-dash-table-dash-dash-dash--dash-dot-dash-csv-dash-dot-dash-csv') '-dash/dash-db-dash/dash-table-dash--dash.dash-csv-dash.dash-csv' ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045111309,https://api.github.com/repos/simonw/datasette/issues/1439,1045111309,IC_kwDOBm6k_c4-SyIN,9599,simonw,2022-02-18T20:04:24Z,2022-02-18T20:05:40Z,OWNER,"This made me worry that my current `dash_decode()` implementation had unknown round-trip bugs, but thankfully this works OK: ```pycon >>> dash_encode(""/db/table-.csv.csv"") '-/db-/table---.csv-.csv' >>> dash_encode('-/db-/table---.csv-.csv') '---/db---/table-------.csv---.csv' >>> dash_decode('---/db---/table-------.csv---.csv') '-/db-/table---.csv-.csv' >>> dash_decode('-/db-/table---.csv-.csv') '/db/table-.csv.csv' ``` The regex still works against that double-encoded example too: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045099290,https://api.github.com/repos/simonw/datasette/issues/1439,1045099290,IC_kwDOBm6k_c4-SvMa,9599,simonw,2022-02-18T19:56:18Z,2022-02-18T19:56:30Z,OWNER,"> ```python > def dash_encode(s): > return s.replace(""-"", ""--"").replace(""."", ""-."").replace(""/"", ""-/"") > > def dash_decode(s): > return s.replace(""-/"", ""/"").replace(""-."", ""."").replace(""--"", ""-"") > ``` I think **dash-encoding** (new name for this) is the right way forward here.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045024276,https://api.github.com/repos/simonw/datasette/issues/1439,1045024276,IC_kwDOBm6k_c4-Sc4U,9599,simonw,2022-02-18T19:01:42Z,2022-02-18T19:55:24Z,OWNER,"> Maybe I should use `-/` to encode forward slashes too, to defend against any ASGI servers that might not implement `raw_path` correctly. ```python def dash_encode(s): return s.replace(""-"", ""--"").replace(""."", ""-."").replace(""/"", ""-/"") def dash_decode(s): return s.replace(""-/"", ""/"").replace(""-."", ""."").replace(""--"", ""-"") ``` ```pycon >>> dash_encode(""foo/bar/baz.csv"") 'foo-/bar-/baz-.csv' >>> dash_decode('foo-/bar-/baz-.csv') 'foo/bar/baz.csv' ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045095348,https://api.github.com/repos/simonw/datasette/issues/1439,1045095348,IC_kwDOBm6k_c4-SuO0,9599,simonw,2022-02-18T19:53:48Z,2022-02-18T19:53:48Z,OWNER,"> Ugh, one disadvantage I just spotted with this: Datasette already has a `/-/versions.json` convention where ""system"" URLs are namespaced under `/-/` - but that could be confused under this new scheme with the `-/` escaping sequence. > > And I've thought about adding `/db/-/special` and `/db/table/-/special` URLs in the past too. I don't think this matters. The new regex does indeed capture that kind of page: But Datasette goes through configured route regular expressions in order - so I can have the regex that captures `/db/-/special` routes listed before the one that captures tables and formats.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045081042,https://api.github.com/repos/simonw/datasette/issues/1439,1045081042,IC_kwDOBm6k_c4-SqvS,9599,simonw,2022-02-18T19:44:12Z,2022-02-18T19:51:34Z,OWNER,"```python def dot_encode(s): return s.replace(""."", "".."").replace(""/"", ""./"") def dot_decode(s): return s.replace(""./"", ""/"").replace("".."", ""."") ``` No need for hyphen encoding in this variant at all, which simplifies things a bit. (Update: this is flawed, see https://github.com/simonw/datasette/issues/1439#issuecomment-1045086033)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045086033,https://api.github.com/repos/simonw/datasette/issues/1439,1045086033,IC_kwDOBm6k_c4-Sr9R,9599,simonw,2022-02-18T19:47:43Z,2022-02-18T19:51:11Z,OWNER,"- https://datasette.io/-/asgi-scope/db/./db./table-..csv..csv - https://til.simonwillison.net/-/asgi-scope/db/./db./table-..csv..csv Do both of those survive the round-trip to populate `raw_path` correctly? No! In both cases the `/./` bit goes missing. It looks like this might even be a client issue - `curl` shows me this: ``` ~ % curl -vv -i 'https://datasette.io/-/asgi-scope/db/./db./table-..csv..csv' * Trying 216.239.32.21:443... * Connected to datasette.io (216.239.32.21) port 443 (#0) * ALPN, offering http/1.1 * TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 * Server certificate: datasette.io * Server certificate: R3 * Server certificate: ISRG Root X1 > GET /-/asgi-scope/db/db./table-..csv..csv HTTP/1.1 ``` So `curl` decided to turn `/-/asgi-scope/db/./db./table` into `/-/asgi-scope/db/db./table` before even sending the request.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045082891,https://api.github.com/repos/simonw/datasette/issues/1439,1045082891,IC_kwDOBm6k_c4-SrML,9599,simonw,2022-02-18T19:45:32Z,2022-02-18T19:45:32Z,OWNER,"```pycon >>> dot_encode(""/db/table-.csv.csv"") './db./table-..csv..csv' >>> dot_decode('./db./table-..csv..csv') '/db/table-.csv.csv' ``` I worry that web servers might treat `./` in a special way though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045077590,https://api.github.com/repos/simonw/datasette/issues/1439,1045077590,IC_kwDOBm6k_c4-Sp5W,9599,simonw,2022-02-18T19:41:37Z,2022-02-18T19:42:41Z,OWNER,"Ugh, one disadvantage I just spotted with this: Datasette already has a `/-/versions.json` convention where ""system"" URLs are namespaced under `/-/` - but that could be confused under this new scheme with the `-/` escaping sequence. And I've thought about adding `/db/-/special` and `/db/table/-/special` URLs in the past too. Maybe change this system to use `.` as the escaping character instead of `-`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045075207,https://api.github.com/repos/simonw/datasette/issues/1439,1045075207,IC_kwDOBm6k_c4-SpUH,9599,simonw,2022-02-18T19:39:35Z,2022-02-18T19:40:13Z,OWNER,"> And if for some horific reason you had a table with the name `/db/table-.csv.csv` (so `/db/` was the first part of the actual table name in SQLite) the URLs would look like this: > > * `/db/%2Fdb%2Ftable---.csv-.csv` - the HTML version > * `/db/%2Fdb%2Ftable---.csv-.csv.csv` - the CSV version > * `/db/%2Fdb%2Ftable---.csv-.csv.json` - the JSON version Here's what those look like with the updated version of `dot_dash_encode()` that also encodes `/` as `-/`: - `/db/-/db-/table---.csv-.csv` - HTML - `/db/-/db-/table---.csv-.csv.csv` - CSV - `/db/-/db-/table---.csv-.csv.json` - JSON ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045059427,https://api.github.com/repos/simonw/datasette/issues/1439,1045059427,IC_kwDOBm6k_c4-Sldj,9599,simonw,2022-02-18T19:26:25Z,2022-02-18T19:26:25Z,OWNER,"With this new pattern I could probably extract out the optional `.json` format string as part of the initial route capturing regex too, rather than the current `table_and_format` hack.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045055772,https://api.github.com/repos/simonw/datasette/issues/1439,1045055772,IC_kwDOBm6k_c4-Skkc,9599,simonw,2022-02-18T19:23:33Z,2022-02-18T19:25:42Z,OWNER,"I want a match for this URL: /db/table-/with-/slashes-.csv Maybe this: ^/(?P[^/]+)/(?P([^/]*|(\-/)*|(\-\.)*|(\.\.)*)*$) Here we are matching a sequence of: ([^/]*|(\-/)*|(\-\.)*|(\-\-)*)* So a combination of not-slashes OR -/ or -. Or -- sequences ^/(?P[^/]+)/(?P([^/]*|(\-/)*|(\-\.)*|(\-\-)*)*$) Try that with non-capturing bits: ^/(?P[^/]+)/(?P(?:[^/]*|(?:\-/)*|(?:\-\.)*|(?:\-\-)*)*$) `(?:[^/]*|(?:\-/)*|(?:\-\.)*|(?:\-\-)*)*` visualized is: Here's the explanation on regex101.com https://regex101.com/r/CPnsIO/1 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045032377,https://api.github.com/repos/simonw/datasette/issues/1439,1045032377,IC_kwDOBm6k_c4-Se25,9599,simonw,2022-02-18T19:06:50Z,2022-02-18T19:06:50Z,OWNER,"How does URL routing for https://latest.datasette.io/fixtures/table%2Fwith%2Fslashes.csv work? Right now it's https://github.com/simonw/datasette/blob/7d24fd405f3c60e4c852c5d746c91aa2ba23cf5b/datasette/app.py#L1098-L1101 That's not going to capture the dot-dash encoding version of that table name: ```pycon >>> dot_dash_encode(""table/with/slashes.csv"") 'table-/with-/slashes-.csv' ``` Probably needs a fancy regex trick like a negative lookbehind assertion or similar.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1045027067,https://api.github.com/repos/simonw/datasette/issues/1439,1045027067,IC_kwDOBm6k_c4-Sdj7,9599,simonw,2022-02-18T19:03:26Z,2022-02-18T19:03:26Z,OWNER,"(If I make this change it may break some existing Datasette installations when they upgrade - I could try and build a plugin for them which triggers on 404s and checks to see if the old format would return a 200 response, then returns that.)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/sqlite-utils/issues/406#issuecomment-1040978032,https://api.github.com/repos/simonw/sqlite-utils/issues/406,1040978032,IC_kwDOCGYnMM4-DBBw,9599,simonw,2022-02-16T01:10:31Z,2022-02-16T01:10:31Z,OWNER,"Allowing custom strings in the `create()` method, as you suggest in your example, feels like a reasonable way to support this. ```python db[""dummy""].create({ ""title"": str, ""vector"": ""array"", }) ``` I'm slightly nervous about that just because people might accidentally use this without realizig what they are doing - passing `""column-name"": ""string""` for example when they should have used `""column-name"": str` in order to get a `TEXT` column. Alternatively, this could work: ```python db[""dummy""].create({ ""title"": str, ""vector"": CustomColumnType(""array"") }) ``` This would play better with `mypy` too I think.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1128466114,Creating tables with custom datatypes, https://github.com/simonw/sqlite-utils/issues/406#issuecomment-1040974519,https://api.github.com/repos/simonw/sqlite-utils/issues/406,1040974519,IC_kwDOCGYnMM4-DAK3,9599,simonw,2022-02-16T01:08:17Z,2022-02-16T01:08:17Z,OWNER,"I had no idea this was possible! I guess SQLite will allow any text string as the column type, defaulting to `TEXT` as the underlying default representation if it doesn't recognize the type.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1128466114,Creating tables with custom datatypes, https://github.com/simonw/sqlite-utils/issues/398#issuecomment-1040965672,https://api.github.com/repos/simonw/sqlite-utils/issues/398,1040965672,IC_kwDOCGYnMM4-C-Ao,9599,simonw,2022-02-16T01:02:29Z,2022-02-16T01:02:29Z,OWNER,"Documentation: - https://sqlite-utils.datasette.io/en/latest/cli-reference.html#create-database - https://sqlite-utils.datasette.io/en/latest/cli-reference.html#add-geometry-column - https://sqlite-utils.datasette.io/en/latest/cli-reference.html#create-spatial-index - https://sqlite-utils.datasette.io/en/latest/cli.html#spatialite-helpers","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124237013,Add SpatiaLite helpers to CLI, https://github.com/simonw/sqlite-utils/pull/407#issuecomment-1040959312,https://api.github.com/repos/simonw/sqlite-utils/issues/407,1040959312,IC_kwDOCGYnMM4-C8dQ,9599,simonw,2022-02-16T00:58:32Z,2022-02-16T00:58:32Z,OWNER,This is honestly one of the most complete PRs I've ever seen for a feature of this size. Thanks so much for this!,"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1138948786,Add SpatiaLite helpers to CLI, https://github.com/simonw/sqlite-utils/pull/407#issuecomment-1040598665,https://api.github.com/repos/simonw/sqlite-utils/issues/407,1040598665,IC_kwDOCGYnMM4-BkaJ,9599,simonw,2022-02-15T17:58:11Z,2022-02-15T17:58:11Z,OWNER,"Wow, just found out I can edit files in this PR branch by hitting `.` on my keyboard while looking at the PR, then making changes in the VS Code for web on `github.dev`!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1138948786,Add SpatiaLite helpers to CLI, https://github.com/simonw/sqlite-utils/pull/407#issuecomment-1040596969,https://api.github.com/repos/simonw/sqlite-utils/issues/407,1040596969,IC_kwDOCGYnMM4-Bj_p,9599,simonw,2022-02-15T17:56:22Z,2022-02-15T17:56:35Z,OWNER,"We should add SpatiaLite to the action that calculates code coverage - that way we can calculate coverage across the new GIS tests as well: https://github.com/simonw/sqlite-utils/blob/main/.github/workflows/test-coverage.yml Should just be a case of adding this to that workflow - we can do this in the same PR. ``` - name: Install SpatiaLite run: sudo apt-get install libsqlite3-mod-spatialite ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1138948786,Add SpatiaLite helpers to CLI, https://github.com/simonw/sqlite-utils/pull/407#issuecomment-1040595572,https://api.github.com/repos/simonw/sqlite-utils/issues/407,1040595572,IC_kwDOCGYnMM4-Bjp0,9599,simonw,2022-02-15T17:54:58Z,2022-02-15T17:54:58Z,OWNER,This PR looks fantastic.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1138948786,Add SpatiaLite helpers to CLI, https://github.com/simonw/datasette/issues/1143#issuecomment-1038289584,https://api.github.com/repos/simonw/datasette/issues/1143,1038289584,IC_kwDOBm6k_c494wqw,9599,simonw,2022-02-13T17:40:50Z,2022-02-13T17:41:17Z,OWNER,"The way Drupal does this is interesting; https://www.drupal.org/node/2715637 - it supports the following YAML: ```yaml # Configure Cross-Site HTTP requests (CORS). # Read https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS # for more information about the topic in general. # Note: By default the configuration is disabled. cors.config: enabled: false # Specify allowed headers, like 'x-allowed-header'. allowedHeaders: [] # Specify allowed request methods, specify ['*'] to allow all possible ones. allowedMethods: [] # Configure requests allowed from specific origins. allowedOrigins: ['*'] # Sets the Access-Control-Expose-Headers header. exposedHeaders: false # Sets the Access-Control-Max-Age header. maxAge: false # Sets the Access-Control-Allow-Credentials header. supportsCredentials: false ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",764059235,"More flexible CORS support in core, to encourage good security practices", https://github.com/simonw/datasette/issues/1634#issuecomment-1035667060,https://api.github.com/repos/simonw/datasette/issues/1634,1035667060,IC_kwDOBm6k_c49uwZ0,9599,simonw,2022-02-11T00:13:22Z,2022-02-11T00:13:22Z,OWNER,Looks like `3.10.2` is the latest: https://hub.docker.com/_/python?tab=tags&page=1&name=3.10.2-slim-bu,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1131295060,Update Dockerfile generated by `datasette publish`, https://github.com/simonw/datasette/issues/1634#issuecomment-1035664928,https://api.github.com/repos/simonw/datasette/issues/1634,1035664928,IC_kwDOBm6k_c49uv4g,9599,simonw,2022-02-11T00:10:07Z,2022-02-11T00:10:23Z,OWNER,Could also bump this up to Python 3.10: https://github.com/simonw/datasette/blob/5619069968ab39fd44c44a1888965e361c6f7fb9/Dockerfile#L1,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1131295060,Update Dockerfile generated by `datasette publish`, https://github.com/simonw/datasette/issues/1634#issuecomment-1035664412,https://api.github.com/repos/simonw/datasette/issues/1634,1035664412,IC_kwDOBm6k_c49uvwc,9599,simonw,2022-02-11T00:09:18Z,2022-02-11T00:09:18Z,OWNER,Starting it with `FROM datasetteproject/datasette` might be a good idea. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1131295060,Update Dockerfile generated by `datasette publish`, https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1033366312,https://api.github.com/repos/simonw/sqlite-utils/issues/402,1033366312,IC_kwDOCGYnMM49l-so,9599,simonw,2022-02-09T05:28:11Z,2022-02-09T07:28:48Z,OWNER,"My hunch is that the case where you want to consider input from more than one column will actually be pretty rare - the only case I can think of where I would want to do that is for latitude/longitude columns - everything else that I'd want to use it for (which admittedly is still mostly SpatiaLite stuff) works against a single value. The reason I'm leaning towards using the constructor for the values is that I really like the look of this variant for common conversions: ```python db[""places""].insert( { ""name"": ""London"", ""boundary"": GeometryFromGeoJSON({...}) } ) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125297737,Advanced class-based `conversions=` mechanism, https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1033428967,https://api.github.com/repos/simonw/sqlite-utils/issues/402,1033428967,IC_kwDOCGYnMM49mN_n,9599,simonw,2022-02-09T07:25:44Z,2022-02-09T07:28:11Z,OWNER,"The CLI version of this could perhaps look like this: sqlite-utils insert a.db places places.json \ --conversion boundary GeometryGeoJSON This will treat the boundary key as GeoJSON. It's equivalent to passing `conversions={""boundary"": geometryGeoJSON}` The combined latitude/longitude case here can be handled by combining this with the existing `--convert` mechanism. Any `Conversion` subclass will be available to the CLI in this way.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125297737,Advanced class-based `conversions=` mechanism, https://github.com/simonw/sqlite-utils/issues/405#issuecomment-1033425512,https://api.github.com/repos/simonw/sqlite-utils/issues/405,1033425512,IC_kwDOCGYnMM49mNJo,9599,simonw,2022-02-09T07:20:11Z,2022-02-09T07:20:11Z,OWNER,"Datasette's implementation: https://github.com/simonw/datasette/blob/458f03ad3a454d271f47a643f4530bd8b60ddb76/datasette/database.py#L73-L79 ```python if self.memory_name: uri = ""file:{}?mode=memory&cache=shared"".format(self.memory_name) conn = sqlite3.connect( uri, uri=True, check_same_thread=False, ) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1128139375,"`Database(memory_name=""name"")` constructor argument", https://github.com/simonw/sqlite-utils/issues/405#issuecomment-1033424454,https://api.github.com/repos/simonw/sqlite-utils/issues/405,1033424454,IC_kwDOCGYnMM49mM5G,9599,simonw,2022-02-09T07:18:25Z,2022-02-09T07:18:25Z,OWNER,Writing tests against this is always a tiny bit fiddly since the created databases persist across the lifetime of the test run. Using randomly generated names helps.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1128139375,"`Database(memory_name=""name"")` constructor argument", https://github.com/simonw/sqlite-utils/issues/404#issuecomment-1033410970,https://api.github.com/repos/simonw/sqlite-utils/issues/404,1033410970,IC_kwDOCGYnMM49mJma,9599,simonw,2022-02-09T06:56:35Z,2022-02-09T06:56:35Z,OWNER,https://sqlite-utils.datasette.io/en/latest/cli-reference.html#insert,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1128120451,Add example of `--convert` to the help for `sqlite-utils insert`, https://github.com/simonw/sqlite-utils/issues/404#issuecomment-1033407778,https://api.github.com/repos/simonw/sqlite-utils/issues/404,1033407778,IC_kwDOCGYnMM49mI0i,9599,simonw,2022-02-09T06:50:26Z,2022-02-09T06:50:26Z,OWNER,"I'll use this: ``` sqlite-utils insert plants.db plants plants.csv --csv --convert ' return { ""name"": row[""name""].upper(), ""latitude"": float(row[""latitude""]), ""longitude"": float(row[""longitude""]), }' ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1128120451,Add example of `--convert` to the help for `sqlite-utils insert`, https://github.com/simonw/datasette/issues/1607#issuecomment-1033403664,https://api.github.com/repos/simonw/datasette/issues/1607,1033403664,IC_kwDOBm6k_c49mH0Q,9599,simonw,2022-02-09T06:42:02Z,2022-02-09T06:42:02Z,OWNER,"Deployed a new build of https://github.com/simonw/calands-datasette/actions/workflows/build-and-deploy.yml for a live demo: https://calands.datasettes.com/-/versions","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1109783030,More detailed information about installed SpatiaLite version, https://github.com/simonw/sqlite-utils/issues/403#issuecomment-1032987901,https://api.github.com/repos/simonw/sqlite-utils/issues/403,1032987901,IC_kwDOCGYnMM49kiT9,9599,simonw,2022-02-08T19:36:06Z,2022-02-08T19:36:06Z,OWNER,New documentation: https://sqlite-utils.datasette.io/en/latest/cli.html#adding-a-primary-key-to-a-rowid-table,"{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1126692066,Document how to add a primary key to a rowid table using `sqlite-utils transform --pk`, https://github.com/simonw/sqlite-utils/issues/403#issuecomment-1032976720,https://api.github.com/repos/simonw/sqlite-utils/issues/403,1032976720,IC_kwDOCGYnMM49kflQ,9599,simonw,2022-02-08T19:23:05Z,2022-02-08T19:23:27Z,OWNER,"This is already possible using `sqlite-utils transform` like so: ``` % echo '[{""name"": ""Barry""}, {""name"": ""Sandra""}]' | sqlite-utils insert rowid.db records - % sqlite-utils schema rowid.db CREATE TABLE [records] ( [name] TEXT ); % sqlite-utils rows rowid.db records [{""name"": ""Barry""}, {""name"": ""Sandra""}] % sqlite-utils transform rowid.db records --pk id % sqlite-utils rows rowid.db records [{""id"": 1, ""name"": ""Barry""}, {""id"": 2, ""name"": ""Sandra""}] % sqlite-utils schema rowid.db CREATE TABLE ""records"" ( [id] INTEGER PRIMARY KEY, [name] TEXT ); % echo '[{""name"": ""Barry 2""}, {""name"": ""Sandra 2""}]' | sqlite-utils insert rowid.db records - % sqlite-utils rows rowid.db records [{""id"": 1, ""name"": ""Barry""}, {""id"": 2, ""name"": ""Sandra""}, {""id"": 3, ""name"": ""Barry 2""}, {""id"": 4, ""name"": ""Sandra 2""}] ``` It's not covered in the documentation though: https://sqlite-utils.datasette.io/en/3.23/cli.html#transforming-tables","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1126692066,Document how to add a primary key to a rowid table using `sqlite-utils transform --pk`, https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1030904948,https://api.github.com/repos/simonw/sqlite-utils/issues/402,1030904948,IC_kwDOCGYnMM49clx0,9599,simonw,2022-02-06T20:09:42Z,2022-02-08T07:40:44Z,OWNER,"I think this is the code that needs to become aware of this system: https://github.com/simonw/sqlite-utils/blob/fea8c9bcc509bcae75e99ae8870f520103b9aa58/sqlite_utils/db.py#L2453-L2469 There's an earlier branch that runs for upserts which needs to be modified too: https://github.com/simonw/sqlite-utils/blob/fea8c9bcc509bcae75e99ae8870f520103b9aa58/sqlite_utils/db.py#L2417-L2440","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125297737,Advanced class-based `conversions=` mechanism, https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1030902102,https://api.github.com/repos/simonw/sqlite-utils/issues/402,1030902102,IC_kwDOCGYnMM49clFW,9599,simonw,2022-02-06T19:53:34Z,2022-02-08T07:40:34Z,OWNER,"I like the idea that the contract for `Conversion` (or rather for its subclasses) is that it can wrap a Python value and then return both the SQL fragment - e.g. `GeomFromText(?, 4326)` - and the values that should be used as the SQL parameters.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125297737,Advanced class-based `conversions=` mechanism, https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1032296717,https://api.github.com/repos/simonw/sqlite-utils/issues/402,1032296717,IC_kwDOCGYnMM49h5kN,9599,simonw,2022-02-08T07:35:46Z,2022-02-08T07:35:46Z,OWNER,"I'm going to write the documentation for this first, before the implementation, so I can see if it explains cleanly enough that the design appears to be sound.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125297737,Advanced class-based `conversions=` mechanism, https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1032294365,https://api.github.com/repos/simonw/sqlite-utils/issues/402,1032294365,IC_kwDOCGYnMM49h4_d,9599,simonw,2022-02-08T07:32:09Z,2022-02-08T07:34:41Z,OWNER,"I have an idea for how that third option could work - the one that creates a new column using values from the existing ones: ```python db[""places""].insert( { ""name"": ""London"", ""lng"": -0.118092, ""lat"": 51.509865, }, conversions={""point"": LongitudeLatitude(""lng"", ""lat"")}, ) ``` How about specifying that the values in that `conversion=` dictionary can be: - A SQL string fragment (as currently implemented) - A subclass of `Conversion` as described above - Or... a callable function that takes the row as an argument and returns either a `Conversion` subclass instance or a literal value to be jnserted into the database (a string, int or float) Then you could do this: ```python db[""places""].insert( { ""name"": ""London"", ""lng"": -0.118092, ""lat"": 51.509865, }, conversions={ ""point"": lambda row: LongitudeLatitude( row[""lng""], row[""lat""] ) } ) ``` Something I really like about this is that it expands the abilities of `conversions=` beyond the slightly obscure need to customize the SQL fragment into something that can solve other data insertion cleanup problems too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125297737,Advanced class-based `conversions=` mechanism, https://github.com/simonw/datasette/issues/1632#issuecomment-1032057472,https://api.github.com/repos/simonw/datasette/issues/1632,1032057472,IC_kwDOBm6k_c49g_KA,9599,simonw,2022-02-07T23:50:01Z,2022-02-07T23:50:01Z,OWNER,Released in https://github.com/simonw/datasette/releases/tag/0.60.2,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1126604194,"datasette one.db one.db opens database twice, as one and one_2", https://github.com/simonw/datasette/issues/1632#issuecomment-1032050489,https://api.github.com/repos/simonw/datasette/issues/1632,1032050489,IC_kwDOBm6k_c49g9c5,9599,simonw,2022-02-07T23:39:11Z,2022-02-07T23:42:08Z,OWNER,"That implementation broke on Python 3.6 - which is still a supported Python version for the 0.60.x branch - `test_homepage` failed. ``` > assert ( ""2 rows in 1 table, 5 rows in 4 hidden tables, 1 view"" == counts_p.text.strip() ) E AssertionError: assert '2 rows in 1 ...ables, 1 view' == '1 table, 4 h...ables, 1 view' E - 1 table, 4 hidden tables, 1 view E + 2 rows in 1 table, 5 rows in 4 hidden tables, 1 view E ? ++++++++++ ++++++++++ ``` That's because this idiom isn't guaranteed to preserve order in versions earlier than Python 3.7: https://github.com/simonw/datasette/blob/fa5fc327adbbf70656ac533912f3fc0526a3873d/datasette/cli.py#L552-L553 I could say that `0.60.2` is the first version to require Python 3.7 - but that feels a little surprising. I'm going to use a different idiom for order-preserving de-duplication from [this StackOverflow](https://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-whilst-preserving-order) instead.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1126604194,"datasette one.db one.db opens database twice, as one and one_2", https://github.com/simonw/datasette/issues/1632#issuecomment-1032037391,https://api.github.com/repos/simonw/datasette/issues/1632,1032037391,IC_kwDOBm6k_c49g6QP,9599,simonw,2022-02-07T23:21:07Z,2022-02-07T23:21:07Z,OWNER,"For the record, here's the code that picks the `one_2` name if that stem is already used as a database name: https://github.com/simonw/datasette/blob/03305ea183b1534bc4cef3a721fe5f3700273b84/datasette/app.py#L401-L417","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1126604194,"datasette one.db one.db opens database twice, as one and one_2", https://github.com/simonw/datasette/issues/1632#issuecomment-1032036525,https://api.github.com/repos/simonw/datasette/issues/1632,1032036525,IC_kwDOBm6k_c49g6Ct,9599,simonw,2022-02-07T23:19:59Z,2022-02-07T23:19:59Z,OWNER,"I'm going to fix this in the CLI code itself, rather than fixing it in the `Datasette` constructor. That way if someone has a truly weird reason to want this behaviour they can construct Datasette directly. https://github.com/simonw/datasette/blob/03305ea183b1534bc4cef3a721fe5f3700273b84/datasette/cli.py#L535-L550","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1126604194,"datasette one.db one.db opens database twice, as one and one_2", https://github.com/simonw/datasette/issues/1632#issuecomment-1032034015,https://api.github.com/repos/simonw/datasette/issues/1632,1032034015,IC_kwDOBm6k_c49g5bf,9599,simonw,2022-02-07T23:17:57Z,2022-02-07T23:17:57Z,OWNER,I'm going to fix this in a 0.60.2 bug fix release.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1126604194,"datasette one.db one.db opens database twice, as one and one_2", https://github.com/simonw/datasette/issues/1632#issuecomment-1032032686,https://api.github.com/repos/simonw/datasette/issues/1632,1032032686,IC_kwDOBm6k_c49g5Gu,9599,simonw,2022-02-07T23:16:10Z,2022-02-07T23:16:10Z,OWNER,"I found this bug while trying to get the following to work: datasette /data/one.db /data/two.db /data/*.db --create I want this to create any missing database files on startup out of that literal list of `one.db` and `two.db` and to also open any other `*.db` files in that folder - needed for `datasette-publish-fly` in https://github.com/simonw/datasette-publish-fly/pull/12#issuecomment-1032029874","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1126604194,"datasette one.db one.db opens database twice, as one and one_2", https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1031787865,https://api.github.com/repos/simonw/sqlite-utils/issues/402,1031787865,IC_kwDOCGYnMM49f9VZ,9599,simonw,2022-02-07T18:33:27Z,2022-02-07T18:33:27Z,OWNER,"Hah, that's interesting - I've never used that mechanism before so it wasn't something that came to mind. They seem to be using a pretty surprising trick there that takes advantage of SQLite allowing you to define a column ""type"" using a made-up type name, which you can then introspect later.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125297737,Advanced class-based `conversions=` mechanism, https://github.com/simonw/datasette/issues/1439#issuecomment-1031141849,https://api.github.com/repos/simonw/datasette/issues/1439,1031141849,IC_kwDOBm6k_c49dfnZ,9599,simonw,2022-02-07T07:11:11Z,2022-02-07T07:11:11Z,OWNER,"I added a Link header to solve this problem for the JSON version in: - #1533 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1176#issuecomment-1031126801,https://api.github.com/repos/simonw/datasette/issues/1176,1031126801,IC_kwDOBm6k_c49db8R,9599,simonw,2022-02-07T06:43:31Z,2022-02-07T06:43:31Z,OWNER,Here's the new test: https://github.com/simonw/datasette/blob/03305ea183b1534bc4cef3a721fe5f3700273b84/tests/test_docs.py#L91-L104,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",779691739,"Policy on documenting ""public"" datasette.utils functions", https://github.com/simonw/datasette/issues/1176#issuecomment-1031126547,https://api.github.com/repos/simonw/datasette/issues/1176,1031126547,IC_kwDOBm6k_c49db4T,9599,simonw,2022-02-07T06:42:58Z,2022-02-07T06:42:58Z,OWNER,"That fixed it: https://docs.datasette.io/en/latest/internals.html#parse-metadata-content ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",779691739,"Policy on documenting ""public"" datasette.utils functions", https://github.com/simonw/datasette/issues/1176#issuecomment-1031125347,https://api.github.com/repos/simonw/datasette/issues/1176,1031125347,IC_kwDOBm6k_c49dblj,9599,simonw,2022-02-07T06:40:16Z,2022-02-07T06:40:16Z,OWNER,"Read The Docs error: > Problem in your project's configuration. Invalid ""python.version"": .readthedocs.yaml: Invalid configuration option: python.version. Make sure the key name is correct.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",779691739,"Policy on documenting ""public"" datasette.utils functions", https://github.com/simonw/datasette/issues/1176#issuecomment-1031123719,https://api.github.com/repos/simonw/datasette/issues/1176,1031123719,IC_kwDOBm6k_c49dbMH,9599,simonw,2022-02-07T06:36:32Z,2022-02-07T06:36:32Z,OWNER,"https://github.com/simonw/sqlite-utils/blob/main/.readthedocs.yaml looks like this (it works correctly): ```yaml version: 2 sphinx: configuration: docs/conf.py python: version: ""3.8"" install: - method: pip path: . extra_requirements: - docs ``` Compare to the current Datasette one here: https://github.com/simonw/datasette/blob/d9b508ffaa91f9f1840b366f5d282712d445f16b/.readthedocs.yaml#L1-L13 Looks like I need this bit: ```python python: version: ""3.8"" install: - method: pip path: . extra_requirements: - docs ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",779691739,"Policy on documenting ""public"" datasette.utils functions", https://github.com/simonw/datasette/issues/1176#issuecomment-1031122800,https://api.github.com/repos/simonw/datasette/issues/1176,1031122800,IC_kwDOBm6k_c49da9w,9599,simonw,2022-02-07T06:34:21Z,2022-02-07T06:34:21Z,OWNER,"New section is here: https://docs.datasette.io/en/latest/internals.html#the-datasette-utils-module But it's not correctly displaying the new autodoc stuff: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",779691739,"Policy on documenting ""public"" datasette.utils functions", https://github.com/simonw/datasette/issues/1176#issuecomment-1031108559,https://api.github.com/repos/simonw/datasette/issues/1176,1031108559,IC_kwDOBm6k_c49dXfP,9599,simonw,2022-02-07T06:11:27Z,2022-02-07T06:11:27Z,OWNER,I'm going with `@documented` as the decorator for functions that should be documented.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",779691739,"Policy on documenting ""public"" datasette.utils functions", https://github.com/simonw/datasette/issues/932#issuecomment-1030940407,https://api.github.com/repos/simonw/datasette/issues/932,1030940407,IC_kwDOBm6k_c49cub3,9599,simonw,2022-02-06T23:31:22Z,2022-02-06T23:31:22Z,OWNER,"Great argument for doing this from a conversation on Twitter about documentation-driven development: > Long ago, when the majority of commercial programs were desktop apps, I've read a very wise advice: The user manual should be written first, before even a single line if code. https://twitter.com/b11c/status/1490466703175823362","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",678760988,End-user documentation, https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030902158,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030902158,IC_kwDOCGYnMM49clGO,9599,simonw,2022-02-06T19:53:54Z,2022-02-06T19:53:54Z,OWNER,"Moving the design of this new `Conversion` subclass mechanism to: - https://github.com/simonw/sqlite-utils/issues/402","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1030901853,https://api.github.com/repos/simonw/sqlite-utils/issues/402,1030901853,IC_kwDOCGYnMM49clBd,9599,simonw,2022-02-06T19:52:10Z,2022-02-06T19:52:10Z,OWNER,"So the key idea here is to introduce a new abstract base class, `Conversion`, which has the following abilities: - Can wrap one or more Python values (if called using the constructor) such that the `.insert_all()` method knows how to transform those into a format that can be included in an insert - something like `GeomFromText(?, 4326)` with input `POINT(-0.118092 51.509865)` - Can be passed to `conversions={""point"": LongitudeLatitude}` in a way that then knows to apply that conversion to every value in the `""point""` key of the data being inserted. - Maybe also extend `conversions=` to allow the definition of additional keys that use as input other rows? That's the `conversions={""point"": LongitudeLatitude(""lng"", ""lat"")}` example above - it may not be possible to get this working with the rest of the design though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125297737,Advanced class-based `conversions=` mechanism, https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1030901189,https://api.github.com/repos/simonw/sqlite-utils/issues/402,1030901189,IC_kwDOCGYnMM49ck3F,9599,simonw,2022-02-06T19:48:36Z,2022-02-06T19:48:52Z,OWNER,"From [that thread](https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030739566), two extra ideas which it may be possible to support in a single implementation: ```python from sqlite_utils.conversions import LongitudeLatitude db[""places""].insert( { ""name"": ""London"", ""lng"": -0.118092, ""lat"": 51.509865, }, conversions={""point"": LongitudeLatitude(""lng"", ""lat"")}, ) ``` And ```python db[""places""].insert( { ""name"": ""London"", ""point"": LongitudeLatitude(-0.118092, 51.509865) } ) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125297737,Advanced class-based `conversions=` mechanism, https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030871591,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030871591,IC_kwDOCGYnMM49cdon,9599,simonw,2022-02-06T16:57:22Z,2022-02-06T16:57:22Z,OWNER,"I wonder if I could implement the above such that this *also* works: ```python db[""places""].insert( { ""name"": ""London"", ""point"": LongitudeLatitude(-0.118092, 51.509865) } ) ``` This feels like a very natural way to work with single inserts. The challenge is writing the code inside `.insert_all()` such that it can handle these special objects in the input column values in addition to them being passed in `conversions=`. I'm feeling very good about this direction in general though, it feels like it takes the existing but not particularly elegant `conversions=` mechanism and upgrades it to be far more useful, while maintaining backwards compatibility.","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 1}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/datasette/issues/1176#issuecomment-1030762279,https://api.github.com/repos/simonw/datasette/issues/1176,1030762279,IC_kwDOBm6k_c49cC8n,9599,simonw,2022-02-06T06:38:08Z,2022-02-06T06:41:37Z,OWNER,"Might do this using Sphinx auto-generated function and class documentation hooks, as seen here in `sqlite-utils`: https://sqlite-utils.datasette.io/en/stable/python-api.html#spatialite-helpers This would encourage me to add really good docstrings. ``` .. _python_api_gis_find_spatialite: Finding SpatiaLite ------------------ .. autofunction:: sqlite_utils.utils.find_spatialite ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",779691739,"Policy on documenting ""public"" datasette.utils functions", https://github.com/simonw/datasette/issues/957#issuecomment-1030762140,https://api.github.com/repos/simonw/datasette/issues/957,1030762140,IC_kwDOBm6k_c49cC6c,9599,simonw,2022-02-06T06:36:41Z,2022-02-06T06:36:41Z,OWNER,Documented here: https://docs.datasette.io/en/latest/internals.html#import-shortcuts,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688622148,Simplify imports of common classes, https://github.com/simonw/datasette/issues/957#issuecomment-1030761625,https://api.github.com/repos/simonw/datasette/issues/957,1030761625,IC_kwDOBm6k_c49cCyZ,9599,simonw,2022-02-06T06:30:32Z,2022-02-06T06:31:44Z,OWNER,"I'm just going with: ```python from datasette import Response from datasette import Forbidden from datasette import NotFound from datasette import hookimpl from datasette import actor_matches_allow ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688622148,Simplify imports of common classes, https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740963,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030740963,IC_kwDOCGYnMM49b9vj,9599,simonw,2022-02-06T03:00:33Z,2022-02-06T03:00:33Z,OWNER,"Yeah, having this be a general purpose mechanism which has a few canned examples for handling geospatial stuff is a lot neater than having a mechanism for this that's exclusive to SpatiaLite.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740846,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030740846,IC_kwDOCGYnMM49b9tu,9599,simonw,2022-02-06T02:59:21Z,2022-02-06T02:59:21Z,OWNER,I wonder if there are any interesting non-geospatial canned conversions that it would be worth including?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740771,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030740771,IC_kwDOCGYnMM49b9sj,9599,simonw,2022-02-06T02:58:29Z,2022-02-06T02:58:29Z,OWNER,That example you have there is really neat - I like the idea that they can also be used to populate completely new columns that are derived from the other column inputs.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740570,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030740570,IC_kwDOCGYnMM49b9pa,9599,simonw,2022-02-06T02:56:17Z,2022-02-06T02:57:00Z,OWNER,"Thinking about types. The type of the `conversions` parameter right now is a bit lazy: ```python conversions: Optional[dict] = None, ``` That becomes: ```python Optional[Dict[str, Union[str, Conversion]]] ``` Where `Conversion` is an abstract base class which expects implementations to have a `.sql() -> str` and a `.convert(value) -> str` method.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030739566,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030739566,IC_kwDOCGYnMM49b9Zu,9599,simonw,2022-02-06T02:45:25Z,2022-02-06T02:50:27Z,OWNER,"Another idea - my favourite option so far: ```python from sqlite_utils.utils import LongitudeLatitude db[""places""].insert( { ""name"": ""London"", ""point"": (-0.118092, 51.509865) }, conversions={""point"": LongitudeLatitude}, ) ``` Here `LongitudeLatitude` is a magical value which does TWO things: it sets up the `GeomFromText(?, 4326)` SQL function, and it handles converting the `(51.509865, -0.118092)` tuple into a `POINT({} {})` string. This would involve a change to the `conversions=` contract - where it usually expects a SQL string fragment, but it can also take an object which combines that SQL string fragment with a Python conversion function. Best of all... this resolves the `lat, lon` v.s. `lon, lat` dilemma because you can use `from sqlite_utils.utils import LongitudeLatitude` OR `from sqlite_utils.utils import LatitudeLongitude` depending on which you prefer!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030738023,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030738023,IC_kwDOCGYnMM49b9Bn,9599,simonw,2022-02-06T02:28:05Z,2022-02-06T02:29:24Z,OWNER,"Here's the definitive guide to `latitude, longitude` v.s. `longitude, latitude`: https://macwright.com/lonlat/ > Which is right? > > Neither. This is an opinion with no right answer. Geographical tradition favors lat, lon. Math and software prefer lon, lat. I asked on Twitter here: https://twitter.com/simonw/status/1490148001569906688","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030736848,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030736848,IC_kwDOCGYnMM49b8vQ,9599,simonw,2022-02-06T02:17:35Z,2022-02-06T02:17:35Z,OWNER,"Note that GeoJSON itself uses `(longitude, latitude)` so I should probably stick to that order here too. https://datatracker.ietf.org/doc/html/rfc7946#section-3.1.1","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030736589,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030736589,IC_kwDOCGYnMM49b8rN,9599,simonw,2022-02-06T02:14:52Z,2022-02-06T02:14:52Z,OWNER,"Another idea: introduce a helper function transform pattern, something a bit like this: ```python transformer = make_transformer({ ""point"": lambda pair: ""POINT({} {})"".format(pair[1], pair[0]) }) db[""places""].insert_all( transformer([{""name"": ""London"", ""point"": (51.509865, -0.118092)}]) conversions={""point"": ""GeomFromText(?, 4326)""}, ) ``` The `make_transformer(...)` function builds an object that can work as a wrapping iterator, applying those transform functions to everything in the sequence that it wraps. So the above code would handle converting `(lat, lon)` to `POINT(lon lat)` - then the `conversions=` applies `GeomFromText`. Naming is a challenge here: `.transform()` and `.convert()` and `conversions=` all have existing meanings within the `sqlite-utils` Python library. It's also a bit of a messy way of solving this. It's not exactly a smooth API for inserting a bunch of lat/lon coordinate pairs!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030736047,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030736047,IC_kwDOCGYnMM49b8iv,9599,simonw,2022-02-06T02:10:18Z,2022-02-06T02:10:18Z,OWNER,"So maybe back to that earlier idea where the code introspects the table, figures out that `""point""` is a geometry table of type POINT, then applies the necessary conversions to the raw Python data? That feels overly-complicated to me, especially since nothing else in the `.insert()` method currently relies on table introspection.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030735774,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030735774,IC_kwDOCGYnMM49b8ee,9599,simonw,2022-02-06T02:08:19Z,2022-02-06T02:08:59Z,OWNER,"Maybe I should leave this entirely up to documented patterns in the `conversions={}` dictionary? But even that's not ideal for the co-ordinate case. Consider the following: ```python db[""places""].insert( {""name"": ""London"", ""point"": (51.509865, -0.118092)}, conversions={""point"": ""GeomFromText(?, 4326)""}, ) ``` The challenge here is that the SpatiaLite function `GeomFromText()` expects a WKT string, which looks like this: POINT(-0.118092 51.509865) The existing `conversions=` mechanism doesn't support applying Python code to convert the `(lat, lon)` tuple to that value. It doesn't even support passing a Python tuple as a `?` parameter - so I don't think I could come up with a SQL string that would do the right thing here either.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/401#issuecomment-1030735372,https://api.github.com/repos/simonw/sqlite-utils/issues/401,1030735372,IC_kwDOCGYnMM49b8YM,9599,simonw,2022-02-06T02:05:03Z,2022-02-06T02:05:03Z,OWNER,Improved version: https://sqlite-utils.datasette.io/en/latest/python-api.html#converting-column-values-using-sql-functions,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125081640,Update SpatiaLite example in the documentation, https://github.com/simonw/sqlite-utils/issues/401#issuecomment-1030734937,https://api.github.com/repos/simonw/sqlite-utils/issues/401,1030734937,IC_kwDOCGYnMM49b8RZ,9599,simonw,2022-02-06T02:02:24Z,2022-02-06T02:02:24Z,OWNER,The example also doesn't work right now - the code that fetches from Who's On First gets a 403 forbidden error.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125081640,Update SpatiaLite example in the documentation, https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030732909,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030732909,IC_kwDOCGYnMM49b7xt,9599,simonw,2022-02-06T01:47:06Z,2022-02-06T01:47:06Z,OWNER,"Here's an idea for an API design: ```python geojson_geometry = {} # ... GeoJSON goes here db[""places""].insert( {""name"": ""Wales"", ""geometry"": geojson_geometry}, geojson=""geometry"" ) ``` That `geojson=` parameter takes either a single column name or an iterable of column names. Any column in that list is expected to be a compatible `geometry` and the correct conversion functions will be applied. That solves for GeoJSON, but it's a bit ugly. Should I add `wkt=` and maybe even `kml=` and `gml=` and so-on too? Definitely not, that's way too many ugly and inscrutable new parameters. More importantly: if I want to support the following how would I do it? ```python db[""places""].insert( {""name"": ""London"", ""point"": (51.509865, -0.118092)} ) ``` Here I want to provide a `(latitude, longitude)` pair and have it inserted correctly into a `point` column. Could do this, but again it's messy: ```python db[""places""].insert( {""name"": ""London"", ""point"": (51.509865, -0.118092)}, point=""point"" ) ``` And again, what about those `(longitude, latitude)` people?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/398#issuecomment-1030732222,https://api.github.com/repos/simonw/sqlite-utils/issues/398,1030732222,IC_kwDOCGYnMM49b7m-,9599,simonw,2022-02-06T01:42:19Z,2022-02-06T01:42:28Z,OWNER,"Adding some thoughts to: - #399 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124237013,Add SpatiaLite helpers to CLI, https://github.com/simonw/sqlite-utils/issues/398#issuecomment-1030732093,https://api.github.com/repos/simonw/sqlite-utils/issues/398,1030732093,IC_kwDOCGYnMM49b7k9,9599,simonw,2022-02-06T01:41:37Z,2022-02-06T01:41:37Z,OWNER,Yeah I'd like to avoid adding any geo-dependencies to `sqlite-utils` if I can avoid it. I'm fine using stuff that's going to be available in SpatiaLite itself (provided it's available as a SQLite module) since then I don't need to add any extra Python dependencies.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124237013,Add SpatiaLite helpers to CLI, https://github.com/simonw/sqlite-utils/issues/400#issuecomment-1030730748,https://api.github.com/repos/simonw/sqlite-utils/issues/400,1030730748,IC_kwDOCGYnMM49b7P8,9599,simonw,2022-02-06T01:34:46Z,2022-02-06T01:34:46Z,OWNER,"Actually this is not needed - there is already an option that does this, it's just called `--ignore` rather than `--if-not-exists`. The lack of consistency here is a little annoying, but not annoying enough to justify making a backwards incompatible change. ``` % sqlite-utils create-table --help Usage: sqlite-utils create-table [OPTIONS] PATH TABLE COLUMNS... Add a table with the specified columns. Columns should be specified using name, type pairs, for example: sqlite-utils create-table my.db people \ id integer \ name text \ height float \ photo blob --pk id Options: --pk TEXT Column to use as primary key --not-null TEXT Columns that should be created as NOT NULL --default ... Default value that should be set for a column --fk ... Column, other table, other column to set as a foreign key --ignore If table already exists, do nothing --replace If table already exists, replace it --load-extension TEXT SQLite extensions to load -h, --help Show this message and exit. ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1125077063,`sqlite-utils create-table` ... `--if-not-exists`, https://github.com/simonw/sqlite-utils/issues/397#issuecomment-1030730108,https://api.github.com/repos/simonw/sqlite-utils/issues/397,1030730108,IC_kwDOCGYnMM49b7F8,9599,simonw,2022-02-06T01:30:46Z,2022-02-06T01:30:46Z,OWNER,Updated documentation is here: https://sqlite-utils.datasette.io/en/latest/python-api.html#explicitly-creating-a-table,"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",1123903919,Support IF NOT EXISTS for table creation, https://github.com/simonw/sqlite-utils/issues/397#issuecomment-1030727979,https://api.github.com/repos/simonw/sqlite-utils/issues/397,1030727979,IC_kwDOCGYnMM49b6kr,9599,simonw,2022-02-06T01:19:21Z,2022-02-06T01:19:21Z,OWNER,"Just noticed there's no explicit test coverage for the `db[""table""].create(...)` method.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1123903919,Support IF NOT EXISTS for table creation, https://github.com/simonw/sqlite-utils/issues/397#issuecomment-1030726991,https://api.github.com/repos/simonw/sqlite-utils/issues/397,1030726991,IC_kwDOCGYnMM49b6VP,9599,simonw,2022-02-06T01:13:58Z,2022-02-06T01:13:58Z,OWNER,This is a good idea. We already have that parameter for the `table.create_index()` method: https://sqlite-utils.datasette.io/en/stable/reference.html#sqlite_utils.db.Table.create_index,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1123903919,Support IF NOT EXISTS for table creation, https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030712129,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030712129,IC_kwDOCGYnMM49b2tB,9599,simonw,2022-02-05T23:08:45Z,2022-02-05T23:08:45Z,OWNER,"Useful thoughts on Twitter regarding making coordinate pairs easy and more complex shapes possible: https://twitter.com/dbreunig/status/1490099303888547843 > That is exactly where I was going: two modes. > > 1. Heuristics and assumptions to get coordinates as a pair (in tuple) or as columns (look for lat, lon, latitude, longitude, etc). > 2. GIS mode with projections, polys, etc > > Make it easy for people with csvs of coordinates. If you're using Geojson or shp files, you have to specify.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030468418,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030468418,IC_kwDOCGYnMM49a7NC,9599,simonw,2022-02-05T00:49:08Z,2022-02-05T22:59:06Z,OWNER,"I'm trying to think of ways to make this nicer from the perspective of someone calling the `.insert()` or `.insert_all()` methods against a table that has geometry columns. One option would be for the code to introspect the table (if it exists) before running the insert, looking for any geometry columns. This introspection isn't easy! The table schema just gives you `""name_of_column"" point` or similar - to figure out the SRID and suchlike you need to consult the `geometry_columns` table, I think - which throws a 500 error on https://calands.datasettes.com/calands/geometry_columns for some reason. Also does the shape of that table change between SpatiaLite versions? Assuming we can introspect the table, what would we do with that information? We could add code that detects if the user attempted to pass GeoJSON objects and automatically inserts a `GeomFromGeoJSON()` function call - but detecting GeoJSON is a bit weird, and GeoJSON also isn't necessarily the nicest format for populating e.g. latitude/longitude points. Maybe we just support the simplest possible case: a tuple of floats, which we assume is `latitude, longitude` (or should we expect `longitude, latitude`, the eternal debate?) - if those are used against a geometry table (especially a point table) we assume they are coordinates that need to be converted using `GeomFromText('POINT(...`. Not crazy about either of these ideas. Is there something better?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/398#issuecomment-1030534868,https://api.github.com/repos/simonw/sqlite-utils/issues/398,1030534868,IC_kwDOCGYnMM49bLbU,9599,simonw,2022-02-05T06:03:38Z,2022-02-05T06:03:38Z,OWNER,@eyeseast how do you usually insert geometries at the moment?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124237013,Add SpatiaLite helpers to CLI, https://github.com/simonw/datasette/issues/1576#issuecomment-1030530071,https://api.github.com/repos/simonw/datasette/issues/1576,1030530071,IC_kwDOBm6k_c49bKQX,9599,simonw,2022-02-05T05:21:35Z,2022-02-05T05:21:35Z,OWNER,New documentation section: https://docs.datasette.io/en/latest/internals.html#datasette-tracer,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1087181951,Traces should include SQL executed by subtasks created with `asyncio.gather`, https://github.com/simonw/datasette/issues/1576#issuecomment-1030528532,https://api.github.com/repos/simonw/datasette/issues/1576,1030528532,IC_kwDOBm6k_c49bJ4U,9599,simonw,2022-02-05T05:09:57Z,2022-02-05T05:09:57Z,OWNER,Needs documentation. I'll document `from datasette.tracer import trace` too.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1087181951,Traces should include SQL executed by subtasks created with `asyncio.gather`, https://github.com/simonw/datasette/issues/1576#issuecomment-1030525218,https://api.github.com/repos/simonw/datasette/issues/1576,1030525218,IC_kwDOBm6k_c49bJEi,9599,simonw,2022-02-05T04:45:11Z,2022-02-05T04:45:11Z,OWNER,"Got a prototype working with `contextvars` - it identified two parallel executing queries using the patch from above: ![CleanShot 2022-02-04 at 20 41 50@2x](https://user-images.githubusercontent.com/9599/152628949-cf766b13-13cf-4831-b48d-2f23cadb6a05.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1087181951,Traces should include SQL executed by subtasks created with `asyncio.gather`, https://github.com/simonw/datasette/issues/1576#issuecomment-1017112543,https://api.github.com/repos/simonw/datasette/issues/1576,1017112543,IC_kwDOBm6k_c48n-ff,9599,simonw,2022-01-20T04:35:00Z,2022-02-05T04:33:46Z,OWNER,I dropped support for Python 3.6 in fae3983c51f4a3aca8335f3e01ff85ef27076fbf so now free to use `contextvars` for this.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1087181951,Traces should include SQL executed by subtasks created with `asyncio.gather`, https://github.com/simonw/sqlite-utils/issues/398#issuecomment-1030521533,https://api.github.com/repos/simonw/sqlite-utils/issues/398,1030521533,IC_kwDOCGYnMM49bIK9,9599,simonw,2022-02-05T04:25:49Z,2022-02-05T04:25:49Z,OWNER,For ingesting geometry data from the command-line maybe GeoJSON would be the best route?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124237013,Add SpatiaLite helpers to CLI, https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030466255,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030466255,IC_kwDOCGYnMM49a6rP,9599,simonw,2022-02-05T00:41:35Z,2022-02-05T00:42:23Z,OWNER,"Wow, it was the newlines that broke it! This works fine: ```sql select AsWKT(SetSRID(GeomFromGeoJSON('{""type"": ""Point"",""coordinates"": [-94.921875,45.460130637921004]}'), 4326)) ``` https://calands.datasettes.com/calands?sql=select+AsWKT%28SetSRID%28GeomFromGeoJSON%28%27%7B%22type%22%3A+%22Point%22%2C%22coordinates%22%3A+%5B-94.921875%2C45.460130637921004%5D%7D%27%29%2C+4326%29%29 And removing `SetSRID()` returns exactly the same result: https://calands.datasettes.com/calands?sql=select+AsWKT%28GeomFromGeoJSON%28%27%7B%22type%22%3A+%22Point%22%2C%22coordinates%22%3A+%5B-94.921875%2C45.460130637921004%5D%7D%27%29%29","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030465557,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030465557,IC_kwDOCGYnMM49a6gV,9599,simonw,2022-02-05T00:39:09Z,2022-02-05T00:39:09Z,OWNER,"I can't seem to get `GeomFromGeoJSON()` to work - example: https://calands.datasettes.com/calands?sql=select+IsValid%28SetSRID%28GeomFromGeoJSON%28%27%7B%0D%0A++++++++%22type%22%3A+%22Point%22%2C%0D%0A++++++++%22coordinates%22%3A+%5B%0D%0A++++++++++-94.921875%2C%0D%0A++++++++++45.460130637921004%0D%0A++++++++%5D%0D%0A++++++%7D%27%29%2C+4326%29%29 ```sql select IsValid(SetSRID(GeomFromGeoJSON('{ ""type"": ""Point"", ""coordinates"": [ -94.921875, 45.460130637921004 ] }'), 4326)) ``` Returns `-1` suggesting the geometry is not valid. Just doing this (with or without that `SetSRID()` function) returns null: ```sql select SetSRID(GeomFromGeoJSON('{ ""type"": ""Point"", ""coordinates"": [ -94.921875, 45.460130637921004 ] }'), 4326) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030461163,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030461163,IC_kwDOCGYnMM49a5br,9599,simonw,2022-02-05T00:30:18Z,2022-02-05T00:30:18Z,OWNER,"I wonder what the most developer-friendly way to insert geometry data into SpatiaLite is? From https://www.gaia-gis.it/gaia-sins/spatialite-sql-latest.html it looks like these are the main options: - `GeomFromText( wkt String [ , SRID Integer] )` - `GeomFromWKB( wkbGeometry Binary [ , SRID Integer] )` - `GeomFromKml( KmlGeometry String )` - `GeomFromGML( gmlGeometry String )` - `GeomFromGeoJSON( geoJSONGeometry String )` - `GeomFromEWKB( ewkbGeometry String )` - `GeomFromEWKT( ewktGeometry String )` - `GeomFromFGF( fgfGeometry Binary [ , SRID Integer] )` - `GeomFromTWKB( twkbGeometry BLOB [ , SRID Integer] )` - `GeomFromGPB( geom GPKG Blob Geometry )` - GeoPackage format - `GeomFromExifGpsBlob( image BLOB )` Interesting that some accept an SRID and others do not - presumably `GeomFromGeoJSON()` always uses SRID=4326?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/398#issuecomment-1030456717,https://api.github.com/repos/simonw/sqlite-utils/issues/398,1030456717,IC_kwDOCGYnMM49a4WN,9599,simonw,2022-02-05T00:16:42Z,2022-02-05T00:16:42Z,OWNER,"> The one thing worth highlighting in docs is that geometry columns can only be added to existing tables. Trying to add a geometry column to a table that doesn't exist yet might mean you have a schema like `{""rowid"": int, ""geometry"": bytes}`. Might be worth nudging people to explicitly create a table first, then add geometry columns. That's a good call. I'm happy for `sqlite-utils add-geometry-column` to throw an error if the table doesn't exist yet.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124237013,Add SpatiaLite helpers to CLI, https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030455715,https://api.github.com/repos/simonw/sqlite-utils/issues/399,1030455715,IC_kwDOCGYnMM49a4Gj,9599,simonw,2022-02-05T00:15:28Z,2022-02-05T00:15:28Z,OWNER,"The `conversions=` argument to `.insert()` and friends is designed to handle this case, but I don't think it's very elegant: https://sqlite-utils.datasette.io/en/stable/python-api.html#converting-column-values-using-sql-functions ```python db[""places""].insert( {""name"": ""Wales"", ""geometry"": wkt}, conversions={""geometry"": ""GeomFromText(?, 4326)""}, ) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124731464,"Make it easier to insert geometries, with documentation and maybe code", https://github.com/simonw/sqlite-utils/issues/398#issuecomment-1030454114,https://api.github.com/repos/simonw/sqlite-utils/issues/398,1030454114,IC_kwDOCGYnMM49a3ti,9599,simonw,2022-02-05T00:14:47Z,2022-02-05T00:14:47Z,OWNER,"I like these designs a lot. I would suggest `sqlite-utils create database.db --init-spatialite` there for consistency with the `sqlite-utils init-spatialite database.db` command. The other part of this story is how we support actually inserting spatial data from the command-line. I opened an issue about the challenges in doing that for the Python API here - #399 - but we need a good answer for the CLI too. I don't yet have any good ideas here. The `conversions=` option in the Python library was designed to cover these kinds of cases but it's pretty clunky and I don't think it's very widely used: https://sqlite-utils.datasette.io/en/stable/python-api.html#converting-column-values-using-sql-functions","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1124237013,Add SpatiaLite helpers to CLI, https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1029703503,https://api.github.com/repos/simonw/sqlite-utils/issues/79,1029703503,IC_kwDOCGYnMM49YAdP,9599,simonw,2022-02-04T06:46:32Z,2022-02-04T06:46:32Z,OWNER,Shipped in 3.23: https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-23,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",557842245,Helper methods for working with SpatiaLite, https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029703216,https://api.github.com/repos/simonw/sqlite-utils/issues/385,1029703216,IC_kwDOCGYnMM49YAYw,9599,simonw,2022-02-04T06:45:43Z,2022-02-04T06:45:43Z,OWNER,Shipped this as `sqlite-utils` 3.23: https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-23,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1102899312,Add new spatialite helper methods, https://github.com/simonw/datasette/issues/1080#issuecomment-1029695083,https://api.github.com/repos/simonw/datasette/issues/1080,1029695083,IC_kwDOBm6k_c49X-Zr,9599,simonw,2022-02-04T06:24:40Z,2022-02-04T06:25:18Z,OWNER,"An initial prototype of that in my local `group-count` branch quickly started running into problems: ```diff diff --git a/datasette/views/table.py b/datasette/views/table.py index be9e9c3..d30efe1 100644 --- a/datasette/views/table.py +++ b/datasette/views/table.py @@ -105,8 +105,12 @@ class RowTableShared(DataView): type_ = ""integer"" notnull = 0 else: - type_ = column_details[r[0]].type - notnull = column_details[r[0]].notnull + try: + type_ = column_details[r[0]].type + notnull = column_details[r[0]].notnull + except KeyError: # Probably count(*) + type_ = ""integer"" + notnull = False columns.append( { ""name"": r[0], @@ -613,6 +617,15 @@ class TableView(RowTableShared): offset=offset, ) + # If ?_group_count we convert the SQL query here + group_count = request.args.getlist(""_group_count"") + if group_count: + wrapped_sql = ""select {cols}, count(*) from ({sql}) group by {cols}"".format( + cols="", "".join(group_count), + sql=sql, + ) + sql = wrapped_sql + if request.args.get(""_timelimit""): extra_args[""custom_time_limit""] = int(request.args.get(""_timelimit"")) ``` Resulted in errors like this one: ``` pk_path = path_from_row_pks(row, pks, not pks, False) File ""/Users/simon/Dropbox/Development/datasette/datasette/utils/__init__.py"", line 82, in path_from_row_pks bits = [ File ""/Users/simon/Dropbox/Development/datasette/datasette/utils/__init__.py"", line 83, in row[pk][""value""] if isinstance(row[pk], dict) else row[pk] for pk in pks IndexError: No item with that key ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",734777631,"""View all"" option for facets, to provide a (paginated) list of ALL of the facet counts plus a link to view them", https://github.com/simonw/datasette/issues/1080#issuecomment-1029691693,https://api.github.com/repos/simonw/datasette/issues/1080,1029691693,IC_kwDOBm6k_c49X9kt,9599,simonw,2022-02-04T06:16:45Z,2022-02-04T06:16:45Z,OWNER,"Had a new, different idea for how this could work: support a `?_group_count=colname` parameter to the table view, which turns the page into a `select colname, count(*) ... group by colname` query - but keeps things like the filter interface, facet selection, search box and so on.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",734777631,"""View all"" option for facets, to provide a (paginated) list of ALL of the facet counts plus a link to view them", https://github.com/simonw/sqlite-utils/issues/395#issuecomment-1029686150,https://api.github.com/repos/simonw/sqlite-utils/issues/395,1029686150,IC_kwDOCGYnMM49X8OG,9599,simonw,2022-02-04T06:03:51Z,2022-02-04T06:03:51Z,OWNER,I'm just going to run the SpatiaLite tests on Ubuntu.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1123849278,"""apt-get: command not found"" error on macOS", https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1029683977,https://api.github.com/repos/simonw/sqlite-utils/issues/79,1029683977,IC_kwDOCGYnMM49X7sJ,9599,simonw,2022-02-04T05:58:15Z,2022-02-04T05:58:15Z,OWNER,Documentation: https://sqlite-utils.datasette.io/en/latest/python-api.html#spatialite-helpers,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",557842245,Helper methods for working with SpatiaLite, https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029682294,https://api.github.com/repos/simonw/sqlite-utils/issues/385,1029682294,IC_kwDOCGYnMM49X7R2,9599,simonw,2022-02-04T05:53:26Z,2022-02-04T05:53:26Z,OWNER,"This looks fantastic, thanks for all of the work you put into this!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1102899312,Add new spatialite helper methods, https://github.com/simonw/sqlite-utils/issues/352#issuecomment-1029479388,https://api.github.com/repos/simonw/sqlite-utils/issues/352,1029479388,IC_kwDOCGYnMM49XJvc,9599,simonw,2022-02-03T22:59:35Z,2022-02-03T22:59:35Z,OWNER,"Ran into this bug again while writing tests for this: - #186","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1072792507,`sqlite-utils insert --extract colname`, https://github.com/simonw/sqlite-utils/issues/363#issuecomment-1029475387,https://api.github.com/repos/simonw/sqlite-utils/issues/363,1029475387,IC_kwDOCGYnMM49XIw7,9599,simonw,2022-02-03T22:52:30Z,2022-02-03T22:52:30Z,OWNER,"Demos: ``` % sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert '[1]' --text Error: Rows must all be dictionaries, got: 1 % sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert '1' --text Error: --convert must return dict or iterator ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094981339,Better error message if `--convert` code fails to return a dict, https://github.com/simonw/sqlite-utils/issues/363#issuecomment-1029469630,https://api.github.com/repos/simonw/sqlite-utils/issues/363,1029469630,IC_kwDOCGYnMM49XHW-,9599,simonw,2022-02-03T22:42:36Z,2022-02-03T22:42:36Z,OWNER,"> This check should run inside the `.insert_all()` method. It should raise a custom exception which the CLI code can then catch and turn into a click error. Actually no that doesn't work, because this line causes an error before we even get to `.insert_all()`: https://github.com/simonw/sqlite-utils/blob/7d928f83085fb285f294dbdaeb93fd94a44d5d44/sqlite_utils/cli.py#L1012-L1013","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094981339,Better error message if `--convert` code fails to return a dict, https://github.com/simonw/sqlite-utils/issues/393#issuecomment-1029450617,https://api.github.com/repos/simonw/sqlite-utils/issues/393,1029450617,IC_kwDOCGYnMM49XCt5,9599,simonw,2022-02-03T22:13:24Z,2022-02-03T22:13:24Z,OWNER,Much better: https://sqlite-utils.datasette.io/en/latest/python-api.html#insert-replacing-data,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1118585417,Better documentation for insert-replace, https://github.com/simonw/sqlite-utils/issues/369#issuecomment-1029402837,https://api.github.com/repos/simonw/sqlite-utils/issues/369,1029402837,IC_kwDOCGYnMM49W3DV,9599,simonw,2022-02-03T21:07:35Z,2022-02-03T21:07:35Z,OWNER,"Closing this - it was something I was curious about, but evidently not curious enough to actually do the work!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1097091527,Research how much of a difference analyze / sqlite_stat1 makes, https://github.com/simonw/sqlite-utils/issues/394#issuecomment-1029402029,https://api.github.com/repos/simonw/sqlite-utils/issues/394,1029402029,IC_kwDOCGYnMM49W22t,9599,simonw,2022-02-03T21:06:35Z,2022-02-03T21:06:35Z,OWNER,"This broke on Windows: https://github.com/simonw/sqlite-utils/runs/5056912641 ``` if recreate and os.path.exists(filename_or_conn): > os.remove(filename_or_conn) E PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\runneradmin\\AppData\\Local\\Temp\\pytest-of-runneradmin\\pytest-0\\test_recreate_False_True_0\\data.db' ``` I'm going to revert it from `main` for the moment.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1122446693,Test against Python 3.11-dev,