html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app
https://github.com/simonw/datasette/issues/1727#issuecomment-1111602802,https://api.github.com/repos/simonw/datasette/issues/1727,1111602802,IC_kwDOBm6k_c5CQbZy,9599,2022-04-28T00:21:35Z,2022-04-28T00:21:35Z,OWNER,"Tried this but I'm getting back an empty JSON array of traces at the bottom of the page most of the time (intermittently it works correctly):

```diff
diff --git a/datasette/database.py b/datasette/database.py
index ba594a8..d7f9172 100644
--- a/datasette/database.py
+++ b/datasette/database.py
@@ -7,7 +7,7 @@ import sys
 import threading
 import uuid
 
-from .tracer import trace
+from .tracer import trace, trace_child_tasks
 from .utils import (
     detect_fts,
     detect_primary_keys,
@@ -207,30 +207,31 @@ class Database:
                 time_limit_ms = custom_time_limit
 
             with sqlite_timelimit(conn, time_limit_ms):
-                try:
-                    cursor = conn.cursor()
-                    cursor.execute(sql, params if params is not None else {})
-                    max_returned_rows = self.ds.max_returned_rows
-                    if max_returned_rows == page_size:
-                        max_returned_rows += 1
-                    if max_returned_rows and truncate:
-                        rows = cursor.fetchmany(max_returned_rows + 1)
-                        truncated = len(rows) > max_returned_rows
-                        rows = rows[:max_returned_rows]
-                    else:
-                        rows = cursor.fetchall()
-                        truncated = False
-                except (sqlite3.OperationalError, sqlite3.DatabaseError) as e:
-                    if e.args == (""interrupted"",):
-                        raise QueryInterrupted(e, sql, params)
-                    if log_sql_errors:
-                        sys.stderr.write(
-                            ""ERROR: conn={}, sql = {}, params = {}: {}\n"".format(
-                                conn, repr(sql), params, e
+                with trace(""sql"", database=self.name, sql=sql.strip(), params=params):
+                    try:
+                        cursor = conn.cursor()
+                        cursor.execute(sql, params if params is not None else {})
+                        max_returned_rows = self.ds.max_returned_rows
+                        if max_returned_rows == page_size:
+                            max_returned_rows += 1
+                        if max_returned_rows and truncate:
+                            rows = cursor.fetchmany(max_returned_rows + 1)
+                            truncated = len(rows) > max_returned_rows
+                            rows = rows[:max_returned_rows]
+                        else:
+                            rows = cursor.fetchall()
+                            truncated = False
+                    except (sqlite3.OperationalError, sqlite3.DatabaseError) as e:
+                        if e.args == (""interrupted"",):
+                            raise QueryInterrupted(e, sql, params)
+                        if log_sql_errors:
+                            sys.stderr.write(
+                                ""ERROR: conn={}, sql = {}, params = {}: {}\n"".format(
+                                    conn, repr(sql), params, e
+                                )
                             )
-                        )
-                        sys.stderr.flush()
-                    raise
+                            sys.stderr.flush()
+                        raise
 
             if truncate:
                 return Results(rows, truncated, cursor.description)
@@ -238,9 +239,8 @@ class Database:
             else:
                 return Results(rows, False, cursor.description)
 
-        with trace(""sql"", database=self.name, sql=sql.strip(), params=params):
-            results = await self.execute_fn(sql_operation_in_thread)
-        return results
+        with trace_child_tasks():
+            return await self.execute_fn(sql_operation_in_thread)
 
     @property
     def size(self):
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111597176,https://api.github.com/repos/simonw/datasette/issues/1727,1111597176,IC_kwDOBm6k_c5CQaB4,9599,2022-04-28T00:11:44Z,2022-04-28T00:11:44Z,OWNER,"Though it would be interesting to also have the trace reveal how much time is spent in the functions that wrap that core SQL - the stuff that is being measured at the moment.

I have a hunch that this could help solve the over-arching performance mystery.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111595319,https://api.github.com/repos/simonw/datasette/issues/1727,1111595319,IC_kwDOBm6k_c5CQZk3,9599,2022-04-28T00:09:45Z,2022-04-28T00:11:01Z,OWNER,"Here's where read queries are instrumented: https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L241-L242

So the instrumentation is actually capturing quite a bit of Python activity before it gets to SQLite:

https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L179-L190

And then:

https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L204-L233

Ideally I'd like that `trace()` block to wrap just the `cursor.execute()` and `cursor.fetchmany(...)` or `cursor.fetchall()` calls.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111558204,https://api.github.com/repos/simonw/datasette/issues/1727,1111558204,IC_kwDOBm6k_c5CQQg8,9599,2022-04-27T22:58:39Z,2022-04-27T22:58:39Z,OWNER,"I should check my timing mechanism. Am I capturing the time taken just in SQLite or does it include time spent in Python crossing between async and threaded world and waiting for a thread pool worker to become available?

That could explain the longer query times.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111553029,https://api.github.com/repos/simonw/datasette/issues/1727,1111553029,IC_kwDOBm6k_c5CQPQF,9599,2022-04-27T22:48:21Z,2022-04-27T22:48:21Z,OWNER,I wonder if it would be worth exploring multiprocessing here.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111551076,https://api.github.com/repos/simonw/datasette/issues/1727,1111551076,IC_kwDOBm6k_c5CQOxk,9599,2022-04-27T22:44:51Z,2022-04-27T22:45:04Z,OWNER,Really wild idea: what if I created three copies of the SQLite database file - as three separate file names - and then balanced the parallel queries across all these? Any chance that could avoid any mysterious locking issues?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111535818,https://api.github.com/repos/simonw/datasette/issues/1727,1111535818,IC_kwDOBm6k_c5CQLDK,9599,2022-04-27T22:18:45Z,2022-04-27T22:18:45Z,OWNER,"Another avenue: https://twitter.com/weargoggles/status/1519426289920270337

> SQLite has its own mutexes to provide thread safety, which as another poster noted are out of play in multi process setups. Perhaps downgrading from the “serializable” to “multi-threaded” safety would be okay for Datasette? https://sqlite.org/c3ref/c_config_covering_index_scan.html#sqliteconfigmultithread

Doesn't look like there's an obvious way to access that from Python via the `sqlite3` module though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/sqlite-utils/issues/159#issuecomment-1111506339,https://api.github.com/repos/simonw/sqlite-utils/issues/159,1111506339,IC_kwDOCGYnMM5CQD2j,154364,2022-04-27T21:35:13Z,2022-04-27T21:35:13Z,NONE,"Just stumbled across this, wondering why none of my deletes were working.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",702386948,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111485722,https://api.github.com/repos/simonw/datasette/issues/1727,1111485722,IC_kwDOBm6k_c5CP-0a,9599,2022-04-27T21:08:20Z,2022-04-27T21:08:20Z,OWNER,"Tried that and it didn't seem to make a difference either.

I really need a much deeper view of what's going on here.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111462442,https://api.github.com/repos/simonw/datasette/issues/1727,1111462442,IC_kwDOBm6k_c5CP5Iq,9599,2022-04-27T20:40:59Z,2022-04-27T20:42:49Z,OWNER,"This looks VERY relevant: [SQLite Shared-Cache Mode](https://www.sqlite.org/sharedcache.html):

> SQLite includes a special ""shared-cache"" mode (disabled by default) intended for use in embedded servers. If shared-cache mode is enabled and a thread establishes multiple connections to the same database, the connections share a single data and schema cache. This can significantly reduce the quantity of memory and IO required by the system.

Enabled as part of the URI filename:

    ATTACH 'file:aux.db?cache=shared' AS aux;

Turns out I'm already using this for in-memory databases that have `.memory_name` set, but not (yet) for regular file-backed databases:

https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L73-L75
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111460068,https://api.github.com/repos/simonw/datasette/issues/1727,1111460068,IC_kwDOBm6k_c5CP4jk,9599,2022-04-27T20:38:32Z,2022-04-27T20:38:32Z,OWNER,WAL mode didn't seem to make a difference. I thought there was a chance it might help multiple read connections operate at the same time but it looks like it really does only matter for when writes are going on.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111456500,https://api.github.com/repos/simonw/datasette/issues/1727,1111456500,IC_kwDOBm6k_c5CP3r0,9599,2022-04-27T20:36:01Z,2022-04-27T20:36:01Z,OWNER,"Yeah all of this is pretty much assuming read-only connections. Datasette has a separate mechanism for ensuring that writes are executed one at a time against a dedicated connection from an in-memory queue:
- https://github.com/simonw/datasette/issues/682","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111451790,https://api.github.com/repos/simonw/datasette/issues/1727,1111451790,IC_kwDOBm6k_c5CP2iO,716529,2022-04-27T20:30:33Z,2022-04-27T20:30:33Z,NONE,"> I should try seeing what happens with WAL mode enabled.

I've only skimmed above but it looks like you're doing mainly read-only queries?  WAL mode is about better interactions between writers & readers, primarily.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111448928,https://api.github.com/repos/simonw/datasette/issues/1727,1111448928,IC_kwDOBm6k_c5CP11g,716529,2022-04-27T20:27:05Z,2022-04-27T20:27:05Z,NONE,"You don't want to re-use an SQLite connection from multiple threads anyway: https://www.sqlite.org/threadsafe.html

Multiple connections can operate on the file in parallel, but a single connection can't:

> Multi-thread. In this mode, SQLite can be safely used by multiple threads **provided that no single database connection is used simultaneously in two or more threads**.

(emphasis mine)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111442012,https://api.github.com/repos/simonw/datasette/issues/1727,1111442012,IC_kwDOBm6k_c5CP0Jc,9599,2022-04-27T20:19:00Z,2022-04-27T20:19:00Z,OWNER,"Something worth digging into: are these parallel queries running against the same SQLite connection or are they each rubbing against a separate SQLite connection?

Just realized I know the answer: they're running against separate SQLite connections, because that's how the time limit mechanism works: it installs a progress handler for each connection which terminates it after a set time.

This means that if SQLite benefits from multiple threads using the same connection (due to shared caches or similar) then Datasette will not be seeing those benefits.

It also means that if there's some mechanism within SQLite that penalizes you for having multiple parallel connections to a single file (just guessing here, maybe there's some kind of locking going on?) then Datasette will suffer those penalties.

I should try seeing what happens with WAL mode enabled.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111432375,https://api.github.com/repos/simonw/datasette/issues/1727,1111432375,IC_kwDOBm6k_c5CPxy3,9599,2022-04-27T20:07:57Z,2022-04-27T20:07:57Z,OWNER,Also useful: https://avi.im/blag/2021/fast-sqlite-inserts/ - from a tip on Twitter: https://twitter.com/ricardoanderegg/status/1519402047556235264,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111431785,https://api.github.com/repos/simonw/datasette/issues/1727,1111431785,IC_kwDOBm6k_c5CPxpp,9599,2022-04-27T20:07:16Z,2022-04-27T20:07:16Z,OWNER,"I think I need some much more in-depth tracing tricks for this.

https://www.maartenbreddels.com/perf/jupyter/python/tracing/gil/2021/01/14/Tracing-the-Python-GIL.html looks relevant - uses the `perf` tool on Linux.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111408273,https://api.github.com/repos/simonw/datasette/issues/1727,1111408273,IC_kwDOBm6k_c5CPr6R,9599,2022-04-27T19:40:51Z,2022-04-27T19:42:17Z,OWNER,"Relevant: here's the code that sets up a Datasette SQLite connection: https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L73-L96

It's using `check_same_thread=False` - here's [the Python docs on that](https://docs.python.org/3/library/sqlite3.html#sqlite3.connect):

> By default, *check_same_thread* is [`True`](https://docs.python.org/3/library/constants.html#True ""True"") and only the creating thread may use the connection. If set [`False`](https://docs.python.org/3/library/constants.html#False ""False""), the returned connection may be shared across multiple threads. When using multiple threads with the same connection writing operations should be serialized by the user to avoid data corruption.

This is why Datasette reserves a single connection for write queries and queues them up in memory, [as described here](https://simonwillison.net/2020/Feb/26/weeknotes-datasette-writes/).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111390433,https://api.github.com/repos/simonw/datasette/issues/1727,1111390433,IC_kwDOBm6k_c5CPnjh,9599,2022-04-27T19:21:02Z,2022-04-27T19:21:02Z,OWNER,"One weird thing: I noticed that in the parallel trace above the SQL query bars are wider. Mousover shows duration in ms, and I got 13ms for this query:

    select message as value, count(*) as n from (

But in the `?_noparallel=1` version that some query took 2.97ms.

Given those numbers though I would expect the overall page time to be MUCH worse for the parallel version - but the page load times are instead very close to each other, with parallel often winning.

This is super-weird.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111385875,https://api.github.com/repos/simonw/datasette/issues/1727,1111385875,IC_kwDOBm6k_c5CPmcT,9599,2022-04-27T19:16:57Z,2022-04-27T19:16:57Z,OWNER,"I just remembered the `--setting num_sql_threads` option... which defaults to 3! https://github.com/simonw/datasette/blob/942411ef946e9a34a2094944d3423cddad27efd3/datasette/app.py#L109-L113

Would explain why the first trace never seems to show more than three SQL queries executing at once.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1727#issuecomment-1111380282,https://api.github.com/repos/simonw/datasette/issues/1727,1111380282,IC_kwDOBm6k_c5CPlE6,9599,2022-04-27T19:10:27Z,2022-04-27T19:10:27Z,OWNER,"Wrote more about that here: https://simonwillison.net/2022/Apr/27/parallel-queries/

Compare https://latest-with-plugins.datasette.io/github/commits?_facet=repo&_facet=committer&_trace=1

![image](https://user-images.githubusercontent.com/9599/165601503-2083c5d2-d740-405c-b34d-85570744ca82.png)

With the same thing but with parallel execution disabled:

https://latest-with-plugins.datasette.io/github/commits?_facet=repo&_facet=committer&_trace=1&_noparallel=1

![image](https://user-images.githubusercontent.com/9599/165601525-98abbfb1-5631-4040-b6bd-700948d1db6e.png)

Those total page load time numbers are very similar. Is this parallel optimization worthwhile?

Maybe it's only worth it on larger databases? Or maybe larger databases perform worse with this?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,
https://github.com/simonw/datasette/issues/1724#issuecomment-1110585475,https://api.github.com/repos/simonw/datasette/issues/1724,1110585475,IC_kwDOBm6k_c5CMjCD,9599,2022-04-27T06:15:14Z,2022-04-27T06:15:14Z,OWNER,"Yeah, that page is 438K (but only 20K gzipped).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216619276,
https://github.com/simonw/datasette/issues/1724#issuecomment-1110370095,https://api.github.com/repos/simonw/datasette/issues/1724,1110370095,IC_kwDOBm6k_c5CLucv,9599,2022-04-27T00:18:30Z,2022-04-27T00:18:30Z,OWNER,"So this isn't a bug here, it's working as intended.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216619276,
https://github.com/simonw/datasette/issues/1724#issuecomment-1110369004,https://api.github.com/repos/simonw/datasette/issues/1724,1110369004,IC_kwDOBm6k_c5CLuLs,9599,2022-04-27T00:16:35Z,2022-04-27T00:17:04Z,OWNER,"I bet this is because it's exceeding the size limit: https://github.com/simonw/datasette/blob/da53e0360da4771ffb56a8e3eb3f7476f3168299/datasette/tracer.py#L80-L88

https://github.com/simonw/datasette/blob/da53e0360da4771ffb56a8e3eb3f7476f3168299/datasette/tracer.py#L102-L113","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216619276,
https://github.com/simonw/datasette/issues/1723#issuecomment-1110330554,https://api.github.com/repos/simonw/datasette/issues/1723,1110330554,IC_kwDOBm6k_c5CLky6,9599,2022-04-26T23:06:20Z,2022-04-26T23:06:20Z,OWNER,Deployed here: https://latest-with-plugins.datasette.io/github/commits?_facet=repo&_trace=1&_facet=committer,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216508080,
https://github.com/simonw/datasette/issues/1723#issuecomment-1110305790,https://api.github.com/repos/simonw/datasette/issues/1723,1110305790,IC_kwDOBm6k_c5CLev-,9599,2022-04-26T22:19:04Z,2022-04-26T22:19:04Z,OWNER,"I realized that seeing the total time in queries wasn't enough to understand this, because if the queries were executed in serial or parallel it should still sum up to the same amount of SQL time (roughly).

Instead I need to know how long the page took to render. But that's hard to display on the page since you can't measure it until rendering has finished!

So I built an ASGI plugin to handle that measurement: https://github.com/simonw/datasette-total-page-time

And with that plugin installed, `http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel2&_facet=other_fuel1&_parallel=1` (the parallel version) takes 377ms:

<img width=""543"" alt=""CleanShot 2022-04-26 at 15 17 38@2x"" src=""https://user-images.githubusercontent.com/9599/165401856-d592ed7a-0240-4514-b9d8-fb9e7d8c9629.png"">

While `http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel2&_facet=other_fuel1` (the serial version) takes 762ms:

<img width=""543"" alt=""image"" src=""https://user-images.githubusercontent.com/9599/165401933-6d647014-4cab-4fbd-b9aa-958fc24ff435.png"">
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216508080,
https://github.com/simonw/datasette/issues/1723#issuecomment-1110279869,https://api.github.com/repos/simonw/datasette/issues/1723,1110279869,IC_kwDOBm6k_c5CLYa9,9599,2022-04-26T21:45:39Z,2022-04-26T21:45:39Z,OWNER,"Getting some nice traces out of this:

<img width=""1384"" alt=""CleanShot 2022-04-26 at 14 45 21@2x"" src=""https://user-images.githubusercontent.com/9599/165397745-e8bfbe0a-306f-45bd-81f1-f5f6fc6422b9.png"">

","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216508080,
https://github.com/simonw/datasette/issues/1723#issuecomment-1110278577,https://api.github.com/repos/simonw/datasette/issues/1723,1110278577,IC_kwDOBm6k_c5CLYGx,9599,2022-04-26T21:44:04Z,2022-04-26T21:44:04Z,OWNER,"And some simple benchmarks with `ab` - using the `?_parallel=1` hack to try it with and without a parallel `asyncio.gather()`:

```
~ % ab -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2'                    
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done


Server Software:        uvicorn
Server Hostname:        127.0.0.1
Server Port:            8001

Document Path:          /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2
Document Length:        314187 bytes

Concurrency Level:      1
Time taken for tests:   68.279 seconds
Complete requests:      100
Failed requests:        13
   (Connect: 0, Receive: 0, Length: 13, Exceptions: 0)
Total transferred:      31454937 bytes
HTML transferred:       31418437 bytes
Requests per second:    1.46 [#/sec] (mean)
Time per request:       682.787 [ms] (mean)
Time per request:       682.787 [ms] (mean, across all concurrent requests)
Transfer rate:          449.89 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   621  683  68.0    658     993
Waiting:      620  682  68.0    657     992
Total:        621  683  68.0    658     993

Percentage of the requests served within a certain time (ms)
  50%    658
  66%    678
  75%    687
  80%    711
  90%    763
  95%    879
  98%    926
  99%    993
 100%    993 (longest request)


----

In parallel:

~ % ab -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1'
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done


Server Software:        uvicorn
Server Hostname:        127.0.0.1
Server Port:            8001

Document Path:          /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1
Document Length:        315703 bytes

Concurrency Level:      1
Time taken for tests:   34.763 seconds
Complete requests:      100
Failed requests:        11
   (Connect: 0, Receive: 0, Length: 11, Exceptions: 0)
Total transferred:      31607988 bytes
HTML transferred:       31570288 bytes
Requests per second:    2.88 [#/sec] (mean)
Time per request:       347.632 [ms] (mean)
Time per request:       347.632 [ms] (mean, across all concurrent requests)
Transfer rate:          887.93 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   311  347  28.0    338     450
Waiting:      311  347  28.0    338     450
Total:        312  348  28.0    338     451

Percentage of the requests served within a certain time (ms)
  50%    338
  66%    348
  75%    361
  80%    367
  90%    396
  95%    408
  98%    436
  99%    451
 100%    451 (longest request)

----

With concurrency 10, not parallel:

~ % ab -c 10 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=' 
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done


Server Software:        uvicorn
Server Hostname:        127.0.0.1
Server Port:            8001

Document Path:          /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=
Document Length:        314346 bytes

Concurrency Level:      10
Time taken for tests:   38.408 seconds
Complete requests:      100
Failed requests:        93
   (Connect: 0, Receive: 0, Length: 93, Exceptions: 0)
Total transferred:      31471333 bytes
HTML transferred:       31433733 bytes
Requests per second:    2.60 [#/sec] (mean)
Time per request:       3840.829 [ms] (mean)
Time per request:       384.083 [ms] (mean, across all concurrent requests)
Transfer rate:          800.18 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:   685 3719 354.0   3774    4096
Waiting:      684 3707 353.7   3750    4095
Total:        685 3719 354.0   3774    4096

Percentage of the requests served within a certain time (ms)
  50%   3774
  66%   3832
  75%   3855
  80%   3878
  90%   3944
  95%   4006
  98%   4057
  99%   4096
 100%   4096 (longest request)


----

Concurrency 10 parallel:

~ % ab -c 10 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1'
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done


Server Software:        uvicorn
Server Hostname:        127.0.0.1
Server Port:            8001

Document Path:          /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1
Document Length:        315703 bytes

Concurrency Level:      10
Time taken for tests:   36.762 seconds
Complete requests:      100
Failed requests:        89
   (Connect: 0, Receive: 0, Length: 89, Exceptions: 0)
Total transferred:      31606516 bytes
HTML transferred:       31568816 bytes
Requests per second:    2.72 [#/sec] (mean)
Time per request:       3676.182 [ms] (mean)
Time per request:       367.618 [ms] (mean, across all concurrent requests)
Transfer rate:          839.61 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       0
Processing:   381 3602 419.6   3609    4458
Waiting:      381 3586 418.7   3607    4457
Total:        381 3603 419.6   3609    4458

Percentage of the requests served within a certain time (ms)
  50%   3609
  66%   3741
  75%   3791
  80%   3821
  90%   3972
  95%   4074
  98%   4386
  99%   4458
 100%   4458 (longest request)


Trying -c 3 instead. Non parallel:

~ % ab -c 3 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel='
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done


Server Software:        uvicorn
Server Hostname:        127.0.0.1
Server Port:            8001

Document Path:          /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=
Document Length:        314346 bytes

Concurrency Level:      3
Time taken for tests:   39.365 seconds
Complete requests:      100
Failed requests:        83
   (Connect: 0, Receive: 0, Length: 83, Exceptions: 0)
Total transferred:      31470808 bytes
HTML transferred:       31433208 bytes
Requests per second:    2.54 [#/sec] (mean)
Time per request:       1180.955 [ms] (mean)
Time per request:       393.652 [ms] (mean, across all concurrent requests)
Transfer rate:          780.72 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   731 1153 126.2   1189    1359
Waiting:      730 1151 125.9   1188    1358
Total:        731 1153 126.2   1189    1359

Percentage of the requests served within a certain time (ms)
  50%   1189
  66%   1221
  75%   1234
  80%   1247
  90%   1296
  95%   1309
  98%   1343
  99%   1359
 100%   1359 (longest request)

----

Parallel:

~ % ab -c 3 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1'
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done


Server Software:        uvicorn
Server Hostname:        127.0.0.1
Server Port:            8001

Document Path:          /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1
Document Length:        315703 bytes

Concurrency Level:      3
Time taken for tests:   34.530 seconds
Complete requests:      100
Failed requests:        18
   (Connect: 0, Receive: 0, Length: 18, Exceptions: 0)
Total transferred:      31606179 bytes
HTML transferred:       31568479 bytes
Requests per second:    2.90 [#/sec] (mean)
Time per request:       1035.902 [ms] (mean)
Time per request:       345.301 [ms] (mean, across all concurrent requests)
Transfer rate:          893.87 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   412 1020 104.4   1018    1280
Waiting:      411 1018 104.1   1014    1275
Total:        412 1021 104.4   1018    1280

Percentage of the requests served within a certain time (ms)
  50%   1018
  66%   1041
  75%   1061
  80%   1079
  90%   1136
  95%   1176
  98%   1251
  99%   1280
 100%   1280 (longest request)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216508080,
https://github.com/simonw/datasette/issues/1723#issuecomment-1110278182,https://api.github.com/repos/simonw/datasette/issues/1723,1110278182,IC_kwDOBm6k_c5CLYAm,9599,2022-04-26T21:43:34Z,2022-04-26T21:43:34Z,OWNER,"Here's the diff I'm using:
```diff
diff --git a/datasette/views/table.py b/datasette/views/table.py
index d66adb8..f15ef1e 100644
--- a/datasette/views/table.py
+++ b/datasette/views/table.py
@@ -1,3 +1,4 @@
+import asyncio
 import itertools
 import json
 
@@ -5,6 +6,7 @@ import markupsafe
 
 from datasette.plugins import pm
 from datasette.database import QueryInterrupted
+from datasette import tracer
 from datasette.utils import (
     await_me_maybe,
     CustomRow,
@@ -150,6 +152,16 @@ class TableView(DataView):
         default_labels=False,
         _next=None,
         _size=None,
+    ):
+        with tracer.trace_child_tasks():
+            return await self._data_traced(request, default_labels, _next, _size)
+
+    async def _data_traced(
+        self,
+        request,
+        default_labels=False,
+        _next=None,
+        _size=None,
     ):
         database_route = tilde_decode(request.url_vars[""database""])
         table_name = tilde_decode(request.url_vars[""table""])
@@ -159,6 +171,20 @@ class TableView(DataView):
             raise NotFound(""Database not found: {}"".format(database_route))
         database_name = db.name
 
+        # For performance profiling purposes, ?_parallel=1 turns on asyncio.gather
+        async def _gather_parallel(*args):
+            return await asyncio.gather(*args)
+
+        async def _gather_sequential(*args):
+            results = []
+            for fn in args:
+                results.append(await fn)
+            return results
+
+        gather = (
+            _gather_parallel if request.args.get(""_parallel"") else _gather_sequential
+        )
+
         # If this is a canned query, not a table, then dispatch to QueryView instead
         canned_query = await self.ds.get_canned_query(
             database_name, table_name, request.actor
@@ -174,8 +200,12 @@ class TableView(DataView):
                 write=bool(canned_query.get(""write"")),
             )
 
-        is_view = bool(await db.get_view_definition(table_name))
-        table_exists = bool(await db.table_exists(table_name))
+        is_view, table_exists = map(
+            bool,
+            await gather(
+                db.get_view_definition(table_name), db.table_exists(table_name)
+            ),
+        )
 
         # If table or view not found, return 404
         if not is_view and not table_exists:
@@ -497,33 +527,44 @@ class TableView(DataView):
                 )
             )
 
-        if not nofacet:
-            for facet in facet_instances:
-                (
+        async def execute_facets():
+            if not nofacet:
+                # Run them in parallel
+                facet_awaitables = [facet.facet_results() for facet in facet_instances]
+                facet_awaitable_results = await gather(*facet_awaitables)
+                for (
                     instance_facet_results,
                     instance_facets_timed_out,
-                ) = await facet.facet_results()
-                for facet_info in instance_facet_results:
-                    base_key = facet_info[""name""]
-                    key = base_key
-                    i = 1
-                    while key in facet_results:
-                        i += 1
-                        key = f""{base_key}_{i}""
-                    facet_results[key] = facet_info
-                facets_timed_out.extend(instance_facets_timed_out)
-
-        # Calculate suggested facets
+                ) in facet_awaitable_results:
+                    for facet_info in instance_facet_results:
+                        base_key = facet_info[""name""]
+                        key = base_key
+                        i = 1
+                        while key in facet_results:
+                            i += 1
+                            key = f""{base_key}_{i}""
+                        facet_results[key] = facet_info
+                    facets_timed_out.extend(instance_facets_timed_out)
+
         suggested_facets = []
-        if (
-            self.ds.setting(""suggest_facets"")
-            and self.ds.setting(""allow_facet"")
-            and not _next
-            and not nofacet
-            and not nosuggest
-        ):
-            for facet in facet_instances:
-                suggested_facets.extend(await facet.suggest())
+
+        async def execute_suggested_facets():
+            # Calculate suggested facets
+            if (
+                self.ds.setting(""suggest_facets"")
+                and self.ds.setting(""allow_facet"")
+                and not _next
+                and not nofacet
+                and not nosuggest
+            ):
+                # Run them in parallel
+                facet_suggest_awaitables = [
+                    facet.suggest() for facet in facet_instances
+                ]
+                for suggest_result in await gather(*facet_suggest_awaitables):
+                    suggested_facets.extend(suggest_result)
+
+        await gather(execute_facets(), execute_suggested_facets())
 
         # Figure out columns and rows for the query
         columns = [r[0] for r in results.description]
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1216508080,
https://github.com/simonw/datasette/issues/1715#issuecomment-1110265087,https://api.github.com/repos/simonw/datasette/issues/1715,1110265087,IC_kwDOBm6k_c5CLUz_,9599,2022-04-26T21:26:17Z,2022-04-26T21:26:17Z,OWNER,"Running facets and facet suggestions in parallel using `asyncio.gather()` turns out to be a lot less hassle than I had thought - maybe I don't need `asyncinject` for this at all?

```diff
         if not nofacet:
-            for facet in facet_instances:
-                (
-                    instance_facet_results,
-                    instance_facets_timed_out,
-                ) = await facet.facet_results()
+            # Run them in parallel
+            facet_awaitables = [facet.facet_results() for facet in facet_instances]
+            facet_awaitable_results = await asyncio.gather(*facet_awaitables)
+            for (
+                instance_facet_results,
+                instance_facets_timed_out,
+            ) in facet_awaitable_results:
                 for facet_info in instance_facet_results:
                     base_key = facet_info[""name""]
                     key = base_key
@@ -522,8 +540,10 @@ class TableView(DataView):
             and not nofacet
             and not nosuggest
         ):
-            for facet in facet_instances:
-                suggested_facets.extend(await facet.suggest())
+            # Run them in parallel
+            facet_suggest_awaitables = [facet.suggest() for facet in facet_instances]
+            for suggest_result in await asyncio.gather(*facet_suggest_awaitables):
+                suggested_facets.extend(suggest_result)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1715#issuecomment-1110246593,https://api.github.com/repos/simonw/datasette/issues/1715,1110246593,IC_kwDOBm6k_c5CLQTB,9599,2022-04-26T21:03:56Z,2022-04-26T21:03:56Z,OWNER,"Well this is fun... I applied this change:

```diff
diff --git a/datasette/views/table.py b/datasette/views/table.py
index d66adb8..85f9e44 100644
--- a/datasette/views/table.py
+++ b/datasette/views/table.py
@@ -1,3 +1,4 @@
+import asyncio
 import itertools
 import json
 
@@ -5,6 +6,7 @@ import markupsafe
 
 from datasette.plugins import pm
 from datasette.database import QueryInterrupted
+from datasette import tracer
 from datasette.utils import (
     await_me_maybe,
     CustomRow,
@@ -174,8 +176,11 @@ class TableView(DataView):
                 write=bool(canned_query.get(""write"")),
             )
 
-        is_view = bool(await db.get_view_definition(table_name))
-        table_exists = bool(await db.table_exists(table_name))
+        with tracer.trace_child_tasks():
+            is_view, table_exists = map(bool, await asyncio.gather(
+                db.get_view_definition(table_name),
+                db.table_exists(table_name)
+            ))
 
         # If table or view not found, return 404
         if not is_view and not table_exists:
```
And now using https://datasette.io/plugins/datasette-pretty-traces I get this:

![CleanShot 2022-04-26 at 14 03 33@2x](https://user-images.githubusercontent.com/9599/165392009-84c4399d-3e94-46d4-ba7b-a64a116cac5c.png)

","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1715#issuecomment-1110219185,https://api.github.com/repos/simonw/datasette/issues/1715,1110219185,IC_kwDOBm6k_c5CLJmx,9599,2022-04-26T20:28:40Z,2022-04-26T20:56:48Z,OWNER,"The refactor I did in #1719 pretty much clashes with all of the changes in https://github.com/simonw/datasette/commit/5053f1ea83194ecb0a5693ad5dada5b25bf0f7e6 so I'll probably need to start my `api-extras` branch again from scratch.

Using a new `tableview-asyncinject` branch.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1715#issuecomment-1110239536,https://api.github.com/repos/simonw/datasette/issues/1715,1110239536,IC_kwDOBm6k_c5CLOkw,9599,2022-04-26T20:54:53Z,2022-04-26T20:54:53Z,OWNER,`pytest tests/test_table_*` runs the tests quickly.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1715#issuecomment-1110238896,https://api.github.com/repos/simonw/datasette/issues/1715,1110238896,IC_kwDOBm6k_c5CLOaw,9599,2022-04-26T20:53:59Z,2022-04-26T20:53:59Z,OWNER,I'm going to rename `database` to `database_name` and `table` to `table_name` to avoid confusion with the `Database` object as opposed to the string name for the database.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1715#issuecomment-1110229319,https://api.github.com/repos/simonw/datasette/issues/1715,1110229319,IC_kwDOBm6k_c5CLMFH,9599,2022-04-26T20:41:32Z,2022-04-26T20:44:38Z,OWNER,"This time I'm not going to bother with the `filter_args` thing - I'm going to just try to use `asyncinject` to execute some big high level things in parallel - facets, suggested facets, counts, the query - and then combine it with the `extras` mechanism I'm trying to introduce too.

Most importantly: I want that `extra_template()` function that adds more template context for the HTML to be executed as part of an `asyncinject` flow!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1720#issuecomment-1110212021,https://api.github.com/repos/simonw/datasette/issues/1720,1110212021,IC_kwDOBm6k_c5CLH21,9599,2022-04-26T20:20:27Z,2022-04-26T20:20:27Z,OWNER,Closing this because I have a good enough idea of the design for now - the details of the parameters can be figured out when I implement this.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109309683,https://api.github.com/repos/simonw/datasette/issues/1720,1109309683,IC_kwDOBm6k_c5CHrjz,9599,2022-04-26T04:12:39Z,2022-04-26T04:12:39Z,OWNER,"I think the rough shape of the three plugin hooks is right. The detailed decisions that are needed concern what the parameters should be, which I think will mainly happen as part of:

- #1715","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109306070,https://api.github.com/repos/simonw/datasette/issues/1720,1109306070,IC_kwDOBm6k_c5CHqrW,9599,2022-04-26T04:05:20Z,2022-04-26T04:05:20Z,OWNER,"The proposed plugin for annotations - allowing users to attach comments to database tables, columns and rows - would be a great application for all three of those `?_extra=` plugin hooks.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109305184,https://api.github.com/repos/simonw/datasette/issues/1720,1109305184,IC_kwDOBm6k_c5CHqdg,9599,2022-04-26T04:03:35Z,2022-04-26T04:03:35Z,OWNER,I bet there's all kinds of interesting potential extras that could be calculated by loading the results of the query into a Pandas DataFrame.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109200774,https://api.github.com/repos/simonw/datasette/issues/1720,1109200774,IC_kwDOBm6k_c5CHQ-G,9599,2022-04-26T01:25:43Z,2022-04-26T01:26:15Z,OWNER,"Had a thought: if a custom HTML template is going to make use of stuff generated using these extras, it will need a way to tell Datasette to execute those extras even in the absence of the `?_extra=...` URL parameters.

Is that necessary? Or should those kinds of plugins use the existing `extra_template_vars` hook instead?

Or maybe the `extra_template_vars` hook gets redesigned so it can depend on other `extras` in some way?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109200335,https://api.github.com/repos/simonw/datasette/issues/1720,1109200335,IC_kwDOBm6k_c5CHQ3P,9599,2022-04-26T01:24:47Z,2022-04-26T01:24:47Z,OWNER,"Sketching out a `?_extra=statistics` table plugin:

```python
from datasette import hookimpl

@hookimpl
def register_table_extras(datasette):
    return [statistics]

async def statistics(datasette, query, columns, sql):
    # ... need to figure out which columns are integer/floats
    # then build and execute a SQL query that calculates sum/avg/etc for each column
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/sqlite-utils/issues/428#issuecomment-1109190401,https://api.github.com/repos/simonw/sqlite-utils/issues/428,1109190401,IC_kwDOCGYnMM5CHOcB,9599,2022-04-26T01:05:29Z,2022-04-26T01:05:29Z,OWNER,Django makes extensive use of savepoints for nested transactions: https://docs.djangoproject.com/en/4.0/topics/db/transactions/#savepoints,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215216249,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109174715,https://api.github.com/repos/simonw/datasette/issues/1720,1109174715,IC_kwDOBm6k_c5CHKm7,9599,2022-04-26T00:40:13Z,2022-04-26T00:43:33Z,OWNER,"Some of the things I'd like to use `?_extra=` for, that may or not make sense as plugins:

- Performance breakdown information, maybe including explain output for a query/table
- Information about the tables that were consulted in a query - imagine pulling in additional table metadata
- Statistical aggregates against the full set of results. This may well be a Datasette core feature at some point in the future, but being able to provide it early as a plugin would be really cool.
- For tables, what are the other tables they can join against?
- Suggested facets
- Facet results themselves
- New custom facets I haven't thought of - though the `register_facet_classes` hook covers that already
- Table schema
- Table metadata
- Analytics - how many times has this table been queried? Would be a plugin thing
- For geospatial data, how about a GeoJSON polygon that represents the bounding box for all returned results? Effectively this is an extra aggregation.

Looking at https://github-to-sqlite.dogsheep.net/github/commits.json?_labels=on&_shape=objects for inspiration.

I think there's a separate potential mechanism in the future that lets you add custom columns to a table. This would affect `.csv` and the HTML presentation too, which makes it a different concept from the `?_extra=` hook that affects the JSON export (and the context that is fed to the HTML templates).","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109171871,https://api.github.com/repos/simonw/datasette/issues/1720,1109171871,IC_kwDOBm6k_c5CHJ6f,9599,2022-04-26T00:34:48Z,2022-04-26T00:34:48Z,OWNER,"Let's try sketching out a `register_table_extras` plugin for something new.

The first idea I came up with suggests adding new fields to the individual row records that come back - my mental model for extras so far has been that they add new keys to the root object.

So if a table result looked like this:

```json
{
  ""rows"": [
    {""id"": 1, ""name"": ""Cleo""},
    {""id"": 2, ""name"": ""Suna""}
  ],
  ""next_url"": null
}
```
I was initially thinking that `?_extra=facets` would add a `""facets"": {...}` key to that root object.

Here's a plugin idea I came up with that would probably justify adding to the individual row objects instead:

- `?_extra=check404s` - does an async `HEAD` request against every column value that looks like a URL and checks if it returns a 404

This could also work by adding a `""check404s"": {""url-here"": 200}` key to the root object though.

I think I need some better plugin concepts before committing to this new hook. There's overlap between this and how I want the enrichments mechanism ([see here](https://simonwillison.net/2021/Jan/17/weeknotes-still-pretty-distracted/)) to work.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109165411,https://api.github.com/repos/simonw/datasette/issues/1720,1109165411,IC_kwDOBm6k_c5CHIVj,9599,2022-04-26T00:22:42Z,2022-04-26T00:22:42Z,OWNER,Passing `pk_values` to the plugin hook feels odd. I think I'd pass a `row` object instead and let the code look up the primary key values on that row (by introspecting the primary keys for the table).,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109164803,https://api.github.com/repos/simonw/datasette/issues/1720,1109164803,IC_kwDOBm6k_c5CHIMD,9599,2022-04-26T00:21:40Z,2022-04-26T00:21:40Z,OWNER,"What would the existing https://latest.datasette.io/fixtures/simple_primary_key/1.json?_extras=foreign_key_tables feature look like if it was re-imagined as a `register_row_extras()` plugin?

Rough sketch, copying most of the code from https://github.com/simonw/datasette/blob/579f59dcec43a91dd7d404e00b87a00afd8515f2/datasette/views/row.py#L98

```python
from datasette import hookimpl

@hookimpl
def register_row_extras(datasette):
    return [foreign_key_tables]

async def foreign_key_tables(datasette, database, table, pk_values):
    if len(pk_values) != 1:
        return []
    db = datasette.get_database(database)
    all_foreign_keys = await db.get_all_foreign_keys()
    foreign_keys = all_foreign_keys[table][""incoming""]
    if len(foreign_keys) == 0:
        return []

    sql = ""select "" + "", "".join(
        [
            ""(select count(*) from {table} where {column}=:id)"".format(
                table=escape_sqlite(fk[""other_table""]),
                column=escape_sqlite(fk[""other_column""]),
            )
            for fk in foreign_keys
        ]
    )
    try:
        rows = list(await db.execute(sql, {""id"": pk_values[0]}))
    except QueryInterrupted:
        # Almost certainly hit the timeout
        return []

    foreign_table_counts = dict(
        zip(
            [(fk[""other_table""], fk[""other_column""]) for fk in foreign_keys],
            list(rows[0]),
        )
    )
    foreign_key_tables = []
    for fk in foreign_keys:
        count = (
            foreign_table_counts.get((fk[""other_table""], fk[""other_column""])) or 0
        )
        key = fk[""other_column""]
        if key.startswith(""_""):
            key += ""__exact""
        link = ""{}?{}={}"".format(
            self.ds.urls.table(database, fk[""other_table""]),
            key,
            "","".join(pk_values),
        )
        foreign_key_tables.append({**fk, **{""count"": count, ""link"": link}})
    return foreign_key_tables
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109162123,https://api.github.com/repos/simonw/datasette/issues/1720,1109162123,IC_kwDOBm6k_c5CHHiL,9599,2022-04-26T00:16:42Z,2022-04-26T00:16:51Z,OWNER,"Actually I'm going to imitate the existing `register_*` hooks:

- `def register_output_renderer(datasette)`
- `def register_facet_classes()`
- `def register_routes(datasette)`
- `def register_commands(cli)`
- `def register_magic_parameters(datasette)`

So I'm going to call the new hooks:

- `register_table_extras(datasette)`
- `register_row_extras(datasette)`
- `register_query_extras(datasette)`

They'll return a list of `async def` functions. The names of those functions will become the names of the extras.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109160226,https://api.github.com/repos/simonw/datasette/issues/1720,1109160226,IC_kwDOBm6k_c5CHHEi,9599,2022-04-26T00:14:11Z,2022-04-26T00:14:11Z,OWNER,"There are four existing plugin hooks that include the word ""extra"" but use it to mean something else - to mean additional CSS/JS/variables to be injected into the page:

- `def extra_css_urls(...)`
- `def extra_js_urls(...)`
- `def extra_body_script(...)`
- `def extra_template_vars(...)`

I think `extra_*` and `*_extras` are different enough that they won't be confused with each other.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109159307,https://api.github.com/repos/simonw/datasette/issues/1720,1109159307,IC_kwDOBm6k_c5CHG2L,9599,2022-04-26T00:12:28Z,2022-04-26T00:12:28Z,OWNER,"I'm going to keep table and row separate. So I think I need to add three new plugin hooks:

- `table_extras()`
- `row_extras()`
- `query_extras()`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1720#issuecomment-1109158903,https://api.github.com/repos/simonw/datasette/issues/1720,1109158903,IC_kwDOBm6k_c5CHGv3,9599,2022-04-26T00:11:42Z,2022-04-26T00:11:42Z,OWNER,"Places this plugin hook (or hooks?) should be able to affect:

- JSON for a table/view
- JSON for a row
- JSON for a canned query
- JSON for a custom arbitrary query

I'm going to combine those last two, which means there are three places. But maybe I can combine the table one and the row one as well?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1215174094,
https://github.com/simonw/datasette/issues/1719#issuecomment-1108907238,https://api.github.com/repos/simonw/datasette/issues/1719,1108907238,IC_kwDOBm6k_c5CGJTm,9599,2022-04-25T18:34:21Z,2022-04-25T18:34:21Z,OWNER,Well this refactor turned out to be pretty quick and really does greatly simplify both the `RowView` and `TableView` classes. Very happy with this.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1214859703,
https://github.com/simonw/datasette/issues/262#issuecomment-1108890170,https://api.github.com/repos/simonw/datasette/issues/262,1108890170,IC_kwDOBm6k_c5CGFI6,9599,2022-04-25T18:17:09Z,2022-04-25T18:18:39Z,OWNER,"I spotted in https://github.com/simonw/datasette/issues/1719#issuecomment-1108888494 that there's actually already an undocumented implementation of `?_extras=foreign_key_tables` - https://latest.datasette.io/fixtures/simple_primary_key/1.json?_extras=foreign_key_tables

I added that feature all the way back in November 2017! https://github.com/simonw/datasette/commit/a30c5b220c15360d575e94b0e67f3255e120b916","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",323658641,
https://github.com/simonw/datasette/issues/1719#issuecomment-1108888494,https://api.github.com/repos/simonw/datasette/issues/1719,1108888494,IC_kwDOBm6k_c5CGEuu,9599,2022-04-25T18:15:42Z,2022-04-25T18:15:42Z,OWNER,"Here's an undocumented feature I forgot existed: https://latest.datasette.io/fixtures/simple_primary_key/1.json?_extras=foreign_key_tables

`?_extras=foreign_key_tables`

https://github.com/simonw/datasette/blob/0bc5186b7bb4fc82392df08f99a9132f84dcb331/datasette/views/table.py#L1021-L1024

It's even covered by the tests:

https://github.com/simonw/datasette/blob/b9c2b1cfc8692b9700416db98721fa3ec982f6be/tests/test_api.py#L691-L703","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1214859703,
https://github.com/simonw/datasette/issues/1719#issuecomment-1108884171,https://api.github.com/repos/simonw/datasette/issues/1719,1108884171,IC_kwDOBm6k_c5CGDrL,9599,2022-04-25T18:10:46Z,2022-04-25T18:12:45Z,OWNER,"It looks like the only class method from that shared class needed  by `RowView` is `self.display_columns_and_rows()`.

Which I've been wanting to refactor to provide to `QueryView` too:

- #715","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1214859703,
https://github.com/simonw/datasette/issues/1715#issuecomment-1108875068,https://api.github.com/repos/simonw/datasette/issues/1715,1108875068,IC_kwDOBm6k_c5CGBc8,9599,2022-04-25T18:03:13Z,2022-04-25T18:06:33Z,OWNER,"The `RowTableShared` class is making this a whole lot more complicated.

I'm going to split the `RowView` view out into an entirely separate `views/row.py` module.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1715#issuecomment-1108877454,https://api.github.com/repos/simonw/datasette/issues/1715,1108877454,IC_kwDOBm6k_c5CGCCO,9599,2022-04-25T18:04:27Z,2022-04-25T18:04:27Z,OWNER,Pushed my WIP on this to the `api-extras` branch: 5053f1ea83194ecb0a5693ad5dada5b25bf0f7e6,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107873311,https://api.github.com/repos/simonw/datasette/issues/1718,1107873311,IC_kwDOBm6k_c5CCM4f,9599,2022-04-24T16:24:14Z,2022-04-24T16:24:14Z,OWNER,Wrote up what I learned in a TIL: https://til.simonwillison.net/sphinx/blacken-docs,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107873271,https://api.github.com/repos/simonw/datasette/issues/1718,1107873271,IC_kwDOBm6k_c5CCM33,9599,2022-04-24T16:23:57Z,2022-04-24T16:23:57Z,OWNER,"Turns out I didn't need that `git diff-index` trick after all - the `blacken-docs` command returns a non-zero exit code if it changes any files.

Submitted a documentation PR to that project instead:

- https://github.com/asottile/blacken-docs/pull/162","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107870788,https://api.github.com/repos/simonw/datasette/issues/1718,1107870788,IC_kwDOBm6k_c5CCMRE,9599,2022-04-24T16:09:23Z,2022-04-24T16:09:23Z,OWNER,One more attempt at testing the `git diff-index` trick.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107869884,https://api.github.com/repos/simonw/datasette/issues/1718,1107869884,IC_kwDOBm6k_c5CCMC8,9599,2022-04-24T16:04:03Z,2022-04-24T16:04:03Z,OWNER,"OK, I'm expecting this one to fail at the `git diff-index --quiet HEAD --` check.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107869556,https://api.github.com/repos/simonw/datasette/issues/1718,1107869556,IC_kwDOBm6k_c5CCL90,9599,2022-04-24T16:02:27Z,2022-04-24T16:02:27Z,OWNER,"Looking at that first error it appears to be a place where I had deliberately omitted the body of the function:

https://github.com/simonw/datasette/blob/36573638b0948174ae237d62e6369b7d55220d7f/docs/internals.rst#L196-L211

I can use `...` as the function body here to get it to pass.

Fixing those warnings actually helped me spot a couple of bugs, so I'm glad this happened.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107868585,https://api.github.com/repos/simonw/datasette/issues/1718,1107868585,IC_kwDOBm6k_c5CCLup,9599,2022-04-24T15:57:10Z,2022-04-24T15:57:19Z,OWNER,"The tests failed there because of what I thought were warnings but turn out to be treated as errors:
```
% blacken-docs -l 60 docs/*.rst                                        
docs/internals.rst:196: code block parse error Cannot parse: 14:0: <line number missing in source>
docs/json_api.rst:449: code block parse error Cannot parse: 1:0: <link rel=""alternate""
docs/plugin_hooks.rst:250: code block parse error Cannot parse: 6:4:     ]
docs/plugin_hooks.rst:311: code block parse error Cannot parse: 38:0: <line number missing in source>
docs/testing_plugins.rst:135: code block parse error Cannot parse: 5:0: <line number missing in source>
% echo $?
1
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107867281,https://api.github.com/repos/simonw/datasette/issues/1718,1107867281,IC_kwDOBm6k_c5CCLaR,9599,2022-04-24T15:49:23Z,2022-04-24T15:49:23Z,OWNER,I'm going to push the first commit with a deliberate missing formatting to check that the tests fail.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107866013,https://api.github.com/repos/simonw/datasette/issues/1718,1107866013,IC_kwDOBm6k_c5CCLGd,9599,2022-04-24T15:42:07Z,2022-04-24T15:42:07Z,OWNER,"In the absence of `--check` I can use this to detect if changes are applied:
```zsh
% git diff-index --quiet HEAD --
% echo $?                       
0
% blacken-docs -l 60 docs/*.rst
docs/authentication.rst: Rewriting...
...
% git diff-index --quiet HEAD --
% echo $?                       
1
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107865493,https://api.github.com/repos/simonw/datasette/issues/1718,1107865493,IC_kwDOBm6k_c5CCK-V,9599,2022-04-24T15:39:02Z,2022-04-24T15:39:02Z,OWNER,"There's no `blacken-docs --check` option so I filed a feature request:
- https://github.com/asottile/blacken-docs/issues/161","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107863924,https://api.github.com/repos/simonw/datasette/issues/1718,1107863924,IC_kwDOBm6k_c5CCKl0,9599,2022-04-24T15:30:03Z,2022-04-24T15:30:03Z,OWNER,"On the one hand, I'm not crazy about some of the indentation decisions Black made here - in particular this one, which I had indented deliberately for readability:
```diff
 diff --git a/docs/authentication.rst b/docs/authentication.rst
index 0d98cf8..8008023 100644
--- a/docs/authentication.rst
+++ b/docs/authentication.rst
@@ -381,11 +381,7 @@ Authentication plugins can set signed ``ds_actor`` cookies themselves like so:
 .. code-block:: python
 
     response = Response.redirect(""/"")
-    response.set_cookie(""ds_actor"", datasette.sign({
-        ""a"": {
-            ""id"": ""cleopaws""
-        }
-    }, ""actor""))
+    response.set_cookie(""ds_actor"", datasette.sign({""a"": {""id"": ""cleopaws""}}, ""actor""))
```
But... consistency is a virtue. Maybe I'm OK with just this one disagreement?

Also: I've been mentally trying to keep the line lengths a bit shorter to help them  be more readable on mobile devices.

I'll try a different line length using `blacken-docs -l 60 docs/*.rst` instead.

I like this more - here's the result for that example:
```diff
diff --git a/docs/authentication.rst b/docs/authentication.rst
index 0d98cf8..2496073 100644
--- a/docs/authentication.rst
+++ b/docs/authentication.rst
@@ -381,11 +381,10 @@ Authentication plugins can set signed ``ds_actor`` cookies themselves like so:
 .. code-block:: python
 
     response = Response.redirect(""/"")
-    response.set_cookie(""ds_actor"", datasette.sign({
-        ""a"": {
-            ""id"": ""cleopaws""
-        }
-    }, ""actor""))
+    response.set_cookie(
+        ""ds_actor"",
+        datasette.sign({""a"": {""id"": ""cleopaws""}}, ""actor""),
+    )
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107863365,https://api.github.com/repos/simonw/datasette/issues/1718,1107863365,IC_kwDOBm6k_c5CCKdF,9599,2022-04-24T15:26:41Z,2022-04-24T15:26:41Z,OWNER,"Tried this:
```
pip install blacken-docs
blacken-docs docs/*.rst
git diff | pbcopy
```
Got this:
```diff
 diff --git a/docs/authentication.rst b/docs/authentication.rst
index 0d98cf8..8008023 100644
--- a/docs/authentication.rst
+++ b/docs/authentication.rst
@@ -381,11 +381,7 @@ Authentication plugins can set signed ``ds_actor`` cookies themselves like so:
 .. code-block:: python
 
     response = Response.redirect(""/"")
-    response.set_cookie(""ds_actor"", datasette.sign({
-        ""a"": {
-            ""id"": ""cleopaws""
-        }
-    }, ""actor""))
+    response.set_cookie(""ds_actor"", datasette.sign({""a"": {""id"": ""cleopaws""}}, ""actor""))
 
 Note that you need to pass ``""actor""`` as the namespace to :ref:`datasette_sign`.
 
@@ -412,12 +408,16 @@ To include an expiry, add a ``""e""`` key to the cookie value containing a `base62
     expires_at = int(time.time()) + (24 * 60 * 60)
 
     response = Response.redirect(""/"")
-    response.set_cookie(""ds_actor"", datasette.sign({
-        ""a"": {
-            ""id"": ""cleopaws""
-        },
-        ""e"": baseconv.base62.encode(expires_at),
-    }, ""actor""))
+    response.set_cookie(
+        ""ds_actor"",
+        datasette.sign(
+            {
+                ""a"": {""id"": ""cleopaws""},
+                ""e"": baseconv.base62.encode(expires_at),
+            },
+            ""actor"",
+        ),
+    )
 
 The resulting cookie will encode data that looks something like this:
 
diff --git a/docs/spatialite.rst b/docs/spatialite.rst
index d1b300b..556bad8 100644
--- a/docs/spatialite.rst
+++ b/docs/spatialite.rst
@@ -58,19 +58,22 @@ Here's a recipe for taking a table with existing latitude and longitude columns,
 .. code-block:: python
 
     import sqlite3
-    conn = sqlite3.connect('museums.db')
+
+    conn = sqlite3.connect(""museums.db"")
     # Lead the spatialite extension:
     conn.enable_load_extension(True)
-    conn.load_extension('/usr/local/lib/mod_spatialite.dylib')
+    conn.load_extension(""/usr/local/lib/mod_spatialite.dylib"")
     # Initialize spatial metadata for this database:
-    conn.execute('select InitSpatialMetadata(1)')
+    conn.execute(""select InitSpatialMetadata(1)"")
     # Add a geometry column called point_geom to our museums table:
     conn.execute(""SELECT AddGeometryColumn('museums', 'point_geom', 4326, 'POINT', 2);"")
     # Now update that geometry column with the lat/lon points
-    conn.execute('''
+    conn.execute(
+        """"""
         UPDATE museums SET
         point_geom = GeomFromText('POINT('||""longitude""||' '||""latitude""||')',4326);
-    ''')
+    """"""
+    )
     # Now add a spatial index to that column
     conn.execute('select CreateSpatialIndex(""museums"", ""point_geom"");')
     # If you don't commit your changes will not be persisted:
@@ -186,13 +189,14 @@ Here's Python code to create a SQLite database, enable SpatiaLite, create a plac
 .. code-block:: python
 
     import sqlite3
-    conn = sqlite3.connect('places.db')
+
+    conn = sqlite3.connect(""places.db"")
     # Enable SpatialLite extension
     conn.enable_load_extension(True)
-    conn.load_extension('/usr/local/lib/mod_spatialite.dylib')
+    conn.load_extension(""/usr/local/lib/mod_spatialite.dylib"")
     # Create the masic countries table
-    conn.execute('select InitSpatialMetadata(1)')
-    conn.execute('create table places (id integer primary key, name text);')
+    conn.execute(""select InitSpatialMetadata(1)"")
+    conn.execute(""create table places (id integer primary key, name text);"")
     # Add a MULTIPOLYGON Geometry column
     conn.execute(""SELECT AddGeometryColumn('places', 'geom', 4326, 'MULTIPOLYGON', 2);"")
     # Add a spatial index against the new column
@@ -201,13 +205,17 @@ Here's Python code to create a SQLite database, enable SpatiaLite, create a plac
     from shapely.geometry.multipolygon import MultiPolygon
     from shapely.geometry import shape
     import requests
-    geojson = requests.get('https://data.whosonfirst.org/404/227/475/404227475.geojson').json()
+
+    geojson = requests.get(
+        ""https://data.whosonfirst.org/404/227/475/404227475.geojson""
+    ).json()
     # Convert to ""Well Known Text"" format
-    wkt = shape(geojson['geometry']).wkt
+    wkt = shape(geojson[""geometry""]).wkt
     # Insert and commit the record
-    conn.execute(""INSERT INTO places (id, name, geom) VALUES(null, ?, GeomFromText(?, 4326))"", (
-       ""Wales"", wkt
-    ))
+    conn.execute(
+        ""INSERT INTO places (id, name, geom) VALUES(null, ?, GeomFromText(?, 4326))"",
+        (""Wales"", wkt),
+    )
     conn.commit()
 
 Querying polygons using within()
diff --git a/docs/writing_plugins.rst b/docs/writing_plugins.rst
index bd60a4b..5af01f6 100644
--- a/docs/writing_plugins.rst
+++ b/docs/writing_plugins.rst
@@ -18,9 +18,10 @@ The quickest way to start writing a plugin is to create a ``my_plugin.py`` file
 
     from datasette import hookimpl
 
+
     @hookimpl
     def prepare_connection(conn):
-        conn.create_function('hello_world', 0, lambda: 'Hello world!')
+        conn.create_function(""hello_world"", 0, lambda: ""Hello world!"")
 
 If you save this in ``plugins/my_plugin.py`` you can then start Datasette like this::
 
@@ -60,22 +61,18 @@ The example consists of two files: a ``setup.py`` file that defines the plugin:
 
     from setuptools import setup
 
-    VERSION = '0.1'
+    VERSION = ""0.1""
 
     setup(
-        name='datasette-plugin-demos',
-        description='Examples of plugins for Datasette',
-        author='Simon Willison',
-        url='https://github.com/simonw/datasette-plugin-demos',
-        license='Apache License, Version 2.0',
+        name=""datasette-plugin-demos"",
+        description=""Examples of plugins for Datasette"",
+        author=""Simon Willison"",
+        url=""https://github.com/simonw/datasette-plugin-demos"",
+        license=""Apache License, Version 2.0"",
         version=VERSION,
-        py_modules=['datasette_plugin_demos'],
-        entry_points={
-            'datasette': [
-                'plugin_demos = datasette_plugin_demos'
-            ]
-        },
-        install_requires=['datasette']
+        py_modules=[""datasette_plugin_demos""],
+        entry_points={""datasette"": [""plugin_demos = datasette_plugin_demos""]},
+        install_requires=[""datasette""],
     )
 
 And a Python module file, ``datasette_plugin_demos.py``, that implements the plugin:
@@ -88,12 +85,12 @@ And a Python module file, ``datasette_plugin_demos.py``, that implements the plu
 
     @hookimpl
     def prepare_jinja2_environment(env):
-        env.filters['uppercase'] = lambda u: u.upper()
+        env.filters[""uppercase""] = lambda u: u.upper()
 
 
     @hookimpl
     def prepare_connection(conn):
-        conn.create_function('random_integer', 2, random.randint)
+        conn.create_function(""random_integer"", 2, random.randint)
 
 
 Having built a plugin in this way you can turn it into an installable package using the following command::
@@ -123,11 +120,13 @@ To bundle the static assets for a plugin in the package that you publish to PyPI
 
 .. code-block:: python
 
-        package_data={
-            'datasette_plugin_name': [
-                'static/plugin.js',
-            ],
-        },
+        package_data = (
+            {
+                ""datasette_plugin_name"": [
+                    ""static/plugin.js"",
+                ],
+            },
+        )
 
 Where ``datasette_plugin_name`` is the name of the plugin package (note that it uses underscores, not hyphens) and ``static/plugin.js`` is the path within that package to the static file.
 
@@ -152,11 +151,13 @@ Templates should be bundled for distribution using the same ``package_data`` mec
 
 .. code-block:: python
 
-        package_data={
-            'datasette_plugin_name': [
-                'templates/my_template.html',
-            ],
-        },
+        package_data = (
+            {
+                ""datasette_plugin_name"": [
+                    ""templates/my_template.html"",
+                ],
+            },
+        )
 
 You can also use wildcards here such as ``templates/*.html``. See `datasette-edit-schema <https://github.com/simonw/datasette-edit-schema>`__ for an example of this pattern.
 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/issues/1718#issuecomment-1107862882,https://api.github.com/repos/simonw/datasette/issues/1718,1107862882,IC_kwDOBm6k_c5CCKVi,9599,2022-04-24T15:23:56Z,2022-04-24T15:23:56Z,OWNER,"Found https://github.com/asottile/blacken-docs via
- https://github.com/psf/black/issues/294","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213683988,
https://github.com/simonw/datasette/pull/1717#issuecomment-1107848097,https://api.github.com/repos/simonw/datasette/issues/1717,1107848097,IC_kwDOBm6k_c5CCGuh,9599,2022-04-24T14:02:37Z,2022-04-24T14:02:37Z,OWNER,"This is a neat feature, thanks!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213281044,
https://github.com/simonw/datasette/pull/1717#issuecomment-1107459446,https://api.github.com/repos/simonw/datasette/issues/1717,1107459446,IC_kwDOBm6k_c5CAn12,22429695,2022-04-23T11:56:36Z,2022-04-23T11:56:36Z,NONE,"# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1717?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report
> Merging [#1717](https://codecov.io/gh/simonw/datasette/pull/1717?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (9b9a314) into [main](https://codecov.io/gh/simonw/datasette/commit/d57c347f35bcd8cff15f913da851b4b8eb030867?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (d57c347) will **increase** coverage by `0.00%`.
> The diff coverage is `100.00%`.

```diff
@@           Coverage Diff           @@
##             main    #1717   +/-   ##
=======================================
  Coverage   91.75%   91.75%           
=======================================
  Files          34       34           
  Lines        4574     4575    +1     
=======================================
+ Hits         4197     4198    +1     
  Misses        377      377           
```


| [Impacted Files](https://codecov.io/gh/simonw/datasette/pull/1717?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) | Coverage Δ | |
|---|---|---|
| [datasette/publish/cloudrun.py](https://codecov.io/gh/simonw/datasette/pull/1717/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-ZGF0YXNldHRlL3B1Ymxpc2gvY2xvdWRydW4ucHk=) | `97.05% <100.00%> (+0.04%)` | :arrow_up: |

------

[Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1717?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1717?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Last update [d57c347...9b9a314](https://codecov.io/gh/simonw/datasette/pull/1717?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1213281044,
https://github.com/simonw/datasette/issues/1715#issuecomment-1106989581,https://api.github.com/repos/simonw/datasette/issues/1715,1106989581,IC_kwDOBm6k_c5B-1IN,9599,2022-04-22T23:03:29Z,2022-04-22T23:03:29Z,OWNER,I'm having second thoughts about injecting `request` - might be better to have the view function pull the relevant pieces out of the request before triggering the rest of the resolution.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1715#issuecomment-1106947168,https://api.github.com/repos/simonw/datasette/issues/1715,1106947168,IC_kwDOBm6k_c5B-qxg,9599,2022-04-22T22:25:57Z,2022-04-22T22:26:06Z,OWNER,"```python
async def database(request: Request, datasette: Datasette) -> Database:
    database_route = tilde_decode(request.url_vars[""database""])
    try:
        return datasette.get_database(route=database_route)
    except KeyError:
        raise NotFound(""Database not found: {}"".format(database_route))

async def table_name(request: Request) -> str:
    return tilde_decode(request.url_vars[""table""])
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1715#issuecomment-1106945876,https://api.github.com/repos/simonw/datasette/issues/1715,1106945876,IC_kwDOBm6k_c5B-qdU,9599,2022-04-22T22:24:29Z,2022-04-22T22:24:29Z,OWNER,"Looking at the start of `TableView.data()`:

https://github.com/simonw/datasette/blob/d57c347f35bcd8cff15f913da851b4b8eb030867/datasette/views/table.py#L333-L346

I'm going to resolve `table_name` and `database` from the URL - `table_name` will be a string, `database` will be the DB object returned by `datasette.get_database()`. Then those can be passed in separately too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1716#issuecomment-1106923258,https://api.github.com/repos/simonw/datasette/issues/1716,1106923258,IC_kwDOBm6k_c5B-k76,9599,2022-04-22T22:02:07Z,2022-04-22T22:02:07Z,OWNER,"https://github.com/simonw/datasette/blame/main/datasette/views/base.py

<img width=""1373"" alt=""image"" src=""https://user-images.githubusercontent.com/9599/164801564-d8a11ce9-7d9b-4e85-8947-a547d2986ef3.png"">
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212838949,
https://github.com/simonw/datasette/issues/1715#issuecomment-1106908642,https://api.github.com/repos/simonw/datasette/issues/1715,1106908642,IC_kwDOBm6k_c5B-hXi,9599,2022-04-22T21:47:55Z,2022-04-22T21:47:55Z,OWNER,"I need a `asyncio.Registry` with functions registered to perform the role of the table view.

Something like this perhaps:
```python
def table_html_context(facet_results, query, datasette, rows):
    return {...}
```
That then gets called like this:
```python
async def view(request):
    registry = Registry(facet_results, query, datasette, rows)
    context = await registry.resolve(table_html, request=request, datasette=datasette)
    return Reponse.html(await datasette.render(""table.html"", context)
```
It's also interesting to start thinking about this from a Python client library point of view. If I'm writing code outside of the HTTP request cycle, what would it look like?

One thing I could do: break out is the code that turns a request into a list of pairs extracted from the request - this code here: https://github.com/simonw/datasette/blob/8338c66a57502ef27c3d7afb2527fbc0663b2570/datasette/views/table.py#L442-L449

I could turn that into a typed dependency injection function like this:

```python
def filter_args(request: Request) -> List[Tuple[str, str]]:
    # Arguments that start with _ and don't contain a __ are
    # special - things like ?_search= - and should not be
    # treated as filters.
    filter_args = []
    for key in request.args:
        if not (key.startswith(""_"") and ""__"" not in key):
            for v in request.args.getlist(key):
                filter_args.append((key, v))
    return filter_args
```
Then I can either pass a `request` into a `.resolve()` call, or I can instead skip that function by passing:

```python
output = registry.resolve(table_context, filter_args=[(""foo"", ""bar"")])
```
I do need to think about where plugins get executed in all of this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665,
https://github.com/simonw/datasette/issues/1101#issuecomment-1105642187,https://api.github.com/repos/simonw/datasette/issues/1101,1105642187,IC_kwDOBm6k_c5B5sLL,25778,2022-04-21T18:59:08Z,2022-04-21T18:59:08Z,CONTRIBUTOR,"Ha! That was your idea (and a good one).

But it's probably worth measuring to see what overhead it adds. It did require both passing in the database and making the whole thing `async`. 

Just timing the queries themselves:

1. [Using `AsGeoJSON(geometry) as geometry`](https://alltheplaces-datasette.fly.dev/alltheplaces?sql=select%0D%0A++id%2C%0D%0A++properties%2C%0D%0A++AsGeoJSON%28geometry%29+as+geometry%2C%0D%0A++spider%0D%0Afrom%0D%0A++places%0D%0Aorder+by%0D%0A++id%0D%0Alimit%0D%0A++1000) takes 10.235 ms
2. [Leaving as binary](https://alltheplaces-datasette.fly.dev/alltheplaces?sql=select%0D%0A++id%2C%0D%0A++properties%2C%0D%0A++geometry%2C%0D%0A++spider%0D%0Afrom%0D%0A++places%0D%0Aorder+by%0D%0A++id%0D%0Alimit%0D%0A++1000) takes 8.63 ms

Looking at the network panel:

1. Takes about 200 ms for the `fetch` request
2. Takes about 300 ms

I'm not sure how best to time the GeoJSON generation, but it would be interesting to check. Maybe I'll write a plugin to add query times to response headers.

The other thing to consider with async streaming is that it might be well-suited for a slower response. When I have to get the whole result and send a response in a fixed amount of time, I need the most efficient query possible. If I can hang onto a connection and get things one chunk at a time, maybe it's ok if there's some overhead.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",749283032,
https://github.com/simonw/datasette/issues/1101#issuecomment-1105615625,https://api.github.com/repos/simonw/datasette/issues/1101,1105615625,IC_kwDOBm6k_c5B5lsJ,9599,2022-04-21T18:31:41Z,2022-04-21T18:32:22Z,OWNER,"The `datasette-geojson` plugin is actually an interesting case here, because of the way it converts SpatiaLite geometries into GeoJSON: https://github.com/eyeseast/datasette-geojson/blob/602c4477dc7ddadb1c0a156cbcd2ef6688a5921d/datasette_geojson/__init__.py#L61-L66

```python

    if isinstance(geometry, bytes):
        results = await db.execute(
            ""SELECT AsGeoJSON(:geometry)"", {""geometry"": geometry}
        )
        return geojson.loads(results.single_value())
```
That actually seems to work really well as-is, but it does worry me a bit that it ends up having to execute an extra `SELECT` query for every single returned row - especially in streaming mode where it might be asked to return 1m rows at once.

My PostgreSQL/MySQL engineering brain says that this would be better handled by doing a chunk of these (maybe 100) at once, to avoid the per-query-overhead - but with SQLite that might not be necessary.

At any rate, this is one of the reasons I'm interested in ""iterate over this sequence of chunks of 100 rows at a time"" as a potential option here.

Of course, a better solution would be for `datasette-geojson` to have a way to influence the SQL query before it is executed, adding a `AsGeoJSON(geometry)` clause to it - so that's something I'm open to as well.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",749283032,
https://github.com/simonw/datasette/issues/1101#issuecomment-1105608964,https://api.github.com/repos/simonw/datasette/issues/1101,1105608964,IC_kwDOBm6k_c5B5kEE,9599,2022-04-21T18:26:29Z,2022-04-21T18:26:29Z,OWNER,"I'm questioning if the mechanisms should be separate at all now - a single response rendering is really just a case of a streaming response that only pulls the first N records from the iterator.

It probably needs to be an `async for` iterator, which I've not worked with much before. Good opportunity to learn.

This actually gets a fair bit more complicated due to the work I'm doing right now to improve the default JSON API:

- #1709

I want to do things like make faceting results optionally available to custom renderers - which is a separate concern from streaming rows.

I'm going to poke around with a bunch of prototypes and see what sticks.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",749283032,
https://github.com/simonw/datasette/issues/1101#issuecomment-1105588651,https://api.github.com/repos/simonw/datasette/issues/1101,1105588651,IC_kwDOBm6k_c5B5fGr,25778,2022-04-21T18:15:39Z,2022-04-21T18:15:39Z,CONTRIBUTOR,"What if you split rendering and streaming into two things:

- `render` is a function that returns a response
- `stream` is a function that sends chunks, or yields chunks passed to an ASGI `send` callback

That way current plugins still work, and streaming is purely additive. A `stream` function could get a cursor or iterator of rows, instead of a list, so it could more efficiently handle large queries.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",749283032,
https://github.com/simonw/datasette/issues/1101#issuecomment-1105571003,https://api.github.com/repos/simonw/datasette/issues/1101,1105571003,IC_kwDOBm6k_c5B5ay7,9599,2022-04-21T18:10:38Z,2022-04-21T18:10:46Z,OWNER,"Maybe the simplest design for this is to add an optional `can_stream` to the contract:

```python
    @hookimpl
    def register_output_renderer(datasette):
        return {
            ""extension"": ""tsv"",
            ""render"": render_tsv,
            ""can_render"": lambda: True,
            ""can_stream"": lambda: True
        }
```
When streaming, a new parameter could be passed to the render function - maybe `chunks` - which is an iterator/generator over a sequence of chunks of rows.

Or it could use the existing `rows` parameter but treat that as an iterator?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",749283032,
https://github.com/dogsheep/github-to-sqlite/issues/72#issuecomment-1105474232,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/72,1105474232,IC_kwDODFdgUs5B5DK4,9599,2022-04-21T17:02:15Z,2022-04-21T17:02:15Z,MEMBER,"That's interesting - yeah it looks like the number of pages can be derived from the `Link` header, which is enough information to show a progress bar, probably using Click just to avoid adding another dependency.

https://docs.github.com/en/rest/guides/traversing-with-pagination","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1211283427,
https://github.com/simonw/datasette/pull/1574#issuecomment-1105464661,https://api.github.com/repos/simonw/datasette/issues/1574,1105464661,IC_kwDOBm6k_c5B5A1V,208018,2022-04-21T16:51:24Z,2022-04-21T16:51:24Z,NONE,"tfw you have more ephemeral storage than upstream bandwidth

```
FROM python:3.10-slim AS base

RUN apt update && apt -y install zstd

ENV DATASETTE_SECRET 'sosecret'
RUN --mount=type=cache,target=/root/.cache/pip
    pip install -U datasette datasette-pretty-json datasette-graphql

ENV PORT 8080
EXPOSE 8080

FROM base AS pack

COPY . /app
WORKDIR /app

RUN datasette inspect --inspect-file inspect-data.json
RUN zstd --rm *.db

FROM base AS unpack

COPY --from=pack /app /app
WORKDIR /app

CMD [""/bin/bash"", ""-c"", ""shopt -s nullglob && zstd --rm -d *.db.zst && datasette serve --host 0.0.0.0 --cors --inspect-file inspect-data.json --metadata metadata.json --create --port $PORT *.db""]
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1084193403,
https://github.com/simonw/datasette/issues/1713#issuecomment-1103312860,https://api.github.com/repos/simonw/datasette/issues/1713,1103312860,IC_kwDOBm6k_c5Bwzfc,536941,2022-04-20T00:52:19Z,2022-04-20T00:52:19Z,CONTRIBUTOR,feels related to #1402 ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203943272,
https://github.com/simonw/sqlite-utils/issues/425#issuecomment-1101594549,https://api.github.com/repos/simonw/sqlite-utils/issues/425,1101594549,IC_kwDOCGYnMM5BqP-1,9599,2022-04-18T17:36:14Z,2022-04-18T17:36:14Z,OWNER,"Releated:
- #408","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203842656,
https://github.com/simonw/datasette/pull/1159#issuecomment-1100243987,https://api.github.com/repos/simonw/datasette/issues/1159,1100243987,IC_kwDOBm6k_c5BlGQT,552629,2022-04-15T17:24:43Z,2022-04-15T17:24:43Z,NONE,@simonw : do you think this could be merged ?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",774332247,
https://github.com/simonw/datasette/issues/1713#issuecomment-1099540225,https://api.github.com/repos/simonw/datasette/issues/1713,1099540225,IC_kwDOBm6k_c5BiacB,25778,2022-04-14T19:09:57Z,2022-04-14T19:09:57Z,CONTRIBUTOR,"I wonder if this overlaps with what I outlined in #1605. You could run something like this:

```sh
datasette freeze -d exports/
aws s3 cp exports/ s3://my-export-bucket/$(date)
```

And maybe that does what you need. Of course, that plugin isn't built yet. But that's the idea.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203943272,
https://github.com/simonw/datasette/issues/1713#issuecomment-1099443468,https://api.github.com/repos/simonw/datasette/issues/1713,1099443468,IC_kwDOBm6k_c5BiC0M,9308268,2022-04-14T17:26:27Z,2022-04-14T17:26:27Z,NONE,"What would be an awesome feature as a plugin would be to be able to save a query (and possibly even results) to a github gist. Being able to share results that way would be super fantastic. Possibly even in Jupyter Notebook format (since github and github gists nicely render those)! 

I know there's the handy datasette-saved-queries plugin, but a button that could export stuff out and then even possibly import stuff back in (I'm sort of thinking the way that Google Colab allows you to save to github, and then pull the notebook back in is a really great workflow 
![image](https://user-images.githubusercontent.com/9308268/163441612-9ad2649f-c73e-4557-aaf2-e3d0fdc48fbf.png)
https://github.com/cincinnatilibrary/collection-analysis/blob/master/reports/colab_datasette_example.ipynb )","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203943272,
https://github.com/simonw/datasette/issues/1713#issuecomment-1098628334,https://api.github.com/repos/simonw/datasette/issues/1713,1098628334,IC_kwDOBm6k_c5Be7zu,9599,2022-04-14T01:43:00Z,2022-04-14T01:43:13Z,OWNER,"Current workaround for fast publishing to S3:

    datasette fixtures.db --get /fixtures/facetable.json | \
      s3-credentials put-object my-bucket facetable.json -","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203943272,
https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098548931,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098548931,IC_kwDOCGYnMM5BeobD,9599,2022-04-13T22:41:59Z,2022-04-13T22:41:59Z,OWNER,"I'm going to close this ticket since it looks like this is a bug in the way the Dockerfile builds Python, but I'm going to ship a fix for that issue I found so the `LD_PRELOAD` workaround above should work OK with the next release of `sqlite-utils`. Thanks for the detailed bug report!","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,
https://github.com/simonw/sqlite-utils/issues/424#issuecomment-1098548090,https://api.github.com/repos/simonw/sqlite-utils/issues/424,1098548090,IC_kwDOCGYnMM5BeoN6,9599,2022-04-13T22:40:15Z,2022-04-13T22:40:15Z,OWNER,"New error:
```pycon
>>> from sqlite_utils import Database
>>> db = Database(memory=True)
>>> db[""foo""].create({})
Traceback (most recent call last):
  File ""<stdin>"", line 1, in <module>
  File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py"", line 1465, in create
    self.db.create_table(
  File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py"", line 885, in create_table
    sql = self.create_table_sql(
  File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py"", line 771, in create_table_sql
    assert columns, ""Tables must have at least one column""
AssertionError: Tables must have at least one column
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1200866134,
https://github.com/simonw/sqlite-utils/issues/425#issuecomment-1098545390,https://api.github.com/repos/simonw/sqlite-utils/issues/425,1098545390,IC_kwDOCGYnMM5Benju,9599,2022-04-13T22:34:52Z,2022-04-13T22:34:52Z,OWNER,"That broke Python 3.7 because it doesn't support `deterministic=True` even being passed:

> function takes at most 3 arguments (4 given)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203842656,
https://github.com/simonw/sqlite-utils/issues/425#issuecomment-1098537000,https://api.github.com/repos/simonw/sqlite-utils/issues/425,1098537000,IC_kwDOCGYnMM5Belgo,9599,2022-04-13T22:18:22Z,2022-04-13T22:18:22Z,OWNER,"I figured out a workaround in https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098535531

The current `register(fn)` method looks like this: https://github.com/simonw/sqlite-utils/blob/95522ad919f96eb6cc8cd3cd30389b534680c717/sqlite_utils/db.py#L389-L403

This alternative implementation worked in the environment where that failed:

```python
        def register(fn):
            name = fn.__name__
            arity = len(inspect.signature(fn).parameters)
            if not replace and (name, arity) in self._registered_functions:
                return fn
            kwargs = {}
            done = False
            if deterministic:
                # Try this, but fall back if sqlite3.NotSupportedError
                try:
                    self.conn.create_function(name, arity, fn, **dict(kwargs, deterministic=True))
                    done = True
                except sqlite3.NotSupportedError:
                    pass
            if not done:
                self.conn.create_function(name, arity, fn, **kwargs)
            self._registered_functions.add((name, arity))
            return fn
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203842656,
https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098535531,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098535531,IC_kwDOCGYnMM5BelJr,9599,2022-04-13T22:15:48Z,2022-04-13T22:15:48Z,OWNER,"Trying this alternative implementation of the `register()` method:

```python
        def register(fn):
            name = fn.__name__
            arity = len(inspect.signature(fn).parameters)
            if not replace and (name, arity) in self._registered_functions:
                return fn
            kwargs = {}
            done = False
            if deterministic:
                # Try this, but fall back if sqlite3.NotSupportedError
                try:
                    self.conn.create_function(name, arity, fn, **dict(kwargs, deterministic=True))
                    done = True
                except sqlite3.NotSupportedError:
                    pass
            if not done:
                self.conn.create_function(name, arity, fn, **kwargs)
            self._registered_functions.add((name, arity))
            return fn
```
With that fix, the following worked!
```
LD_PRELOAD=./build/sqlite-autoconf-3360000/.libs/libsqlite3.so sqlite-utils indexes /tmp/global.db --table
table      index_name                    seqno    cid  name       desc  coll      key
---------  --------------------------  -------  -----  -------  ------  ------  -----
countries  idx_countries_country_name        0      1  country       0  BINARY      1
countries  idx_countries_country_name        1      2  name          0  BINARY      1
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,
https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098532220,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098532220,IC_kwDOCGYnMM5BekV8,9599,2022-04-13T22:09:52Z,2022-04-13T22:09:52Z,OWNER,That error is weird - it's not supposed to happen according to this code here: https://github.com/simonw/sqlite-utils/blob/95522ad919f96eb6cc8cd3cd30389b534680c717/sqlite_utils/db.py#L389-L400,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,
https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098531354,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098531354,IC_kwDOCGYnMM5BekIa,9599,2022-04-13T22:08:20Z,2022-04-13T22:08:20Z,OWNER,"OK I figured out what's going on here. First I added an extra `print(sql)` statement to the `indexes` command to see what SQL it was running:
```
(app-root) sqlite-utils indexes global.db --table

    select
      sqlite_master.name as ""table"",
      indexes.name as index_name,
      xinfo.*
    from sqlite_master
      join pragma_index_list(sqlite_master.name) indexes
      join pragma_index_xinfo(index_name) xinfo
    where
      sqlite_master.type = 'table'
     and xinfo.key = 1
Error: near ""("": syntax error
```
This made me suspicious that the SQLite version being used here didn't support joining against the `pragma_index_list(...)` table-valued functions in that way. So I checked the version:
```
(app-root) sqlite3
SQLite version 3.36.0 2021-06-18 18:36:39
```
That version should be fine - it's the one you compiled in the Dockerfile.

Then I checked the version that `sqlite-utils` itself was using:
```
(app-root) sqlite-utils memory 'select sqlite_version()'
[{""sqlite_version()"": ""3.7.17""}]
```
It's running SQLite 3.7.17!

So the problem here is that the Python in that Docker image is running a very old version of SQLite.

I tried using the trick in https://til.simonwillison.net/sqlite/ld-preload as a workaround, and it almost worked:

```
(app-root) python3 -c 'import sqlite3; print(sqlite3.connect("":memory"").execute(""select sqlite_version()"").fetchone())'
('3.7.17',)
(app-root) LD_PRELOAD=./build/sqlite-autoconf-3360000/.libs/libsqlite3.so python3 -c 'import sqlite3; print(sqlite3.connect("":memory"").execute(""select sqlite_version()"").fetchone())'
('3.36.0',)
```
But when I try to run `sqlite-utils` like that I get an error:

```
(app-root) LD_PRELOAD=./build/sqlite-autoconf-3360000/.libs/libsqlite3.so sqlite-utils indexes /tmp/global.db 
...
  File ""/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/cli.py"", line 1624, in query
    db.register_fts4_bm25()
  File ""/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/db.py"", line 412, in register_fts4_bm25
    self.register_function(rank_bm25, deterministic=True)
  File ""/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/db.py"", line 408, in register_function
    register(fn)
  File ""/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/db.py"", line 401, in register
    self.conn.create_function(name, arity, fn, **kwargs)
sqlite3.NotSupportedError: deterministic=True requires SQLite 3.8.3 or higher
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,
https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098295517,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098295517,IC_kwDOCGYnMM5Bdqjd,9599,2022-04-13T17:16:20Z,2022-04-13T17:16:20Z,OWNER,"Aha! I was able to replicate the bug using your `Dockerfile` - thanks very much for providing that.
```
(app-root) sqlite-utils indexes global.db --table
Error: near ""("": syntax error
```
(That wa sbefore I even ran the `extract` command.)

To build your `Dockerfile` I copied it into an empty folder and ran the following:
```
wget https://www.sqlite.org/2021/sqlite-autoconf-3360000.tar.gz
docker build . -t centos-sqlite-utils
docker run -it centos-sqlite-utils /bin/bash
```
This gave me a shell in which I could replicate the bug.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,
https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098288158,https://api.github.com/repos/simonw/sqlite-utils/issues/421,1098288158,IC_kwDOCGYnMM5Bdowe,9599,2022-04-13T17:07:53Z,2022-04-13T17:07:53Z,OWNER,"I can't replicate the bug I'm afraid:
```
% wget ""https://github.com/wri/global-power-plant-database/blob/232a6666/output_database/global_power_plant_database.csv?raw=true""               
...
2022-04-13 10:06:29 (8.97 MB/s) - ‘global_power_plant_database.csv?raw=true’ saved [8856038/8856038]
% sqlite-utils insert global.db power_plants \                      
    'global_power_plant_database.csv?raw=true' --csv
  [------------------------------------]    0%
  [###################################-]   99%  00:00:00%
% sqlite-utils indexes global.db --table                            
table    index_name    seqno    cid    name    desc    coll    key
-------  ------------  -------  -----  ------  ------  ------  -----
% sqlite-utils extract global.db power_plants country country_long \
    --table countries \
    --fk-column country_id \
    --rename country_long name
% sqlite-utils indexes global.db --table                            
table      index_name                    seqno    cid  name       desc  coll      key
---------  --------------------------  -------  -----  -------  ------  ------  -----
countries  idx_countries_country_name        0      1  country       0  BINARY      1
countries  idx_countries_country_name        1      2  name          0  BINARY      1
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1180427792,
https://github.com/simonw/datasette/issues/1712#issuecomment-1097115034,https://api.github.com/repos/simonw/datasette/issues/1712,1097115034,IC_kwDOBm6k_c5BZKWa,9599,2022-04-12T19:12:21Z,2022-04-12T19:12:21Z,OWNER,Got a TIL out of this too: https://til.simonwillison.net/spatialite/gunion-to-combine-geometries,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1202227104,
https://github.com/simonw/datasette/issues/1712#issuecomment-1097076622,https://api.github.com/repos/simonw/datasette/issues/1712,1097076622,IC_kwDOBm6k_c5BZA-O,9599,2022-04-12T18:42:04Z,2022-04-12T18:42:04Z,OWNER,I'm not going to show the tooltip if the formatted number is in bytes.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1202227104,
https://github.com/simonw/datasette/issues/1712#issuecomment-1097068474,https://api.github.com/repos/simonw/datasette/issues/1712,1097068474,IC_kwDOBm6k_c5BY--6,9599,2022-04-12T18:38:18Z,2022-04-12T18:38:18Z,OWNER,"<img width=""633"" alt=""image"" src=""https://user-images.githubusercontent.com/9599/163030785-9dcc5a21-6a1b-42a7-97de-10e7d2874412.png"">
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1202227104,
https://github.com/simonw/datasette/issues/1708#issuecomment-1095687566,https://api.github.com/repos/simonw/datasette/issues/1708,1095687566,IC_kwDOBm6k_c5BTt2O,9599,2022-04-11T23:24:30Z,2022-04-11T23:24:30Z,OWNER,"## Redesigned template context

**Warning:** if you use any custom templates with your Datasette instance they are likely to break when you upgrade to 1.0.

The template context has been redesigned to be based on the documented JSON API. This means that the template context can be considered stable going forward, so any custom templates you implement should continue to work when you upgrade Datasette in the future.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1200649124,