html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app https://github.com/simonw/datasette/issues/1746#issuecomment-1133254599,https://api.github.com/repos/simonw/datasette/issues/1746,1133254599,IC_kwDOBm6k_c5DjBfH,9599,2022-05-20T19:33:08Z,2022-05-20T19:33:08Z,OWNER,Actually maybe I don't? I just noticed that on other pages on https://docs.datasette.io/en/stable/installation.html the only way to get back to that useful table of context / index page at https://docs.datasette.io/en/stable/index.html is by clicking the tiny house icon. Can I do better or should I have the logo do that?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243498298, https://github.com/simonw/datasette/issues/1746#issuecomment-1133252598,https://api.github.com/repos/simonw/datasette/issues/1746,1133252598,IC_kwDOBm6k_c5DjA_2,9599,2022-05-20T19:31:30Z,2022-05-20T19:31:30Z,OWNER,"I'd also like to bring back this stable / latest / version indicator: ![CleanShot 2022-05-20 at 12 30 49@2x](https://user-images.githubusercontent.com/9599/169598732-e2093ec1-7eaf-40dd-acfa-1a7c31091ff1.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243498298, https://github.com/simonw/datasette/issues/1746#issuecomment-1133250151,https://api.github.com/repos/simonw/datasette/issues/1746,1133250151,IC_kwDOBm6k_c5DjAZn,9599,2022-05-20T19:29:37Z,2022-05-20T19:29:37Z,OWNER,"I want the Datasette logo in the sidebar to link to https://datasette.io/ Looks like I can do that by dropping in my own `sidebar/brand.html` template based on this: https://github.com/pradyunsg/furo/blob/0c2acbbd23f8146dd0ae50a2ba57258c1f63ea9f/src/furo/theme/furo/sidebar/brand.html","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243498298, https://github.com/simonw/datasette/issues/1746#issuecomment-1133246791,https://api.github.com/repos/simonw/datasette/issues/1746,1133246791,IC_kwDOBm6k_c5Di_lH,9599,2022-05-20T19:26:49Z,2022-05-20T19:26:49Z,OWNER,"Putting this in the `css/custom.css` file seems to work for fixing that logo problem: ```css body[data-theme=""dark""] .sidebar-logo-container { background-color: white; padding: 5px; opacity: 0.6; } ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243498298, https://github.com/simonw/datasette/issues/1746#issuecomment-1133242063,https://api.github.com/repos/simonw/datasette/issues/1746,1133242063,IC_kwDOBm6k_c5Di-bP,9599,2022-05-20T19:22:49Z,2022-05-20T19:22:49Z,OWNER,"I have some custom CSS in this file: https://github.com/simonw/datasette/blob/1465fea4798599eccfe7e8f012bd8d9adfac3039/docs/_static/css/custom.css#L1-L7 I tested and the `overflow-wrap: anywhere` is still needed for this fix: - #828 The `.wy-side-nav-search` bit is no longer needed with the new theme.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243498298, https://github.com/simonw/datasette/issues/1748#issuecomment-1133232301,https://api.github.com/repos/simonw/datasette/issues/1748,1133232301,IC_kwDOBm6k_c5Di8Ct,9599,2022-05-20T19:15:00Z,2022-05-20T19:15:00Z,OWNER,"Now live on https://docs.datasette.io/en/latest/testing_plugins.html ![copy](https://user-images.githubusercontent.com/9599/169596586-396eb6c7-ef5a-405a-bb21-348499478d9b.gif) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243517592, https://github.com/simonw/datasette/issues/1747#issuecomment-1133229196,https://api.github.com/repos/simonw/datasette/issues/1747,1133229196,IC_kwDOBm6k_c5Di7SM,9599,2022-05-20T19:12:30Z,2022-05-20T19:12:30Z,OWNER,https://docs.datasette.io/en/latest/getting_started.html#follow-a-tutorial,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243512344, https://github.com/simonw/datasette/issues/1748#issuecomment-1133225441,https://api.github.com/repos/simonw/datasette/issues/1748,1133225441,IC_kwDOBm6k_c5Di6Xh,9599,2022-05-20T19:09:13Z,2022-05-20T19:09:13Z,OWNER,I'm going to add this Sphinx plugin: https://github.com/executablebooks/sphinx-copybutton,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243517592, https://github.com/simonw/datasette/issues/1153#issuecomment-1133222848,https://api.github.com/repos/simonw/datasette/issues/1153,1133222848,IC_kwDOBm6k_c5Di5vA,9599,2022-05-20T19:07:10Z,2022-05-20T19:07:10Z,OWNER,I could use https://github.com/pradyunsg/sphinx-inline-tabs for this - recommended by https://pradyunsg.me/furo/recommendations/,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",771202454, https://github.com/simonw/datasette/issues/1746#issuecomment-1133217219,https://api.github.com/repos/simonw/datasette/issues/1746,1133217219,IC_kwDOBm6k_c5Di4XD,9599,2022-05-20T18:58:54Z,2022-05-20T18:58:54Z,OWNER,"Need to address other customizations I've made in https://github.com/simonw/datasette/blob/0.62a0/docs/_templates/layout.html - such as Plausible analytics and some custom JavaScript. https://github.com/simonw/datasette/blob/943aa2e1f7341cb51e60332cde46bde650c64217/docs/_templates/layout.html#L1-L61","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243498298, https://github.com/simonw/datasette/issues/1746#issuecomment-1133215684,https://api.github.com/repos/simonw/datasette/issues/1746,1133215684,IC_kwDOBm6k_c5Di3_E,9599,2022-05-20T18:56:29Z,2022-05-20T18:56:29Z,OWNER,"One other problem: in dark mode the Datasette logo looks bad: This helps a bit: ```css .sidebar-logo-container { background-color: white; padding: 5px; opacity: 0.6; } ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243498298, https://github.com/simonw/datasette/issues/1746#issuecomment-1133210942,https://api.github.com/repos/simonw/datasette/issues/1746,1133210942,IC_kwDOBm6k_c5Di20-,9599,2022-05-20T18:49:40Z,2022-05-20T18:49:40Z,OWNER,"And for those local table of contents, do this: ```rst .. contents:: :local: :class: this-will-duplicate-information-and-it-is-still-useful-here ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243498298, https://github.com/simonw/datasette/issues/1746#issuecomment-1133210651,https://api.github.com/repos/simonw/datasette/issues/1746,1133210651,IC_kwDOBm6k_c5Di2wb,9599,2022-05-20T18:49:11Z,2022-05-20T18:49:11Z,OWNER,"I found a workaround for the no-longer-nested left hand navigation: drop this into `_templates/sidebar/navigation.html`: ```html+jinja
{{ toctree( collapse=True, titles_only=False, maxdepth=3, includehidden=True, ) }}
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243498298, https://github.com/simonw/datasette/issues/1746#issuecomment-1133210032,https://api.github.com/repos/simonw/datasette/issues/1746,1133210032,IC_kwDOBm6k_c5Di2mw,9599,2022-05-20T18:48:17Z,2022-05-20T18:48:17Z,OWNER,"A couple of changes I want to make. First, I don't really like the way Furo keeps the in-page titles in a separate menu on the right rather than expanding them on the left. I like this: ![CleanShot 2022-05-20 at 11 43 33@2x](https://user-images.githubusercontent.com/9599/169592611-ac0f9bd2-ff99-49b6-88d3-92dace9d85a6.png) Furo wants to do this instead: I also still want to include those inline tables of contents on the two pages that have them: - https://docs.datasette.io/en/stable/installation.html - https://docs.datasette.io/en/stable/plugin_hooks.html","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1243498298, https://github.com/simonw/sqlite-utils/issues/425#issuecomment-1129332959,https://api.github.com/repos/simonw/sqlite-utils/issues/425,1129332959,IC_kwDOCGYnMM5DUEDf,102771161,2022-05-17T21:27:02Z,2022-05-17T21:27:02Z,NONE,"Hi, I'm trying to deploy my site using elasticbeanstalk and I keep getting this same error : deterministic=True requires SQLite 3.8.3 or higher I saw your previous solution that involves editing sqlite-utils/sqlite_utils/db.py file, but I'm curious as to how that will work in production.","{""total_count"": 5, ""+1"": 5, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1203842656, https://github.com/simonw/datasette/issues/1744#issuecomment-1129251699,https://api.github.com/repos/simonw/datasette/issues/1744,1129251699,IC_kwDOBm6k_c5DTwNz,9599,2022-05-17T19:44:47Z,2022-05-17T19:46:38Z,OWNER,Updated docs: https://docs.datasette.io/en/latest/getting_started.html#using-datasette-on-your-own-computer and https://docs.datasette.io/en/latest/cli-reference.html#datasette-serve-help,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1239008850, https://github.com/simonw/datasette/issues/1745#issuecomment-1129252603,https://api.github.com/repos/simonw/datasette/issues/1745,1129252603,IC_kwDOBm6k_c5DTwb7,9599,2022-05-17T19:45:51Z,2022-05-17T19:45:51Z,OWNER,Now documented here: https://docs.datasette.io/en/latest/contributing.html#running-cog,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1239080102, https://github.com/simonw/datasette/issues/1744#issuecomment-1129243427,https://api.github.com/repos/simonw/datasette/issues/1744,1129243427,IC_kwDOBm6k_c5DTuMj,9599,2022-05-17T19:35:02Z,2022-05-17T19:35:02Z,OWNER,"One thing to note is that the `datasette-copy-to-memory` plugin broke with a locked file, because it does this: https://github.com/simonw/datasette-copy-to-memory/blob/d541c18a78ae6f707a8f9b1e7fc4c020a9f68f2e/datasette_copy_to_memory/__init__.py#L27 ```python tmp.execute(""ATTACH DATABASE ? AS _copy_from"", [db.path]) ``` That would need to use a URI filename too for it to work with locked files.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1239008850, https://github.com/simonw/datasette/issues/1744#issuecomment-1129241873,https://api.github.com/repos/simonw/datasette/issues/1744,1129241873,IC_kwDOBm6k_c5DTt0R,9599,2022-05-17T19:33:16Z,2022-05-17T19:33:16Z,OWNER,"I'm going to skip adding a test for this - the test logic would have to be pretty convoluted to exercise it properly, and it's a pretty minor and low-risk feature in the scheme of things.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1239008850, https://github.com/simonw/datasette/issues/1744#issuecomment-1129241283,https://api.github.com/repos/simonw/datasette/issues/1744,1129241283,IC_kwDOBm6k_c5DTtrD,9599,2022-05-17T19:32:35Z,2022-05-17T19:32:35Z,OWNER,"I tried writing a test like this: ```python @pytest.mark.parametrize(""locked"", (True, False)) def test_locked_sqlite_db(tmp_path_factory, locked): dir = tmp_path_factory.mktemp(""test_locked_sqlite_db"") test_db = str(dir / ""test.db"") sqlite3.connect(test_db).execute(""create table t (id integer primary key)"") if locked: fp = open(test_db, ""w"") fcntl.lockf(fp.fileno(), fcntl.LOCK_EX) runner = CliRunner() result = runner.invoke( cli, [ ""serve"", ""--memory"", ""--get"", ""/test"", ], catch_exceptions=False, ) ``` But it didn't work, because the test runs in the same process - so taking an exclusive lock on that file didn't cause an error when the test later tried to access it via Datasette!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1239008850, https://github.com/simonw/datasette/issues/1744#issuecomment-1129187486,https://api.github.com/repos/simonw/datasette/issues/1744,1129187486,IC_kwDOBm6k_c5DTgie,9599,2022-05-17T18:28:49Z,2022-05-17T18:28:49Z,OWNER,I think I do that with `fcntl.flock()`: https://docs.python.org/3/library/fcntl.html#fcntl.flock,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1239008850, https://github.com/simonw/datasette/issues/1744#issuecomment-1129185356,https://api.github.com/repos/simonw/datasette/issues/1744,1129185356,IC_kwDOBm6k_c5DTgBM,9599,2022-05-17T18:26:26Z,2022-05-17T18:26:26Z,OWNER,Not sure how to test this - I'd need to open my own lock against a database file somehow.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1239008850, https://github.com/simonw/datasette/issues/1744#issuecomment-1129184908,https://api.github.com/repos/simonw/datasette/issues/1744,1129184908,IC_kwDOBm6k_c5DTf6M,9599,2022-05-17T18:25:57Z,2022-05-17T18:25:57Z,OWNER,"I knocked out a quick prototype of this and it worked! datasette ~/Library/Application\ Support/Google/Chrome/Default/History --nolock Here's the prototype diff: ```diff diff --git a/datasette/app.py b/datasette/app.py index b7b8437..f43700d 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -213,6 +213,7 @@ class Datasette: config_dir=None, pdb=False, crossdb=False, + nolock=False, ): assert config_dir is None or isinstance( config_dir, Path @@ -238,6 +239,7 @@ class Datasette: self.databases = collections.OrderedDict() self._refresh_schemas_lock = asyncio.Lock() self.crossdb = crossdb + self.nolock = nolock if memory or crossdb or not self.files: self.add_database(Database(self, is_memory=True), name=""_memory"") # memory_name is a random string so that each Datasette instance gets its own diff --git a/datasette/cli.py b/datasette/cli.py index 3c6e1b2..7e44665 100644 --- a/datasette/cli.py +++ b/datasette/cli.py @@ -452,6 +452,11 @@ def uninstall(packages, yes): is_flag=True, help=""Enable cross-database joins using the /_memory database"", ) +@click.option( + ""--nolock"", + is_flag=True, + help=""Ignore locking and open locked files in read-only mode"", +) @click.option( ""--ssl-keyfile"", help=""SSL key file"", @@ -486,6 +491,7 @@ def serve( open_browser, create, crossdb, + nolock, ssl_keyfile, ssl_certfile, return_instance=False, @@ -545,6 +551,7 @@ def serve( version_note=version_note, pdb=pdb, crossdb=crossdb, + nolock=nolock, ) # if files is a single directory, use that as config_dir= diff --git a/datasette/database.py b/datasette/database.py index 44d3266..fa55804 100644 --- a/datasette/database.py +++ b/datasette/database.py @@ -89,6 +89,8 @@ class Database: # mode=ro or immutable=1? if self.is_mutable: qs = ""?mode=ro"" + if self.ds.nolock: + qs += ""&nolock=1"" else: qs = ""?immutable=1"" assert not (write and not self.is_mutable) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1239008850, https://github.com/simonw/datasette/issues/1742#issuecomment-1128064864,https://api.github.com/repos/simonw/datasette/issues/1742,1128064864,IC_kwDOBm6k_c5DPOdg,25778,2022-05-16T19:42:13Z,2022-05-16T19:42:13Z,CONTRIBUTOR,"Just to add a wrinkle here, this loads fine: https://alltheplaces-datasette.fly.dev/alltheplaces/places.geojson?_trace=1 But also, this doesn't add any trace data: https://alltheplaces-datasette.fly.dev/alltheplaces/places.json?_trace=1 What am I missing?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1237586379, https://github.com/simonw/datasette/issues/1742#issuecomment-1128052948,https://api.github.com/repos/simonw/datasette/issues/1742,1128052948,IC_kwDOBm6k_c5DPLjU,9599,2022-05-16T19:28:31Z,2022-05-16T19:28:31Z,OWNER,"The trace mechanism is a bit gnarly - it's actually done by some ASGI middleware I wrote, so I'm pretty sure the bug is in there somewhere: https://github.com/simonw/datasette/blob/280ff372ab30df244f6c54f6f3002da57334b3d7/datasette/tracer.py#L73","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1237586379, https://github.com/simonw/datasette/issues/1742#issuecomment-1128049716,https://api.github.com/repos/simonw/datasette/issues/1742,1128049716,IC_kwDOBm6k_c5DPKw0,25778,2022-05-16T19:24:44Z,2022-05-16T19:24:44Z,CONTRIBUTOR,"Where is `_trace` getting injected? And is it something a plugin should be able to handle? (If it is, I guess I should handle it in this case.)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1237586379, https://github.com/simonw/datasette/issues/1742#issuecomment-1128033018,https://api.github.com/repos/simonw/datasette/issues/1742,1128033018,IC_kwDOBm6k_c5DPGr6,9599,2022-05-16T19:06:38Z,2022-05-16T19:06:38Z,OWNER,The same URL with `.json` instead works fine: https://calands.datasettes.com/calands/CPAD_2020a_SuperUnits.json?_sort=id&id__exact=4&_labels=on&_trace=1,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1237586379, https://github.com/simonw/sqlite-utils/issues/431#issuecomment-1126295407,https://api.github.com/repos/simonw/sqlite-utils/issues/431,1126295407,IC_kwDOCGYnMM5DIedv,738408,2022-05-13T17:47:32Z,2022-05-13T17:47:32Z,NONE,"I'd be happy to write a PR for this, if you think it's worth having.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1227571375, https://github.com/simonw/datasette/issues/741#issuecomment-1125342229,https://api.github.com/repos/simonw/datasette/issues/741,1125342229,IC_kwDOBm6k_c5DE1wV,25778,2022-05-12T19:21:16Z,2022-05-12T19:21:16Z,CONTRIBUTOR,"Came here to check if this had been flagged already. Was helping a colleague get something on Cloud Run and had to dig to find `--extra-options=""--setting sql_time_limit_ms 2500""`. If I get some time next week, maybe I'll try to tackle it. Would definitely make things easier to be able to do something like this: ```sh datasette publish cloudrun something.db --setting sql_time_limit_ms 2500 ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",607223136, https://github.com/simonw/datasette/issues/1298#issuecomment-1125083348,https://api.github.com/repos/simonw/datasette/issues/1298,1125083348,IC_kwDOBm6k_c5DD2jU,7150,2022-05-12T14:43:51Z,2022-05-12T14:43:51Z,NONE,"user report: I found this issue because the first time I tried to use datasette for real, I displayed a large table, and thought there was no horizontal scroll bar at all. I didn't even consider that I had to scroll all the way to the end of the page to find it. Just chipping in to say that this confused me, and I didn't even find the scroll bar until after I saw this issue. I don't know what the right answer is, but IMO the UI should suggest to the user that there is a way to view the data that's hidden to the right.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",855476501, https://github.com/simonw/datasette/issues/1739#issuecomment-1117662420,https://api.github.com/repos/simonw/datasette/issues/1739,1117662420,IC_kwDOBm6k_c5CnizU,9599,2022-05-04T18:21:18Z,2022-05-04T18:21:18Z,OWNER,That prototype is now public: https://github.com/simonw/datasette-lite,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223699280, https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1116684581,https://api.github.com/repos/simonw/sqlite-utils/issues/416,1116684581,IC_kwDOCGYnMM5Cj0El,638427,2022-05-03T21:36:49Z,2022-05-03T21:36:49Z,NONE,"Thanks for addressing this @simonw! However, I just reinstalled sqlite-utils 3.26.1 and get an `ParserError: Unknown string format: None`: ``` sqlite-utils --version sqlite-utils, version 3.26.1 ``` ``` sqlite-utils convert idfpr.db license ""Original Issue Date"" ""r.parsedate(value)"" Traceback (most recent call last): File ""/home/matt/.local/lib/python3.9/site-packages/sqlite_utils/db.py"", line 2514, in convert_value return fn(v) File """", line 2, in fn File ""/home/matt/.local/lib/python3.9/site-packages/sqlite_utils/recipes.py"", line 19, in parsedate parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst) File ""/usr/lib/python3/dist-packages/dateutil/parser/_parser.py"", line 1374, in parse return DEFAULTPARSER.parse(timestr, **kwargs) File ""/usr/lib/python3/dist-packages/dateutil/parser/_parser.py"", line 649, in parse raise ParserError(""Unknown string format: %s"", timestr) dateutil.parser._parser.ParserError: Unknown string format: None Traceback (most recent call last): File ""/home/matt/.local/bin/sqlite-utils"", line 8, in sys.exit(cli()) File ""/usr/lib/python3/dist-packages/click/core.py"", line 829, in __call__ return self.main(*args, **kwargs) File ""/usr/lib/python3/dist-packages/click/core.py"", line 782, in main rv = self.invoke(ctx) File ""/usr/lib/python3/dist-packages/click/core.py"", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File ""/usr/lib/python3/dist-packages/click/core.py"", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File ""/usr/lib/python3/dist-packages/click/core.py"", line 610, in invoke return callback(*args, **kwargs) File ""/home/matt/.local/lib/python3.9/site-packages/sqlite_utils/cli.py"", line 2707, in convert db[table].convert( File ""/home/matt/.local/lib/python3.9/site-packages/sqlite_utils/db.py"", line 2530, in convert self.db.execute(sql, where_args or []) File ""/home/matt/.local/lib/python3.9/site-packages/sqlite_utils/db.py"", line 463, in execute return self.conn.execute(sql, parameters) sqlite3.OperationalError: user-defined function raised exception ``` I definitely have some invalid data in the db. Happy to send a copy if it's helpful.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1173023272, https://github.com/simonw/sqlite-utils/issues/430#issuecomment-1116336340,https://api.github.com/repos/simonw/sqlite-utils/issues/430,1116336340,IC_kwDOCGYnMM5CifDU,9308268,2022-05-03T17:03:31Z,2022-05-03T17:03:31Z,NONE,"So, the good news is that it appears that setting one of those PRAGMA statements fixed the issue of `table.extract()` method call on this large database completing (that I described above.) The bad news is that I'm not sure which one! I wonder if it's something system / environment specific about SQLite, or maybe something else going on.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1224112817, https://github.com/simonw/datasette/issues/1739#issuecomment-1116215371,https://api.github.com/repos/simonw/datasette/issues/1739,1116215371,IC_kwDOBm6k_c5CiBhL,9599,2022-05-03T15:12:16Z,2022-05-03T15:12:16Z,OWNER,"That worked - both DBs are 304 for me now on a subsequent load of the page: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223699280, https://github.com/simonw/datasette/issues/1739#issuecomment-1116183369,https://api.github.com/repos/simonw/datasette/issues/1739,1116183369,IC_kwDOBm6k_c5Ch5tJ,9599,2022-05-03T14:43:14Z,2022-05-03T14:43:14Z,OWNER,Relevant tests start here: https://github.com/simonw/datasette/blob/d60f163528f466b1127b2935c3b6869c34fd6545/tests/test_html.py#L395,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223699280, https://github.com/simonw/datasette/issues/1739#issuecomment-1116180599,https://api.github.com/repos/simonw/datasette/issues/1739,1116180599,IC_kwDOBm6k_c5Ch5B3,9599,2022-05-03T14:40:32Z,2022-05-03T14:40:32Z,OWNER,"Database downloads are served here: https://github.com/simonw/datasette/blob/d60f163528f466b1127b2935c3b6869c34fd6545/datasette/views/database.py#L186-L192 Here's `AsgiFileDownload`: https://github.com/simonw/datasette/blob/d60f163528f466b1127b2935c3b6869c34fd6545/datasette/utils/asgi.py#L410-L430 I can add an `etag=` parameter to that and populate it with `db.hash`, if it is populated (which it always should be for immutable databases that can be downloaded).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223699280, https://github.com/simonw/datasette/issues/1739#issuecomment-1116178727,https://api.github.com/repos/simonw/datasette/issues/1739,1116178727,IC_kwDOBm6k_c5Ch4kn,9599,2022-05-03T14:38:46Z,2022-05-03T14:38:46Z,OWNER,"Reminded myself how this works by reviewing `conditional-get`: https://github.com/simonw/conditional-get/blob/db6dfec0a296080aaf68fcd80e55fb3f0714e738/conditional_get/cli.py#L33-L52 Simply add a `If-None-Match: last-known-etag` header to the request and check that the response is a status 304 with an empty body.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223699280, https://github.com/simonw/datasette/issues/1739#issuecomment-1115760104,https://api.github.com/repos/simonw/datasette/issues/1739,1115760104,IC_kwDOBm6k_c5CgSXo,9599,2022-05-03T05:50:19Z,2022-05-03T05:50:19Z,OWNER,Here's how Starlette does it: https://github.com/encode/starlette/blob/830f3486537916bae6b46948ff922adc14a22b7c/starlette/staticfiles.py#L213,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223699280, https://github.com/simonw/datasette/issues/1732#issuecomment-1115542067,https://api.github.com/repos/simonw/datasette/issues/1732,1115542067,IC_kwDOBm6k_c5CfdIz,52649,2022-05-03T01:50:44Z,2022-05-03T01:50:44Z,NONE,"I haven’t set one up unfortunately. My time is very limited because we just had a baby. On Mon, May 2, 2022, at 6:42 PM, Simon Willison wrote: > > > Thanks, this definitely sounds like a bug. Do you have simple steps to reproduce this? > > > β€” > Reply to this email directly, view it on GitHub , or unsubscribe . > You are receiving this because you authored the thread.Message ID: ***@***.***> > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1221849746, https://github.com/simonw/datasette/issues/1732#issuecomment-1115533820,https://api.github.com/repos/simonw/datasette/issues/1732,1115533820,IC_kwDOBm6k_c5CfbH8,9599,2022-05-03T01:42:25Z,2022-05-03T01:42:25Z,OWNER,"Thanks, this definitely sounds like a bug. Do you have simple steps to reproduce this?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1221849746, https://github.com/simonw/datasette/issues/1737#issuecomment-1115470180,https://api.github.com/repos/simonw/datasette/issues/1737,1115470180,IC_kwDOBm6k_c5CfLlk,9599,2022-05-02T23:39:29Z,2022-05-02T23:39:29Z,OWNER,"Test ran in 38 seconds and passed! https://github.com/simonw/datasette/runs/6265954274?check_suite_focus=true I'm going to have it run on every commit and PR.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223459734, https://github.com/simonw/datasette/issues/1737#issuecomment-1115468193,https://api.github.com/repos/simonw/datasette/issues/1737,1115468193,IC_kwDOBm6k_c5CfLGh,9599,2022-05-02T23:35:26Z,2022-05-02T23:35:26Z,OWNER,"https://github.com/simonw/datasette/runs/6265915080?check_suite_focus=true failed but looks like it passed because I forgot to use `set -e` at the start of the bash script. It failed because it didn't have `build` available.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223459734, https://github.com/simonw/datasette/issues/1737#issuecomment-1115464097,https://api.github.com/repos/simonw/datasette/issues/1737,1115464097,IC_kwDOBm6k_c5CfKGh,9599,2022-05-02T23:27:40Z,2022-05-02T23:27:40Z,OWNER,"I'm going to start off by running this manually - I may run it on every commit once this is all a little bit more stable. I can base the workflow on https://github.com/simonw/scrape-hacker-news-by-domain/blob/main/.github/workflows/scrape.yml","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223459734, https://github.com/simonw/datasette/issues/1737#issuecomment-1115462720,https://api.github.com/repos/simonw/datasette/issues/1737,1115462720,IC_kwDOBm6k_c5CfJxA,9599,2022-05-02T23:25:03Z,2022-05-02T23:25:03Z,OWNER,"Here's a script that seems to work. It builds the wheel, starts a Python web server that serves the wheel, runs a test with `shot-scraper` and then shuts down the server again. ```bash #!/bin/bash # Build the wheel python3 -m build # Find name of wheel wheel=$(basename $(ls dist/*.whl)) # strip off the dist/ # Create a blank index page echo ' ' > dist/index.html # Run a server for that dist/ folder cd dist python3 -m http.server 8529 & cd .. shot-scraper javascript http://localhost:8529/ "" async () => { let pyodide = await loadPyodide(); await pyodide.loadPackage(['micropip', 'ssl', 'setuptools']); let output = await pyodide.runPythonAsync(\` import micropip await micropip.install('h11==0.12.0') await micropip.install('http://localhost:8529/$wheel') import ssl import setuptools from datasette.app import Datasette ds = Datasette(memory=True, settings={'num_sql_threads': 0}) (await ds.client.get('/_memory.json?sql=select+55+as+itworks&_shape=array')).text \`); if (JSON.parse(output)[0].itworks != 55) { throw 'Got ' + output + ', expected itworks: 55'; } return 'Test passed!'; } "" # Shut down the server pkill -f 'http.server 8529' ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223459734, https://github.com/simonw/datasette/issues/1733#issuecomment-1115404729,https://api.github.com/repos/simonw/datasette/issues/1733,1115404729,IC_kwDOBm6k_c5Ce7m5,9599,2022-05-02T21:49:01Z,2022-05-02T21:49:38Z,OWNER,"That alpha release works! https://pyodide.org/en/stable/console.html ```pycon Welcome to the Pyodide terminal emulator 🐍 Python 3.10.2 (main, Apr 9 2022 20:52:01) on WebAssembly VM Type ""help"", ""copyright"", ""credits"" or ""license"" for more information. >>> import micropip >>> await micropip.install(""datasette==0.62a0"") >>> import ssl >>> import setuptools >>> from datasette.app import Datasette >>> ds = Datasette(memory=True, settings={""num_sql_threads"": 0}) >>> await ds.client.get(""/.json"") >>> (await ds.client.get(""/.json"")).json() {'_memory': {'name': '_memory', 'hash': None, 'color': 'a6c7b9', 'path': '/_memory', 'tables_and_views_truncated': [], 'tab les_and_views_more': False, 'tables_count': 0, 'table_rows_sum': 0, 'show_table_row_counts': False, 'hidden_table_rows_sum' : 0, 'hidden_tables_count': 0, 'views_count': 0, 'private': False}} >>> ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932, https://github.com/simonw/datasette/issues/1733#issuecomment-1115318417,https://api.github.com/repos/simonw/datasette/issues/1733,1115318417,IC_kwDOBm6k_c5CemiR,9599,2022-05-02T20:13:43Z,2022-05-02T20:13:43Z,OWNER,This is good enough to push an alpha.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932, https://github.com/simonw/datasette/issues/1733#issuecomment-1115318303,https://api.github.com/repos/simonw/datasette/issues/1733,1115318303,IC_kwDOBm6k_c5Cemgf,9599,2022-05-02T20:13:36Z,2022-05-02T20:13:36Z,OWNER,"I got a build from the `pyodide` branch to work! ``` Welcome to the Pyodide terminal emulator 🐍 Python 3.10.2 (main, Apr 9 2022 20:52:01) on WebAssembly VM Type ""help"", ""copyright"", ""credits"" or ""license"" for more information. >>> import micropip >>> await micropip.install(""https://s3.amazonaws.com/simonwillison-cors-allowed-public/datasette-0.62a0-py3-none-any.whl"") Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/asyncio/futures.py"", line 284, in __await__ yield self # This tells Task to wait for completion. File ""/lib/python3.10/asyncio/tasks.py"", line 304, in __wakeup future.result() File ""/lib/python3.10/asyncio/futures.py"", line 201, in result raise self._exception File ""/lib/python3.10/asyncio/tasks.py"", line 234, in __step result = coro.throw(exc) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 183, in install transaction = await self.gather_requirements(requirements, ctx, keep_going) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 173, in gather_requirements await gather(*requirement_promises) File ""/lib/python3.10/asyncio/futures.py"", line 284, in __await__ yield self # This tells Task to wait for completion. File ""/lib/python3.10/asyncio/tasks.py"", line 304, in __wakeup future.result() File ""/lib/python3.10/asyncio/futures.py"", line 201, in result raise self._exception File ""/lib/python3.10/asyncio/tasks.py"", line 232, in __step result = coro.send(None) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 245, in add_requirement await self.add_wheel(name, wheel, version, (), ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 291, in add_requirement await self.add_wheel( File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 291, in add_requirement await self.add_wheel( File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 276, in add_requirement raise ValueError( ValueError: Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed >>> await micropip.install(""https://s3.amazonaws.com/simonwillison-cors-allowed-public/datasette-0.62a0-py3-none-any.whl"") Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/asyncio/futures.py"", line 284, in __await__ yield self # This tells Task to wait for completion. File ""/lib/python3.10/asyncio/tasks.py"", line 304, in __wakeup future.result() File ""/lib/python3.10/asyncio/futures.py"", line 201, in result raise self._exception File ""/lib/python3.10/asyncio/tasks.py"", line 234, in __step result = coro.throw(exc) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 183, in install transaction = await self.gather_requirements(requirements, ctx, keep_going) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 173, in gather_requirements await gather(*requirement_promises) File ""/lib/python3.10/asyncio/futures.py"", line 284, in __await__ yield self # This tells Task to wait for completion. File ""/lib/python3.10/asyncio/tasks.py"", line 304, in __wakeup future.result() File ""/lib/python3.10/asyncio/futures.py"", line 201, in result raise self._exception File ""/lib/python3.10/asyncio/tasks.py"", line 232, in __step result = coro.send(None) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 245, in add_requirement await self.add_wheel(name, wheel, version, (), ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 291, in add_requirement await self.add_wheel( File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 291, in add_requirement await self.add_wheel( File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 316, in add_wheel await self.add_requirement(recurs_req, ctx, transaction) File ""/lib/python3.10/site-packages/micropip/_micropip.py"", line 276, in add_requirement raise ValueError( ValueError: Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed >>> await micropip.install(""h11==0.12"") >>> await micropip.install(""https://s3.amazonaws.com/simonwillison-cors-allowed-public/datasette-0.62a0-py3-none-any.whl"") >>> import datasette >>> from datasette.app import Datasette Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/site-packages/datasette/app.py"", line 9, in import httpx File ""/lib/python3.10/site-packages/httpx/__init__.py"", line 2, in from ._api import delete, get, head, options, patch, post, put, request, stream File ""/lib/python3.10/site-packages/httpx/_api.py"", line 4, in from ._client import Client File ""/lib/python3.10/site-packages/httpx/_client.py"", line 9, in from ._auth import Auth, BasicAuth, FunctionAuth File ""/lib/python3.10/site-packages/httpx/_auth.py"", line 10, in from ._models import Request, Response File ""/lib/python3.10/site-packages/httpx/_models.py"", line 16, in from ._content import ByteStream, UnattachedStream, encode_request, encode_response File ""/lib/python3.10/site-packages/httpx/_content.py"", line 17, in from ._multipart import MultipartStream File ""/lib/python3.10/site-packages/httpx/_multipart.py"", line 7, in from ._types import ( File ""/lib/python3.10/site-packages/httpx/_types.py"", line 5, in import ssl File ""/lib/python3.10/ssl.py"", line 98, in import _ssl # if we can't import it, let the error propagate ModuleNotFoundError: No module named '_ssl' >>> import ssl >>> from datasette.app import Datasette Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/site-packages/datasette/app.py"", line 14, in import pkg_resources ModuleNotFoundError: No module named 'pkg_resources' >>> import setuptools >>> from datasette.app import Datasette >>> ds = Datasette(memory=True) >>> ds >>> await ds.client.get(""/"") Traceback (most recent call last): File ""/lib/python3.10/site-packages/datasette/app.py"", line 1268, in route_path response = await view(request, send) File ""/lib/python3.10/site-packages/datasette/views/base.py"", line 134, in view return await self.dispatch_request(request) File ""/lib/python3.10/site-packages/datasette/views/base.py"", line 89, in dispatch_request await self.ds.refresh_schemas() File ""/lib/python3.10/site-packages/datasette/app.py"", line 353, in refresh_schemas await self._refresh_schemas() File ""/lib/python3.10/site-packages/datasette/app.py"", line 358, in _refresh_schemas await init_internal_db(internal_db) File ""/lib/python3.10/site-packages/datasette/utils/internal_db.py"", line 65, in init_internal_db await db.execute_write_script(create_tables_sql) File ""/lib/python3.10/site-packages/datasette/database.py"", line 116, in execute_write_script results = await self.execute_write_fn(_inner, block=block) File ""/lib/python3.10/site-packages/datasette/database.py"", line 155, in execute_write_fn self._write_thread.start() File ""/lib/python3.10/threading.py"", line 928, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't start new thread >>> ds = Datasette(memory=True, settings={""num_sql_threads"": 0}) >>> await ds.client.get(""/"") >>> (await ds.client.get(""/"")).text '\n\n\n Datasette: _memory\n \n \n\n\n\n
\n
\n\n\n\n \n\n\n\n
\n\n

Datasette

\n\n\n\n\n\n

r detailsClickedWithin = null;\n while (target && target.tagName != \'DETAILS\') {\n target = target.parentNode;\ n }\n if (target && target.tagName == \'DETAILS\') {\n detailsClickedWithin = target;\n }\n Array.from(d ocument.getElementsByTagName(\'details\')).filter(\n (details) => details.open && details != detailsClickedWithin\n ).forEach(details => details.open = false);\n});\n\n\n\n\n\n\n ' >>> ``` That `ValueError: Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed` error is annoying. I assume it's a `uvicorn` dependency clash of some sort, because I wasn't getting that when I removed `uvicorn` as a dependency. I can avoid it by running this first though: await micropip.install(""h11==0.12"")","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932, https://github.com/simonw/datasette/issues/1735#issuecomment-1115301733,https://api.github.com/repos/simonw/datasette/issues/1735,1115301733,IC_kwDOBm6k_c5Ceidl,9599,2022-05-02T19:57:19Z,2022-05-02T19:59:03Z,OWNER,"This code breaks if that setting is 0: https://github.com/simonw/datasette/blob/a29c1277896b6a7905ef5441c42a37bc15f67599/datasette/app.py#L291-L293 It's used here: https://github.com/simonw/datasette/blob/a29c1277896b6a7905ef5441c42a37bc15f67599/datasette/database.py#L188-L190","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223263540, https://github.com/simonw/datasette/issues/1733#issuecomment-1115288284,https://api.github.com/repos/simonw/datasette/issues/1733,1115288284,IC_kwDOBm6k_c5CefLc,9599,2022-05-02T19:40:33Z,2022-05-02T19:40:33Z,OWNER,"I'll release this as a `0.62a0` as soon as it's ready, so I can start testing it out in Pyodide for real.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932, https://github.com/simonw/datasette/issues/1734#issuecomment-1115283922,https://api.github.com/repos/simonw/datasette/issues/1734,1115283922,IC_kwDOBm6k_c5CeeHS,9599,2022-05-02T19:35:32Z,2022-05-02T19:35:32Z,OWNER,I'll use my original from 2009: https://www.djangosnippets.org/snippets/1431/,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223241647, https://github.com/simonw/datasette/issues/1734#issuecomment-1115282773,https://api.github.com/repos/simonw/datasette/issues/1734,1115282773,IC_kwDOBm6k_c5Ced1V,9599,2022-05-02T19:34:15Z,2022-05-02T19:34:15Z,OWNER,I'm going to vendor it and update the documentation.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223241647, https://github.com/simonw/datasette/issues/1733#issuecomment-1115278325,https://api.github.com/repos/simonw/datasette/issues/1733,1115278325,IC_kwDOBm6k_c5Cecv1,9599,2022-05-02T19:29:05Z,2022-05-02T19:29:05Z,OWNER,"I'm going to add a Datasette setting to disable threading entirely, designed for usage in this particular case. I thought about adding a new setting, then I noticed this: datasette mydatabase.db --setting num_sql_threads 10 I'm going to let users set that to `0` to disable threaded execution of SQL queries.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932, https://github.com/simonw/datasette/issues/1733#issuecomment-1115268245,https://api.github.com/repos/simonw/datasette/issues/1733,1115268245,IC_kwDOBm6k_c5CeaSV,9599,2022-05-02T19:18:11Z,2022-05-02T19:18:11Z,OWNER,"Maybe I can leave `uvicorn` as a dependency? Installing it works OK, it only generates errors when you try to import it: ```pycon Welcome to the Pyodide terminal emulator 🐍 Python 3.10.2 (main, Apr 9 2022 20:52:01) on WebAssembly VM Type ""help"", ""copyright"", ""credits"" or ""license"" for more information. >>> import micropip >>> await micropip.install(""uvicorn"") >>> import uvicorn Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/site-packages/uvicorn/__init__.py"", line 1, in from uvicorn.config import Config File ""/lib/python3.10/site-packages/uvicorn/config.py"", line 8, in import ssl File ""/lib/python3.10/ssl.py"", line 98, in import _ssl # if we can't import it, let the error propagate ModuleNotFoundError: No module named '_ssl' >>> import ssl >>> import uvicorn Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/site-packages/uvicorn/__init__.py"", line 2, in from uvicorn.main import Server, main, run File ""/lib/python3.10/site-packages/uvicorn/main.py"", line 24, in from uvicorn.supervisors import ChangeReload, Multiprocess File ""/lib/python3.10/site-packages/uvicorn/supervisors/__init__.py"", line 3, in from uvicorn.supervisors.basereload import BaseReload File ""/lib/python3.10/site-packages/uvicorn/supervisors/basereload.py"", line 12, in from uvicorn.subprocess import get_subprocess File ""/lib/python3.10/site-packages/uvicorn/subprocess.py"", line 14, in multiprocessing.allow_connection_pickling() File ""/lib/python3.10/multiprocessing/context.py"", line 170, in allow_connection_pickling from . import connection File ""/lib/python3.10/multiprocessing/connection.py"", line 21, in import _multiprocessing ModuleNotFoundError: No module named '_multiprocessing' >>> import multiprocessing >>> import uvicorn Traceback (most recent call last): File """", line 1, in File ""/lib/python3.10/site-packages/uvicorn/__init__.py"", line 2, in from uvicorn.main import Server, main, run File ""/lib/python3.10/site-packages/uvicorn/main.py"", line 24, in from uvicorn.supervisors import ChangeReload, Multiprocess File ""/lib/python3.10/site-packages/uvicorn/supervisors/__init__.py"", line 3, in from uvicorn.supervisors.basereload import BaseReload File ""/lib/python3.10/site-packages/uvicorn/supervisors/basereload.py"", line 12, in from uvicorn.subprocess import get_subprocess File ""/lib/python3.10/site-packages/uvicorn/subprocess.py"", line 14, in multiprocessing.allow_connection_pickling() File ""/lib/python3.10/multiprocessing/context.py"", line 170, in allow_connection_pickling from . import connection File ""/lib/python3.10/multiprocessing/connection.py"", line 21, in import _multiprocessing ModuleNotFoundError: No module named '_multiprocessing' >>> ``` Since the `import ssl` trick fixed the `_ssl` error I was hopeful that `import multiprocessing` could fix the `_multiprocessing` one, but sadly it did not. But it looks like i can address this issue just by making `import uvicorn` in `app.py` an optional import.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932, https://github.com/simonw/datasette/issues/1733#issuecomment-1115262218,https://api.github.com/repos/simonw/datasette/issues/1733,1115262218,IC_kwDOBm6k_c5CeY0K,9599,2022-05-02T19:11:51Z,2022-05-02T19:14:01Z,OWNER,"Here's the full diff I applied to Datasette to get it fully working in Pyodide: https://github.com/simonw/datasette/compare/94a3171b01fde5c52697aeeff052e3ad4bab5391...8af32bc5b03c30b1f7a4a8cc4bd80eb7e2ee7b81 And as a visible diff: ```diff diff --git a/datasette/app.py b/datasette/app.py index d269372..6c0c5fc 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -15,7 +15,6 @@ import pkg_resources import re import secrets import sys -import threading import traceback import urllib.parse from concurrent import futures @@ -26,7 +25,6 @@ from itsdangerous import URLSafeSerializer from jinja2 import ChoiceLoader, Environment, FileSystemLoader, PrefixLoader from jinja2.environment import Template from jinja2.exceptions import TemplateNotFound -import uvicorn from .views.base import DatasetteError, ureg from .views.database import DatabaseDownload, DatabaseView @@ -813,7 +811,6 @@ class Datasette: }, ""datasette"": datasette_version, ""asgi"": ""3.0"", - ""uvicorn"": uvicorn.__version__, ""sqlite"": { ""version"": sqlite_version, ""fts_versions"": fts_versions, @@ -854,23 +851,7 @@ class Datasette: ] def _threads(self): - threads = list(threading.enumerate()) - d = { - ""num_threads"": len(threads), - ""threads"": [ - {""name"": t.name, ""ident"": t.ident, ""daemon"": t.daemon} for t in threads - ], - } - # Only available in Python 3.7+ - if hasattr(asyncio, ""all_tasks""): - tasks = asyncio.all_tasks() - d.update( - { - ""num_tasks"": len(tasks), - ""tasks"": [_cleaner_task_str(t) for t in tasks], - } - ) - return d + return {""num_threads"": 0, ""threads"": []} def _actor(self, request): return {""actor"": request.actor} diff --git a/datasette/database.py b/datasette/database.py index ba594a8..b50142d 100644 --- a/datasette/database.py +++ b/datasette/database.py @@ -4,7 +4,6 @@ from pathlib import Path import janus import queue import sys -import threading import uuid from .tracer import trace @@ -21,8 +20,6 @@ from .utils import ( ) from .inspect import inspect_hash -connections = threading.local() - AttachedDatabase = namedtuple(""AttachedDatabase"", (""seq"", ""name"", ""file"")) @@ -43,12 +40,12 @@ class Database: self.hash = None self.cached_size = None self._cached_table_counts = None - self._write_thread = None - self._write_queue = None if not self.is_mutable and not self.is_memory: p = Path(path) self.hash = inspect_hash(p) self.cached_size = p.stat().st_size + self._read_connection = None + self._write_connection = None @property def cached_table_counts(self): @@ -134,60 +131,17 @@ class Database: return results async def execute_write_fn(self, fn, block=True): - task_id = uuid.uuid5(uuid.NAMESPACE_DNS, ""datasette.io"") - if self._write_queue is None: - self._write_queue = queue.Queue() - if self._write_thread is None: - self._write_thread = threading.Thread( - target=self._execute_writes, daemon=True - ) - self._write_thread.start() - reply_queue = janus.Queue() - self._write_queue.put(WriteTask(fn, task_id, reply_queue)) - if block: - result = await reply_queue.async_q.get() - if isinstance(result, Exception): - raise result - else: - return result - else: - return task_id - - def _execute_writes(self): - # Infinite looping thread that protects the single write connection - # to this database - conn_exception = None - conn = None - try: - conn = self.connect(write=True) - self.ds._prepare_connection(conn, self.name) - except Exception as e: - conn_exception = e - while True: - task = self._write_queue.get() - if conn_exception is not None: - result = conn_exception - else: - try: - result = task.fn(conn) - except Exception as e: - sys.stderr.write(""{}\n"".format(e)) - sys.stderr.flush() - result = e - task.reply_queue.sync_q.put(result) + # We always treat it as if block=True now + if self._write_connection is None: + self._write_connection = self.connect(write=True) + self.ds._prepare_connection(self._write_connection, self.name) + return fn(self._write_connection) async def execute_fn(self, fn): - def in_thread(): - conn = getattr(connections, self.name, None) - if not conn: - conn = self.connect() - self.ds._prepare_connection(conn, self.name) - setattr(connections, self.name, conn) - return fn(conn) - - return await asyncio.get_event_loop().run_in_executor( - self.ds.executor, in_thread - ) + if self._read_connection is None: + self._read_connection = self.connect() + self.ds._prepare_connection(self._read_connection, self.name) + return fn(self._read_connection) async def execute( self, diff --git a/setup.py b/setup.py index 7f0562f..c41669c 100644 --- a/setup.py +++ b/setup.py @@ -44,20 +44,20 @@ setup( install_requires=[ ""asgiref>=3.2.10,<3.6.0"", ""click>=7.1.1,<8.2.0"", - ""click-default-group~=1.2.2"", + # ""click-default-group~=1.2.2"", ""Jinja2>=2.10.3,<3.1.0"", ""hupper~=1.9"", ""httpx>=0.20"", ""pint~=0.9"", ""pluggy>=1.0,<1.1"", - ""uvicorn~=0.11"", + # ""uvicorn~=0.11"", ""aiofiles>=0.4,<0.9"", ""janus>=0.6.2,<1.1"", ""asgi-csrf>=0.9"", ""PyYAML>=5.3,<7.0"", ""mergedeep>=1.1.1,<1.4.0"", ""itsdangerous>=1.1,<3.0"", - ""python-baseconv==1.2.2"", + # ""python-baseconv==1.2.2"", ], entry_points="""""" [console_scripts] ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932, https://github.com/simonw/datasette/issues/1734#issuecomment-1115260999,https://api.github.com/repos/simonw/datasette/issues/1734,1115260999,IC_kwDOBm6k_c5CeYhH,9599,2022-05-02T19:10:34Z,2022-05-02T19:10:34Z,OWNER,"This is actually mostly a documentation thing: here: https://docs.datasette.io/en/0.61.1/authentication.html#including-an-expiry-time In the code it's only used in these two places: https://github.com/simonw/datasette/blob/0a7621f96f8ad14da17e7172e8a7bce24ef78966/datasette/actor_auth_cookie.py#L16-L20 https://github.com/simonw/datasette/blob/0a7621f96f8ad14da17e7172e8a7bce24ef78966/tests/test_auth.py#L56-L60","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223241647, https://github.com/simonw/datasette/issues/1733#issuecomment-1115258737,https://api.github.com/repos/simonw/datasette/issues/1733,1115258737,IC_kwDOBm6k_c5CeX9x,9599,2022-05-02T19:08:17Z,2022-05-02T19:08:17Z,OWNER,"I was going to vendor `baseconv.py`, but then I reconsidered - what if there are plugins out there that expect `import baseconv` to work because they have dependend on Datasette? I used https://cs.github.com/ and as far as I can tell there aren't any! So I'm going to remove that dependency and work out a smarter way to do this - probably by providing a utility function within Datasette itself.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932, https://github.com/simonw/datasette/issues/1733#issuecomment-1115256318,https://api.github.com/repos/simonw/datasette/issues/1733,1115256318,IC_kwDOBm6k_c5CeXX-,9599,2022-05-02T19:05:55Z,2022-05-02T19:05:55Z,OWNER,"I released a `click-default-group-wheel` package to solve that dependency issue. I've already upgraded `sqlite-utils` to that, so now you can use that in Pyodide: - https://github.com/simonw/sqlite-utils/pull/429 `python-baseconv` is only used for actor cookie expiration times: https://github.com/simonw/datasette/blob/0a7621f96f8ad14da17e7172e8a7bce24ef78966/datasette/actor_auth_cookie.py#L16-L20 Datasette never actually sets that cookie itself - it instead encourages plugins to set it in the authentication documentation here: https://docs.datasette.io/en/0.61.1/authentication.html#including-an-expiry-time","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223234932, https://github.com/simonw/sqlite-utils/pull/429#issuecomment-1115196863,https://api.github.com/repos/simonw/sqlite-utils/issues/429,1115196863,IC_kwDOCGYnMM5CeI2_,9599,2022-05-02T18:03:47Z,2022-05-02T18:52:42Z,OWNER,"I made a build of this branch and tested it like this: https://pyodide.org/en/stable/console.html ```pycon >>> import micropip >>> await micropip.install(""https://s3.amazonaws.com/simonwillison-cors-allowed-public/sqlite_utils-3.26-py3-none-any.whl"") >>> import sqlite_utils >>> db = sqlite_utils.Database(memory=True) >>> list(db.query(""select 32443 + 55"")) [{'32443 + 55': 32498}] ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223177069, https://github.com/simonw/sqlite-utils/pull/429#issuecomment-1115197644,https://api.github.com/repos/simonw/sqlite-utils/issues/429,1115197644,IC_kwDOCGYnMM5CeJDM,9599,2022-05-02T18:04:28Z,2022-05-02T18:04:28Z,OWNER,I'm going to ship this straight away as `3.26.1`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1223177069, https://github.com/simonw/datasette/issues/1479#issuecomment-1114601882,https://api.github.com/repos/simonw/datasette/issues/1479,1114601882,IC_kwDOBm6k_c5Cb3ma,32839123,2022-05-02T08:10:27Z,2022-05-02T11:54:49Z,NONE,"Also ran into this issue today using `datasette package`. The stack trace takes up my whole PowerShell history, though (recursionerror), but it also concerns the temporary directory. Our development machines have a very zealous scanner that appears to insert itself between every call to the filesystem. I suspected that was causing some racing, but this turned out not to be the case: inserting `time.sleep(3)` on line 451 of `datasette/datasette/utils/__init__.py` does not make the problem go away. Commenting out the `tmp.cleanup()` line does. The next error I get is docker-specific, so that probably does resolve the Datasette error here.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1010112818, https://github.com/simonw/datasette/issues/1727#issuecomment-1114058210,https://api.github.com/repos/simonw/datasette/issues/1727,1114058210,IC_kwDOBm6k_c5CZy3i,9599,2022-04-30T21:39:34Z,2022-04-30T21:39:34Z,OWNER,"Something to consider if I look into subprocesses for parallel query execution: https://sqlite.org/howtocorrupt.html#_carrying_an_open_database_connection_across_a_fork_ > Do not open an SQLite database connection, then fork(), then try to use that database connection in the child process. All kinds of locking problems will result and you can easily end up with a corrupt database. SQLite is not designed to support that kind of behavior. Any database connection that is used in a child process must be opened in the child process, not inherited from the parent. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1729#issuecomment-1114038259,https://api.github.com/repos/simonw/datasette/issues/1729,1114038259,IC_kwDOBm6k_c5CZt_z,9599,2022-04-30T19:06:03Z,2022-04-30T19:06:03Z,OWNER,"> but actually the facet results would be better if they were a list rather than a dictionary I think `facet_results` in the JSON should match this (used by the HTML) instead: https://github.com/simonw/datasette/blob/942411ef946e9a34a2094944d3423cddad27efd3/datasette/views/table.py#L737-L741 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1729#issuecomment-1114036946,https://api.github.com/repos/simonw/datasette/issues/1729,1114036946,IC_kwDOBm6k_c5CZtrS,9599,2022-04-30T18:56:25Z,2022-04-30T19:04:03Z,OWNER,"Related: - #1558 Which talks about how there was confusion in this example: https://latest.datasette.io/fixtures/facetable.json?_facet=created&_facet_date=created&_facet=tags&_facet_array=tags&_nosuggest=1&_size=0 Which I fixed in #625 by introducing `tags` and `tags_2` keys, but actually the facet results would be better if they were a list rather than a dictionary.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1729#issuecomment-1114037521,https://api.github.com/repos/simonw/datasette/issues/1729,1114037521,IC_kwDOBm6k_c5CZt0R,9599,2022-04-30T19:01:07Z,2022-04-30T19:01:07Z,OWNER,"I had to look up what `hideable` means - it means that you can't hide the current facet because it was defined in metadata, not as a `?_facet=` parameter: https://github.com/simonw/datasette/blob/4e47a2d894b96854348343374c8e97c9d7055cf6/datasette/facets.py#L228 That's a bit of a weird thing to expose in the API. Maybe change that to `source` so it can be `metadata` or `request`? That's very slightly less coupled to how the UI works.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1729#issuecomment-1114013757,https://api.github.com/repos/simonw/datasette/issues/1729,1114013757,IC_kwDOBm6k_c5CZoA9,9599,2022-04-30T16:15:51Z,2022-04-30T18:54:39Z,OWNER,"Deployed a preview of this here: https://latest-1-0-alpha.datasette.io/ Examples: - https://latest-1-0-alpha.datasette.io/fixtures/facetable.json - https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count Second example produces: ```json { ""rows"": [], ""next"": null, ""next_url"": null, ""count"": 15, ""facet_results"": { ""state"": { ""name"": ""state"", ""type"": ""column"", ""hideable"": true, ""toggle_url"": ""/fixtures/facetable.json?_size=0&_extra=facet_results&_extra=count"", ""results"": [ { ""value"": ""CA"", ""label"": ""CA"", ""count"": 10, ""toggle_url"": ""https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=CA"", ""selected"": false }, { ""value"": ""MI"", ""label"": ""MI"", ""count"": 4, ""toggle_url"": ""https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=MI"", ""selected"": false }, { ""value"": ""MC"", ""label"": ""MC"", ""count"": 1, ""toggle_url"": ""https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=MC"", ""selected"": false } ], ""truncated"": false } } } ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1727#issuecomment-1112889800,https://api.github.com/repos/simonw/datasette/issues/1727,1112889800,IC_kwDOBm6k_c5CVVnI,9599,2022-04-29T05:29:38Z,2022-04-29T05:29:38Z,OWNER,"OK, I just got the most incredible result with that! I started up a container running `bash` like this, from my `datasette` checkout. I'm mapping port 8005 on my laptop to port 8001 inside the container because laptop port 8001 was already doing something else: ``` docker run -it --rm --name my-running-script -p 8005:8001 -v ""$PWD"":/usr/src/myapp \ -w /usr/src/myapp nogil/python bash ``` Then in `bash` I ran the following commands to install Datasette and its dependencies: ``` pip install -e '.[test]' pip install datasette-pretty-traces # For debug tracing ``` Then I started Datasette against my `github.db` database (from github-to-sqlite.dogsheep.net/github.db) like this: ``` datasette github.db -h 0.0.0.0 --setting trace_debug 1 ``` I hit the following two URLs to compare the parallel v.s. not parallel implementations: - `http://127.0.0.1:8005/github/issues?_facet=milestone&_facet=repo&_trace=1&_size=10` - `http://127.0.0.1:8005/github/issues?_facet=milestone&_facet=repo&_trace=1&_size=10&_noparallel=1` And... the parallel one beat the non-parallel one decisively, on multiple page refreshes! Not parallel: 77ms Parallel: 47ms So yeah, I'm very confident this is a problem with the GIL. And I am absolutely **stunned** that @colesbury's fork ran Datasette (which has some reasonably tricky threading and async stuff going on) out of the box!","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1727#issuecomment-1112879463,https://api.github.com/repos/simonw/datasette/issues/1727,1112879463,IC_kwDOBm6k_c5CVTFn,9599,2022-04-29T05:03:58Z,2022-04-29T05:03:58Z,OWNER,"It would be _really_ fun to try running this with the in-development `nogil` Python from https://github.com/colesbury/nogil There's a Docker container for it: https://hub.docker.com/r/nogil/python It suggests you can run something like this: docker run -it --rm --name my-running-script -v ""$PWD"":/usr/src/myapp \ -w /usr/src/myapp nogil/python python your-daemon-or-script.py","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1727#issuecomment-1112878955,https://api.github.com/repos/simonw/datasette/issues/1727,1112878955,IC_kwDOBm6k_c5CVS9r,9599,2022-04-29T05:02:40Z,2022-04-29T05:02:40Z,OWNER,"Here's a very useful (recent) article about how the GIL works and how to think about it: https://pythonspeed.com/articles/python-gil/ - via https://lobste.rs/s/9hj80j/when_python_can_t_thread_deep_dive_into_gil From that article: > For example, let's consider an extension module written in C or Rust that lets you talk to a PostgreSQL database server. > > Conceptually, handling a SQL query with this library will go through three steps: > > 1. Deserialize from Python to the internal library representation. Since this will be reading Python objects, it needs to hold the GIL. > 2. Send the query to the database server, and wait for a response. This doesn't need the GIL. > 3. Convert the response into Python objects. This needs the GIL again. > > As you can see, how much parallelism you can get depends on how much time is spent in each step. If the bulk of time is spent in step 2, you'll get parallelism there. But if, for example, you run a `SELECT` and get a large number of rows back, the library will need to create many Python objects, and step 3 will have to hold GIL for a while. That explains what I'm seeing here. I'm pretty convinced now that the reason I'm not getting a performance boost from parallel queries is that there's more time spent in Python code assembling the results than in SQLite C code executing the query.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1729#issuecomment-1112734577,https://api.github.com/repos/simonw/datasette/issues/1729,1112734577,IC_kwDOBm6k_c5CUvtx,9599,2022-04-28T23:08:42Z,2022-04-28T23:08:42Z,OWNER,"That prototype is a very small amount of code so far: ```diff diff --git a/datasette/renderer.py b/datasette/renderer.py index 4508949..b600e1b 100644 --- a/datasette/renderer.py +++ b/datasette/renderer.py @@ -28,6 +28,10 @@ def convert_specific_columns_to_json(rows, columns, json_cols): def json_renderer(args, data, view_name): """"""Render a response as JSON"""""" + from pprint import pprint + + pprint(data) + status_code = 200 # Handle the _json= parameter which may modify data[""rows""] @@ -43,6 +47,41 @@ def json_renderer(args, data, view_name): if ""rows"" in data and not value_as_boolean(args.get(""_json_infinity"", ""0"")): data[""rows""] = [remove_infinites(row) for row in data[""rows""]] + # Start building the default JSON here + columns = data[""columns""] + next_url = data.get(""next_url"") + output = { + ""rows"": [dict(zip(columns, row)) for row in data[""rows""]], + ""next"": data[""next""], + ""next_url"": next_url, + } + + extras = set(args.getlist(""_extra"")) + + extras_map = { + # _extra= : data[field] + ""count"": ""filtered_table_rows_count"", + ""facet_results"": ""facet_results"", + ""suggested_facets"": ""suggested_facets"", + ""columns"": ""columns"", + ""primary_keys"": ""primary_keys"", + ""query_ms"": ""query_ms"", + ""query"": ""query"", + } + for extra_key, data_key in extras_map.items(): + if extra_key in extras: + output[extra_key] = data[data_key] + + body = json.dumps(output, cls=CustomJSONEncoder) + content_type = ""application/json; charset=utf-8"" + headers = {} + if next_url: + headers[""link""] = f'<{next_url}>; rel=""next""' + return Response( + body, status=status_code, headers=headers, content_type=content_type + ) + + # Deal with the _shape option shape = args.get(""_shape"", ""arrays"") # if there's an error, ignore the shape entirely ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1729#issuecomment-1112732563,https://api.github.com/repos/simonw/datasette/issues/1729,1112732563,IC_kwDOBm6k_c5CUvOT,9599,2022-04-28T23:05:03Z,2022-04-28T23:05:03Z,OWNER,"OK, the prototype of this is looking really good - it's very pleasant to use. `http://127.0.0.1:8001/github_memory/issue_comments.json?_search=simon&_sort=id&_size=5&_extra=query_ms&_extra=count&_col=body` returns this: ```json { ""rows"": [ { ""id"": 338854988, ""body"": "" /database-name/table-name?name__contains=simon&sort=id+desc\r\n\r\nNote that if there's a column called \""sort\"" you can still do sort__exact=blah\r\n\r\n"" }, { ""id"": 346427794, ""body"": ""Thanks. There is a way to use pip to grab apsw, which also let's you configure it (flags to build extensions, use an internal sqlite, etc). Don't know how that works as a dependency for another package, though.\n\nOn November 22, 2017 11:38:06 AM EST, Simon Willison wrote:\n>I have a solution for FTS already, but I'm interested in apsw as a\n>mechanism for allowing custom virtual tables to be written in Python\n>(pysqlite only lets you write custom functions)\n>\n>Not having PyPI support is pretty tough though. I'm planning a\n>plugin/extension system which would be ideal for things like an\n>optional apsw mode, but that's a lot harder if apsw isn't in PyPI.\n>\n>-- \n>You are receiving this because you authored the thread.\n>Reply to this email directly or view it on GitHub:\n>https://github.com/simonw/datasette/issues/144#issuecomment-346405660\n"" }, { ""id"": 348252037, ""body"": ""WOW!\n\n--\nPaul Ford // (646) 369-7128 // @ftrain\n\nOn Thu, Nov 30, 2017 at 11:47 AM, Simon Willison \nwrote:\n\n> Remaining work on this now lives in a milestone:\n> https://github.com/simonw/datasette/milestone/6\n>\n> β€”\n> You are receiving this because you were mentioned.\n> Reply to this email directly, view it on GitHub\n> ,\n> or mute the thread\n> \n> .\n>\n"" }, { ""id"": 391141391, ""body"": ""I'm going to clean this up for consistency tomorrow morning so hold off\nmerging until then please\n\nOn Tue, May 22, 2018 at 6:34 PM, Simon Willison \nwrote:\n\n> Yeah let's try this without pysqlite3 and see if we still get the correct\n> version.\n>\n> β€”\n> You are receiving this because you authored the thread.\n> Reply to this email directly, view it on GitHub\n> , or mute\n> the thread\n> \n> .\n>\n"" }, { ""id"": 391355030, ""body"": ""No objections;\r\nIt's good to go @simonw\r\n\r\nOn Wed, 23 May 2018, 14:51 Simon Willison, wrote:\r\n\r\n> @r4vi any objections to me merging this?\r\n>\r\n> β€”\r\n> You are receiving this because you were mentioned.\r\n> Reply to this email directly, view it on GitHub\r\n> , or mute\r\n> the thread\r\n> \r\n> .\r\n>\r\n"" } ], ""next"": ""391355030,391355030"", ""next_url"": ""http://127.0.0.1:8001/github_memory/issue_comments.json?_search=simon&_size=5&_extra=query_ms&_extra=count&_col=body&_next=391355030%2C391355030&_sort=id"", ""count"": 57, ""query_ms"": 21.780223003588617 } ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1729#issuecomment-1112730416,https://api.github.com/repos/simonw/datasette/issues/1729,1112730416,IC_kwDOBm6k_c5CUusw,9599,2022-04-28T23:01:21Z,2022-04-28T23:01:21Z,OWNER,"I'm not sure what to do about the `""truncated"": true/false` key. It's not really relevant to table results, since they are paginated whether or not you ask for them to be. It plays a role in query results, where you might run `select * from table` and get back 1000 results because Datasette truncates at that point rather than returning everything. Adding it to every table result and always setting it to `""truncated"": false` feels confusing. I think I'm going to keep it exclusively in the default representation for the `/db?sql=...` query endpoint, and not return it at all for tables.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1729#issuecomment-1112721321,https://api.github.com/repos/simonw/datasette/issues/1729,1112721321,IC_kwDOBm6k_c5CUsep,9599,2022-04-28T22:44:05Z,2022-04-28T22:44:14Z,OWNER,I may be able to implement this mostly in the `json_renderer()` function: https://github.com/simonw/datasette/blob/94a3171b01fde5c52697aeeff052e3ad4bab5391/datasette/renderer.py#L29-L34,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1729#issuecomment-1112717745,https://api.github.com/repos/simonw/datasette/issues/1729,1112717745,IC_kwDOBm6k_c5CUrmx,9599,2022-04-28T22:38:39Z,2022-04-28T22:39:05Z,OWNER,"(I remain keen on the idea of shipping a plugin that restores the old default API shape to people who have written pre-Datasette-1.0 code against it, but I'll tackle that much later. I really like how jQuery has a culture of doing this.)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1729#issuecomment-1112717210,https://api.github.com/repos/simonw/datasette/issues/1729,1112717210,IC_kwDOBm6k_c5CUrea,9599,2022-04-28T22:37:37Z,2022-04-28T22:37:37Z,OWNER,"This means `filtered_table_rows_count` is going to become `count`. I had originally picked that terrible name to avoid confusion between the count of all rows in the table and the count of rows that were filtered. I'll add `?_extra=table_count` for getting back the full table count instead. I think `count` is clear enough!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1729#issuecomment-1112716611,https://api.github.com/repos/simonw/datasette/issues/1729,1112716611,IC_kwDOBm6k_c5CUrVD,9599,2022-04-28T22:36:24Z,2022-04-28T22:36:24Z,OWNER,"Then I'm going to implement the following `?_extra=` options: - `?_extra=facet_results` - to see facet results - `?_extra=suggested_facets` - for suggested facets - `?_extra=count` - for the count of total rows - `?_extra=columns` - for a list of column names - `?_extra=primary_keys` - for a list of primary keys - `?_extra=query` - a `{""sql"" ""select ..."", ""params"": {}}` object I thought about having `?_extra=facet_results` returned automatically if the user specifies at least one `?_facet` - but that doesn't work for default facets configured in `metadata.json` - how can the user opt out of those being returned? So I'm going to say you don't see facets at all if you don't include `?_extra=facet_results`. I'm tempted to add `?_extra=_all` to return everything, but I can decide if that's a good idea later.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1729#issuecomment-1112713581,https://api.github.com/repos/simonw/datasette/issues/1729,1112713581,IC_kwDOBm6k_c5CUqlt,9599,2022-04-28T22:31:11Z,2022-04-28T22:31:11Z,OWNER,"I'm going to change the default API response to look like this: ```json { ""rows"": [ { ""pk"": 1, ""created"": ""2019-01-14 08:00:00"", ""planet_int"": 1, ""on_earth"": 1, ""state"": ""CA"", ""_city_id"": 1, ""_neighborhood"": ""Mission"", ""tags"": ""[\""tag1\"", \""tag2\""]"", ""complex_array"": ""[{\""foo\"": \""bar\""}]"", ""distinct_some_null"": ""one"", ""n"": ""n1"" }, { ""pk"": 2, ""created"": ""2019-01-14 08:00:00"", ""planet_int"": 1, ""on_earth"": 1, ""state"": ""CA"", ""_city_id"": 1, ""_neighborhood"": ""Dogpatch"", ""tags"": ""[\""tag1\"", \""tag3\""]"", ""complex_array"": ""[]"", ""distinct_some_null"": ""two"", ""n"": ""n2"" } ], ""next"": null, ""next_url"": null } ``` Basically https://latest.datasette.io/fixtures/facetable.json?_shape=objects but with just the `rows`, `next` and `next_url` fields returned by default.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1219385669, https://github.com/simonw/datasette/issues/1715#issuecomment-1112711115,https://api.github.com/repos/simonw/datasette/issues/1715,1112711115,IC_kwDOBm6k_c5CUp_L,9599,2022-04-28T22:26:56Z,2022-04-28T22:26:56Z,OWNER,"I'm not going to use `asyncinject` in this refactor - at least not until I really need it. My research in these issues has put me off the idea ( in favour of `asyncio.gather()` or even not trying for parallel execution at all): - #1727","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1212823665, https://github.com/simonw/datasette/issues/1727#issuecomment-1112668411,https://api.github.com/repos/simonw/datasette/issues/1727,1112668411,IC_kwDOBm6k_c5CUfj7,9599,2022-04-28T21:25:34Z,2022-04-28T21:25:44Z,OWNER,"The two most promising theories at the moment, from here and Twitter and the SQLite forum, are: - SQLite is I/O bound - it generally only goes as fast as it can load data from disk. Multiple connections all competing for the same file on disk are going to end up blocked at the file system layer. But maybe this means in-memory databases will perform better? - It's the GIL. The sqlite3 C code may release the GIL, but the bits that do things like assembling `Row` objects to return still happen in Python, and that Python can only run on a single core. A couple of ways to research the in-memory theory: - Use a RAM disk on macOS (or Linux). https://stackoverflow.com/a/2033417/6083 has instructions - short version: hdiutil attach -nomount ram://$((2 * 1024 * 100)) diskutil eraseVolume HFS+ RAMDisk name-returned-by-previous-command (was `/dev/disk2` when I tried it) cd /Volumes/RAMDisk cp ~/fixtures.db . - Copy Datasette databases into an in-memory database on startup. I built a new plugin to do that here: https://github.com/simonw/datasette-copy-to-memory I need to do some more, better benchmarks using these different approaches. https://twitter.com/laurencerowe/status/1519780174560169987 also suggests: > Maybe try: > 1. Copy the sqlite file to /dev/shm and rerun (all in ram.) > 2. Create a CTE which calculates Fibonacci or similar so you can test something completely cpu bound (only return max value or something to avoid crossing between sqlite/Python.) I like that second idea a lot - I could use the mandelbrot example from https://www.sqlite.org/lang_with.html#outlandish_recursive_query_examples","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1633#issuecomment-1111955628,https://api.github.com/repos/simonw/datasette/issues/1633,1111955628,IC_kwDOBm6k_c5CRxis,6613091,2022-04-28T09:12:56Z,2022-04-28T09:12:56Z,NONE,I have verified that the problem with base_url still exists in the latest version 0.61.1. I would need some guidance if my code change suggestion is correct or if base_url should be included in some other code?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1129052172, https://github.com/simonw/datasette/issues/1728#issuecomment-1111752676,https://api.github.com/repos/simonw/datasette/issues/1728,1111752676,IC_kwDOBm6k_c5CQ__k,127565,2022-04-28T05:11:54Z,2022-04-28T05:11:54Z,CONTRIBUTOR,"And in terms of the bug, yep I agree that option 2 would be the most useful and least frustrating.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366, https://github.com/simonw/datasette/issues/1728#issuecomment-1111751734,https://api.github.com/repos/simonw/datasette/issues/1728,1111751734,IC_kwDOBm6k_c5CQ_w2,127565,2022-04-28T05:09:59Z,2022-04-28T05:09:59Z,CONTRIBUTOR,"Thanks, I'll give it a try!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366, https://github.com/simonw/datasette/issues/1727#issuecomment-1111726586,https://api.github.com/repos/simonw/datasette/issues/1727,1111726586,IC_kwDOBm6k_c5CQ5n6,9599,2022-04-28T04:17:16Z,2022-04-28T04:19:31Z,OWNER,"I could experiment with the `await asyncio.run_in_executor(processpool_executor, fn)` mechanism described in https://stackoverflow.com/a/29147750 Code examples: https://cs.github.com/?scopeName=All+repos&scope=&q=run_in_executor+ProcessPoolExecutor","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1727#issuecomment-1111725638,https://api.github.com/repos/simonw/datasette/issues/1727,1111725638,IC_kwDOBm6k_c5CQ5ZG,9599,2022-04-28T04:15:15Z,2022-04-28T04:15:15Z,OWNER,"Useful theory from Keith Medcalf https://sqlite.org/forum/forumpost/e363c69d3441172e > This is true, but the concurrency is limited to the execution which occurs with the GIL released (that is, in the native C sqlite3 library itself). Each row (for example) can be retrieved in parallel but ""constructing the python return objects for each row"" will be serialized (by the GIL). > > That is to say that if your have two python threads each with their own connection, and each one is performing a select that returns 1,000,000 rows (lets say that is 25% of the candidates for each select) then the difference in execution time between executing two python threads in parallel vs a single serial thead will not be much different (if even detectable at all). In fact it is possible that the multiple-threaded version takes longer to run both queries to completion because of the increased contention over a shared resource (the GIL). So maybe this is a GIL thing. I should test with some expensive SQL queries (maybe big aggregations against large tables) and see if I can spot an improvement there.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1728#issuecomment-1111714665,https://api.github.com/repos/simonw/datasette/issues/1728,1111714665,IC_kwDOBm6k_c5CQ2tp,9599,2022-04-28T03:52:47Z,2022-04-28T03:52:58Z,OWNER,"Nice custom template/theme! Yeah, for that I'd recommend hosting elsewhere - on a regular VPS (I use `systemd` like this: https://docs.datasette.io/en/stable/deploying.html#running-datasette-using-systemd ) or using Fly if you want to tub containers without managing a full server.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366, https://github.com/simonw/datasette/issues/1728#issuecomment-1111712953,https://api.github.com/repos/simonw/datasette/issues/1728,1111712953,IC_kwDOBm6k_c5CQ2S5,127565,2022-04-28T03:48:36Z,2022-04-28T03:48:36Z,CONTRIBUTOR,"I don't think that'd work for this project. The db is very big, and my aim was to have an environment where researchers could be making use of the data, but be easily able to add corrections to the HTR/OCR extracted data when they came across problems. It's in its immutable (!) form here: https://sydney-stock-exchange-xqtkxtd5za-ts.a.run.app/stock_exchange/stocks","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366, https://github.com/simonw/datasette/issues/1728#issuecomment-1111708206,https://api.github.com/repos/simonw/datasette/issues/1728,1111708206,IC_kwDOBm6k_c5CQ1Iu,9599,2022-04-28T03:38:56Z,2022-04-28T03:38:56Z,OWNER,"In terms of this bug, there are a few potential fixes: 1. Detect the write to a immutable database and show the user a proper, meaningful error message in the red error box at the top of the page 2. Don't allow the user to even submit the form - show a message saying that this canned query is unavailable because the database cannot be written to 3. Don't even allow Datasette to start running at all - if there's a canned query configured in `metadata.yml` and the database it refers to is in `-i` immutable mode throw an error on startup I'm not keen on that last one because it would be frustrating if you couldn't launch Datasette just because you had an old canned query lying around in your metadata file. So I'm leaning towards option 2.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366, https://github.com/simonw/datasette/issues/1728#issuecomment-1111707384,https://api.github.com/repos/simonw/datasette/issues/1728,1111707384,IC_kwDOBm6k_c5CQ074,9599,2022-04-28T03:36:46Z,2022-04-28T03:36:56Z,OWNER,"A more realistic solution (which I've been using on several of my own projects) is to keep the data itself in GitHub and encourage users to edit it there - using the GitHub web interface to edit YAML files or similar. Needs your users to be comfortable hand-editing YAML though! You can at least guard against critical errors by having CI run tests against their YAML before deploying. I have a dream of building a more friendly web forms interface which edits the YAML back on GitHub for the user, but that's just a concept at the moment. Even more fun would be if a user-friendly form could submit PRs for review without the user having to know what a PR is!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366, https://github.com/simonw/datasette/issues/1728#issuecomment-1111706519,https://api.github.com/repos/simonw/datasette/issues/1728,1111706519,IC_kwDOBm6k_c5CQ0uX,9599,2022-04-28T03:34:49Z,2022-04-28T03:34:49Z,OWNER,"I've wanted to do stuff like that on Cloud Run too. So far I've assumed that it's not feasible, but recently I've been wondering how hard it would be to have a small (like less than 100KB or so) Datasette instance which persists data to a backing GitHub repository such that when it starts up it can pull the latest copy and any time someone edits it can push their changes. I'm still not sure it would work well on Cloud Run due to the uncertainty at what would happen if Cloud Run decided to boot up a second instance - but it's still an interesting thought exercise.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366, https://github.com/simonw/datasette/issues/1728#issuecomment-1111705323,https://api.github.com/repos/simonw/datasette/issues/1728,1111705323,IC_kwDOBm6k_c5CQ0br,127565,2022-04-28T03:32:06Z,2022-04-28T03:32:06Z,CONTRIBUTOR,"Ah, that would be it! I have a core set of data which doesn't change to which I want authorised users to be able to submit corrections. I was going to deal with the persistence issue by just grabbing the user corrections at regular intervals and saving to GitHub. I might need to rethink. Thanks!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366, https://github.com/simonw/datasette/issues/1728#issuecomment-1111705069,https://api.github.com/repos/simonw/datasette/issues/1728,1111705069,IC_kwDOBm6k_c5CQ0Xt,9599,2022-04-28T03:31:33Z,2022-04-28T03:31:33Z,OWNER,"Confirmed - this is a bug where immutable databases fail to show a useful error if you write to them with a canned query. Steps to reproduce: ``` echo ' databases: writable: queries: add_name: sql: insert into names(name) values (:name) write: true ' > write-metadata.yml echo '{""name"": ""Simon""}' | sqlite-utils insert writable.db names - datasette writable.db -m write-metadata.yml ``` Then visit http://127.0.0.1:8001/writable/add_name - adding names works. Now do this instead: ``` datasette -i writable.db -m write-metadata.yml ``` And I'm getting a broken error: ![error](https://user-images.githubusercontent.com/9599/165670823-6604dd69-9905-475c-8098-5da22ab026a1.gif) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366, https://github.com/simonw/datasette/issues/1727#issuecomment-1111699175,https://api.github.com/repos/simonw/datasette/issues/1727,1111699175,IC_kwDOBm6k_c5CQy7n,9599,2022-04-28T03:19:48Z,2022-04-28T03:20:08Z,OWNER,"I ran `py-spy` and then hammered refresh a bunch of times on the `http://127.0.0.1:8856/github/commits?_facet=repo&_facet=committer&_trace=1&_noparallel=` page - it generated this SVG profile for me. The area on the right is the threads running the DB queries: ![profile](https://user-images.githubusercontent.com/9599/165669677-5461ede5-3dc4-4b49-8319-bfe5fd8a723d.svg) Interactive version here: https://static.simonwillison.net/static/2022/datasette-parallel-profile.svg","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1728#issuecomment-1111698307,https://api.github.com/repos/simonw/datasette/issues/1728,1111698307,IC_kwDOBm6k_c5CQyuD,9599,2022-04-28T03:18:02Z,2022-04-28T03:18:02Z,OWNER,If the behaviour you are seeing is because the database is running in immutable mode then that's a bug - you should get a useful error message instead!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366, https://github.com/simonw/datasette/issues/1728#issuecomment-1111697985,https://api.github.com/repos/simonw/datasette/issues/1728,1111697985,IC_kwDOBm6k_c5CQypB,9599,2022-04-28T03:17:20Z,2022-04-28T03:17:20Z,OWNER,"How did you deploy to Cloud Run? `datasette publish cloudrun` defaults to running databases there in `-i` immutable mode, because if you managed to change a file on disk on Cloud Run those changes would be lost the next time your container restarted there. That's why I upgraded `datasette-publish-fly` to provide a way of working with their volumes support - they're the best option I know of right now for running Datasette in a container with a persistent volume that can accept writes: https://simonwillison.net/2022/Feb/15/fly-volumes/","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1218133366, https://github.com/simonw/datasette/issues/1727#issuecomment-1111683539,https://api.github.com/repos/simonw/datasette/issues/1727,1111683539,IC_kwDOBm6k_c5CQvHT,9599,2022-04-28T02:47:57Z,2022-04-28T02:47:57Z,OWNER,"Maybe this is the Python GIL after all? I've been hoping that the GIL won't be an issue because the `sqlite3` module releases the GIL for the duration of the execution of a SQL query - see https://github.com/python/cpython/blob/f348154c8f8a9c254503306c59d6779d4d09b3a9/Modules/_sqlite/cursor.c#L749-L759 So I've been hoping this means that SQLite code itself can run concurrently on multiple cores even when Python threads cannot. But maybe I'm misunderstanding how that works?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1727#issuecomment-1111681513,https://api.github.com/repos/simonw/datasette/issues/1727,1111681513,IC_kwDOBm6k_c5CQunp,9599,2022-04-28T02:44:26Z,2022-04-28T02:44:26Z,OWNER,"I could try `py-spy top`, which I previously used here: - https://github.com/simonw/datasette/issues/1673","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1727#issuecomment-1111661331,https://api.github.com/repos/simonw/datasette/issues/1727,1111661331,IC_kwDOBm6k_c5CQpsT,9599,2022-04-28T02:07:31Z,2022-04-28T02:07:31Z,OWNER,Asked on the SQLite forum about this here: https://sqlite.org/forum/forumpost/ffbfa9f38e,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1727#issuecomment-1111602802,https://api.github.com/repos/simonw/datasette/issues/1727,1111602802,IC_kwDOBm6k_c5CQbZy,9599,2022-04-28T00:21:35Z,2022-04-28T00:21:35Z,OWNER,"Tried this but I'm getting back an empty JSON array of traces at the bottom of the page most of the time (intermittently it works correctly): ```diff diff --git a/datasette/database.py b/datasette/database.py index ba594a8..d7f9172 100644 --- a/datasette/database.py +++ b/datasette/database.py @@ -7,7 +7,7 @@ import sys import threading import uuid -from .tracer import trace +from .tracer import trace, trace_child_tasks from .utils import ( detect_fts, detect_primary_keys, @@ -207,30 +207,31 @@ class Database: time_limit_ms = custom_time_limit with sqlite_timelimit(conn, time_limit_ms): - try: - cursor = conn.cursor() - cursor.execute(sql, params if params is not None else {}) - max_returned_rows = self.ds.max_returned_rows - if max_returned_rows == page_size: - max_returned_rows += 1 - if max_returned_rows and truncate: - rows = cursor.fetchmany(max_returned_rows + 1) - truncated = len(rows) > max_returned_rows - rows = rows[:max_returned_rows] - else: - rows = cursor.fetchall() - truncated = False - except (sqlite3.OperationalError, sqlite3.DatabaseError) as e: - if e.args == (""interrupted"",): - raise QueryInterrupted(e, sql, params) - if log_sql_errors: - sys.stderr.write( - ""ERROR: conn={}, sql = {}, params = {}: {}\n"".format( - conn, repr(sql), params, e + with trace(""sql"", database=self.name, sql=sql.strip(), params=params): + try: + cursor = conn.cursor() + cursor.execute(sql, params if params is not None else {}) + max_returned_rows = self.ds.max_returned_rows + if max_returned_rows == page_size: + max_returned_rows += 1 + if max_returned_rows and truncate: + rows = cursor.fetchmany(max_returned_rows + 1) + truncated = len(rows) > max_returned_rows + rows = rows[:max_returned_rows] + else: + rows = cursor.fetchall() + truncated = False + except (sqlite3.OperationalError, sqlite3.DatabaseError) as e: + if e.args == (""interrupted"",): + raise QueryInterrupted(e, sql, params) + if log_sql_errors: + sys.stderr.write( + ""ERROR: conn={}, sql = {}, params = {}: {}\n"".format( + conn, repr(sql), params, e + ) ) - ) - sys.stderr.flush() - raise + sys.stderr.flush() + raise if truncate: return Results(rows, truncated, cursor.description) @@ -238,9 +239,8 @@ class Database: else: return Results(rows, False, cursor.description) - with trace(""sql"", database=self.name, sql=sql.strip(), params=params): - results = await self.execute_fn(sql_operation_in_thread) - return results + with trace_child_tasks(): + return await self.execute_fn(sql_operation_in_thread) @property def size(self): ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1727#issuecomment-1111597176,https://api.github.com/repos/simonw/datasette/issues/1727,1111597176,IC_kwDOBm6k_c5CQaB4,9599,2022-04-28T00:11:44Z,2022-04-28T00:11:44Z,OWNER,"Though it would be interesting to also have the trace reveal how much time is spent in the functions that wrap that core SQL - the stuff that is being measured at the moment. I have a hunch that this could help solve the over-arching performance mystery.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1727#issuecomment-1111595319,https://api.github.com/repos/simonw/datasette/issues/1727,1111595319,IC_kwDOBm6k_c5CQZk3,9599,2022-04-28T00:09:45Z,2022-04-28T00:11:01Z,OWNER,"Here's where read queries are instrumented: https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L241-L242 So the instrumentation is actually capturing quite a bit of Python activity before it gets to SQLite: https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L179-L190 And then: https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L204-L233 Ideally I'd like that `trace()` block to wrap just the `cursor.execute()` and `cursor.fetchmany(...)` or `cursor.fetchall()` calls.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1727#issuecomment-1111558204,https://api.github.com/repos/simonw/datasette/issues/1727,1111558204,IC_kwDOBm6k_c5CQQg8,9599,2022-04-27T22:58:39Z,2022-04-27T22:58:39Z,OWNER,"I should check my timing mechanism. Am I capturing the time taken just in SQLite or does it include time spent in Python crossing between async and threaded world and waiting for a thread pool worker to become available? That could explain the longer query times.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117, https://github.com/simonw/datasette/issues/1727#issuecomment-1111553029,https://api.github.com/repos/simonw/datasette/issues/1727,1111553029,IC_kwDOBm6k_c5CQPQF,9599,2022-04-27T22:48:21Z,2022-04-27T22:48:21Z,OWNER,I wonder if it would be worth exploring multiprocessing here.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1217759117,