{"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974562942", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974562942, "node_id": "IC_kwDOBm6k_c46FqZ-", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-20T00:59:32Z", "updated_at": "2021-11-20T00:59:32Z", "author_association": "OWNER", "body": "Ouch a nasty bug crept through there - https://datasette-apache-proxy-demo-j7hipcg4aq-uc.a.run.app/prefix/fixtures/compound_three_primary_keys says\r\n\r\n> 500: name 'ds' is not defined", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974561593", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974561593, "node_id": "IC_kwDOBm6k_c46FqE5", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-20T00:53:19Z", "updated_at": "2021-11-20T00:53:19Z", "author_association": "OWNER", "body": "Adding that test found (I hope!) all of the remaining `base_url` bugs. There were a bunch! I think I finally get to close #838 too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974559176", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974559176, "node_id": "IC_kwDOBm6k_c46FpfI", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-20T00:42:08Z", "updated_at": "2021-11-20T00:42:08Z", "author_association": "OWNER", "body": "> In the meantime I can catch these errors by changing the test to run each path twice, once with and once without the prefix. This should accurately simulate how Apache is working here.\r\n\r\nThis worked, I managed to get the tests to fail! Here's the change I made:\r\n\r\n```diff\r\ndiff --git a/tests/test_html.py b/tests/test_html.py\r\nindex f24165b..dbdfe59 100644\r\n--- a/tests/test_html.py\r\n+++ b/tests/test_html.py\r\n@@ -1614,12 +1614,19 @@ def test_metadata_sort_desc(app_client):\r\n \"/fixtures/compound_three_primary_keys/a,a,a\",\r\n \"/fixtures/paginated_view\",\r\n \"/fixtures/facetable\",\r\n+ \"/fixtures?sql=select+1\",\r\n ],\r\n )\r\n-def test_base_url_config(app_client_base_url_prefix, path):\r\n+@pytest.mark.parametrize(\"use_prefix\", (True, False))\r\n+def test_base_url_config(app_client_base_url_prefix, path, use_prefix):\r\n client = app_client_base_url_prefix\r\n- response = client.get(\"/prefix/\" + path.lstrip(\"/\"))\r\n+ path_to_get = path\r\n+ if use_prefix:\r\n+ path_to_get = \"/prefix/\" + path.lstrip(\"/\")\r\n+ response = client.get(path_to_get)\r\n soup = Soup(response.body, \"html.parser\")\r\n+ if path == \"/fixtures?sql=select+1\":\r\n+ assert False\r\n for el in soup.findAll([\"a\", \"link\", \"script\"]):\r\n if \"href\" in el.attrs:\r\n href = el[\"href\"]\r\n@@ -1642,11 +1649,12 @@ def test_base_url_config(app_client_base_url_prefix, path):\r\n # If this has been made absolute it may start http://localhost/\r\n if href.startswith(\"http://localhost/\"):\r\n href = href[len(\"http://localost/\") :]\r\n- assert href.startswith(\"/prefix/\"), {\r\n+ assert href.startswith(\"/prefix/\"), json.dumps({\r\n \"path\": path,\r\n+ \"path_to_get\": path_to_get,\r\n \"href_or_src\": href,\r\n \"element_parent\": str(el.parent),\r\n- }\r\n+ }, indent=4, default=repr)\r\n \r\n \r\n def test_base_url_affects_metadata_extra_css_urls(app_client_base_url_prefix):\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974558267", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974558267, "node_id": "IC_kwDOBm6k_c46FpQ7", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-20T00:37:57Z", "updated_at": "2021-11-20T00:37:57Z", "author_association": "OWNER", "body": "Thanks to #1522 I have a live demo that exhibits this bug now: https://apache-proxy-demo.datasette.io/prefix/fixtures/attraction_characteristic", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1522#issuecomment-974558076", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1522", "id": 974558076, "node_id": "IC_kwDOBm6k_c46FpN8", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-20T00:36:56Z", "updated_at": "2021-11-20T00:36:56Z", "author_association": "OWNER", "body": "That 503 error is _really_ frustrating: I have a deploy running at https://apache-proxy-demo.datasette.io/prefix/ and after a fresh deploy it serves 503 errors for quite a while - then eventually starts working.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058896236, "label": "Deploy a live instance of demos/apache-proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1522#issuecomment-974557766", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1522", "id": 974557766, "node_id": "IC_kwDOBm6k_c46FpJG", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-20T00:35:25Z", "updated_at": "2021-11-20T00:35:25Z", "author_association": "OWNER", "body": "Wrote a TIL about `--build-arg` and Cloud Run: https://til.simonwillison.net/cloudrun/using-build-args-with-cloud-run", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058896236, "label": "Deploy a live instance of demos/apache-proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1522#issuecomment-974542348", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1522", "id": 974542348, "node_id": "IC_kwDOBm6k_c46FlYM", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T23:41:47Z", "updated_at": "2021-11-19T23:44:07Z", "author_association": "OWNER", "body": "Do I have to use `cloudbuild.yml` to specify these? https://stackoverflow.com/a/58327340/6083 and https://stackoverflow.com/a/66232670/6083 suggest I do.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058896236, "label": "Deploy a live instance of demos/apache-proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1522#issuecomment-974541971", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1522", "id": 974541971, "node_id": "IC_kwDOBm6k_c46FlST", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T23:40:32Z", "updated_at": "2021-11-19T23:40:32Z", "author_association": "OWNER", "body": "I want to be able to use build arguments to specify which commit version or branch of Datasette to deploy.\r\n\r\nThis is proving hard to work out. I have this in my Dockerfile now:\r\n\r\n```\r\nARG DATASETTE_REF\r\n\r\nRUN pip install https://github.com/simonw/datasette/archive/${DATASETTE_REF}.zip\r\n```\r\nWhich works locally:\r\n\r\n docker build -t datasette-apache-proxy-demo . \\\r\n --build-arg DATASETTE_REF=c617e1769ea27e045b0f2907ef49a9a1244e577d\r\n\r\nBut I can't figure out the right incantation to pass to `gcloud build submit`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058896236, "label": "Deploy a live instance of demos/apache-proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1522#issuecomment-974523569", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1522", "id": 974523569, "node_id": "IC_kwDOBm6k_c46Fgyx", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T22:51:10Z", "updated_at": "2021-11-19T22:51:10Z", "author_association": "OWNER", "body": "I wan a GitHub Action which I can manually activate to deploy a new version of that demo... and I want it to bake in the latest release of Datasette so I can use it to demonstrate bug fixes.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058896236, "label": "Deploy a live instance of demos/apache-proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1522#issuecomment-974523297", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1522", "id": 974523297, "node_id": "IC_kwDOBm6k_c46Fguh", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T22:50:31Z", "updated_at": "2021-11-19T22:50:31Z", "author_association": "OWNER", "body": "Demo code is now at: https://github.com/simonw/datasette/tree/main/demos/apache-proxy", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058896236, "label": "Deploy a live instance of demos/apache-proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1522#issuecomment-974521687", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1522", "id": 974521687, "node_id": "IC_kwDOBm6k_c46FgVX", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T22:46:26Z", "updated_at": "2021-11-19T22:46:26Z", "author_association": "OWNER", "body": "Oh weird, it started working: https://datasette-apache-proxy-demo-j7hipcg4aq-uc.a.run.app/prefix/fixtures/sortable", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058896236, "label": "Deploy a live instance of demos/apache-proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1522#issuecomment-974506401", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1522", "id": 974506401, "node_id": "IC_kwDOBm6k_c46Fcmh", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T22:11:51Z", "updated_at": "2021-11-19T22:11:51Z", "author_association": "OWNER", "body": "This is frustrating: I have the following Dockerfile:\r\n```dockerfile\r\nFROM python:3-alpine\r\n\r\nRUN apk add --no-cache \\\r\n\tapache2 \\\r\n\tapache2-proxy \\\r\n\tbash\r\n\r\nRUN pip install datasette\r\n\r\nENV TINI_VERSION v0.18.0\r\nADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini\r\nRUN chmod +x /tini\r\n\r\n# Append this to the end of the default httpd.conf file\r\nRUN echo $'ServerName localhost\\n\\\r\n\\n\\\r\n\\n\\\r\n Order deny,allow\\n\\\r\n Allow from all\\n\\\r\n\\n\\\r\n\\n\\\r\nProxyPass /prefix/ http://localhost:8001/\\n\\\r\nHeader add X-Proxied-By \"Apache2\"' >> /etc/apache2/httpd.conf\r\n\r\nRUN echo $'Datasette' > /var/www/localhost/htdocs/index.html\r\n\r\nWORKDIR /app\r\n\r\nADD https://latest.datasette.io/fixtures.db /app/fixtures.db\r\n\r\nRUN echo $'#!/usr/bin/env bash\\n\\\r\nset -e\\n\\\r\n\\n\\\r\nhttpd -D FOREGROUND &\\n\\\r\ndatasette fixtures.db --setting base_url \"/prefix/\" -h 0.0.0.0 -p 8001 &\\n\\\r\n\\n\\\r\nwait -n' > /app/start.sh\r\n\r\nRUN chmod +x /app/start.sh\r\n\r\nEXPOSE 80\r\nENTRYPOINT [\"/tini\", \"--\", \"/app/start.sh\"]\r\n```\r\nIt works fine when I run it locally:\r\n```\r\ndocker build -t datasette-apache-proxy-demo .\r\ndocker run -p 5000:80 datasette-apache-proxy-demo\r\n```\r\nBut when I deploy it to Cloud Run with the following script:\r\n```bash\r\n#!/bin/bash\r\n# https://til.simonwillison.net/cloudrun/ship-dockerfile-to-cloud-run\r\n\r\nNAME=\"datasette-apache-proxy-demo\"\r\nPROJECT=$(gcloud config get-value project)\r\nIMAGE=\"gcr.io/$PROJECT/$NAME\"\r\n\r\ngcloud builds submit --tag $IMAGE\r\ngcloud run deploy \\\r\n --allow-unauthenticated \\\r\n --platform=managed \\\r\n --image $IMAGE $NAME \\\r\n --port 80\r\n```\r\nIt serves the `/` page successfully, but hits to `/prefix/` return the following 503 error:\r\n\r\n> Service Unavailable\r\n>\r\n> The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.\r\n>\r\n> Apache/2.4.51 (Unix) Server at datasette-apache-proxy-demo-j7hipcg4aq-uc.a.run.app Port 80\r\n\r\nCloud Run logs:\r\n\r\n\"Screen\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058896236, "label": "Deploy a live instance of demos/apache-proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974478126", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974478126, "node_id": "IC_kwDOBm6k_c46FVsu", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T21:16:36Z", "updated_at": "2021-11-19T21:16:36Z", "author_association": "OWNER", "body": "In the meantime I can catch these errors by changing the test to run each path twice, once with and once without the prefix. This should accurately simulate how Apache is working here.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974477465", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974477465, "node_id": "IC_kwDOBm6k_c46FViZ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T21:15:30Z", "updated_at": "2021-11-19T21:15:30Z", "author_association": "OWNER", "body": "I think what's happening here is Apache is actually making a request to `/fixtures` rather than making a request to `/prefix/fixtures` - and Datasette is replying to requests on both the prefixed and the non-prefixed paths.\r\n\r\nThis is pretty confusing! I think Datasette should ONLY reply to `/prefix/fixtures` instead and return a 404 for `/fixtures` - this would make things a whole lot easier to debug.\r\n\r\nBut shipping that change could break existing deployments. Maybe that should be a breaking change for 1.0.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974450232", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974450232, "node_id": "IC_kwDOBm6k_c46FO44", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:41:53Z", "updated_at": "2021-11-19T20:42:19Z", "author_association": "OWNER", "body": "https://docs.datasette.io/en/stable/deploying.html#apache-proxy-configuration says I should use `ProxyPreserveHost on`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974447950", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974447950, "node_id": "IC_kwDOBm6k_c46FOVO", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:40:19Z", "updated_at": "2021-11-19T20:40:19Z", "author_association": "OWNER", "body": "Figured it out! The test is not an accurate recreation of what is happening, because it doesn't simulate a request with a path of `/fixtures` that has been redirected by the proxy to `/prefix/fixtures`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1522#issuecomment-974435661", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1522", "id": 974435661, "node_id": "IC_kwDOBm6k_c46FLVN", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:33:42Z", "updated_at": "2021-11-19T20:33:42Z", "author_association": "OWNER", "body": "Should just be a case of deploying this `Dockerfile`:\r\n\r\n```Dockerfile\r\nFROM python:3-alpine\r\n\r\nRUN apk add --no-cache \\\r\n\tapache2 \\\r\n\tapache2-proxy \\\r\n\tbash\r\n\r\nRUN pip install datasette\r\n\r\nENV TINI_VERSION v0.18.0\r\nADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini\r\nRUN chmod +x /tini\r\n\r\n# Append this to the end of the default httpd.conf file\r\nRUN echo $'ServerName localhost\\n\\\r\n\\n\\\r\n\\n\\\r\n Order deny,allow\\n\\\r\n Allow from all\\n\\\r\n\\n\\\r\n\\n\\\r\nProxyPass /foo/bar/ http://localhost:9000/\\n\\\r\nHeader add X-Proxied-By \"Apache2\"' >> /etc/apache2/httpd.conf\r\n\r\nRUN echo $'Datasette' > /var/www/localhost/htdocs/index.html\r\n\r\nWORKDIR /app\r\n\r\nADD https://latest.datasette.io/fixtures.db /app/fixtures.db\r\n\r\nRUN echo $'#!/usr/bin/env bash\\n\\\r\nset -e\\n\\\r\n\\n\\\r\nhttpd -D FOREGROUND &\\n\\\r\ndatasette fixtures.db --setting base_url \"/foo/bar/\" -p 9000 &\\n\\\r\n\\n\\\r\nwait -n' > /app/start.sh\r\n\r\nRUN chmod +x /app/start.sh\r\n\r\nEXPOSE 80\r\nENTRYPOINT [\"/tini\", \"--\", \"/app/start.sh\"]\r\n```\r\nI can follow this TIL: https://til.simonwillison.net/cloudrun/ship-dockerfile-to-cloud-run", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058896236, "label": "Deploy a live instance of demos/apache-proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1521#issuecomment-974433520", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1521", "id": 974433520, "node_id": "IC_kwDOBm6k_c46FKzw", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:32:29Z", "updated_at": "2021-11-19T20:32:29Z", "author_association": "OWNER", "body": "This configuration works great.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058815557, "label": "Docker configuration for exercising Datasette behind Apache mod_proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974433320", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974433320, "node_id": "IC_kwDOBm6k_c46FKwo", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:32:04Z", "updated_at": "2021-11-19T20:32:04Z", "author_association": "OWNER", "body": "Still not clear why the tests pass but the live example fails.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974433206", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974433206, "node_id": "IC_kwDOBm6k_c46FKu2", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:31:52Z", "updated_at": "2021-11-19T20:31:52Z", "author_association": "OWNER", "body": "Modified my `Dockerfile` to do this:\r\n\r\n RUN pip install https://github.com/simonw/datasette/archive/ff0dd4da38d48c2fa9250ecf336002c9ed724e36.zip\r\n\r\nAnd now the `request` in that debug `?_context=1` looks like this:\r\n```\r\n \"request\": \"\"\r\n```\r\nThat explains the bug - that request doesn't maintain the original path prefix of `http://localhost:5000/foo/bar/fixtures?sql=` (also it's been rewritten to `localhost:9000` instead of `localhost:5000`).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974422829", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974422829, "node_id": "IC_kwDOBm6k_c46FIMt", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:26:35Z", "updated_at": "2021-11-19T20:26:35Z", "author_association": "OWNER", "body": "In the `?_context=` debug view the request looks like this:\r\n```\r\n \"request\": \"\",\r\n```\r\nI'm going to add a `repr()` to it such that it's a bit more useful.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974420619", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974420619, "node_id": "IC_kwDOBm6k_c46FHqL", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:25:19Z", "updated_at": "2021-11-19T20:25:19Z", "author_association": "OWNER", "body": "The implementations of `path_with_removed_args` and `path_with_format`:\r\n\r\nhttps://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/utils/__init__.py#L228-L254\r\n\r\nhttps://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/utils/__init__.py#L710-L729", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974418496", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974418496, "node_id": "IC_kwDOBm6k_c46FHJA", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:24:16Z", "updated_at": "2021-11-19T20:24:16Z", "author_association": "OWNER", "body": "Here's the code that generates `edit_sql_url` correctly: https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/views/database.py#L416-L420\r\n\r\nAnd here's the code for `show_hide_link`: https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/views/database.py#L432-L433\r\n\r\nAnd for `url_csv`: https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/views/base.py#L600-L602", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974398399", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974398399, "node_id": "IC_kwDOBm6k_c46FCO_", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:08:20Z", "updated_at": "2021-11-19T20:22:02Z", "author_association": "OWNER", "body": "The relevant test is this one: https://github.com/simonw/datasette/blob/30255055150d7bc0affc8156adc18295495020ff/tests/test_html.py#L1608-L1649\r\n\r\nI modified that test to add `\"/fixtures/facetable?sql=select+1\"` as one of the tested paths, and dropped in an `assert False` to pause it in the debugger:\r\n```\r\n @pytest.mark.parametrize(\r\n \"path\",\r\n [\r\n \"/\",\r\n \"/fixtures\",\r\n \"/fixtures/compound_three_primary_keys\",\r\n \"/fixtures/compound_three_primary_keys/a,a,a\",\r\n \"/fixtures/paginated_view\",\r\n \"/fixtures/facetable\",\r\n \"/fixtures?sql=select+1\",\r\n ],\r\n )\r\n def test_base_url_config(app_client_base_url_prefix, path):\r\n client = app_client_base_url_prefix\r\n response = client.get(\"/prefix/\" + path.lstrip(\"/\"))\r\n soup = Soup(response.body, \"html.parser\")\r\n if path == \"/fixtures?sql=select+1\":\r\n> assert False\r\nE assert False\r\n```\r\nBUT... in the debugger:\r\n```\r\n(Pdb) print(soup)\r\n...\r\n

This data as\r\n json,\r\n testall,\r\n testnone,\r\n testresponse,\r\n CSV

\r\n```\r\nThose all have the correct prefix! But that's not what I'm seeing in my `Dockerfile` reproduction of the issue.\r\n\r\nSomething very weird is going on here.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974405016", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974405016, "node_id": "IC_kwDOBm6k_c46FD2Y", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:14:19Z", "updated_at": "2021-11-19T20:15:05Z", "author_association": "OWNER", "body": "I added `template_debug` in the Dockerfile:\r\n```\r\ndatasette fixtures.db --setting template_debug 1 --setting base_url \"/foo/bar/\" -p 9000 &\\n\\\r\n```\r\nAnd then hit `http://localhost:5000/foo/bar/fixtures?sql=select+*+from+compound_three_primary_keys+limit+1&_context=1` to view the template context - and it showed the bug, output edited to just show relevant keys:\r\n\r\n```json\r\n{\r\n \"edit_sql_url\": \"/foo/bar/fixtures?sql=select+%2A+from+compound_three_primary_keys+limit+1\",\r\n \"settings\": {\r\n \"force_https_urls\": false,\r\n \"template_debug\": true,\r\n \"trace_debug\": false,\r\n \"base_url\": \"/foo/bar/\"\r\n },\r\n \"show_hide_link\": \"/fixtures?sql=select+%2A+from+compound_three_primary_keys+limit+1&_context=1&_hide_sql=1\",\r\n \"show_hide_text\": \"hide\",\r\n \"show_hide_hidden\": \"\",\r\n \"renderers\": {\r\n \"json\": \"/fixtures.json?sql=select+*+from+compound_three_primary_keys+limit+1&_context=1\"\r\n },\r\n \"url_csv\": \"/fixtures.csv?sql=select+*+from+compound_three_primary_keys+limit+1&_context=1&_size=max\",\r\n \"url_csv_path\": \"/fixtures.csv\",\r\n \"base_url\": \"/foo/bar/\"\r\n}\r\n```\r\nThis is so strange. `edit_sql_url` and `base_url` are correct, but `show_hide_link` and `url_csv` and `renderers.json` are not.\r\n\r\nAnd it's _really strange_ that the bug doesn't show up in the tests.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974391204", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974391204, "node_id": "IC_kwDOBm6k_c46FAek", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:02:41Z", "updated_at": "2021-11-19T20:02:41Z", "author_association": "OWNER", "body": "Bug confirmed:\r\n\r\n![proxy-bug](https://user-images.githubusercontent.com/9599/142684666-112136bf-9243-4b6e-8202-339fcfe91bcc.gif)\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974389472", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974389472, "node_id": "IC_kwDOBm6k_c46FADg", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:01:02Z", "updated_at": "2021-11-19T20:01:02Z", "author_association": "OWNER", "body": "I now have a `Dockerfile` in https://github.com/simonw/datasette/issues/1521#issuecomment-974388295 that I can use to run a local Apache 2 with `mod_proxy` to investigate this class of bugs!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1521#issuecomment-974388295", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1521", "id": 974388295, "node_id": "IC_kwDOBm6k_c46E_xH", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T20:00:06Z", "updated_at": "2021-11-19T20:00:06Z", "author_association": "OWNER", "body": "And this is the version that proxies to a `base_url` of `/foo/bar/`:\r\n\r\n```Dockerfile\r\nFROM python:3-alpine\r\n\r\nRUN apk add --no-cache \\\r\n\tapache2 \\\r\n\tapache2-proxy \\\r\n\tbash\r\n\r\nRUN pip install datasette\r\n\r\nENV TINI_VERSION v0.18.0\r\nADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini\r\nRUN chmod +x /tini\r\n\r\n# Append this to the end of the default httpd.conf file\r\nRUN echo $'ServerName localhost\\n\\\r\n\\n\\\r\n\\n\\\r\n Order deny,allow\\n\\\r\n Allow from all\\n\\\r\n\\n\\\r\n\\n\\\r\nProxyPass /foo/bar/ http://localhost:9000/\\n\\\r\nHeader add X-Proxied-By \"Apache2\"' >> /etc/apache2/httpd.conf\r\n\r\nRUN echo $'Datasette' > /var/www/localhost/htdocs/index.html\r\n\r\nWORKDIR /app\r\n\r\nADD https://latest.datasette.io/fixtures.db /app/fixtures.db\r\n\r\nRUN echo $'#!/usr/bin/env bash\\n\\\r\nset -e\\n\\\r\n\\n\\\r\nhttpd -D FOREGROUND &\\n\\\r\ndatasette fixtures.db --setting base_url \"/foo/bar/\" -p 9000 &\\n\\\r\n\\n\\\r\nwait -n' > /app/start.sh\r\n\r\nRUN chmod +x /app/start.sh\r\n\r\nEXPOSE 80\r\nENTRYPOINT [\"/tini\", \"--\", \"/app/start.sh\"]\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058815557, "label": "Docker configuration for exercising Datasette behind Apache mod_proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1521#issuecomment-974380798", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1521", "id": 974380798, "node_id": "IC_kwDOBm6k_c46E97-", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T19:54:26Z", "updated_at": "2021-11-19T19:54:26Z", "author_association": "OWNER", "body": "Got it working! Here's a `Dockerfile` which runs completely stand-alone (thanks to using the `echo $'` trick to write out the config files it needs) and successfully serves Datasette behind Apache and `mod_proxy`:\r\n\r\n```Dockerfile\r\nFROM python:3-alpine\r\n\r\nRUN apk add --no-cache \\\r\n\tapache2 \\\r\n\tapache2-proxy \\\r\n\tbash\r\n\r\nRUN pip install datasette\r\n\r\nENV TINI_VERSION v0.18.0\r\nADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini\r\nRUN chmod +x /tini\r\n\r\n# Append this to the end of the default httpd.conf file\r\nRUN echo $'ServerName localhost\\n\\\r\n\\n\\\r\n\\n\\\r\n Order deny,allow\\n\\\r\n Allow from all\\n\\\r\n\\n\\\r\n\\n\\\r\nProxyPass / http://localhost:9000/\\n\\\r\nProxyPassReverse / http://localhost:9000/\\n\\\r\nHeader add X-Proxied-By \"Apache2\"' >> /etc/apache2/httpd.conf\r\n\r\nWORKDIR /app\r\n\r\nRUN echo $'#!/usr/bin/env bash\\n\\\r\nset -e\\n\\\r\n\\n\\\r\nhttpd -D FOREGROUND &\\n\\\r\ndatasette -p 9000 &\\n\\\r\n\\n\\\r\nwait -n' > /app/start.sh\r\n\r\nRUN chmod +x /app/start.sh\r\n\r\nEXPOSE 80\r\nENTRYPOINT [\"/tini\", \"--\", \"/app/start.sh\"]\r\n```\r\n\r\nRun it like this:\r\n\r\n```\r\ndocker build -t datasette-apache2-proxy . \r\ndocker run -p 5000:80 --rm datasette-apache2-proxy\r\n```\r\nThen run this to confirm:\r\n```\r\n~ % curl -i 'http://localhost:5000/-/versions.json'\r\nHTTP/1.1 200 OK\r\nDate: Fri, 19 Nov 2021 19:54:05 GMT\r\nServer: uvicorn\r\ncontent-type: application/json; charset=utf-8\r\nX-Proxied-By: Apache2\r\nTransfer-Encoding: chunked\r\n\r\n{\"python\": {\"version\": \"3.10.0\", \"full\": \"3.10.0 (default, Nov 13 2021, 03:23:03) [GCC 10.3.1 20210424]\"}, \"datasette\": {\"version\": \"0.59.2\"}, \"asgi\": \"3.0\", \"uvicorn\": \"0.15.0\", \"sqlite\": {\"version\": \"3.35.5\", \"fts_versions\": [\"FTS5\", \"FTS4\", \"FTS3\"], \"extensions\": {\"json1\": null}, \"compile_options\": [\"COMPILER=gcc-10.3.1 20210424\", \"ENABLE_COLUMN_METADATA\", \"ENABLE_DBSTAT_VTAB\", \"ENABLE_FTS3\", \"ENABLE_FTS3_PARENTHESIS\", \"ENABLE_FTS4\", \"ENABLE_FTS5\", \"ENABLE_GEOPOLY\", \"ENABLE_JSON1\", \"ENABLE_MATH_FUNCTIONS\", \"ENABLE_RTREE\", \"ENABLE_UNLOCK_NOTIFY\", \"MAX_VARIABLE_NUMBER=250000\", \"SECURE_DELETE\", \"THREADSAFE=1\", \"USE_URI\"]}}\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058815557, "label": "Docker configuration for exercising Datasette behind Apache mod_proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1521#issuecomment-974371116", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1521", "id": 974371116, "node_id": "IC_kwDOBm6k_c46E7ks", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T19:45:47Z", "updated_at": "2021-11-19T19:45:47Z", "author_association": "OWNER", "body": "https://github.com/krallin/tini says:\r\n\r\n> *NOTE: If you are using Docker 1.13 or greater, Tini is included in Docker itself. This includes all versions of Docker CE. To enable Tini, just [pass the `--init` flag to `docker run`](https://docs.docker.com/engine/reference/commandline/run/).*", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058815557, "label": "Docker configuration for exercising Datasette behind Apache mod_proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1521#issuecomment-974336020", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1521", "id": 974336020, "node_id": "IC_kwDOBm6k_c46EzAU", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T19:10:48Z", "updated_at": "2021-11-19T19:10:48Z", "author_association": "OWNER", "body": "There's a promising looking minimal Apache 2 proxy config here: https://stackoverflow.com/questions/26474476/minimal-configuration-for-apache-reverse-proxy-in-docker-container\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058815557, "label": "Docker configuration for exercising Datasette behind Apache mod_proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1521#issuecomment-974334278", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1521", "id": 974334278, "node_id": "IC_kwDOBm6k_c46EylG", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T19:08:09Z", "updated_at": "2021-11-19T19:08:09Z", "author_association": "OWNER", "body": "Stripping comments using this StackOverflow recipe: https://unix.stackexchange.com/a/157619\r\n\r\n docker run -it --entrypoint sh alpine-apache2-sh \\\r\n -c \"cat /etc/apache2/httpd.conf\" | sed '/^[[:blank:]]*#/d;s/#.*//'\r\n\r\nResult is here: https://gist.github.com/simonw/0a05090df5fcff8e8b3334621fa17976", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058815557, "label": "Docker configuration for exercising Datasette behind Apache mod_proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1521#issuecomment-974332787", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1521", "id": 974332787, "node_id": "IC_kwDOBm6k_c46EyNz", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T19:05:52Z", "updated_at": "2021-11-19T19:05:52Z", "author_association": "OWNER", "body": "Made myself this Dockerfile to let me explore a bit:\r\n```Dockerfile\r\nFROM python:3-alpine\r\n\r\nRUN apk add --no-cache \\\r\n\tapache2\r\n\r\nCMD [\"sh\"]\r\n```\r\nThen:\r\n```\r\n% docker run alpine-apache2-sh\r\n% docker run -it alpine-apache2-sh\r\n/ # ls /etc/apache2/httpd.conf\r\n/etc/apache2/httpd.conf\r\n/ # cat /etc/apache2/httpd.conf\r\n#\r\n# This is the main Apache HTTP server configuration file. It contains the\r\n# configuration directives that give the server its instructions.\r\n...\r\n```\r\nCopying that into a GIST like so:\r\n```\r\ndocker run -it --entrypoint sh alpine-apache2-sh -c \"cat /etc/apache2/httpd.conf\" | pbcopy\r\n```\r\nGist here: https://gist.github.com/simonw/5ea0db6049192cb9f761fbd6beb3a84a", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058815557, "label": "Docker configuration for exercising Datasette behind Apache mod_proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1521#issuecomment-974327812", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1521", "id": 974327812, "node_id": "IC_kwDOBm6k_c46ExAE", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T18:58:49Z", "updated_at": "2021-11-19T18:59:55Z", "author_association": "OWNER", "body": "From this example: https://github.com/tigelane/dockerfiles/blob/06cff2ac8cdc920ebd64f50965115eaa3d0afb84/Alpine-Apache2/Dockerfile#L25-L31 it looks like running `apk add apache2` installs a config file at `/etc/apache2/httpd.conf` - so one approach is to then modify that file.\r\n\r\n```\r\n# APACHE - Alpine\r\n#################\r\nRUN apk --update add apache2 php5-apache2 && \\\r\n #apk add openrc --no-cache && \\\r\n rm -rf /var/cache/apk/* && \\\r\n sed -i 's/#ServerName www.example.com:80/ServerName localhost/' /etc/apache2/httpd.conf && \\\r\n mkdir -p /run/apache2/\r\n\r\n# Upload our files from folder \"dist\".\r\nCOPY dist /var/www/localhost/htdocs\r\n\r\n# Manually set up the apache environment variables\r\nENV APACHE_RUN_USER www-data\r\nENV APACHE_RUN_GROUP www-data\r\nENV APACHE_LOG_DIR /var/log/apache2\r\nENV APACHE_LOCK_DIR /var/lock/apache2\r\nENV APACHE_PID_FILE /var/run/apache2.pid\r\n\r\n# Execute apache2 on run\r\n########################\r\nEXPOSE 80\r\nENTRYPOINT [\"httpd\"]\r\nCMD [\"-D\", \"FOREGROUND\"]\r\n```\r\n\r\nI think I'll create my own separate copy and modify that.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058815557, "label": "Docker configuration for exercising Datasette behind Apache mod_proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1521#issuecomment-974321391", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1521", "id": 974321391, "node_id": "IC_kwDOBm6k_c46Evbv", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T18:49:15Z", "updated_at": "2021-11-19T18:57:18Z", "author_association": "OWNER", "body": "This pattern looks like it can help: https://ahmet.im/blog/cloud-run-multiple-processes-easy-way/ - see example in https://github.com/ahmetb/multi-process-container-lazy-solution\r\n\r\nI got that demo working locally like this:\r\n\r\n```bash\r\ncd /tmp\r\ngit clone https://github.com/ahmetb/multi-process-container-lazy-solution\r\ncd multi-process-container-lazy-solution\r\ndocker build -t multi-process-container-lazy-solution .\r\ndocker run -p 5000:8080 --rm multi-process-container-lazy-solution\r\n```\r\n\r\nI want to use `apache2` rather than `nginx` though. I found a few relevant examples of Apache in Alpine:\r\n\r\n- https://github.com/Hacking-Lab/alpine-apache2-reverse-proxy/blob/master/Dockerfile\r\n- https://www.sentiatechblog.com/running-apache-in-a-docker-container\r\n- https://github.com/search?l=Dockerfile&q=alpine+apache2&type=code\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058815557, "label": "Docker configuration for exercising Datasette behind Apache mod_proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1521#issuecomment-974322178", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1521", "id": 974322178, "node_id": "IC_kwDOBm6k_c46EvoC", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T18:50:22Z", "updated_at": "2021-11-19T18:50:22Z", "author_association": "OWNER", "body": "I'll get this working on my laptop first, but then I want to get it up and running on Cloud Run - maybe with a GitHub Actions workflow in this repo that re-deploys it on manual execution.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058815557, "label": "Docker configuration for exercising Datasette behind Apache mod_proxy"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974310208", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974310208, "node_id": "IC_kwDOBm6k_c46EstA", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T18:32:31Z", "updated_at": "2021-11-19T18:32:31Z", "author_association": "OWNER", "body": "Having a live demo running on Cloud Run that proxies through Apache and uses `base_url` would be incredibly useful for replicating and debugging this kind of thing. I wonder how hard it is to run Apache and `mod_proxy` in the same Docker container as Datasette?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1519#issuecomment-974309591", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1519", "id": 974309591, "node_id": "IC_kwDOBm6k_c46EsjX", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T18:31:32Z", "updated_at": "2021-11-19T18:31:32Z", "author_association": "OWNER", "body": "`base_url` has been a source of so many bugs like this! I often find them quite hard to replicate, likely because I haven't made myself a good Apache `mod_proxy` testing environment yet.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058790545, "label": "base_url is omitted in JSON and CSV views"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1520#issuecomment-974308215", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1520", "id": 974308215, "node_id": "IC_kwDOBm6k_c46EsN3", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T18:29:26Z", "updated_at": "2021-11-19T18:29:26Z", "author_association": "OWNER", "body": "The solution that jumps to mind first is that it would be neat if routes could return something that meant \"actually my bad, I can't handle this after all - move to the next one in the list\".\r\n\r\nA related idea: it might be useful for custom views like my one here to say \"no actually call the default view for this, but give me back the response so I can modify it in some way\". Kind of like Django or ASGI middleware.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058803238, "label": "Pattern for avoiding accidental URL over-rides"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-974300823", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 974300823, "node_id": "IC_kwDOBm6k_c46EqaX", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T18:18:32Z", "updated_at": "2021-11-19T18:18:32Z", "author_association": "OWNER", "body": "> This may be an argument for continuing to allow non-JSON-objects through to the HTML templates. Need to think about that a bit more.\r\n\r\nI can definitely support this using pure-JSON - I could make two versions of the row available, one that's an array of cell objects and the other that's an object mapping column names to column raw values.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-974285803", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 974285803, "node_id": "IC_kwDOBm6k_c46Emvr", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T17:56:48Z", "updated_at": "2021-11-19T18:14:30Z", "author_association": "OWNER", "body": "Very confused by this piece of code here: https://github.com/simonw/datasette/blob/1c13e1af0664a4dfb1e69714c56523279cae09e4/datasette/views/table.py#L37-L63\r\n\r\nI added it in https://github.com/simonw/datasette/commit/754836eef043676e84626c4fd3cb993eed0d2976 - in the new world that should probably be replaced by pure JSON.\r\n\r\nAha - this comment explains it: https://github.com/simonw/datasette/issues/521#issuecomment-505279560\r\n\r\n> I think the trick is to redefine what a \"cell_row\" is. Each row is currently a list of cells:\r\n> \r\n> https://github.com/simonw/datasette/blob/6341f8cbc7833022012804dea120b838ec1f6558/datasette/views/table.py#L159-L163\r\n> \r\n> I can redefine the row (the `cells` variable in the above example) as a thing-that-iterates-cells (hence behaving like a list) but that also supports `__getitem__` access for looking up cell values if you know the name of the column.\r\n\r\nThe goal was to support neater custom templates like this:\r\n```html+jinja\r\n{% for row in display_rows %}\r\n

{{ row[\"First_Name\"] }} {{ row[\"Last_Name\"] }}

\r\n ...\r\n```\r\nThis may be an argument for continuing to allow non-JSON-objects through to the HTML templates. Need to think about that a bit more.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-974287570", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 974287570, "node_id": "IC_kwDOBm6k_c46EnLS", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T17:59:33Z", "updated_at": "2021-11-19T17:59:33Z", "author_association": "OWNER", "body": "I'm going to try leaning into the `asyncinject` mechanism a bit here. One method can execute and return the raw rows. Another can turn that into the default minimal JSON representation. Then a third can take that (or take both) and use it to inflate out the JSON that the HTML template needs, with those extras and with the rendered cells from plugins.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1495#issuecomment-974108455", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1495", "id": 974108455, "node_id": "IC_kwDOBm6k_c46D7cn", "user": {"value": 192568, "label": "mroswell"}, "created_at": "2021-11-19T14:14:35Z", "updated_at": "2021-11-19T14:14:35Z", "author_association": "CONTRIBUTOR", "body": "A nudge on this.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1033678984, "label": "Allow routes to have extra options"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973820125", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/342", "id": 973820125, "node_id": "IC_kwDOCGYnMM46C1Dd", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T07:25:55Z", "updated_at": "2021-11-19T07:25:55Z", "author_association": "OWNER", "body": "`alter=True` doesn't make sense to support here either, because `.lookup()` already adds missing columns: https://github.com/simonw/sqlite-utils/blob/3b8abe608796e99e4ffc5f3f4597a85e605c0e9b/sqlite_utils/db.py#L2743-L2746", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058196641, "label": "Extra options to `lookup()` which get passed to `insert()`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802998", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/342", "id": 973802998, "node_id": "IC_kwDOCGYnMM46Cw32", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T06:59:22Z", "updated_at": "2021-11-19T06:59:32Z", "author_association": "OWNER", "body": "I don't think I need the `DEFAULT` defaults for `.insert()` either, since it just passes through to `.insert()`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058196641, "label": "Extra options to `lookup()` which get passed to `insert()`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802766", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/342", "id": 973802766, "node_id": "IC_kwDOCGYnMM46Cw0O", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T06:58:45Z", "updated_at": "2021-11-19T06:58:45Z", "author_association": "OWNER", "body": "And neither does `hash_id`. On that basis I'm going to specifically list the ones that DO make sense, and hope that I remember to add any new ones in the future. I can add a code comment hint to `.insert()` about that.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058196641, "label": "Extra options to `lookup()` which get passed to `insert()`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802469", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/342", "id": 973802469, "node_id": "IC_kwDOCGYnMM46Cwvl", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T06:58:03Z", "updated_at": "2021-11-19T06:58:03Z", "author_association": "OWNER", "body": "Also: I don't think `ignore=` and `replace=` make sense in the context of `lookup()`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058196641, "label": "Extra options to `lookup()` which get passed to `insert()`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802308", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/342", "id": 973802308, "node_id": "IC_kwDOCGYnMM46CwtE", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T06:57:37Z", "updated_at": "2021-11-19T06:57:37Z", "author_association": "OWNER", "body": "Here's the current full method signature for `.insert()`: https://github.com/simonw/sqlite-utils/blob/3b8abe608796e99e4ffc5f3f4597a85e605c0e9b/sqlite_utils/db.py#L2462-L2477\r\n\r\nI could add a test which uses introspection (`inspect.signature(method).parameters`) to confirm that `.lookup()` has a super-set of the arguments accepted by `.insert()`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058196641, "label": "Extra options to `lookup()` which get passed to `insert()`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973801650", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/342", "id": 973801650, "node_id": "IC_kwDOCGYnMM46Cwiy", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T06:55:56Z", "updated_at": "2021-11-19T06:55:56Z", "author_association": "OWNER", "body": "`pk` needs to be an explicit argument to `.lookup()`. The rest could be `**kwargs` passed through to `.insert()`, like this hacked together version (docstring removed for brevity):\r\n\r\n```python\r\n def lookup(\r\n self,\r\n lookup_values: Dict[str, Any],\r\n extra_values: Optional[Dict[str, Any]] = None,\r\n pk=\"id\",\r\n **insert_kwargs,\r\n ):\r\n \"\"\"\r\n assert isinstance(lookup_values, dict)\r\n if extra_values is not None:\r\n assert isinstance(extra_values, dict)\r\n combined_values = dict(lookup_values)\r\n if extra_values is not None:\r\n combined_values.update(extra_values)\r\n if self.exists():\r\n self.add_missing_columns([combined_values])\r\n unique_column_sets = [set(i.columns) for i in self.indexes]\r\n if set(lookup_values.keys()) not in unique_column_sets:\r\n self.create_index(lookup_values.keys(), unique=True)\r\n wheres = [\"[{}] = ?\".format(column) for column in lookup_values]\r\n rows = list(\r\n self.rows_where(\r\n \" and \".join(wheres), [value for _, value in lookup_values.items()]\r\n )\r\n )\r\n try:\r\n return rows[0][pk]\r\n except IndexError:\r\n return self.insert(combined_values, pk=pk, **insert_kwargs).last_pk\r\n else:\r\n pk = self.insert(combined_values, pk=pk, **insert_kwargs).last_pk\r\n self.create_index(lookup_values.keys(), unique=True)\r\n return pk\r\n```\r\nI think I'll explicitly list the parameters, mainly so they can be typed and covered by automatic documentation.\r\n\r\nI do worry that I'll add more keyword arguments to `.insert()` in the future and forget to mirror them to `.lookup()` though.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058196641, "label": "Extra options to `lookup()` which get passed to `insert()`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973800795", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/342", "id": 973800795, "node_id": "IC_kwDOCGYnMM46CwVb", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T06:54:08Z", "updated_at": "2021-11-19T06:54:08Z", "author_association": "OWNER", "body": "Looking at the code for `lookup()` it currently hard-codes `pk` to `\"id\"` - but it actually only calls `.insert()` in two places, both of which could be passed extra arguments.\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/3b8abe608796e99e4ffc5f3f4597a85e605c0e9b/sqlite_utils/db.py#L2756-L2763", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058196641, "label": "Extra options to `lookup()` which get passed to `insert()`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-973700549", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 973700549, "node_id": "IC_kwDOBm6k_c46CX3F", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T03:31:20Z", "updated_at": "2021-11-19T03:31:26Z", "author_association": "OWNER", "body": "... and while I'm doing all of this I can rewrite the templates to not use those cheating magical functions AND document the template context at the same time, refs:\r\n- #1510.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-973700322", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 973700322, "node_id": "IC_kwDOBm6k_c46CXzi", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T03:30:30Z", "updated_at": "2021-11-19T03:30:30Z", "author_association": "OWNER", "body": "Right now the HTML version gets to cheat - it passes through objects that are not JSON serializable, including custom functions that can then be called by Jinja.\r\n\r\nI'm interested in maybe removing this cheating - if the HTML version could only request JSON-serializable extras those could be exposed in the API as well.\r\n\r\nIt would also help cleanup the kind-of-nasty pattern I use in the current `BaseView` where everything returns both a bunch of JSON-serializable data AND an awaitable function that then gets to add extra things to the HTML context.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-973698917", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 973698917, "node_id": "IC_kwDOBm6k_c46CXdl", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T03:26:18Z", "updated_at": "2021-11-19T03:29:03Z", "author_association": "OWNER", "body": "A (likely incomplete) list of features on the table page:\r\n\r\n- [ ] Display table/database/instance metadata\r\n- [ ] Show count of all results\r\n- [ ] Display table of results\r\n - [ ] Special table display treatment for URLs, numbers\r\n - [ ] Allow plugins to modify table cells\r\n - [ ] Respect `?_col=` and `?_nocol=`\r\n- [ ] Show interface for filtering by columns and operations\r\n- [ ] Show search box, support executing FTS searches\r\n- [ ] Sort table by specified column\r\n- [ ] Paginate table\r\n- [ ] Show facet results\r\n- [ ] Show suggested facets\r\n- [ ] Link to available exports\r\n- [ ] Display schema for table\r\n - [ ] Maybe it should show the SQL for the query too?\r\n- [ ] Handle various non-obvious querystring options, like `?_where=` and `?_through=`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-973699424", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 973699424, "node_id": "IC_kwDOBm6k_c46CXlg", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T03:27:49Z", "updated_at": "2021-11-19T03:27:49Z", "author_association": "OWNER", "body": "My goal is to break up a lot of this functionality into separate methods. These methods can be executed in parallel by `asyncinject`, but more importantly they can be used to build a much better JSON representation, where the default representation is lighter and `?_extra=x` options can be used to execute more expensive portions and add them to the response.\r\n\r\nSo the HTML version itself needs to be re-written to use those JSON extras.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1517#issuecomment-973696604", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1517", "id": 973696604, "node_id": "IC_kwDOBm6k_c46CW5c", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T03:20:00Z", "updated_at": "2021-11-19T03:20:00Z", "author_association": "OWNER", "body": "Confirmed - my test plugin is indeed correctly over-riding the table page.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1057996111, "label": "Let `register_routes()` over-ride default routes within Datasette"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-973687978", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 973687978, "node_id": "IC_kwDOBm6k_c46CUyq", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T03:07:47Z", "updated_at": "2021-11-19T03:07:47Z", "author_association": "OWNER", "body": "I was wrong about that, you CAN over-ride default routes already.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1517#issuecomment-973686874", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1517", "id": 973686874, "node_id": "IC_kwDOBm6k_c46CUha", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T03:06:58Z", "updated_at": "2021-11-19T03:06:58Z", "author_association": "OWNER", "body": "I made a mistake: I just wrote a test that proves that plugins CAN over-ride default routes, plus if you look at the code here the plugins get to register themselves first: https://github.com/simonw/datasette/blob/0156c6b5e52d541e93f0d68e9245f20ae83bc933/datasette/app.py#L965-L981", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1057996111, "label": "Let `register_routes()` over-ride default routes within Datasette"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-973682389", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 973682389, "node_id": "IC_kwDOBm6k_c46CTbV", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T02:57:39Z", "updated_at": "2021-11-19T02:57:39Z", "author_association": "OWNER", "body": "Ideally I'd like to execute the existing test suite against the new implementation - that would require me to solve this so I can replace the view with the plugin version though:\r\n\r\n- #1517 ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-973681970", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 973681970, "node_id": "IC_kwDOBm6k_c46CTUy", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T02:56:31Z", "updated_at": "2021-11-19T02:56:53Z", "author_association": "OWNER", "body": "Here's where I got to with my hacked-together initial plugin prototype - it managed to render the table page with some rows on it (and a bunch of missing functionality such as filters): https://gist.github.com/simonw/281eac9c73b062c3469607ad86470eb2\r\n\r\n\"fixtures__roadside_attractions__4_rows_and__11__Liked___Twitter\"\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973678931", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973678931, "node_id": "IC_kwDOBm6k_c46CSlT", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T02:51:17Z", "updated_at": "2021-11-19T02:51:17Z", "author_association": "OWNER", "body": "OK, I managed to get a table to render! Here's the code I used - I had to copy a LOT of stuff. https://gist.github.com/simonw/281eac9c73b062c3469607ad86470eb2\r\n\r\nI'm going to move this work into a new, separate issue.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973635157", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973635157, "node_id": "IC_kwDOBm6k_c46CH5V", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T01:07:08Z", "updated_at": "2021-11-19T01:07:08Z", "author_association": "OWNER", "body": "This exercise is proving so useful in getting my head around how the enormous and complex `TableView` class works again.\r\n\r\nHere's where I've got to now - I'm systematically working through the variables that are returned for HTML and for JSON copying across code to get it to work:\r\n\r\n```python\r\nfrom datasette.database import QueryInterrupted\r\nfrom datasette.utils import escape_sqlite\r\nfrom datasette.utils.asgi import Response, NotFound, Forbidden\r\nfrom datasette.views.base import DatasetteError\r\nfrom datasette import hookimpl\r\nfrom asyncinject import AsyncInject, inject\r\nfrom pprint import pformat\r\n\r\n\r\nclass Table(AsyncInject):\r\n @inject\r\n async def database(self, request, datasette):\r\n # TODO: all that nasty hash resolving stuff can go here\r\n db_name = request.url_vars[\"db_name\"]\r\n try:\r\n db = datasette.databases[db_name]\r\n except KeyError:\r\n raise NotFound(f\"Database '{db_name}' does not exist\")\r\n return db\r\n\r\n @inject\r\n async def table_and_format(self, request, database, datasette):\r\n table_and_format = request.url_vars[\"table_and_format\"]\r\n # TODO: be a lot smarter here\r\n if \".\" in table_and_format:\r\n return table_and_format.split(\".\", 2)\r\n else:\r\n return table_and_format, \"html\"\r\n\r\n @inject\r\n async def main(self, request, database, table_and_format, datasette):\r\n # TODO: if this is actually a canned query, dispatch to it\r\n\r\n table, format = table_and_format\r\n\r\n is_view = bool(await database.get_view_definition(table))\r\n table_exists = bool(await database.table_exists(table))\r\n if not is_view and not table_exists:\r\n raise NotFound(f\"Table not found: {table}\")\r\n\r\n await check_permissions(\r\n datasette,\r\n request,\r\n [\r\n (\"view-table\", (database.name, table)),\r\n (\"view-database\", database.name),\r\n \"view-instance\",\r\n ],\r\n )\r\n\r\n private = not await datasette.permission_allowed(\r\n None, \"view-table\", (database.name, table), default=True\r\n )\r\n\r\n pks = await database.primary_keys(table)\r\n table_columns = await database.table_columns(table)\r\n\r\n specified_columns = await columns_to_select(datasette, database, table, request)\r\n select_specified_columns = \", \".join(\r\n escape_sqlite(t) for t in specified_columns\r\n )\r\n select_all_columns = \", \".join(escape_sqlite(t) for t in table_columns)\r\n\r\n use_rowid = not pks and not is_view\r\n if use_rowid:\r\n select_specified_columns = f\"rowid, {select_specified_columns}\"\r\n select_all_columns = f\"rowid, {select_all_columns}\"\r\n order_by = \"rowid\"\r\n order_by_pks = \"rowid\"\r\n else:\r\n order_by_pks = \", \".join([escape_sqlite(pk) for pk in pks])\r\n order_by = order_by_pks\r\n\r\n if is_view:\r\n order_by = \"\"\r\n\r\n nocount = request.args.get(\"_nocount\")\r\n nofacet = request.args.get(\"_nofacet\")\r\n\r\n if request.args.get(\"_shape\") in (\"array\", \"object\"):\r\n nocount = True\r\n nofacet = True\r\n\r\n # Next, a TON of SQL to build where_params and filters and suchlike\r\n # skipping that and jumping straight to...\r\n where_clauses = []\r\n where_clause = \"\"\r\n if where_clauses:\r\n where_clause = f\"where {' and '.join(where_clauses)} \"\r\n\r\n from_sql = \"from {table_name} {where}\".format(\r\n table_name=escape_sqlite(table),\r\n where=(\"where {} \".format(\" and \".join(where_clauses)))\r\n if where_clauses\r\n else \"\",\r\n )\r\n from_sql_params ={}\r\n params = {}\r\n count_sql = f\"select count(*) {from_sql}\"\r\n sql_no_order_no_limit = (\r\n \"select {select_all_columns} from {table_name} {where}\".format(\r\n select_all_columns=select_all_columns,\r\n table_name=escape_sqlite(table),\r\n where=where_clause,\r\n )\r\n )\r\n\r\n page_size = 100\r\n offset = \" offset 0\"\r\n\r\n sql = \"select {select_specified_columns} from {table_name} {where}{order_by} limit {page_size}{offset}\".format(\r\n select_specified_columns=select_specified_columns,\r\n table_name=escape_sqlite(table),\r\n where=where_clause,\r\n order_by=order_by,\r\n page_size=page_size + 1,\r\n offset=offset,\r\n )\r\n\r\n # Fetch rows\r\n results = await database.execute(sql, params, truncate=True)\r\n columns = [r[0] for r in results.description]\r\n rows = list(results.rows)\r\n\r\n # Fetch count\r\n filtered_table_rows_count = None\r\n if count_sql:\r\n try:\r\n count_rows = list(await database.execute(count_sql, from_sql_params))\r\n filtered_table_rows_count = count_rows[0][0]\r\n except QueryInterrupted:\r\n pass\r\n\r\n\r\n vars = {\r\n \"json\": {\r\n # THIS STUFF is from the regular JSON\r\n \"database\": database.name,\r\n \"table\": table,\r\n \"is_view\": is_view,\r\n # \"human_description_en\": human_description_en,\r\n \"rows\": rows[:page_size],\r\n \"truncated\": results.truncated,\r\n \"filtered_table_rows_count\": filtered_table_rows_count,\r\n # \"expanded_columns\": expanded_columns,\r\n # \"expandable_columns\": expandable_columns,\r\n \"columns\": columns,\r\n \"primary_keys\": pks,\r\n # \"units\": units,\r\n \"query\": {\"sql\": sql, \"params\": params},\r\n # \"facet_results\": facet_results,\r\n # \"suggested_facets\": suggested_facets,\r\n # \"next\": next_value and str(next_value) or None,\r\n # \"next_url\": next_url,\r\n \"private\": private,\r\n \"allow_execute_sql\": await datasette.permission_allowed(\r\n request.actor, \"execute-sql\", database, default=True\r\n ),\r\n },\r\n \"html\": {\r\n # ... this is the HTML special stuff\r\n # \"table_actions\": table_actions,\r\n # \"supports_search\": bool(fts_table),\r\n # \"search\": search or \"\",\r\n \"use_rowid\": use_rowid,\r\n # \"filters\": filters,\r\n # \"display_columns\": display_columns,\r\n # \"filter_columns\": filter_columns,\r\n # \"display_rows\": display_rows,\r\n # \"facets_timed_out\": facets_timed_out,\r\n # \"sorted_facet_results\": sorted(\r\n # facet_results.values(),\r\n # key=lambda f: (len(f[\"results\"]), f[\"name\"]),\r\n # reverse=True,\r\n # ),\r\n # \"show_facet_counts\": special_args.get(\"_facet_size\") == \"max\",\r\n # \"extra_wheres_for_ui\": extra_wheres_for_ui,\r\n # \"form_hidden_args\": form_hidden_args,\r\n # \"is_sortable\": any(c[\"sortable\"] for c in display_columns),\r\n # \"path_with_replaced_args\": path_with_replaced_args,\r\n # \"path_with_removed_args\": path_with_removed_args,\r\n # \"append_querystring\": append_querystring,\r\n \"request\": request,\r\n # \"sort\": sort,\r\n # \"sort_desc\": sort_desc,\r\n \"disable_sort\": is_view,\r\n # \"custom_table_templates\": [\r\n # f\"_table-{to_css_class(database)}-{to_css_class(table)}.html\",\r\n # f\"_table-table-{to_css_class(database)}-{to_css_class(table)}.html\",\r\n # \"_table.html\",\r\n # ],\r\n # \"metadata\": metadata,\r\n # \"view_definition\": await db.get_view_definition(table),\r\n # \"table_definition\": await db.get_table_definition(table),\r\n },\r\n }\r\n\r\n # I'm just trying to get HTML to work for the moment\r\n if format == \"json\":\r\n return Response.json(dict(vars, locals=locals()), default=repr)\r\n else:\r\n return Response.html(repr(vars[\"html\"]))\r\n\r\n async def view(self, request, datasette):\r\n return await self.main(request=request, datasette=datasette)\r\n\r\n\r\n@hookimpl\r\ndef register_routes():\r\n return [\r\n (r\"/t/(?P[^/]+)/(?P[^/]+?$)\", Table().view),\r\n ]\r\n\r\n\r\nasync def check_permissions(datasette, request, permissions):\r\n \"\"\"permissions is a list of (action, resource) tuples or 'action' strings\"\"\"\r\n for permission in permissions:\r\n if isinstance(permission, str):\r\n action = permission\r\n resource = None\r\n elif isinstance(permission, (tuple, list)) and len(permission) == 2:\r\n action, resource = permission\r\n else:\r\n assert (\r\n False\r\n ), \"permission should be string or tuple of two items: {}\".format(\r\n repr(permission)\r\n )\r\n ok = await datasette.permission_allowed(\r\n request.actor,\r\n action,\r\n resource=resource,\r\n default=None,\r\n )\r\n if ok is not None:\r\n if ok:\r\n return\r\n else:\r\n raise Forbidden(action)\r\n\r\n\r\nasync def columns_to_select(datasette, database, table, request):\r\n table_columns = await database.table_columns(table)\r\n pks = await database.primary_keys(table)\r\n columns = list(table_columns)\r\n if \"_col\" in request.args:\r\n columns = list(pks)\r\n _cols = request.args.getlist(\"_col\")\r\n bad_columns = [column for column in _cols if column not in table_columns]\r\n if bad_columns:\r\n raise DatasetteError(\r\n \"_col={} - invalid columns\".format(\", \".join(bad_columns)),\r\n status=400,\r\n )\r\n # De-duplicate maintaining order:\r\n columns.extend(dict.fromkeys(_cols))\r\n if \"_nocol\" in request.args:\r\n # Return all columns EXCEPT these\r\n bad_columns = [\r\n column\r\n for column in request.args.getlist(\"_nocol\")\r\n if (column not in table_columns) or (column in pks)\r\n ]\r\n if bad_columns:\r\n raise DatasetteError(\r\n \"_nocol={} - invalid columns\".format(\", \".join(bad_columns)),\r\n status=400,\r\n )\r\n tmp_columns = [\r\n column for column in columns if column not in request.args.getlist(\"_nocol\")\r\n ]\r\n columns = tmp_columns\r\n return columns\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973568285", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973568285, "node_id": "IC_kwDOBm6k_c46B3kd", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T00:29:20Z", "updated_at": "2021-11-19T00:29:20Z", "author_association": "OWNER", "body": "This is working!\r\n```python\r\nfrom datasette.utils.asgi import Response\r\nfrom datasette import hookimpl\r\nimport html\r\nfrom asyncinject import AsyncInject, inject\r\n\r\n\r\nclass Table(AsyncInject):\r\n @inject\r\n async def database(self, request):\r\n return request.url_vars[\"db_name\"]\r\n\r\n @inject\r\n async def main(self, request, database):\r\n return Response.html(\"Database: {}\".format(\r\n html.escape(database)\r\n ))\r\n\r\n async def view(self, request):\r\n return await self.main(request=request)\r\n\r\n\r\n@hookimpl\r\ndef register_routes():\r\n return [\r\n (r\"/t/(?P[^/]+)/(?P[^/]+?$)\", Table().view),\r\n ]\r\n```\r\nThis project will definitely show me if I actually like the `asyncinject` patterns or not.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973564260", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973564260, "node_id": "IC_kwDOBm6k_c46B2lk", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T00:27:06Z", "updated_at": "2021-11-19T00:27:06Z", "author_association": "OWNER", "body": "Problem: the fancy `asyncinject` stuff inteferes with the fancy Datasette thing that introspects view functions to look for what parameters they take:\r\n```python\r\nclass Table(asyncinject.AsyncInjectAll):\r\n async def view(self, request):\r\n return Response.html(\"Hello from {}\".format(\r\n html.escape(repr(request.url_vars))\r\n ))\r\n\r\n\r\n@hookimpl\r\ndef register_routes():\r\n return [\r\n (r\"/t/(?P[^/]+)/(?P[^/]+?$)\", Table().view),\r\n ]\r\n```\r\nThis failed with error: \"Table.view() takes 1 positional argument but 2 were given\"\r\n\r\nSo I'm going to use `AsyncInject` and have the `view` function NOT use the `@inject` decorator.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973554024", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973554024, "node_id": "IC_kwDOBm6k_c46B0Fo", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T00:21:20Z", "updated_at": "2021-11-19T00:21:20Z", "author_association": "OWNER", "body": "That's annoying: it looks like plugins can't use `register_routes()` to over-ride default routes within Datasette itself. This didn't work:\r\n```python\r\nfrom datasette.utils.asgi import Response\r\nfrom datasette import hookimpl\r\nimport html\r\n\r\n\r\nasync def table(request):\r\n return Response.html(\"Hello from {}\".format(\r\n html.escape(repr(request.url_vars))\r\n ))\r\n\r\n\r\n@hookimpl\r\ndef register_routes():\r\n return [\r\n (r\"/(?P[^/]+)/(?P[^/]+?$)\", table),\r\n ]\r\n```\r\nI'll use a `/t/` prefix for the moment, but this is probably something I'll fix in Datasette itself later.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973542284", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973542284, "node_id": "IC_kwDOBm6k_c46BxOM", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T00:16:44Z", "updated_at": "2021-11-19T00:16:44Z", "author_association": "OWNER", "body": "```\r\nDevelopment % cookiecutter gh:simonw/datasette-plugin\r\nYou've downloaded /Users/simon/.cookiecutters/datasette-plugin before. Is it okay to delete and re-download it? [yes]: yes\r\nplugin_name []: table-new\r\ndescription []: New implementation of TableView, see https://github.com/simonw/datasette/issues/878\r\nhyphenated [table-new]: \r\nunderscored [table_new]: \r\ngithub_username []: simonw\r\nauthor_name []: Simon Willison\r\ninclude_static_directory []: \r\ninclude_templates_directory []: \r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973527870", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973527870, "node_id": "IC_kwDOBm6k_c46Bts-", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T00:13:43Z", "updated_at": "2021-11-19T00:13:43Z", "author_association": "OWNER", "body": "New plan: I'm going to build a brand new implementation of `TableView` starting out as a plugin, using the `register_routes()` plugin hook.\r\n\r\nIt will reuse the existing HTML template but will be a completely new Python implementation, based on `asyncinject`.\r\n\r\nI'm going to start by just getting the table to show up on the page - then I'll add faceting, suggested facets, filters and so-on.\r\n\r\nBonus: I'm going to see if I can get it to work for arbitrary SQL queries too (stretch goal).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1516#issuecomment-972858458", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1516", "id": 972858458, "node_id": "IC_kwDOBm6k_c45_KRa", "user": {"value": 22429695, "label": "codecov[bot]"}, "created_at": "2021-11-18T13:19:01Z", "updated_at": "2021-11-18T13:19:01Z", "author_association": "NONE", "body": "# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report\n> Merging [#1516](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (a82c620) into [main](https://codecov.io/gh/simonw/datasette/commit/0156c6b5e52d541e93f0d68e9245f20ae83bc933?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (0156c6b) will **not change** coverage.\n> The diff coverage is `n/a`.\n\n[![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1516/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n\n```diff\n@@ Coverage Diff @@\n## main #1516 +/- ##\n=======================================\n Coverage 91.82% 91.82% \n=======================================\n Files 34 34 \n Lines 4430 4430 \n=======================================\n Hits 4068 4068 \n Misses 362 362 \n```\n\n\n\n------\n\n[Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n> `\u0394 = absolute (impact)`, `\u00f8 = not affected`, `? = missing data`\n> Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Last update [0156c6b...a82c620](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1057340779, "label": "Bump black from 21.9b0 to 21.11b1"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1514#issuecomment-972852184", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1514", "id": 972852184, "node_id": "IC_kwDOBm6k_c45_IvY", "user": {"value": 49699333, "label": "dependabot[bot]"}, "created_at": "2021-11-18T13:11:15Z", "updated_at": "2021-11-18T13:11:15Z", "author_association": "CONTRIBUTOR", "body": "Superseded by #1516.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1056117435, "label": "Bump black from 21.9b0 to 21.11b0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1514#issuecomment-971575746", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1514", "id": 971575746, "node_id": "IC_kwDOBm6k_c456RHC", "user": {"value": 22429695, "label": "codecov[bot]"}, "created_at": "2021-11-17T13:18:58Z", "updated_at": "2021-11-17T13:18:58Z", "author_association": "NONE", "body": "# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report\n> Merging [#1514](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (b02c35a) into [main](https://codecov.io/gh/simonw/datasette/commit/0156c6b5e52d541e93f0d68e9245f20ae83bc933?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (0156c6b) will **not change** coverage.\n> The diff coverage is `n/a`.\n\n[![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1514/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n\n```diff\n@@ Coverage Diff @@\n## main #1514 +/- ##\n=======================================\n Coverage 91.82% 91.82% \n=======================================\n Files 34 34 \n Lines 4430 4430 \n=======================================\n Hits 4068 4068 \n Misses 362 362 \n```\n\n\n\n------\n\n[Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n> `\u0394 = absolute (impact)`, `\u00f8 = not affected`, `? = missing data`\n> Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Last update [0156c6b...b02c35a](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1056117435, "label": "Bump black from 21.9b0 to 21.11b0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1500#issuecomment-971568829", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1500", "id": 971568829, "node_id": "IC_kwDOBm6k_c456Pa9", "user": {"value": 49699333, "label": "dependabot[bot]"}, "created_at": "2021-11-17T13:13:58Z", "updated_at": "2021-11-17T13:13:58Z", "author_association": "CONTRIBUTOR", "body": "Superseded by #1514.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1041158024, "label": "Bump black from 21.9b0 to 21.10b0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-971209475", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 971209475, "node_id": "IC_kwDOBm6k_c4543sD", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-17T05:41:42Z", "updated_at": "2021-11-17T05:41:42Z", "author_association": "OWNER", "body": "I'm going to build a brand new implementation of the `TableView` class that doesn't subclass `BaseView` at all, instead using `asyncinject`. If I'm lucky that will clean up the grungiest part of the codebase.\r\n\r\nI can maybe even run the tests against old `TableView` and `TableView2` to check that they behave the same.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-971057553", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 971057553, "node_id": "IC_kwDOBm6k_c454SmR", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-17T01:40:45Z", "updated_at": "2021-11-17T01:40:45Z", "author_association": "OWNER", "body": "I shipped that code as a new library, `asyncinject`: https://pypi.org/project/asyncinject/ - I'll open a new PR to attempt to refactor `TableView` to use it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-971056169", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 971056169, "node_id": "IC_kwDOBm6k_c454SQp", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-17T01:39:44Z", "updated_at": "2021-11-17T01:39:44Z", "author_association": "OWNER", "body": "Closing this PR because I shipped the code in it as a separate library instead.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-971055677", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 971055677, "node_id": "IC_kwDOBm6k_c454SI9", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-17T01:39:25Z", "updated_at": "2021-11-17T01:39:25Z", "author_association": "OWNER", "body": "https://github.com/simonw/asyncinject version 0.1a0 is now live on PyPI: https://pypi.org/project/asyncinject/", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-971010724", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 971010724, "node_id": "IC_kwDOBm6k_c454HKk", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-17T01:12:22Z", "updated_at": "2021-11-17T01:12:22Z", "author_association": "OWNER", "body": "I'm going to extract out the `asyncinject` stuff into a separate library.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-970718652", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 970718652, "node_id": "IC_kwDOBm6k_c452_28", "user": {"value": 22429695, "label": "codecov[bot]"}, "created_at": "2021-11-16T22:02:59Z", "updated_at": "2021-11-16T23:51:48Z", "author_association": "NONE", "body": "# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report\n> Merging [#1512](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (8f757da) into [main](https://codecov.io/gh/simonw/datasette/commit/0156c6b5e52d541e93f0d68e9245f20ae83bc933?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (0156c6b) will **decrease** coverage by `2.10%`.\n> The diff coverage is `36.20%`.\n\n[![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1512/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n\n```diff\n@@ Coverage Diff @@\n## main #1512 +/- ##\n==========================================\n- Coverage 91.82% 89.72% -2.11% \n==========================================\n Files 34 36 +2 \n Lines 4430 4604 +174 \n==========================================\n+ Hits 4068 4131 +63 \n- Misses 362 473 +111 \n```\n\n\n| [Impacted Files](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) | Coverage \u0394 | |\n|---|---|---|\n| [datasette/utils/vendored\\_graphlib.py](https://codecov.io/gh/simonw/datasette/pull/1512/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-ZGF0YXNldHRlL3V0aWxzL3ZlbmRvcmVkX2dyYXBobGliLnB5) | `0.00% <0.00%> (\u00f8)` | |\n| [datasette/utils/asyncdi.py](https://codecov.io/gh/simonw/datasette/pull/1512/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-ZGF0YXNldHRlL3V0aWxzL2FzeW5jZGkucHk=) | `96.92% <96.92%> (\u00f8)` | |\n\n------\n\n[Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n> `\u0394 = absolute (impact)`, `\u00f8 = not affected`, `? = missing data`\n> Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Last update [0156c6b...8f757da](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-970861628", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 970861628, "node_id": "IC_kwDOBm6k_c453iw8", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:46:07Z", "updated_at": "2021-11-16T23:46:07Z", "author_association": "OWNER", "body": "I made the changes locally and tested them with Python 3.6 like so:\r\n```\r\ncd /tmp\r\nmkdir v\r\ncd v\r\npipenv shell --python=python3.6\r\ncd ~/Dropbox/Development/datasette\r\npip install -e '.[test]'\r\npytest tests/test_asyncdi.py\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-970857411", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 970857411, "node_id": "IC_kwDOBm6k_c453hvD", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:43:21Z", "updated_at": "2021-11-16T23:43:21Z", "author_association": "OWNER", "body": "```\r\nE File \"/home/runner/work/datasette/datasette/datasette/utils/vendored_graphlib.py\", line 56\r\nE if (result := self._node2info.get(node)) is None:\r\nE ^\r\nE SyntaxError: invalid syntax\r\n```\r\nOh no - the vendored code I use has `:=` so doesn't work on Python 3.6! Will have to backport it more thoroughly.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970855084", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970855084, "node_id": "IC_kwDOBm6k_c453hKs", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:41:46Z", "updated_at": "2021-11-16T23:41:46Z", "author_association": "OWNER", "body": "Conclusion: using a giant convoluted CTE and UNION ALL query to attempt to calculate facets at the same time as retrieving rows is a net LOSS for performance! Very surprised to see that.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970853917", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970853917, "node_id": "IC_kwDOBm6k_c453g4d", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:41:01Z", "updated_at": "2021-11-16T23:41:01Z", "author_association": "OWNER", "body": "One very interesting difference between the two: on the single giant query page:\r\n\r\n```json\r\n{\r\n \"request_duration_ms\": 376.4317020000476,\r\n \"sum_trace_duration_ms\": 370.0828700000329,\r\n \"num_traces\": 5\r\n}\r\n```\r\nAnd on the page that uses separate queries:\r\n```json\r\n{\r\n \"request_duration_ms\": 819.012272000009,\r\n \"sum_trace_duration_ms\": 201.52852100000018,\r\n \"num_traces\": 19\r\n}\r\n```\r\nThe separate pages page takes 819ms total to render the page, but spends 201ms across 19 SQL queries.\r\n\r\nThe single big query takes 376ms total to render the page, spending 370ms in 5 queries\r\n\r\n
Those 5 queries, if you're interested\r\n\r\n```sql\r\nselect database_name, schema_version from databases\r\nPRAGMA schema_version\r\nPRAGMA schema_version\r\nexplain with cte as (\\r\\n select rowid, date, county, state, fips, cases, deaths\\r\\n from ny_times_us_counties\\r\\n),\\r\\ntruncated as (\\r\\n select null as _facet, null as facet_name, null as facet_count, rowid, date, county, state, fips, cases, deaths\\r\\n from cte order by date desc limit 4\\r\\n),\\r\\nstate_facet as (\\r\\n select 'state' as _facet, state as facet_name, count(*) as facet_count,\\r\\n null, null, null, null, null, null, null\\r\\n from cte group by facet_name order by facet_count desc limit 3\\r\\n),\\r\\nfips_facet as (\\r\\n select 'fips' as _facet, fips as facet_name, count(*) as facet_count,\\r\\n null, null, null, null, null, null, null\\r\\n from cte group by facet_name order by facet_count desc limit 3\\r\\n),\\r\\ncounty_facet as (\\r\\n select 'county' as _facet, county as facet_name, count(*) as facet_count,\\r\\n null, null, null, null, null, null, null\\r\\n from cte group by facet_name order by facet_count desc limit 3\\r\\n)\\r\\nselect * from truncated\\r\\nunion all select * from state_facet\\r\\nunion all select * from fips_facet\\r\\nunion all select * from county_facet\r\nwith cte as (\\r\\n select rowid, date, county, state, fips, cases, deaths\\r\\n from ny_times_us_counties\\r\\n),\\r\\ntruncated as (\\r\\n select null as _facet, null as facet_name, null as facet_count, rowid, date, county, state, fips, cases, deaths\\r\\n from cte order by date desc limit 4\\r\\n),\\r\\nstate_facet as (\\r\\n select 'state' as _facet, state as facet_name, count(*) as facet_count,\\r\\n null, null, null, null, null, null, null\\r\\n from cte group by facet_name order by facet_count desc limit 3\\r\\n),\\r\\nfips_facet as (\\r\\n select 'fips' as _facet, fips as facet_name, count(*) as facet_count,\\r\\n null, null, null, null, null, null, null\\r\\n from cte group by facet_name order by facet_count desc limit 3\\r\\n),\\r\\ncounty_facet as (\\r\\n select 'county' as _facet, county as facet_name, count(*) as facet_count,\\r\\n null, null, null, null, null, null, null\\r\\n from cte group by facet_name order by facet_count desc limit 3\\r\\n)\\r\\nselect * from truncated\\r\\nunion all select * from state_facet\\r\\nunion all select * from fips_facet\\r\\nunion all select * from county_facet\r\n```\r\n
\r\n\r\nAll of that additional non-SQL overhead must be stuff relating to Python and template rendering code running on the page. I'm really surprised at how much overhead that is! This is worth researching separately.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970845844", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970845844, "node_id": "IC_kwDOBm6k_c453e6U", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:35:38Z", "updated_at": "2021-11-16T23:35:38Z", "author_association": "OWNER", "body": "I tried adding `cases > 10000` but the SQL query now takes too long - so moving this to my laptop.\r\n\r\n```\r\ncd /tmp\r\nwget https://covid-19.datasettes.com/covid.db\r\ndatasette covid.db \\\r\n --setting facet_time_limit_ms 10000 \\\r\n --setting sql_time_limit_ms 10000 \\\r\n --setting trace_debug 1\r\n```\r\n`http://127.0.0.1:8006/covid/ny_times_us_counties?_trace=1&_facet_size=3&_size=2&cases__gt=10000` shows in the traces:\r\n\r\n```json\r\n[\r\n {\r\n \"type\": \"sql\",\r\n \"start\": 12.693033525,\r\n \"end\": 12.694056904,\r\n \"duration_ms\": 1.0233789999993803,\r\n \"traceback\": [\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\\\", line 262, in get\\n return await self.view_get(\\n\",\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\\\", line 477, in view_get\\n response_or_template_contexts = await self.data(\\n\",\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\\\", line 705, in data\\n results = await db.execute(sql, params, truncate=True, **extra_args)\\n\"\r\n ],\r\n \"database\": \"covid\",\r\n \"sql\": \"select rowid, date, county, state, fips, cases, deaths from ny_times_us_counties where \\\"cases\\\" > :p0 order by rowid limit 3\",\r\n \"params\": {\r\n \"p0\": 10000\r\n }\r\n },\r\n {\r\n \"type\": \"sql\",\r\n \"start\": 12.694285093,\r\n \"end\": 12.814936275,\r\n \"duration_ms\": 120.65118200000136,\r\n \"traceback\": [\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\\\", line 262, in get\\n return await self.view_get(\\n\",\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\\\", line 477, in view_get\\n response_or_template_contexts = await self.data(\\n\",\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\\\", line 723, in data\\n count_rows = list(await db.execute(count_sql, from_sql_params))\\n\"\r\n ],\r\n \"database\": \"covid\",\r\n \"sql\": \"select count(*) from ny_times_us_counties where \\\"cases\\\" > :p0\",\r\n \"params\": {\r\n \"p0\": 10000\r\n }\r\n },\r\n {\r\n \"type\": \"sql\",\r\n \"start\": 12.818812089,\r\n \"end\": 12.851172544,\r\n \"duration_ms\": 32.360455000000954,\r\n \"traceback\": [\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\\\", line 856, in data\\n suggested_facets.extend(await facet.suggest())\\n\",\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/facets.py\\\", line 164, in suggest\\n distinct_values = await self.ds.execute(\\n\",\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/app.py\\\", line 634, in execute\\n return await self.databases[db_name].execute(\\n\"\r\n ],\r\n \"database\": \"covid\",\r\n \"sql\": \"select county, count(*) as n from (\\n select rowid, date, county, state, fips, cases, deaths from ny_times_us_counties where \\\"cases\\\" > :p0 \\n ) where county is not null\\n group by county\\n limit 4\",\r\n \"params\": {\r\n \"p0\": 10000\r\n }\r\n },\r\n {\r\n \"type\": \"sql\",\r\n \"start\": 12.851418868,\r\n \"end\": 12.871268359,\r\n \"duration_ms\": 19.84949100000044,\r\n \"traceback\": [\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\\\", line 856, in data\\n suggested_facets.extend(await facet.suggest())\\n\",\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/facets.py\\\", line 164, in suggest\\n distinct_values = await self.ds.execute(\\n\",\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/app.py\\\", line 634, in execute\\n return await self.databases[db_name].execute(\\n\"\r\n ],\r\n \"database\": \"covid\",\r\n \"sql\": \"select state, count(*) as n from (\\n select rowid, date, county, state, fips, cases, deaths from ny_times_us_counties where \\\"cases\\\" > :p0 \\n ) where state is not null\\n group by state\\n limit 4\",\r\n \"params\": {\r\n \"p0\": 10000\r\n }\r\n },\r\n {\r\n \"type\": \"sql\",\r\n \"start\": 12.871497655,\r\n \"end\": 12.897715027,\r\n \"duration_ms\": 26.217371999999628,\r\n \"traceback\": [\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\\\", line 856, in data\\n suggested_facets.extend(await facet.suggest())\\n\",\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/facets.py\\\", line 164, in suggest\\n distinct_values = await self.ds.execute(\\n\",\r\n \" File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/app.py\\\", line 634, in execute\\n return await self.databases[db_name].execute(\\n\"\r\n ],\r\n \"database\": \"covid\",\r\n \"sql\": \"select fips, count(*) as n from (\\n select rowid, date, county, state, fips, cases, deaths from ny_times_us_counties where \\\"cases\\\" > :p0 \\n ) where fips is not null\\n group by fips\\n limit 4\",\r\n \"params\": {\r\n \"p0\": 10000\r\n }\r\n }\r\n]\r\n```\r\nSo that's:\r\n```\r\nfetch rows: 1.0233789999993803 ms\r\ncount: 120.65118200000136 ms\r\nfacet county: 32.360455000000954 ms\r\nfacet state: 19.84949100000044 ms\r\nfacet fips: 26.217371999999628 ms\r\n```\r\n= 200.1 ms total\r\n\r\nCompared to: `http://127.0.0.1:8006/covid?sql=with+cte+as+(%0D%0A++select+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+ny_times_us_counties%0D%0A)%2C%0D%0Atruncated+as+(%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+cte+order+by+date+desc+limit+4%0D%0A)%2C%0D%0Astate_facet+as+(%0D%0A++select+%27state%27+as+_facet%2C+state+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%2C%0D%0Afips_facet+as+(%0D%0A++select+%27fips%27+as+_facet%2C+fips+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%2C%0D%0Acounty_facet+as+(%0D%0A++select+%27county%27+as+_facet%2C+county+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+state_facet%0D%0Aunion+all+select+*+from+fips_facet%0D%0Aunion+all+select+*+from+county_facet&_trace=1`\r\n\r\nWhich is 353ms total.\r\n\r\nThe separate queries ran faster! Really surprising result there.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970828568", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970828568, "node_id": "IC_kwDOBm6k_c453asY", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:27:11Z", "updated_at": "2021-11-16T23:27:11Z", "author_association": "OWNER", "body": "One last experiment: I'm going to try running an expensive query in the CTE portion.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970827674", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970827674, "node_id": "IC_kwDOBm6k_c453aea", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:26:58Z", "updated_at": "2021-11-16T23:26:58Z", "author_association": "OWNER", "body": "With trace.\r\n\r\nhttps://covid-19.datasettes.com/covid/ny_times_us_counties?_trace=1&_facet_size=3&_size=2&_trace=1 shows the following:\r\n\r\n```\r\nfetch rows: 0.41762600005768036 ms\r\nfacet state: 284.30423800000426 ms\r\nfacet county: 273.2565999999679 ms\r\nfacet fips: 197.80996999998024 ms\r\n```\r\n= 755.78843400001ms total\r\n\r\nIt didn't run a count because that's the homepage and the count is cached. So I dropped the count from the query and ran it:\r\n\r\nhttps://covid-19.datasettes.com/covid?sql=with+cte+as+(%0D%0A++select+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+ny_times_us_counties%0D%0A)%2C%0D%0Atruncated+as+(%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+cte+order+by+date+desc+limit+4%0D%0A)%2C%0D%0Astate_facet+as+(%0D%0A++select+%27state%27+as+_facet%2C+state+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%2C%0D%0Afips_facet+as+(%0D%0A++select+%27fips%27+as+_facet%2C+fips+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%2C%0D%0Acounty_facet+as+(%0D%0A++select+%27county%27+as+_facet%2C+county+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+state_facet%0D%0Aunion+all+select+*+from+fips_facet%0D%0Aunion+all+select+*+from+county_facet&_trace=1\r\n\r\nShows 649.4359889999259 ms for the query - compared to 755.78843400001ms for the separate. So it saved about 100ms.\r\n\r\nStill not a huge difference though!\r\n\r\n\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970780866", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970780866, "node_id": "IC_kwDOBm6k_c453PDC", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:01:57Z", "updated_at": "2021-11-16T23:01:57Z", "author_association": "OWNER", "body": "One disadvantage to this approach: if you have a SQL time limit of 1s and it takes 0.9s to return the rows but then 0.5s to calculate each of the requested facets the entire query will exceed the time limit.\r\n\r\nCould work around this by catching that error and then re-running the query just for the rows, but that would result in the user having to wait longer for the results.\r\n\r\nCould try to remember if that has happened using an in-memory Python data structure and skip the faceting optimization if it's caused problems in the past? That seems a bit gross.\r\n\r\nMaybe this becomes an opt-in optimization you can request in your `metadata.json` setting for that table, which massively increases the time limit? That's a bit weird too - now there are two separate implementations of the faceting logic, which had better have a REALLY big pay-off to be worth maintaining.\r\n\r\nWhat if we kept the query that returns the rows to be displayed on the page separate from the facets, but then executed all of the facets together using this method such that the `cte` only (presumably) has to be calculated once? That would still lead to multiple facets potentially exceeding the SQL time limit when single facets would not have.\r\n\r\nMaybe a better optimization would be to move facets to happening via `fetch()` calls from the client, so the user gets to see their rows instantly and the facets then appear as and when they are available (though it would cause page jank).\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970766486", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970766486, "node_id": "IC_kwDOBm6k_c453LiW", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:52:56Z", "updated_at": "2021-11-16T22:56:07Z", "author_association": "OWNER", "body": "https://covid-19.datasettes.com/covid is 805.2MB\r\n\r\nhttps://covid-19.datasettes.com/covid/ny_times_us_counties?_trace=1&_facet_size=3&_size=2\r\n\r\nEquivalent SQL:\r\n\r\nhttps://covid-19.datasettes.com/covid?sql=with+cte+as+%28%0D%0A++select+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+ny_times_us_counties%0D%0A%29%2C%0D%0Atruncated+as+%28%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+cte+order+by+date+desc+limit+4%0D%0A%29%2C%0D%0Astate_facet+as+%28%0D%0A++select+%27state%27+as+_facet%2C+state+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Afips_facet+as+%28%0D%0A++select+%27fips%27+as+_facet%2C+fips+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Acounty_facet+as+%28%0D%0A++select+%27county%27+as+_facet%2C+county+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Atotal_count+as+%28%0D%0A++select+%27COUNT%27+as+_facet%2C+%27%27+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte%0D%0A%29%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+state_facet%0D%0Aunion+all+select+*+from+fips_facet%0D%0Aunion+all+select+*+from+county_facet%0D%0Aunion+all+select+*+from+total_count\r\n\r\n```sql\r\nwith cte as (\r\n select rowid, date, county, state, fips, cases, deaths\r\n from ny_times_us_counties\r\n),\r\ntruncated as (\r\n select null as _facet, null as facet_name, null as facet_count, rowid, date, county, state, fips, cases, deaths\r\n from cte order by date desc limit 4\r\n),\r\nstate_facet as (\r\n select 'state' as _facet, state as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n),\r\nfips_facet as (\r\n select 'fips' as _facet, fips as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n),\r\ncounty_facet as (\r\n select 'county' as _facet, county as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n),\r\ntotal_count as (\r\n select 'COUNT' as _facet, '' as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null, null\r\n from cte\r\n)\r\nselect * from truncated\r\nunion all select * from state_facet\r\nunion all select * from fips_facet\r\nunion all select * from county_facet\r\nunion all select * from total_count\r\n```\r\n\r\n_facet | facet_name | facet_count | rowid | date | county | state | fips | cases | deaths\r\n-- | -- | -- | -- | -- | -- | -- | -- | -- | --\r\n\u00a0 | \u00a0 | \u00a0 | 1917344 | 2021-11-15 | Autauga | Alabama | 1001 | 10407 | 154\r\n\u00a0 | \u00a0 | \u00a0 | 1917345 | 2021-11-15 | Baldwin | Alabama | 1003 | 37875 | 581\r\n\u00a0 | \u00a0 | \u00a0 | 1917346 | 2021-11-15 | Barbour | Alabama | 1005 | 3648 | 79\r\n\u00a0 | \u00a0 | \u00a0 | 1917347 | 2021-11-15 | Bibb | Alabama | 1007 | 4317 | 92\r\nstate | Texas | 148028 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nstate | Georgia | 96249 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nstate | Virginia | 79315 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nfips | \u00a0 | 17580 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nfips | 53061 | 665 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nfips | 17031 | 662 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\ncounty | Washington | 18666 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\ncounty | Unknown | 15840 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\ncounty | Jefferson | 15637 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nCOUNT | \u00a0 | 1920593 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970770304", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970770304, "node_id": "IC_kwDOBm6k_c453MeA", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:55:19Z", "updated_at": "2021-11-16T22:55:19Z", "author_association": "OWNER", "body": "(One thing I really like about this pattern is that it should work exactly the same when used to facet the results of arbitrary SQL queries as it does when faceting results from the table page.)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970767952", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970767952, "node_id": "IC_kwDOBm6k_c453L5Q", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:53:52Z", "updated_at": "2021-11-16T22:53:52Z", "author_association": "OWNER", "body": "It's going to take another 15 minutes for the build to finish and deploy the version with `_trace=1`: https://github.com/simonw/covid-19-datasette/actions/runs/1469150112", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970758179", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970758179, "node_id": "IC_kwDOBm6k_c453Jgj", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:47:38Z", "updated_at": "2021-11-16T22:47:38Z", "author_association": "OWNER", "body": "Trace now enabled: https://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_facet_size=3&_size=2&_nocount=1&_trace=1\r\n\r\nHere are the relevant traces:\r\n```json\r\n[\r\n {\r\n \"type\": \"sql\",\r\n \"start\": 31.214430154,\r\n \"end\": 31.214817089,\r\n \"duration_ms\": 0.3869350000016425,\r\n \"traceback\": [\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/base.py\\\", line 262, in get\\n return await self.view_get(\\n\",\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/base.py\\\", line 477, in view_get\\n response_or_template_contexts = await self.data(\\n\",\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/table.py\\\", line 705, in data\\n results = await db.execute(sql, params, truncate=True, **extra_args)\\n\"\r\n ],\r\n \"database\": \"global-power-plants\",\r\n \"sql\": \"select rowid, country, country_long, name, gppd_idnr, capacity_mw, latitude, longitude, primary_fuel, other_fuel1, other_fuel2, other_fuel3, commissioning_year, owner, source, url, geolocation_source, wepp_id, year_of_capacity_data, generation_gwh_2013, generation_gwh_2014, generation_gwh_2015, generation_gwh_2016, generation_gwh_2017, generation_data_source, estimated_generation_gwh from [global-power-plants] order by rowid limit 3\",\r\n \"params\": {}\r\n },\r\n {\r\n \"type\": \"sql\",\r\n \"start\": 31.215234586,\r\n \"end\": 31.220110342,\r\n \"duration_ms\": 4.875756000000564,\r\n \"traceback\": [\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/table.py\\\", line 760, in data\\n ) = await facet.facet_results()\\n\",\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/facets.py\\\", line 212, in facet_results\\n facet_rows_results = await self.ds.execute(\\n\",\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/app.py\\\", line 634, in execute\\n return await self.databases[db_name].execute(\\n\"\r\n ],\r\n \"database\": \"global-power-plants\",\r\n \"sql\": \"select country_long as value, count(*) as count from (\\n select rowid, country, country_long, name, gppd_idnr, capacity_mw, latitude, longitude, primary_fuel, other_fuel1, other_fuel2, other_fuel3, commissioning_year, owner, source, url, geolocation_source, wepp_id, year_of_capacity_data, generation_gwh_2013, generation_gwh_2014, generation_gwh_2015, generation_gwh_2016, generation_gwh_2017, generation_data_source, estimated_generation_gwh from [global-power-plants] \\n )\\n where country_long is not null\\n group by country_long order by count desc, value limit 4\",\r\n \"params\": []\r\n },\r\n {\r\n \"type\": \"sql\",\r\n \"start\": 31.221062485,\r\n \"end\": 31.228968364,\r\n \"duration_ms\": 7.905878999999061,\r\n \"traceback\": [\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/table.py\\\", line 760, in data\\n ) = await facet.facet_results()\\n\",\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/facets.py\\\", line 212, in facet_results\\n facet_rows_results = await self.ds.execute(\\n\",\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/app.py\\\", line 634, in execute\\n return await self.databases[db_name].execute(\\n\"\r\n ],\r\n \"database\": \"global-power-plants\",\r\n \"sql\": \"select owner as value, count(*) as count from (\\n select rowid, country, country_long, name, gppd_idnr, capacity_mw, latitude, longitude, primary_fuel, other_fuel1, other_fuel2, other_fuel3, commissioning_year, owner, source, url, geolocation_source, wepp_id, year_of_capacity_data, generation_gwh_2013, generation_gwh_2014, generation_gwh_2015, generation_gwh_2016, generation_gwh_2017, generation_data_source, estimated_generation_gwh from [global-power-plants] \\n )\\n where owner is not null\\n group by owner order by count desc, value limit 4\",\r\n \"params\": []\r\n },\r\n {\r\n \"type\": \"sql\",\r\n \"start\": 31.229809757,\r\n \"end\": 31.253902162,\r\n \"duration_ms\": 24.09240499999754,\r\n \"traceback\": [\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/table.py\\\", line 760, in data\\n ) = await facet.facet_results()\\n\",\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/facets.py\\\", line 212, in facet_results\\n facet_rows_results = await self.ds.execute(\\n\",\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/app.py\\\", line 634, in execute\\n return await self.databases[db_name].execute(\\n\"\r\n ],\r\n \"database\": \"global-power-plants\",\r\n \"sql\": \"select primary_fuel as value, count(*) as count from (\\n select rowid, country, country_long, name, gppd_idnr, capacity_mw, latitude, longitude, primary_fuel, other_fuel1, other_fuel2, other_fuel3, commissioning_year, owner, source, url, geolocation_source, wepp_id, year_of_capacity_data, generation_gwh_2013, generation_gwh_2014, generation_gwh_2015, generation_gwh_2016, generation_gwh_2017, generation_data_source, estimated_generation_gwh from [global-power-plants] \\n )\\n where primary_fuel is not null\\n group by primary_fuel order by count desc, value limit 4\",\r\n \"params\": []\r\n },\r\n {\r\n \"type\": \"sql\",\r\n \"start\": 31.255699745,\r\n \"end\": 31.256243889,\r\n \"duration_ms\": 0.544143999999136,\r\n \"traceback\": [\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/facets.py\\\", line 145, in suggest\\n row_count = await self.get_row_count()\\n\",\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/facets.py\\\", line 132, in get_row_count\\n await self.ds.execute(\\n\",\r\n \" File \\\"/usr/local/lib/python3.8/site-packages/datasette/app.py\\\", line 634, in execute\\n return await self.databases[db_name].execute(\\n\"\r\n ],\r\n \"database\": \"global-power-plants\",\r\n \"sql\": \"select count(*) from (select rowid, country, country_long, name, gppd_idnr, capacity_mw, latitude, longitude, primary_fuel, other_fuel1, other_fuel2, other_fuel3, commissioning_year, owner, source, url, geolocation_source, wepp_id, year_of_capacity_data, generation_gwh_2013, generation_gwh_2014, generation_gwh_2015, generation_gwh_2016, generation_gwh_2017, generation_data_source, estimated_generation_gwh from [global-power-plants] )\",\r\n \"params\": []\r\n }\r\n]\r\n```\r\n```\r\nfetch rows: 0.3869350000016425 ms\r\nfacet country_long: 4.875756000000564 ms\r\nfacet owner: 7.905878999999061 ms\r\nfacet primary_fuel: 24.09240499999754 ms\r\ncount: 0.544143999999136 ms\r\n```\r\nTotal = 37.8ms\r\n\r\nI modified the query to include the total count as well: https://global-power-plants.datasettes.com/global-power-plants?sql=with+cte+as+%28%0D%0A++select+rowid%2C+country%2C+country_long%2C+name%2C+owner%2C+primary_fuel%0D%0A++from+%5Bglobal-power-plants%5D%0D%0A%29%2C%0D%0Atruncated+as+%28%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+country%2C+country_long%2C+name%2C+owner%2C+primary_fuel%0D%0A++from+cte+order+by+rowid+limit+4%0D%0A%29%2C%0D%0Acountry_long_facet+as+%28%0D%0A++select+%27country_long%27+as+_facet%2C+country_long+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Aowner_facet+as+%28%0D%0A++select+%27owner%27+as+_facet%2C+owner+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Aprimary_fuel_facet+as+%28%0D%0A++select+%27primary_fuel%27+as+_facet%2C+primary_fuel+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Atotal_count+as+%28%0D%0A++select+%27COUNT%27+as+_facet%2C+%27%27+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte%0D%0A%29%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+country_long_facet%0D%0Aunion+all+select+*+from+owner_facet%0D%0Aunion+all+select+*+from+primary_fuel_facet%0D%0Aunion+all+select+*+from+total_count&_trace=1\r\n\r\n```sql\r\nwith cte as (\r\n select rowid, country, country_long, name, owner, primary_fuel\r\n from [global-power-plants]\r\n),\r\ntruncated as (\r\n select null as _facet, null as facet_name, null as facet_count, rowid, country, country_long, name, owner, primary_fuel\r\n from cte order by rowid limit 4\r\n),\r\ncountry_long_facet as (\r\n select 'country_long' as _facet, country_long as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n),\r\nowner_facet as (\r\n select 'owner' as _facet, owner as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n),\r\nprimary_fuel_facet as (\r\n select 'primary_fuel' as _facet, primary_fuel as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n),\r\ntotal_count as (\r\n select 'COUNT' as _facet, '' as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null\r\n from cte\r\n)\r\nselect * from truncated\r\nunion all select * from country_long_facet\r\nunion all select * from owner_facet\r\nunion all select * from primary_fuel_facet\r\nunion all select * from total_count\r\n```\r\nThe trace says that query took 34.801436999998714 ms.\r\n\r\nTo my huge surprise, this convoluted optimization only shaves the sum query time down from 37.8ms to 34.8ms!\r\n\r\nThat entire database file is just 11.1 MB though. Maybe it would make a meaningful difference on something larger?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970742415", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970742415, "node_id": "IC_kwDOBm6k_c453FqP", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:37:14Z", "updated_at": "2021-11-16T22:37:14Z", "author_association": "OWNER", "body": "The query takes 42.794ms to run.\r\n\r\nHere's the equivalent page using separate queries: https://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_facet_size=3&_size=2&_nocount=1\r\n\r\nAnnoyingly I can't disable facet suggestions but keep facets.\r\n\r\nI'm going to turn on tracing so I can see how long the separate queries took.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970738130", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970738130, "node_id": "IC_kwDOBm6k_c453EnS", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:32:19Z", "updated_at": "2021-11-16T22:32:19Z", "author_association": "OWNER", "body": "I came up with the following query which seems to work!\r\n\r\n```sql\r\nwith cte as (\r\n select rowid, country, country_long, name, owner, primary_fuel\r\n from [global-power-plants]\r\n),\r\ntruncated as (\r\n select null as _facet, null as facet_name, null as facet_count, rowid, country, country_long, name, owner, primary_fuel\r\n from cte order by rowid limit 4\r\n),\r\ncountry_long_facet as (\r\n select 'country_long' as _facet, country_long as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n),\r\nowner_facet as (\r\n select 'owner' as _facet, owner as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n),\r\nprimary_fuel_facet as (\r\n select 'primary_fuel' as _facet, primary_fuel as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n)\r\nselect * from truncated\r\nunion all select * from country_long_facet\r\nunion all select * from owner_facet\r\nunion all select * from primary_fuel_facet\r\n```\r\n(Limits should be 101, 31, 31, 31 but I reduced size to get a shorter example table).\r\n\r\nResults [look like this](https://global-power-plants.datasettes.com/global-power-plants?sql=with+cte+as+%28%0D%0A++select+rowid%2C+country%2C+country_long%2C+name%2C+owner%2C+primary_fuel%0D%0A++from+%5Bglobal-power-plants%5D%0D%0A%29%2C%0D%0Atruncated+as+%28%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+country%2C+country_long%2C+name%2C+owner%2C+primary_fuel%0D%0A++from+cte+order+by+rowid+limit+4%0D%0A%29%2C%0D%0Acountry_long_facet+as+%28%0D%0A++select+%27country_long%27+as+_facet%2C+country_long+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Aowner_facet+as+%28%0D%0A++select+%27owner%27+as+_facet%2C+owner+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Aprimary_fuel_facet+as+%28%0D%0A++select+%27primary_fuel%27+as+_facet%2C+primary_fuel+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+country_long_facet%0D%0Aunion+all+select+*+from+owner_facet%0D%0Aunion+all+select+*+from+primary_fuel_facet):\r\n\r\n_facet | facet_name | facet_count | rowid | country | country_long | name | owner | primary_fuel\r\n-- | -- | -- | -- | -- | -- | -- | -- | --\r\n\u00a0 | \u00a0 | \u00a0 | 1 | AFG | Afghanistan | Kajaki Hydroelectric Power Plant Afghanistan | \u00a0 | Hydro\r\n\u00a0 | \u00a0 | \u00a0 | 2 | AFG | Afghanistan | Kandahar DOG | \u00a0 | Solar\r\n\u00a0 | \u00a0 | \u00a0 | 3 | AFG | Afghanistan | Kandahar JOL | \u00a0 | Solar\r\n\u00a0 | \u00a0 | \u00a0 | 4 | AFG | Afghanistan | Mahipar Hydroelectric Power Plant Afghanistan | \u00a0 | Hydro\r\ncountry_long | United States of America | 8688 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\ncountry_long | China | 4235 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\ncountry_long | United Kingdom | 2603 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nowner | \u00a0 | 14112 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nowner | Lightsource Renewable Energy | 120 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nowner | Cypress Creek Renewables | 109 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nprimary_fuel | Solar | 9662 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nprimary_fuel | Hydro | 7155 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nprimary_fuel | Wind | 5188 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\n\r\nThis is a neat proof of concept. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-970718337", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 970718337, "node_id": "IC_kwDOBm6k_c452_yB", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:02:30Z", "updated_at": "2021-11-16T22:02:30Z", "author_association": "OWNER", "body": "I've decided to make the clever `asyncio` dependency injection opt-in - so you can either decorate with `@inject` or you can set `inject_all = True` on the class - for example:\r\n```python\r\nimport asyncio\r\nfrom datasette.utils.asyncdi import AsyncBase, inject\r\n\r\n\r\nclass Simple(AsyncBase):\r\n def __init__(self):\r\n self.log = []\r\n\r\n @inject\r\n async def two(self):\r\n self.log.append(\"two\")\r\n\r\n @inject\r\n async def one(self, two):\r\n self.log.append(\"one\")\r\n return self.log\r\n\r\n async def not_inject(self, one, two):\r\n return one + two\r\n\r\n\r\nclass Complex(AsyncBase):\r\n inject_all = True\r\n\r\n def __init__(self):\r\n self.log = []\r\n\r\n async def b(self):\r\n self.log.append(\"b\")\r\n\r\n async def a(self, b):\r\n self.log.append(\"a\")\r\n\r\n async def go(self, a):\r\n self.log.append(\"go\")\r\n return self.log\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970712713", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970712713, "node_id": "IC_kwDOBm6k_c452-aJ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T21:54:33Z", "updated_at": "2021-11-16T21:54:33Z", "author_association": "OWNER", "body": "I'm going to continue working on this in a PR.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970705738", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970705738, "node_id": "IC_kwDOBm6k_c4528tK", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T21:44:31Z", "updated_at": "2021-11-16T21:44:31Z", "author_association": "OWNER", "body": "Wrote a TIL about what I learned using `TopologicalSorter`: https://til.simonwillison.net/python/graphlib-topologicalsorter", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970673085", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970673085, "node_id": "IC_kwDOBm6k_c4520u9", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T20:58:24Z", "updated_at": "2021-11-16T20:58:24Z", "author_association": "OWNER", "body": "New test:\r\n```python\r\n\r\nclass Complex(AsyncBase):\r\n def __init__(self):\r\n self.log = []\r\n\r\n async def d(self):\r\n await asyncio.sleep(random() * 0.1)\r\n print(\"LOG: d\")\r\n self.log.append(\"d\")\r\n\r\n async def c(self):\r\n await asyncio.sleep(random() * 0.1)\r\n print(\"LOG: c\")\r\n self.log.append(\"c\")\r\n\r\n async def b(self, c, d):\r\n print(\"LOG: b\")\r\n self.log.append(\"b\")\r\n\r\n async def a(self, b, c):\r\n print(\"LOG: a\")\r\n self.log.append(\"a\")\r\n\r\n async def go(self, a):\r\n print(\"LOG: go\")\r\n self.log.append(\"go\")\r\n return self.log\r\n\r\n\r\n@pytest.mark.asyncio\r\nasync def test_complex():\r\n result = await Complex().go()\r\n # 'c' should only be called once\r\n assert tuple(result) in (\r\n # c and d could happen in either order\r\n (\"c\", \"d\", \"b\", \"a\", \"go\"),\r\n (\"d\", \"c\", \"b\", \"a\", \"go\"),\r\n )\r\n```\r\nAnd this code passes it:\r\n```python\r\nimport asyncio\r\nfrom functools import wraps\r\nimport inspect\r\n\r\ntry:\r\n import graphlib\r\nexcept ImportError:\r\n from . import vendored_graphlib as graphlib\r\n\r\n\r\nclass AsyncMeta(type):\r\n def __new__(cls, name, bases, attrs):\r\n # Decorate any items that are 'async def' methods\r\n _registry = {}\r\n new_attrs = {\"_registry\": _registry}\r\n for key, value in attrs.items():\r\n if inspect.iscoroutinefunction(value) and not value.__name__ == \"resolve\":\r\n new_attrs[key] = make_method(value)\r\n _registry[key] = new_attrs[key]\r\n else:\r\n new_attrs[key] = value\r\n # Gather graph for later dependency resolution\r\n graph = {\r\n key: {\r\n p\r\n for p in inspect.signature(method).parameters.keys()\r\n if p != \"self\" and not p.startswith(\"_\")\r\n }\r\n for key, method in _registry.items()\r\n }\r\n new_attrs[\"_graph\"] = graph\r\n return super().__new__(cls, name, bases, new_attrs)\r\n\r\n\r\ndef make_method(method):\r\n parameters = inspect.signature(method).parameters.keys()\r\n\r\n @wraps(method)\r\n async def inner(self, _results=None, **kwargs):\r\n print(\"\\n{}.{}({}) _results={}\".format(self, method.__name__, kwargs, _results))\r\n\r\n # Any parameters not provided by kwargs are resolved from registry\r\n to_resolve = [p for p in parameters if p not in kwargs and p != \"self\"]\r\n missing = [p for p in to_resolve if p not in self._registry]\r\n assert (\r\n not missing\r\n ), \"The following DI parameters could not be found in the registry: {}\".format(\r\n missing\r\n )\r\n\r\n results = {}\r\n results.update(kwargs)\r\n if to_resolve:\r\n resolved_parameters = await self.resolve(to_resolve, _results)\r\n results.update(resolved_parameters)\r\n return_value = await method(self, **results)\r\n if _results is not None:\r\n _results[method.__name__] = return_value\r\n return return_value\r\n\r\n return inner\r\n\r\n\r\nclass AsyncBase(metaclass=AsyncMeta):\r\n async def resolve(self, names, results=None):\r\n print(\"\\n resolve: \", names)\r\n if results is None:\r\n results = {}\r\n\r\n # Come up with an execution plan, just for these nodes\r\n ts = graphlib.TopologicalSorter()\r\n to_do = set(names)\r\n done = set()\r\n while to_do:\r\n item = to_do.pop()\r\n dependencies = self._graph[item]\r\n ts.add(item, *dependencies)\r\n done.add(item)\r\n # Add any not-done dependencies to the queue\r\n to_do.update({k for k in dependencies if k not in done})\r\n\r\n ts.prepare()\r\n plan = []\r\n while ts.is_active():\r\n node_group = ts.get_ready()\r\n plan.append(node_group)\r\n ts.done(*node_group)\r\n\r\n print(\"plan:\", plan)\r\n\r\n results = {}\r\n for node_group in plan:\r\n awaitables = [\r\n self._registry[name](\r\n self,\r\n _results=results,\r\n **{k: v for k, v in results.items() if k in self._graph[name]},\r\n )\r\n for name in node_group\r\n ]\r\n print(\" results = \", results)\r\n print(\" awaitables: \", awaitables)\r\n awaitable_results = await asyncio.gather(*awaitables)\r\n results.update(\r\n {p[0].__name__: p[1] for p in zip(awaitables, awaitable_results)}\r\n )\r\n\r\n print(\" End of resolve(), returning\", results)\r\n return {key: value for key, value in results.items() if key in names}\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970660299", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970660299, "node_id": "IC_kwDOBm6k_c452xnL", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T20:39:43Z", "updated_at": "2021-11-16T20:42:27Z", "author_association": "OWNER", "body": "But that does seem to be the plan that `TopographicalSorter` provides:\r\n```python\r\ngraph = {\"go\": {\"a\"}, \"a\": {\"b\", \"c\"}, \"b\": {\"c\", \"d\"}}\r\n\r\nts = TopologicalSorter(graph)\r\nts.prepare()\r\nwhile ts.is_active():\r\n nodes = ts.get_ready()\r\n print(nodes)\r\n ts.done(*nodes)\r\n```\r\nOutputs:\r\n```\r\n('c', 'd')\r\n('b',)\r\n('a',)\r\n('go',)\r\n```\r\nAlso:\r\n```python\r\ngraph = {\"go\": {\"d\", \"e\", \"f\"}, \"d\": {\"b\", \"c\"}, \"b\": {\"c\"}}\r\n\r\nts = TopologicalSorter(graph)\r\nts.prepare()\r\nwhile ts.is_active():\r\n nodes = ts.get_ready()\r\n print(nodes)\r\n ts.done(*nodes)\r\n```\r\nGives:\r\n```\r\n('e', 'f', 'c')\r\n('b',)\r\n('d',)\r\n('go',)\r\n```\r\nI'm confident that `TopologicalSorter` is the way to do this. I think I need to rewrite my code to call it once to get that plan, then `await asyncio.gather(*nodes)` in turn to execute it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970657874", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970657874, "node_id": "IC_kwDOBm6k_c452xBS", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T20:36:01Z", "updated_at": "2021-11-16T20:36:01Z", "author_association": "OWNER", "body": "My goal here is to calculate the most efficient way to resolve the different nodes, running them in parallel where possible.\r\n\r\nSo for this class:\r\n\r\n```python\r\nclass Complex(AsyncBase):\r\n async def d(self):\r\n pass\r\n\r\n async def c(self):\r\n pass\r\n\r\n async def b(self, c, d):\r\n pass\r\n\r\n async def a(self, b, c):\r\n pass\r\n\r\n async def go(self, a):\r\n pass\r\n```\r\nA call to `go()` should do this:\r\n\r\n- `c` and `d` in parallel\r\n- `b`\r\n- `a`\r\n- `go`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970655927", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970655927, "node_id": "IC_kwDOBm6k_c452wi3", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T20:33:11Z", "updated_at": "2021-11-16T20:33:11Z", "author_association": "OWNER", "body": "What should be happening here instead is it should resolve the full graph and notice that `c` is depended on by both `b` and `a` - so it should run `c` first, then run the next ones in parallel.\r\n\r\nSo maybe the algorithm I'm inheriting from https://docs.python.org/3/library/graphlib.html isn't the correct algorithm?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970655304", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970655304, "node_id": "IC_kwDOBm6k_c452wZI", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T20:32:16Z", "updated_at": "2021-11-16T20:32:16Z", "author_association": "OWNER", "body": "This code is really fiddly. I just got to this version:\r\n```python\r\nimport asyncio\r\nfrom functools import wraps\r\nimport inspect\r\n\r\ntry:\r\n import graphlib\r\nexcept ImportError:\r\n from . import vendored_graphlib as graphlib\r\n\r\n\r\nclass AsyncMeta(type):\r\n def __new__(cls, name, bases, attrs):\r\n # Decorate any items that are 'async def' methods\r\n _registry = {}\r\n new_attrs = {\"_registry\": _registry}\r\n for key, value in attrs.items():\r\n if inspect.iscoroutinefunction(value) and not value.__name__ == \"resolve\":\r\n new_attrs[key] = make_method(value)\r\n _registry[key] = new_attrs[key]\r\n else:\r\n new_attrs[key] = value\r\n # Gather graph for later dependency resolution\r\n graph = {\r\n key: {\r\n p\r\n for p in inspect.signature(method).parameters.keys()\r\n if p != \"self\" and not p.startswith(\"_\")\r\n }\r\n for key, method in _registry.items()\r\n }\r\n new_attrs[\"_graph\"] = graph\r\n return super().__new__(cls, name, bases, new_attrs)\r\n\r\n\r\ndef make_method(method):\r\n @wraps(method)\r\n async def inner(self, _results=None, **kwargs):\r\n print(\"inner - _results=\", _results)\r\n parameters = inspect.signature(method).parameters.keys()\r\n # Any parameters not provided by kwargs are resolved from registry\r\n to_resolve = [p for p in parameters if p not in kwargs and p != \"self\"]\r\n missing = [p for p in to_resolve if p not in self._registry]\r\n assert (\r\n not missing\r\n ), \"The following DI parameters could not be found in the registry: {}\".format(\r\n missing\r\n )\r\n results = {}\r\n results.update(kwargs)\r\n if to_resolve:\r\n resolved_parameters = await self.resolve(to_resolve, _results)\r\n results.update(resolved_parameters)\r\n return_value = await method(self, **results)\r\n if _results is not None:\r\n _results[method.__name__] = return_value\r\n return return_value\r\n\r\n return inner\r\n\r\n\r\nclass AsyncBase(metaclass=AsyncMeta):\r\n async def resolve(self, names, results=None):\r\n print(\"\\n resolve: \", names)\r\n if results is None:\r\n results = {}\r\n\r\n # Resolve them in the correct order\r\n ts = graphlib.TopologicalSorter()\r\n for name in names:\r\n ts.add(name, *self._graph[name])\r\n ts.prepare()\r\n\r\n async def resolve_nodes(nodes):\r\n print(\" resolve_nodes\", nodes)\r\n print(\" (current results = {})\".format(repr(results)))\r\n awaitables = [\r\n self._registry[name](\r\n self,\r\n _results=results,\r\n **{k: v for k, v in results.items() if k in self._graph[name]},\r\n )\r\n for name in nodes\r\n if name not in results\r\n ]\r\n print(\" awaitables: \", awaitables)\r\n awaitable_results = await asyncio.gather(*awaitables)\r\n results.update(\r\n {p[0].__name__: p[1] for p in zip(awaitables, awaitable_results)}\r\n )\r\n\r\n if not ts.is_active():\r\n # Nothing has dependencies - just resolve directly\r\n print(\" no dependencies, resolve directly\")\r\n await resolve_nodes(names)\r\n else:\r\n # Resolve in topological order\r\n while ts.is_active():\r\n nodes = ts.get_ready()\r\n print(\" ts.get_ready() returned nodes:\", nodes)\r\n await resolve_nodes(nodes)\r\n for node in nodes:\r\n ts.done(node)\r\n\r\n print(\" End of resolve(), returning\", results)\r\n return {key: value for key, value in results.items() if key in names}\r\n```\r\nWith this test:\r\n```python\r\nclass Complex(AsyncBase):\r\n def __init__(self):\r\n self.log = []\r\n\r\n async def c(self):\r\n print(\"LOG: c\")\r\n self.log.append(\"c\")\r\n\r\n async def b(self, c):\r\n print(\"LOG: b\")\r\n self.log.append(\"b\")\r\n\r\n async def a(self, b, c):\r\n print(\"LOG: a\")\r\n self.log.append(\"a\")\r\n\r\n async def go(self, a):\r\n print(\"LOG: go\")\r\n self.log.append(\"go\")\r\n return self.log\r\n\r\n\r\n@pytest.mark.asyncio\r\nasync def test_complex():\r\n result = await Complex().go()\r\n # 'c' should only be called once\r\n assert result == [\"c\", \"b\", \"a\", \"go\"]\r\n```\r\nThis test sometimes passes, and sometimes fails!\r\n\r\nOutput for a pass:\r\n```\r\ntests/test_asyncdi.py inner - _results= None\r\n\r\n resolve: ['a']\r\n ts.get_ready() returned nodes: ('c', 'b')\r\n resolve_nodes ('c', 'b')\r\n (current results = {})\r\n awaitables: [, ]\r\ninner - _results= {}\r\nLOG: c\r\ninner - _results= {'c': None}\r\n\r\n resolve: ['c']\r\n ts.get_ready() returned nodes: ('c',)\r\n resolve_nodes ('c',)\r\n (current results = {'c': None})\r\n awaitables: []\r\n End of resolve(), returning {'c': None}\r\nLOG: b\r\n ts.get_ready() returned nodes: ('a',)\r\n resolve_nodes ('a',)\r\n (current results = {'c': None, 'b': None})\r\n awaitables: []\r\ninner - _results= {'c': None, 'b': None}\r\nLOG: a\r\n End of resolve(), returning {'c': None, 'b': None, 'a': None}\r\nLOG: go\r\n```\r\nOutput for a fail:\r\n```\r\ntests/test_asyncdi.py inner - _results= None\r\n\r\n resolve: ['a']\r\n ts.get_ready() returned nodes: ('b', 'c')\r\n resolve_nodes ('b', 'c')\r\n (current results = {})\r\n awaitables: [, ]\r\ninner - _results= {}\r\n\r\n resolve: ['c']\r\n ts.get_ready() returned nodes: ('c',)\r\n resolve_nodes ('c',)\r\n (current results = {})\r\n awaitables: []\r\ninner - _results= {}\r\nLOG: c\r\ninner - _results= {'c': None}\r\nLOG: c\r\n End of resolve(), returning {'c': None}\r\nLOG: b\r\n ts.get_ready() returned nodes: ('a',)\r\n resolve_nodes ('a',)\r\n (current results = {'c': None, 'b': None})\r\n awaitables: []\r\ninner - _results= {'c': None, 'b': None}\r\nLOG: a\r\n End of resolve(), returning {'c': None, 'b': None, 'a': None}\r\nLOG: go\r\nF\r\n\r\n=================================================================================================== FAILURES ===================================================================================================\r\n_________________________________________________________________________________________________ test_complex _________________________________________________________________________________________________\r\n\r\n @pytest.mark.asyncio\r\n async def test_complex():\r\n result = await Complex().go()\r\n # 'c' should only be called once\r\n> assert result == [\"c\", \"b\", \"a\", \"go\"]\r\nE AssertionError: assert ['c', 'c', 'b', 'a', 'go'] == ['c', 'b', 'a', 'go']\r\nE At index 1 diff: 'c' != 'b'\r\nE Left contains one more item: 'go'\r\nE Use -v to get the full diff\r\n\r\ntests/test_asyncdi.py:48: AssertionError\r\n================== short test summary info ================================\r\nFAILED tests/test_asyncdi.py::test_complex - AssertionError: assert ['c', 'c', 'b', 'a', 'go'] == ['c', 'b', 'a', 'go']\r\n```\r\nI figured out why this is happening.\r\n\r\n`a` requires `b` and `c`\r\n\r\n`b` also requires `c`\r\n\r\nThe code decides to run `b` and `c` in parallel.\r\n\r\nIf `c` completes first, then when `b` runs it gets to use the already-calculated result for `c` - so it doesn't need to call `c` again.\r\n\r\nIf `b` gets to that point before `c` does it also needs to call `c`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970624197", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970624197, "node_id": "IC_kwDOBm6k_c452ozF", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T19:49:05Z", "updated_at": "2021-11-16T19:49:05Z", "author_association": "OWNER", "body": "Here's the latest version of my weird dependency injection async class:\r\n```python\r\nimport inspect\r\n\r\nclass AsyncMeta(type):\r\n def __new__(cls, name, bases, attrs):\r\n # Decorate any items that are 'async def' methods\r\n _registry = {}\r\n new_attrs = {\"_registry\": _registry}\r\n for key, value in attrs.items():\r\n if inspect.iscoroutinefunction(value) and not value.__name__ == \"resolve\":\r\n new_attrs[key] = make_method(value)\r\n _registry[key] = new_attrs[key]\r\n else:\r\n new_attrs[key] = value\r\n\r\n # Topological sort of _registry by parameter dependencies\r\n graph = {\r\n key: {\r\n p for p in inspect.signature(method).parameters.keys()\r\n if p != \"self\" and not p.startswith(\"_\")\r\n }\r\n for key, method in _registry.items()\r\n }\r\n new_attrs[\"_graph\"] = graph\r\n return super().__new__(cls, name, bases, new_attrs)\r\n\r\n\r\ndef make_method(method):\r\n @wraps(method)\r\n async def inner(self, **kwargs):\r\n parameters = inspect.signature(method).parameters.keys()\r\n # Any parameters not provided by kwargs are resolved from registry\r\n to_resolve = [p for p in parameters if p not in kwargs and p != \"self\"]\r\n missing = [p for p in to_resolve if p not in self._registry]\r\n assert (\r\n not missing\r\n ), \"The following DI parameters could not be found in the registry: {}\".format(\r\n missing\r\n )\r\n results = {}\r\n results.update(kwargs)\r\n results.update(await self.resolve(to_resolve))\r\n return await method(self, **results)\r\n\r\n return inner\r\n\r\n\r\nbad = [0]\r\n\r\nclass AsyncBase(metaclass=AsyncMeta):\r\n async def resolve(self, names):\r\n print(\" resolve({})\".format(names))\r\n results = {}\r\n # Resolve them in the correct order\r\n ts = TopologicalSorter()\r\n ts2 = TopologicalSorter()\r\n print(\" names = \", names)\r\n print(\" self._graph = \", self._graph)\r\n for name in names:\r\n if self._graph[name]:\r\n ts.add(name, *self._graph[name])\r\n ts2.add(name, *self._graph[name])\r\n print(\" static_order =\", tuple(ts2.static_order()))\r\n ts.prepare()\r\n while ts.is_active():\r\n print(\" is_active, i = \", bad[0])\r\n bad[0] += 1\r\n if bad[0] > 20:\r\n print(\" Infinite loop?\")\r\n break\r\n nodes = ts.get_ready()\r\n print(\" Do nodes:\", nodes)\r\n awaitables = [self._registry[name](self, **{\r\n k: v for k, v in results.items() if k in self._graph[name]\r\n }) for name in nodes]\r\n print(\" awaitables: \", awaitables)\r\n awaitable_results = await asyncio.gather(*awaitables)\r\n results.update({\r\n p[0].__name__: p[1] for p in zip(awaitables, awaitable_results)\r\n })\r\n print(results)\r\n for node in nodes:\r\n ts.done(node)\r\n\r\n return results\r\n```\r\nExample usage:\r\n```python\r\nclass Foo(AsyncBase):\r\n async def graa(self, boff):\r\n print(\"graa\")\r\n return 5\r\n async def boff(self):\r\n print(\"boff\")\r\n return 8\r\n async def other(self, boff, graa):\r\n print(\"other\")\r\n return 5 + boff + graa\r\n\r\nfoo = Foo()\r\nawait foo.other()\r\n```\r\nOutput:\r\n```\r\n resolve(['boff', 'graa'])\r\n names = ['boff', 'graa']\r\n self._graph = {'graa': {'boff'}, 'boff': set(), 'other': {'graa', 'boff'}}\r\n static_order = ('boff', 'graa')\r\n is_active, i = 0\r\n Do nodes: ('boff',)\r\n awaitables: []\r\n resolve([])\r\n names = []\r\n self._graph = {'graa': {'boff'}, 'boff': set(), 'other': {'graa', 'boff'}}\r\n static_order = ()\r\nboff\r\n{'boff': 8}\r\n is_active, i = 1\r\n Do nodes: ('graa',)\r\n awaitables: []\r\n resolve([])\r\n names = []\r\n self._graph = {'graa': {'boff'}, 'boff': set(), 'other': {'graa', 'boff'}}\r\n static_order = ()\r\ngraa\r\n{'boff': 8, 'graa': 5}\r\nother\r\n18\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-970554697", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 970554697, "node_id": "IC_kwDOBm6k_c452X1J", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T18:32:03Z", "updated_at": "2021-11-16T18:32:03Z", "author_association": "OWNER", "body": "I'm going to take another look at this:\r\n- https://github.com/simonw/datasette/issues/878", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null}