github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/datasette/issues/1519#issuecomment-974562942 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974562942 | IC_kwDOBm6k_c46FqZ- | 9599 | 2021-11-20T00:59:32Z | 2021-11-20T00:59:32Z | OWNER | Ouch a nasty bug crept through there - https://datasette-apache-proxy-demo-j7hipcg4aq-uc.a.run.app/prefix/fixtures/compound_three_primary_keys says > 500: name 'ds' is not defined | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974561593 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974561593 | IC_kwDOBm6k_c46FqE5 | 9599 | 2021-11-20T00:53:19Z | 2021-11-20T00:53:19Z | OWNER | Adding that test found (I hope!) all of the remaining `base_url` bugs. There were a bunch! I think I finally get to close #838 too. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974559176 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974559176 | IC_kwDOBm6k_c46FpfI | 9599 | 2021-11-20T00:42:08Z | 2021-11-20T00:42:08Z | OWNER | > In the meantime I can catch these errors by changing the test to run each path twice, once with and once without the prefix. This should accurately simulate how Apache is working here. This worked, I managed to get the tests to fail! Here's the change I made: ```diff diff --git a/tests/test_html.py b/tests/test_html.py index f24165b..dbdfe59 100644 --- a/tests/test_html.py +++ b/tests/test_html.py @@ -1614,12 +1614,19 @@ def test_metadata_sort_desc(app_client): "/fixtures/compound_three_primary_keys/a,a,a", "/fixtures/paginated_view", "/fixtures/facetable", + "/fixtures?sql=select+1", ], ) -def test_base_url_config(app_client_base_url_prefix, path): +@pytest.mark.parametrize("use_prefix", (True, False)) +def test_base_url_config(app_client_base_url_prefix, path, use_prefix): client = app_client_base_url_prefix - response = client.get("/prefix/" + path.lstrip("/")) + path_to_get = path + if use_prefix: + path_to_get = "/prefix/" + path.lstrip("/") + response = client.get(path_to_get) soup = Soup(response.body, "html.parser") + if path == "/fixtures?sql=select+1": + assert False for el in soup.findAll(["a", "link", "script"]): if "href" in el.attrs: href = el["href"] @@ -1642,11 +1649,12 @@ def test_base_url_config(app_client_base_url_prefix, path): # If this has been made absolute it may start http://localhost/ if href.startswith("http://localhost/"): href = href[len("http://localost/") :] - assert href.startswith("/prefix/"), { + assert href.startswith("/prefix/"), json.dumps({ "path": path, + "path_to_get": path_to_get, "href_or_src": href, "element_parent": str(el.parent), - } + }, indent=4, default=repr) def test_base_url_affects_metadata_extra_css_urls(app_client_base_url_prefix): ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974558267 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974558267 | IC_kwDOBm6k_c46FpQ7 | 9599 | 2021-11-20T00:37:57Z | 2021-11-20T00:37:57Z | OWNER | Thanks to #1522 I have a live demo that exhibits this bug now: https://apache-proxy-demo.datasette.io/prefix/fixtures/attraction_characteristic | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1522#issuecomment-974558076 | https://api.github.com/repos/simonw/datasette/issues/1522 | 974558076 | IC_kwDOBm6k_c46FpN8 | 9599 | 2021-11-20T00:36:56Z | 2021-11-20T00:36:56Z | OWNER | That 503 error is _really_ frustrating: I have a deploy running at https://apache-proxy-demo.datasette.io/prefix/ and after a fresh deploy it serves 503 errors for quite a while - then eventually starts working. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058896236 | |
https://github.com/simonw/datasette/issues/1522#issuecomment-974557766 | https://api.github.com/repos/simonw/datasette/issues/1522 | 974557766 | IC_kwDOBm6k_c46FpJG | 9599 | 2021-11-20T00:35:25Z | 2021-11-20T00:35:25Z | OWNER | Wrote a TIL about `--build-arg` and Cloud Run: https://til.simonwillison.net/cloudrun/using-build-args-with-cloud-run | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058896236 | |
https://github.com/simonw/datasette/issues/1522#issuecomment-974542348 | https://api.github.com/repos/simonw/datasette/issues/1522 | 974542348 | IC_kwDOBm6k_c46FlYM | 9599 | 2021-11-19T23:41:47Z | 2021-11-19T23:44:07Z | OWNER | Do I have to use `cloudbuild.yml` to specify these? https://stackoverflow.com/a/58327340/6083 and https://stackoverflow.com/a/66232670/6083 suggest I do. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058896236 | |
https://github.com/simonw/datasette/issues/1522#issuecomment-974541971 | https://api.github.com/repos/simonw/datasette/issues/1522 | 974541971 | IC_kwDOBm6k_c46FlST | 9599 | 2021-11-19T23:40:32Z | 2021-11-19T23:40:32Z | OWNER | I want to be able to use build arguments to specify which commit version or branch of Datasette to deploy. This is proving hard to work out. I have this in my Dockerfile now: ``` ARG DATASETTE_REF RUN pip install https://github.com/simonw/datasette/archive/${DATASETTE_REF}.zip ``` Which works locally: docker build -t datasette-apache-proxy-demo . \ --build-arg DATASETTE_REF=c617e1769ea27e045b0f2907ef49a9a1244e577d But I can't figure out the right incantation to pass to `gcloud build submit`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058896236 | |
https://github.com/simonw/datasette/issues/1522#issuecomment-974523569 | https://api.github.com/repos/simonw/datasette/issues/1522 | 974523569 | IC_kwDOBm6k_c46Fgyx | 9599 | 2021-11-19T22:51:10Z | 2021-11-19T22:51:10Z | OWNER | I wan a GitHub Action which I can manually activate to deploy a new version of that demo... and I want it to bake in the latest release of Datasette so I can use it to demonstrate bug fixes. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058896236 | |
https://github.com/simonw/datasette/issues/1522#issuecomment-974523297 | https://api.github.com/repos/simonw/datasette/issues/1522 | 974523297 | IC_kwDOBm6k_c46Fguh | 9599 | 2021-11-19T22:50:31Z | 2021-11-19T22:50:31Z | OWNER | Demo code is now at: https://github.com/simonw/datasette/tree/main/demos/apache-proxy | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058896236 | |
https://github.com/simonw/datasette/issues/1522#issuecomment-974521687 | https://api.github.com/repos/simonw/datasette/issues/1522 | 974521687 | IC_kwDOBm6k_c46FgVX | 9599 | 2021-11-19T22:46:26Z | 2021-11-19T22:46:26Z | OWNER | Oh weird, it started working: https://datasette-apache-proxy-demo-j7hipcg4aq-uc.a.run.app/prefix/fixtures/sortable | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058896236 | |
https://github.com/simonw/datasette/issues/1522#issuecomment-974506401 | https://api.github.com/repos/simonw/datasette/issues/1522 | 974506401 | IC_kwDOBm6k_c46Fcmh | 9599 | 2021-11-19T22:11:51Z | 2021-11-19T22:11:51Z | OWNER | This is frustrating: I have the following Dockerfile: ```dockerfile FROM python:3-alpine RUN apk add --no-cache \ apache2 \ apache2-proxy \ bash RUN pip install datasette ENV TINI_VERSION v0.18.0 ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini RUN chmod +x /tini # Append this to the end of the default httpd.conf file RUN echo $'ServerName localhost\n\ \n\ <Proxy *>\n\ Order deny,allow\n\ Allow from all\n\ </Proxy>\n\ \n\ ProxyPass /prefix/ http://localhost:8001/\n\ Header add X-Proxied-By "Apache2"' >> /etc/apache2/httpd.conf RUN echo $'<a href="/prefix/">Datasette</a>' > /var/www/localhost/htdocs/index.html WORKDIR /app ADD https://latest.datasette.io/fixtures.db /app/fixtures.db RUN echo $'#!/usr/bin/env bash\n\ set -e\n\ \n\ httpd -D FOREGROUND &\n\ datasette fixtures.db --setting base_url "/prefix/" -h 0.0.0.0 -p 8001 &\n\ \n\ wait -n' > /app/start.sh RUN chmod +x /app/start.sh EXPOSE 80 ENTRYPOINT ["/tini", "--", "/app/start.sh"] ``` It works fine when I run it locally: ``` docker build -t datasette-apache-proxy-demo . docker run -p 5000:80 datasette-apache-proxy-demo ``` But when I deploy it to Cloud Run with the following script: ```bash #!/bin/bash # https://til.simonwillison.net/cloudrun/ship-dockerfile-to-cloud-run NAME="datasette-apache-proxy-demo" PROJECT=$(gcloud config get-value project) IMAGE="gcr.io/$PROJECT/$NAME" gcloud builds submit --tag $IMAGE gcloud run deploy \ --allow-unauthenticated \ --platform=managed \ --image $IMAGE $NAME \ --port 80 ``` It serves the `/` page successfully, but hits to `/prefix/` return the following 503 error: > Service Unavailable > > The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later. > > Apache/2.4.51 (Unix) Server at datasette-apache-proxy-demo-j7hipcg4aq-uc.a.run.app Port 80 Cloud Run logs: <img width="1347" alt="Screen Shot 2… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058896236 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974478126 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974478126 | IC_kwDOBm6k_c46FVsu | 9599 | 2021-11-19T21:16:36Z | 2021-11-19T21:16:36Z | OWNER | In the meantime I can catch these errors by changing the test to run each path twice, once with and once without the prefix. This should accurately simulate how Apache is working here. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974477465 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974477465 | IC_kwDOBm6k_c46FViZ | 9599 | 2021-11-19T21:15:30Z | 2021-11-19T21:15:30Z | OWNER | I think what's happening here is Apache is actually making a request to `/fixtures` rather than making a request to `/prefix/fixtures` - and Datasette is replying to requests on both the prefixed and the non-prefixed paths. This is pretty confusing! I think Datasette should ONLY reply to `/prefix/fixtures` instead and return a 404 for `/fixtures` - this would make things a whole lot easier to debug. But shipping that change could break existing deployments. Maybe that should be a breaking change for 1.0. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974450232 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974450232 | IC_kwDOBm6k_c46FO44 | 9599 | 2021-11-19T20:41:53Z | 2021-11-19T20:42:19Z | OWNER | https://docs.datasette.io/en/stable/deploying.html#apache-proxy-configuration says I should use `ProxyPreserveHost on`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974447950 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974447950 | IC_kwDOBm6k_c46FOVO | 9599 | 2021-11-19T20:40:19Z | 2021-11-19T20:40:19Z | OWNER | Figured it out! The test is not an accurate recreation of what is happening, because it doesn't simulate a request with a path of `/fixtures` that has been redirected by the proxy to `/prefix/fixtures`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1522#issuecomment-974435661 | https://api.github.com/repos/simonw/datasette/issues/1522 | 974435661 | IC_kwDOBm6k_c46FLVN | 9599 | 2021-11-19T20:33:42Z | 2021-11-19T20:33:42Z | OWNER | Should just be a case of deploying this `Dockerfile`: ```Dockerfile FROM python:3-alpine RUN apk add --no-cache \ apache2 \ apache2-proxy \ bash RUN pip install datasette ENV TINI_VERSION v0.18.0 ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini RUN chmod +x /tini # Append this to the end of the default httpd.conf file RUN echo $'ServerName localhost\n\ \n\ <Proxy *>\n\ Order deny,allow\n\ Allow from all\n\ </Proxy>\n\ \n\ ProxyPass /foo/bar/ http://localhost:9000/\n\ Header add X-Proxied-By "Apache2"' >> /etc/apache2/httpd.conf RUN echo $'<a href="/foo/bar/">Datasette</a>' > /var/www/localhost/htdocs/index.html WORKDIR /app ADD https://latest.datasette.io/fixtures.db /app/fixtures.db RUN echo $'#!/usr/bin/env bash\n\ set -e\n\ \n\ httpd -D FOREGROUND &\n\ datasette fixtures.db --setting base_url "/foo/bar/" -p 9000 &\n\ \n\ wait -n' > /app/start.sh RUN chmod +x /app/start.sh EXPOSE 80 ENTRYPOINT ["/tini", "--", "/app/start.sh"] ``` I can follow this TIL: https://til.simonwillison.net/cloudrun/ship-dockerfile-to-cloud-run | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058896236 | |
https://github.com/simonw/datasette/issues/1521#issuecomment-974433520 | https://api.github.com/repos/simonw/datasette/issues/1521 | 974433520 | IC_kwDOBm6k_c46FKzw | 9599 | 2021-11-19T20:32:29Z | 2021-11-19T20:32:29Z | OWNER | This configuration works great. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058815557 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974433320 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974433320 | IC_kwDOBm6k_c46FKwo | 9599 | 2021-11-19T20:32:04Z | 2021-11-19T20:32:04Z | OWNER | Still not clear why the tests pass but the live example fails. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974433206 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974433206 | IC_kwDOBm6k_c46FKu2 | 9599 | 2021-11-19T20:31:52Z | 2021-11-19T20:31:52Z | OWNER | Modified my `Dockerfile` to do this: RUN pip install https://github.com/simonw/datasette/archive/ff0dd4da38d48c2fa9250ecf336002c9ed724e36.zip And now the `request` in that debug `?_context=1` looks like this: ``` "request": "<asgi.Request method=\"GET\" url=\"http://localhost:9000/fixtures?sql=select+*+from+compound_three_primary_keys+limit+1&_context=1\">" ``` That explains the bug - that request doesn't maintain the original path prefix of `http://localhost:5000/foo/bar/fixtures?sql=` (also it's been rewritten to `localhost:9000` instead of `localhost:5000`). | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974422829 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974422829 | IC_kwDOBm6k_c46FIMt | 9599 | 2021-11-19T20:26:35Z | 2021-11-19T20:26:35Z | OWNER | In the `?_context=` debug view the request looks like this: ``` "request": "<datasette.utils.asgi.Request object at 0x7faf9fe06200>", ``` I'm going to add a `repr()` to it such that it's a bit more useful. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974420619 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974420619 | IC_kwDOBm6k_c46FHqL | 9599 | 2021-11-19T20:25:19Z | 2021-11-19T20:25:19Z | OWNER | The implementations of `path_with_removed_args` and `path_with_format`: https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/utils/__init__.py#L228-L254 https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/utils/__init__.py#L710-L729 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974418496 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974418496 | IC_kwDOBm6k_c46FHJA | 9599 | 2021-11-19T20:24:16Z | 2021-11-19T20:24:16Z | OWNER | Here's the code that generates `edit_sql_url` correctly: https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/views/database.py#L416-L420 And here's the code for `show_hide_link`: https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/views/database.py#L432-L433 And for `url_csv`: https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/views/base.py#L600-L602 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974398399 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974398399 | IC_kwDOBm6k_c46FCO_ | 9599 | 2021-11-19T20:08:20Z | 2021-11-19T20:22:02Z | OWNER | The relevant test is this one: https://github.com/simonw/datasette/blob/30255055150d7bc0affc8156adc18295495020ff/tests/test_html.py#L1608-L1649 I modified that test to add `"/fixtures/facetable?sql=select+1"` as one of the tested paths, and dropped in an `assert False` to pause it in the debugger: ``` @pytest.mark.parametrize( "path", [ "/", "/fixtures", "/fixtures/compound_three_primary_keys", "/fixtures/compound_three_primary_keys/a,a,a", "/fixtures/paginated_view", "/fixtures/facetable", "/fixtures?sql=select+1", ], ) def test_base_url_config(app_client_base_url_prefix, path): client = app_client_base_url_prefix response = client.get("/prefix/" + path.lstrip("/")) soup = Soup(response.body, "html.parser") if path == "/fixtures?sql=select+1": > assert False E assert False ``` BUT... in the debugger: ``` (Pdb) print(soup) ... <p class="export-links">This data as <a href="/prefix/fixtures.json?sql=select+1">json</a>, <a href="/prefix/fixtures.testall?sql=select+1">testall</a>, <a href="/prefix/fixtures.testnone?sql=select+1">testnone</a>, <a href="/prefix/fixtures.testresponse?sql=select+1">testresponse</a>, <a href="/prefix/fixtures.csv?sql=select+1&_size=max">CSV</a></p> ``` Those all have the correct prefix! But that's not what I'm seeing in my `Dockerfile` reproduction of the issue. Something very weird is going on here. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974405016 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974405016 | IC_kwDOBm6k_c46FD2Y | 9599 | 2021-11-19T20:14:19Z | 2021-11-19T20:15:05Z | OWNER | I added `template_debug` in the Dockerfile: ``` datasette fixtures.db --setting template_debug 1 --setting base_url "/foo/bar/" -p 9000 &\n\ ``` And then hit `http://localhost:5000/foo/bar/fixtures?sql=select+*+from+compound_three_primary_keys+limit+1&_context=1` to view the template context - and it showed the bug, output edited to just show relevant keys: ```json { "edit_sql_url": "/foo/bar/fixtures?sql=select+%2A+from+compound_three_primary_keys+limit+1", "settings": { "force_https_urls": false, "template_debug": true, "trace_debug": false, "base_url": "/foo/bar/" }, "show_hide_link": "/fixtures?sql=select+%2A+from+compound_three_primary_keys+limit+1&_context=1&_hide_sql=1", "show_hide_text": "hide", "show_hide_hidden": "", "renderers": { "json": "/fixtures.json?sql=select+*+from+compound_three_primary_keys+limit+1&_context=1" }, "url_csv": "/fixtures.csv?sql=select+*+from+compound_three_primary_keys+limit+1&_context=1&_size=max", "url_csv_path": "/fixtures.csv", "base_url": "/foo/bar/" } ``` This is so strange. `edit_sql_url` and `base_url` are correct, but `show_hide_link` and `url_csv` and `renderers.json` are not. And it's _really strange_ that the bug doesn't show up in the tests. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974391204 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974391204 | IC_kwDOBm6k_c46FAek | 9599 | 2021-11-19T20:02:41Z | 2021-11-19T20:02:41Z | OWNER | Bug confirmed: ![proxy-bug](https://user-images.githubusercontent.com/9599/142684666-112136bf-9243-4b6e-8202-339fcfe91bcc.gif) | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974389472 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974389472 | IC_kwDOBm6k_c46FADg | 9599 | 2021-11-19T20:01:02Z | 2021-11-19T20:01:02Z | OWNER | I now have a `Dockerfile` in https://github.com/simonw/datasette/issues/1521#issuecomment-974388295 that I can use to run a local Apache 2 with `mod_proxy` to investigate this class of bugs! | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1521#issuecomment-974388295 | https://api.github.com/repos/simonw/datasette/issues/1521 | 974388295 | IC_kwDOBm6k_c46E_xH | 9599 | 2021-11-19T20:00:06Z | 2021-11-19T20:00:06Z | OWNER | And this is the version that proxies to a `base_url` of `/foo/bar/`: ```Dockerfile FROM python:3-alpine RUN apk add --no-cache \ apache2 \ apache2-proxy \ bash RUN pip install datasette ENV TINI_VERSION v0.18.0 ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini RUN chmod +x /tini # Append this to the end of the default httpd.conf file RUN echo $'ServerName localhost\n\ \n\ <Proxy *>\n\ Order deny,allow\n\ Allow from all\n\ </Proxy>\n\ \n\ ProxyPass /foo/bar/ http://localhost:9000/\n\ Header add X-Proxied-By "Apache2"' >> /etc/apache2/httpd.conf RUN echo $'<a href="/foo/bar/">Datasette</a>' > /var/www/localhost/htdocs/index.html WORKDIR /app ADD https://latest.datasette.io/fixtures.db /app/fixtures.db RUN echo $'#!/usr/bin/env bash\n\ set -e\n\ \n\ httpd -D FOREGROUND &\n\ datasette fixtures.db --setting base_url "/foo/bar/" -p 9000 &\n\ \n\ wait -n' > /app/start.sh RUN chmod +x /app/start.sh EXPOSE 80 ENTRYPOINT ["/tini", "--", "/app/start.sh"] ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058815557 | |
https://github.com/simonw/datasette/issues/1521#issuecomment-974380798 | https://api.github.com/repos/simonw/datasette/issues/1521 | 974380798 | IC_kwDOBm6k_c46E97- | 9599 | 2021-11-19T19:54:26Z | 2021-11-19T19:54:26Z | OWNER | Got it working! Here's a `Dockerfile` which runs completely stand-alone (thanks to using the `echo $'` trick to write out the config files it needs) and successfully serves Datasette behind Apache and `mod_proxy`: ```Dockerfile FROM python:3-alpine RUN apk add --no-cache \ apache2 \ apache2-proxy \ bash RUN pip install datasette ENV TINI_VERSION v0.18.0 ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini RUN chmod +x /tini # Append this to the end of the default httpd.conf file RUN echo $'ServerName localhost\n\ \n\ <Proxy *>\n\ Order deny,allow\n\ Allow from all\n\ </Proxy>\n\ \n\ ProxyPass / http://localhost:9000/\n\ ProxyPassReverse / http://localhost:9000/\n\ Header add X-Proxied-By "Apache2"' >> /etc/apache2/httpd.conf WORKDIR /app RUN echo $'#!/usr/bin/env bash\n\ set -e\n\ \n\ httpd -D FOREGROUND &\n\ datasette -p 9000 &\n\ \n\ wait -n' > /app/start.sh RUN chmod +x /app/start.sh EXPOSE 80 ENTRYPOINT ["/tini", "--", "/app/start.sh"] ``` Run it like this: ``` docker build -t datasette-apache2-proxy . docker run -p 5000:80 --rm datasette-apache2-proxy ``` Then run this to confirm: ``` ~ % curl -i 'http://localhost:5000/-/versions.json' HTTP/1.1 200 OK Date: Fri, 19 Nov 2021 19:54:05 GMT Server: uvicorn content-type: application/json; charset=utf-8 X-Proxied-By: Apache2 Transfer-Encoding: chunked {"python": {"version": "3.10.0", "full": "3.10.0 (default, Nov 13 2021, 03:23:03) [GCC 10.3.1 20210424]"}, "datasette": {"version": "0.59.2"}, "asgi": "3.0", "uvicorn": "0.15.0", "sqlite": {"version": "3.35.5", "fts_versions": ["FTS5", "FTS4", "FTS3"], "extensions": {"json1": null}, "compile_options": ["COMPILER=gcc-10.3.1 20210424", "ENABLE_COLUMN_METADATA", "ENABLE_DBSTAT_VTAB", "ENABLE_FTS3", "ENABLE_FTS3_PARENTHESIS", "ENABLE_FTS4", "ENABLE_FTS5", "ENABLE_GEOPOLY", "ENABLE_JSON1", "ENABLE_MATH_FUNCTIONS", "ENABLE_RTREE", "ENABLE_UNLOCK_NOTIFY", "MAX_VARIABLE_NUMBER=250000", "SECURE_DELETE", … | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058815557 | |
https://github.com/simonw/datasette/issues/1521#issuecomment-974371116 | https://api.github.com/repos/simonw/datasette/issues/1521 | 974371116 | IC_kwDOBm6k_c46E7ks | 9599 | 2021-11-19T19:45:47Z | 2021-11-19T19:45:47Z | OWNER | https://github.com/krallin/tini says: > *NOTE: If you are using Docker 1.13 or greater, Tini is included in Docker itself. This includes all versions of Docker CE. To enable Tini, just [pass the `--init` flag to `docker run`](https://docs.docker.com/engine/reference/commandline/run/).* | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058815557 | |
https://github.com/simonw/datasette/issues/1521#issuecomment-974336020 | https://api.github.com/repos/simonw/datasette/issues/1521 | 974336020 | IC_kwDOBm6k_c46EzAU | 9599 | 2021-11-19T19:10:48Z | 2021-11-19T19:10:48Z | OWNER | There's a promising looking minimal Apache 2 proxy config here: https://stackoverflow.com/questions/26474476/minimal-configuration-for-apache-reverse-proxy-in-docker-container | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058815557 | |
https://github.com/simonw/datasette/issues/1521#issuecomment-974334278 | https://api.github.com/repos/simonw/datasette/issues/1521 | 974334278 | IC_kwDOBm6k_c46EylG | 9599 | 2021-11-19T19:08:09Z | 2021-11-19T19:08:09Z | OWNER | Stripping comments using this StackOverflow recipe: https://unix.stackexchange.com/a/157619 docker run -it --entrypoint sh alpine-apache2-sh \ -c "cat /etc/apache2/httpd.conf" | sed '/^[[:blank:]]*#/d;s/#.*//' Result is here: https://gist.github.com/simonw/0a05090df5fcff8e8b3334621fa17976 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058815557 | |
https://github.com/simonw/datasette/issues/1521#issuecomment-974332787 | https://api.github.com/repos/simonw/datasette/issues/1521 | 974332787 | IC_kwDOBm6k_c46EyNz | 9599 | 2021-11-19T19:05:52Z | 2021-11-19T19:05:52Z | OWNER | Made myself this Dockerfile to let me explore a bit: ```Dockerfile FROM python:3-alpine RUN apk add --no-cache \ apache2 CMD ["sh"] ``` Then: ``` % docker run alpine-apache2-sh % docker run -it alpine-apache2-sh / # ls /etc/apache2/httpd.conf /etc/apache2/httpd.conf / # cat /etc/apache2/httpd.conf # # This is the main Apache HTTP server configuration file. It contains the # configuration directives that give the server its instructions. ... ``` Copying that into a GIST like so: ``` docker run -it --entrypoint sh alpine-apache2-sh -c "cat /etc/apache2/httpd.conf" | pbcopy ``` Gist here: https://gist.github.com/simonw/5ea0db6049192cb9f761fbd6beb3a84a | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058815557 | |
https://github.com/simonw/datasette/issues/1521#issuecomment-974327812 | https://api.github.com/repos/simonw/datasette/issues/1521 | 974327812 | IC_kwDOBm6k_c46ExAE | 9599 | 2021-11-19T18:58:49Z | 2021-11-19T18:59:55Z | OWNER | From this example: https://github.com/tigelane/dockerfiles/blob/06cff2ac8cdc920ebd64f50965115eaa3d0afb84/Alpine-Apache2/Dockerfile#L25-L31 it looks like running `apk add apache2` installs a config file at `/etc/apache2/httpd.conf` - so one approach is to then modify that file. ``` # APACHE - Alpine ################# RUN apk --update add apache2 php5-apache2 && \ #apk add openrc --no-cache && \ rm -rf /var/cache/apk/* && \ sed -i 's/#ServerName www.example.com:80/ServerName localhost/' /etc/apache2/httpd.conf && \ mkdir -p /run/apache2/ # Upload our files from folder "dist". COPY dist /var/www/localhost/htdocs # Manually set up the apache environment variables ENV APACHE_RUN_USER www-data ENV APACHE_RUN_GROUP www-data ENV APACHE_LOG_DIR /var/log/apache2 ENV APACHE_LOCK_DIR /var/lock/apache2 ENV APACHE_PID_FILE /var/run/apache2.pid # Execute apache2 on run ######################## EXPOSE 80 ENTRYPOINT ["httpd"] CMD ["-D", "FOREGROUND"] ``` I think I'll create my own separate copy and modify that. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058815557 | |
https://github.com/simonw/datasette/issues/1521#issuecomment-974321391 | https://api.github.com/repos/simonw/datasette/issues/1521 | 974321391 | IC_kwDOBm6k_c46Evbv | 9599 | 2021-11-19T18:49:15Z | 2021-11-19T18:57:18Z | OWNER | This pattern looks like it can help: https://ahmet.im/blog/cloud-run-multiple-processes-easy-way/ - see example in https://github.com/ahmetb/multi-process-container-lazy-solution I got that demo working locally like this: ```bash cd /tmp git clone https://github.com/ahmetb/multi-process-container-lazy-solution cd multi-process-container-lazy-solution docker build -t multi-process-container-lazy-solution . docker run -p 5000:8080 --rm multi-process-container-lazy-solution ``` I want to use `apache2` rather than `nginx` though. I found a few relevant examples of Apache in Alpine: - https://github.com/Hacking-Lab/alpine-apache2-reverse-proxy/blob/master/Dockerfile - https://www.sentiatechblog.com/running-apache-in-a-docker-container - https://github.com/search?l=Dockerfile&q=alpine+apache2&type=code | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058815557 | |
https://github.com/simonw/datasette/issues/1521#issuecomment-974322178 | https://api.github.com/repos/simonw/datasette/issues/1521 | 974322178 | IC_kwDOBm6k_c46EvoC | 9599 | 2021-11-19T18:50:22Z | 2021-11-19T18:50:22Z | OWNER | I'll get this working on my laptop first, but then I want to get it up and running on Cloud Run - maybe with a GitHub Actions workflow in this repo that re-deploys it on manual execution. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058815557 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974310208 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974310208 | IC_kwDOBm6k_c46EstA | 9599 | 2021-11-19T18:32:31Z | 2021-11-19T18:32:31Z | OWNER | Having a live demo running on Cloud Run that proxies through Apache and uses `base_url` would be incredibly useful for replicating and debugging this kind of thing. I wonder how hard it is to run Apache and `mod_proxy` in the same Docker container as Datasette? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1519#issuecomment-974309591 | https://api.github.com/repos/simonw/datasette/issues/1519 | 974309591 | IC_kwDOBm6k_c46EsjX | 9599 | 2021-11-19T18:31:32Z | 2021-11-19T18:31:32Z | OWNER | `base_url` has been a source of so many bugs like this! I often find them quite hard to replicate, likely because I haven't made myself a good Apache `mod_proxy` testing environment yet. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058790545 | |
https://github.com/simonw/datasette/issues/1520#issuecomment-974308215 | https://api.github.com/repos/simonw/datasette/issues/1520 | 974308215 | IC_kwDOBm6k_c46EsN3 | 9599 | 2021-11-19T18:29:26Z | 2021-11-19T18:29:26Z | OWNER | The solution that jumps to mind first is that it would be neat if routes could return something that meant "actually my bad, I can't handle this after all - move to the next one in the list". A related idea: it might be useful for custom views like my one here to say "no actually call the default view for this, but give me back the response so I can modify it in some way". Kind of like Django or ASGI middleware. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058803238 | |
https://github.com/simonw/datasette/issues/1518#issuecomment-974300823 | https://api.github.com/repos/simonw/datasette/issues/1518 | 974300823 | IC_kwDOBm6k_c46EqaX | 9599 | 2021-11-19T18:18:32Z | 2021-11-19T18:18:32Z | OWNER | > This may be an argument for continuing to allow non-JSON-objects through to the HTML templates. Need to think about that a bit more. I can definitely support this using pure-JSON - I could make two versions of the row available, one that's an array of cell objects and the other that's an object mapping column names to column raw values. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058072543 | |
https://github.com/simonw/datasette/issues/1518#issuecomment-974285803 | https://api.github.com/repos/simonw/datasette/issues/1518 | 974285803 | IC_kwDOBm6k_c46Emvr | 9599 | 2021-11-19T17:56:48Z | 2021-11-19T18:14:30Z | OWNER | Very confused by this piece of code here: https://github.com/simonw/datasette/blob/1c13e1af0664a4dfb1e69714c56523279cae09e4/datasette/views/table.py#L37-L63 I added it in https://github.com/simonw/datasette/commit/754836eef043676e84626c4fd3cb993eed0d2976 - in the new world that should probably be replaced by pure JSON. Aha - this comment explains it: https://github.com/simonw/datasette/issues/521#issuecomment-505279560 > I think the trick is to redefine what a "cell_row" is. Each row is currently a list of cells: > > https://github.com/simonw/datasette/blob/6341f8cbc7833022012804dea120b838ec1f6558/datasette/views/table.py#L159-L163 > > I can redefine the row (the `cells` variable in the above example) as a thing-that-iterates-cells (hence behaving like a list) but that also supports `__getitem__` access for looking up cell values if you know the name of the column. The goal was to support neater custom templates like this: ```html+jinja {% for row in display_rows %} <h2 class="scientist">{{ row["First_Name"] }} {{ row["Last_Name"] }}</h2> ... ``` This may be an argument for continuing to allow non-JSON-objects through to the HTML templates. Need to think about that a bit more. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058072543 | |
https://github.com/simonw/datasette/issues/1518#issuecomment-974287570 | https://api.github.com/repos/simonw/datasette/issues/1518 | 974287570 | IC_kwDOBm6k_c46EnLS | 9599 | 2021-11-19T17:59:33Z | 2021-11-19T17:59:33Z | OWNER | I'm going to try leaning into the `asyncinject` mechanism a bit here. One method can execute and return the raw rows. Another can turn that into the default minimal JSON representation. Then a third can take that (or take both) and use it to inflate out the JSON that the HTML template needs, with those extras and with the rendered cells from plugins. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058072543 | |
https://github.com/simonw/datasette/pull/1495#issuecomment-974108455 | https://api.github.com/repos/simonw/datasette/issues/1495 | 974108455 | IC_kwDOBm6k_c46D7cn | 192568 | 2021-11-19T14:14:35Z | 2021-11-19T14:14:35Z | CONTRIBUTOR | A nudge on this. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1033678984 | |
https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973820125 | https://api.github.com/repos/simonw/sqlite-utils/issues/342 | 973820125 | IC_kwDOCGYnMM46C1Dd | 9599 | 2021-11-19T07:25:55Z | 2021-11-19T07:25:55Z | OWNER | `alter=True` doesn't make sense to support here either, because `.lookup()` already adds missing columns: https://github.com/simonw/sqlite-utils/blob/3b8abe608796e99e4ffc5f3f4597a85e605c0e9b/sqlite_utils/db.py#L2743-L2746 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058196641 | |
https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802998 | https://api.github.com/repos/simonw/sqlite-utils/issues/342 | 973802998 | IC_kwDOCGYnMM46Cw32 | 9599 | 2021-11-19T06:59:22Z | 2021-11-19T06:59:32Z | OWNER | I don't think I need the `DEFAULT` defaults for `.insert()` either, since it just passes through to `.insert()`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058196641 | |
https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802766 | https://api.github.com/repos/simonw/sqlite-utils/issues/342 | 973802766 | IC_kwDOCGYnMM46Cw0O | 9599 | 2021-11-19T06:58:45Z | 2021-11-19T06:58:45Z | OWNER | And neither does `hash_id`. On that basis I'm going to specifically list the ones that DO make sense, and hope that I remember to add any new ones in the future. I can add a code comment hint to `.insert()` about that. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058196641 | |
https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802469 | https://api.github.com/repos/simonw/sqlite-utils/issues/342 | 973802469 | IC_kwDOCGYnMM46Cwvl | 9599 | 2021-11-19T06:58:03Z | 2021-11-19T06:58:03Z | OWNER | Also: I don't think `ignore=` and `replace=` make sense in the context of `lookup()`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058196641 | |
https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802308 | https://api.github.com/repos/simonw/sqlite-utils/issues/342 | 973802308 | IC_kwDOCGYnMM46CwtE | 9599 | 2021-11-19T06:57:37Z | 2021-11-19T06:57:37Z | OWNER | Here's the current full method signature for `.insert()`: https://github.com/simonw/sqlite-utils/blob/3b8abe608796e99e4ffc5f3f4597a85e605c0e9b/sqlite_utils/db.py#L2462-L2477 I could add a test which uses introspection (`inspect.signature(method).parameters`) to confirm that `.lookup()` has a super-set of the arguments accepted by `.insert()`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058196641 | |
https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973801650 | https://api.github.com/repos/simonw/sqlite-utils/issues/342 | 973801650 | IC_kwDOCGYnMM46Cwiy | 9599 | 2021-11-19T06:55:56Z | 2021-11-19T06:55:56Z | OWNER | `pk` needs to be an explicit argument to `.lookup()`. The rest could be `**kwargs` passed through to `.insert()`, like this hacked together version (docstring removed for brevity): ```python def lookup( self, lookup_values: Dict[str, Any], extra_values: Optional[Dict[str, Any]] = None, pk="id", **insert_kwargs, ): """ assert isinstance(lookup_values, dict) if extra_values is not None: assert isinstance(extra_values, dict) combined_values = dict(lookup_values) if extra_values is not None: combined_values.update(extra_values) if self.exists(): self.add_missing_columns([combined_values]) unique_column_sets = [set(i.columns) for i in self.indexes] if set(lookup_values.keys()) not in unique_column_sets: self.create_index(lookup_values.keys(), unique=True) wheres = ["[{}] = ?".format(column) for column in lookup_values] rows = list( self.rows_where( " and ".join(wheres), [value for _, value in lookup_values.items()] ) ) try: return rows[0][pk] except IndexError: return self.insert(combined_values, pk=pk, **insert_kwargs).last_pk else: pk = self.insert(combined_values, pk=pk, **insert_kwargs).last_pk self.create_index(lookup_values.keys(), unique=True) return pk ``` I think I'll explicitly list the parameters, mainly so they can be typed and covered by automatic documentation. I do worry that I'll add more keyword arguments to `.insert()` in the future and forget to mirror them to `.lookup()` though. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058196641 | |
https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973800795 | https://api.github.com/repos/simonw/sqlite-utils/issues/342 | 973800795 | IC_kwDOCGYnMM46CwVb | 9599 | 2021-11-19T06:54:08Z | 2021-11-19T06:54:08Z | OWNER | Looking at the code for `lookup()` it currently hard-codes `pk` to `"id"` - but it actually only calls `.insert()` in two places, both of which could be passed extra arguments. https://github.com/simonw/sqlite-utils/blob/3b8abe608796e99e4ffc5f3f4597a85e605c0e9b/sqlite_utils/db.py#L2756-L2763 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058196641 | |
https://github.com/simonw/datasette/issues/1518#issuecomment-973700549 | https://api.github.com/repos/simonw/datasette/issues/1518 | 973700549 | IC_kwDOBm6k_c46CX3F | 9599 | 2021-11-19T03:31:20Z | 2021-11-19T03:31:26Z | OWNER | ... and while I'm doing all of this I can rewrite the templates to not use those cheating magical functions AND document the template context at the same time, refs: - #1510. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058072543 | |
https://github.com/simonw/datasette/issues/1518#issuecomment-973700322 | https://api.github.com/repos/simonw/datasette/issues/1518 | 973700322 | IC_kwDOBm6k_c46CXzi | 9599 | 2021-11-19T03:30:30Z | 2021-11-19T03:30:30Z | OWNER | Right now the HTML version gets to cheat - it passes through objects that are not JSON serializable, including custom functions that can then be called by Jinja. I'm interested in maybe removing this cheating - if the HTML version could only request JSON-serializable extras those could be exposed in the API as well. It would also help cleanup the kind-of-nasty pattern I use in the current `BaseView` where everything returns both a bunch of JSON-serializable data AND an awaitable function that then gets to add extra things to the HTML context. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058072543 | |
https://github.com/simonw/datasette/issues/1518#issuecomment-973698917 | https://api.github.com/repos/simonw/datasette/issues/1518 | 973698917 | IC_kwDOBm6k_c46CXdl | 9599 | 2021-11-19T03:26:18Z | 2021-11-19T03:29:03Z | OWNER | A (likely incomplete) list of features on the table page: - [ ] Display table/database/instance metadata - [ ] Show count of all results - [ ] Display table of results - [ ] Special table display treatment for URLs, numbers - [ ] Allow plugins to modify table cells - [ ] Respect `?_col=` and `?_nocol=` - [ ] Show interface for filtering by columns and operations - [ ] Show search box, support executing FTS searches - [ ] Sort table by specified column - [ ] Paginate table - [ ] Show facet results - [ ] Show suggested facets - [ ] Link to available exports - [ ] Display schema for table - [ ] Maybe it should show the SQL for the query too? - [ ] Handle various non-obvious querystring options, like `?_where=` and `?_through=` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058072543 | |
https://github.com/simonw/datasette/issues/1518#issuecomment-973699424 | https://api.github.com/repos/simonw/datasette/issues/1518 | 973699424 | IC_kwDOBm6k_c46CXlg | 9599 | 2021-11-19T03:27:49Z | 2021-11-19T03:27:49Z | OWNER | My goal is to break up a lot of this functionality into separate methods. These methods can be executed in parallel by `asyncinject`, but more importantly they can be used to build a much better JSON representation, where the default representation is lighter and `?_extra=x` options can be used to execute more expensive portions and add them to the response. So the HTML version itself needs to be re-written to use those JSON extras. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058072543 | |
https://github.com/simonw/datasette/issues/1517#issuecomment-973696604 | https://api.github.com/repos/simonw/datasette/issues/1517 | 973696604 | IC_kwDOBm6k_c46CW5c | 9599 | 2021-11-19T03:20:00Z | 2021-11-19T03:20:00Z | OWNER | Confirmed - my test plugin is indeed correctly over-riding the table page. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1057996111 | |
https://github.com/simonw/datasette/issues/1518#issuecomment-973687978 | https://api.github.com/repos/simonw/datasette/issues/1518 | 973687978 | IC_kwDOBm6k_c46CUyq | 9599 | 2021-11-19T03:07:47Z | 2021-11-19T03:07:47Z | OWNER | I was wrong about that, you CAN over-ride default routes already. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058072543 | |
https://github.com/simonw/datasette/issues/1517#issuecomment-973686874 | https://api.github.com/repos/simonw/datasette/issues/1517 | 973686874 | IC_kwDOBm6k_c46CUha | 9599 | 2021-11-19T03:06:58Z | 2021-11-19T03:06:58Z | OWNER | I made a mistake: I just wrote a test that proves that plugins CAN over-ride default routes, plus if you look at the code here the plugins get to register themselves first: https://github.com/simonw/datasette/blob/0156c6b5e52d541e93f0d68e9245f20ae83bc933/datasette/app.py#L965-L981 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1057996111 | |
https://github.com/simonw/datasette/issues/1518#issuecomment-973682389 | https://api.github.com/repos/simonw/datasette/issues/1518 | 973682389 | IC_kwDOBm6k_c46CTbV | 9599 | 2021-11-19T02:57:39Z | 2021-11-19T02:57:39Z | OWNER | Ideally I'd like to execute the existing test suite against the new implementation - that would require me to solve this so I can replace the view with the plugin version though: - #1517 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058072543 | |
https://github.com/simonw/datasette/issues/1518#issuecomment-973681970 | https://api.github.com/repos/simonw/datasette/issues/1518 | 973681970 | IC_kwDOBm6k_c46CTUy | 9599 | 2021-11-19T02:56:31Z | 2021-11-19T02:56:53Z | OWNER | Here's where I got to with my hacked-together initial plugin prototype - it managed to render the table page with some rows on it (and a bunch of missing functionality such as filters): https://gist.github.com/simonw/281eac9c73b062c3469607ad86470eb2 <img width="899" alt="fixtures__roadside_attractions__4_rows_and__11__Liked___Twitter" src="https://user-images.githubusercontent.com/9599/142557265-6e10c808-5898-49ed-a8e3-7b5207b2872a.png"> | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1058072543 | |
https://github.com/simonw/datasette/issues/878#issuecomment-973678931 | https://api.github.com/repos/simonw/datasette/issues/878 | 973678931 | IC_kwDOBm6k_c46CSlT | 9599 | 2021-11-19T02:51:17Z | 2021-11-19T02:51:17Z | OWNER | OK, I managed to get a table to render! Here's the code I used - I had to copy a LOT of stuff. https://gist.github.com/simonw/281eac9c73b062c3469607ad86470eb2 I'm going to move this work into a new, separate issue. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-973635157 | https://api.github.com/repos/simonw/datasette/issues/878 | 973635157 | IC_kwDOBm6k_c46CH5V | 9599 | 2021-11-19T01:07:08Z | 2021-11-19T01:07:08Z | OWNER | This exercise is proving so useful in getting my head around how the enormous and complex `TableView` class works again. Here's where I've got to now - I'm systematically working through the variables that are returned for HTML and for JSON copying across code to get it to work: ```python from datasette.database import QueryInterrupted from datasette.utils import escape_sqlite from datasette.utils.asgi import Response, NotFound, Forbidden from datasette.views.base import DatasetteError from datasette import hookimpl from asyncinject import AsyncInject, inject from pprint import pformat class Table(AsyncInject): @inject async def database(self, request, datasette): # TODO: all that nasty hash resolving stuff can go here db_name = request.url_vars["db_name"] try: db = datasette.databases[db_name] except KeyError: raise NotFound(f"Database '{db_name}' does not exist") return db @inject async def table_and_format(self, request, database, datasette): table_and_format = request.url_vars["table_and_format"] # TODO: be a lot smarter here if "." in table_and_format: return table_and_format.split(".", 2) else: return table_and_format, "html" @inject async def main(self, request, database, table_and_format, datasette): # TODO: if this is actually a canned query, dispatch to it table, format = table_and_format is_view = bool(await database.get_view_definition(table)) table_exists = bool(await database.table_exists(table)) if not is_view and not table_exists: raise NotFound(f"Table not found: {table}") await check_permissions( datasette, request, [ ("view-table", (database.name, table)), ("view-database", database.name), "view-instance", ], ) private =… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-973568285 | https://api.github.com/repos/simonw/datasette/issues/878 | 973568285 | IC_kwDOBm6k_c46B3kd | 9599 | 2021-11-19T00:29:20Z | 2021-11-19T00:29:20Z | OWNER | This is working! ```python from datasette.utils.asgi import Response from datasette import hookimpl import html from asyncinject import AsyncInject, inject class Table(AsyncInject): @inject async def database(self, request): return request.url_vars["db_name"] @inject async def main(self, request, database): return Response.html("Database: {}".format( html.escape(database) )) async def view(self, request): return await self.main(request=request) @hookimpl def register_routes(): return [ (r"/t/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)", Table().view), ] ``` This project will definitely show me if I actually like the `asyncinject` patterns or not. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-973564260 | https://api.github.com/repos/simonw/datasette/issues/878 | 973564260 | IC_kwDOBm6k_c46B2lk | 9599 | 2021-11-19T00:27:06Z | 2021-11-19T00:27:06Z | OWNER | Problem: the fancy `asyncinject` stuff inteferes with the fancy Datasette thing that introspects view functions to look for what parameters they take: ```python class Table(asyncinject.AsyncInjectAll): async def view(self, request): return Response.html("Hello from {}".format( html.escape(repr(request.url_vars)) )) @hookimpl def register_routes(): return [ (r"/t/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)", Table().view), ] ``` This failed with error: "Table.view() takes 1 positional argument but 2 were given" So I'm going to use `AsyncInject` and have the `view` function NOT use the `@inject` decorator. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-973554024 | https://api.github.com/repos/simonw/datasette/issues/878 | 973554024 | IC_kwDOBm6k_c46B0Fo | 9599 | 2021-11-19T00:21:20Z | 2021-11-19T00:21:20Z | OWNER | That's annoying: it looks like plugins can't use `register_routes()` to over-ride default routes within Datasette itself. This didn't work: ```python from datasette.utils.asgi import Response from datasette import hookimpl import html async def table(request): return Response.html("Hello from {}".format( html.escape(repr(request.url_vars)) )) @hookimpl def register_routes(): return [ (r"/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)", table), ] ``` I'll use a `/t/` prefix for the moment, but this is probably something I'll fix in Datasette itself later. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-973542284 | https://api.github.com/repos/simonw/datasette/issues/878 | 973542284 | IC_kwDOBm6k_c46BxOM | 9599 | 2021-11-19T00:16:44Z | 2021-11-19T00:16:44Z | OWNER | ``` Development % cookiecutter gh:simonw/datasette-plugin You've downloaded /Users/simon/.cookiecutters/datasette-plugin before. Is it okay to delete and re-download it? [yes]: yes plugin_name []: table-new description []: New implementation of TableView, see https://github.com/simonw/datasette/issues/878 hyphenated [table-new]: underscored [table_new]: github_username []: simonw author_name []: Simon Willison include_static_directory []: include_templates_directory []: ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-973527870 | https://api.github.com/repos/simonw/datasette/issues/878 | 973527870 | IC_kwDOBm6k_c46Bts- | 9599 | 2021-11-19T00:13:43Z | 2021-11-19T00:13:43Z | OWNER | New plan: I'm going to build a brand new implementation of `TableView` starting out as a plugin, using the `register_routes()` plugin hook. It will reuse the existing HTML template but will be a completely new Python implementation, based on `asyncinject`. I'm going to start by just getting the table to show up on the page - then I'll add faceting, suggested facets, filters and so-on. Bonus: I'm going to see if I can get it to work for arbitrary SQL queries too (stretch goal). | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/pull/1516#issuecomment-972858458 | https://api.github.com/repos/simonw/datasette/issues/1516 | 972858458 | IC_kwDOBm6k_c45_KRa | 22429695 | 2021-11-18T13:19:01Z | 2021-11-18T13:19:01Z | NONE | # [Codecov](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report > Merging [#1516](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (a82c620) into [main](https://codecov.io/gh/simonw/datasette/commit/0156c6b5e52d541e93f0d68e9245f20ae83bc933?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (0156c6b) will **not change** coverage. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1516/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) ```diff @@ Coverage Diff @@ ## main #1516 +/- ## ======================================= Coverage 91.82% 91.82% ======================================= Files 34 34 Lines 4430 4430 ======================================= Hits 4068 4068 Misses 362 362 ``` ------ [Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1057340779 | |
https://github.com/simonw/datasette/pull/1514#issuecomment-972852184 | https://api.github.com/repos/simonw/datasette/issues/1514 | 972852184 | IC_kwDOBm6k_c45_IvY | 49699333 | 2021-11-18T13:11:15Z | 2021-11-18T13:11:15Z | CONTRIBUTOR | Superseded by #1516. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1056117435 | |
https://github.com/simonw/datasette/pull/1514#issuecomment-971575746 | https://api.github.com/repos/simonw/datasette/issues/1514 | 971575746 | IC_kwDOBm6k_c456RHC | 22429695 | 2021-11-17T13:18:58Z | 2021-11-17T13:18:58Z | NONE | # [Codecov](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report > Merging [#1514](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (b02c35a) into [main](https://codecov.io/gh/simonw/datasette/commit/0156c6b5e52d541e93f0d68e9245f20ae83bc933?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (0156c6b) will **not change** coverage. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1514/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) ```diff @@ Coverage Diff @@ ## main #1514 +/- ## ======================================= Coverage 91.82% 91.82% ======================================= Files 34 34 Lines 4430 4430 ======================================= Hits 4068 4068 Misses 362 362 ``` ------ [Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1056117435 | |
https://github.com/simonw/datasette/pull/1500#issuecomment-971568829 | https://api.github.com/repos/simonw/datasette/issues/1500 | 971568829 | IC_kwDOBm6k_c456Pa9 | 49699333 | 2021-11-17T13:13:58Z | 2021-11-17T13:13:58Z | CONTRIBUTOR | Superseded by #1514. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1041158024 | |
https://github.com/simonw/datasette/issues/878#issuecomment-971209475 | https://api.github.com/repos/simonw/datasette/issues/878 | 971209475 | IC_kwDOBm6k_c4543sD | 9599 | 2021-11-17T05:41:42Z | 2021-11-17T05:41:42Z | OWNER | I'm going to build a brand new implementation of the `TableView` class that doesn't subclass `BaseView` at all, instead using `asyncinject`. If I'm lucky that will clean up the grungiest part of the codebase. I can maybe even run the tests against old `TableView` and `TableView2` to check that they behave the same. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-971057553 | https://api.github.com/repos/simonw/datasette/issues/878 | 971057553 | IC_kwDOBm6k_c454SmR | 9599 | 2021-11-17T01:40:45Z | 2021-11-17T01:40:45Z | OWNER | I shipped that code as a new library, `asyncinject`: https://pypi.org/project/asyncinject/ - I'll open a new PR to attempt to refactor `TableView` to use it. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/pull/1512#issuecomment-971056169 | https://api.github.com/repos/simonw/datasette/issues/1512 | 971056169 | IC_kwDOBm6k_c454SQp | 9599 | 2021-11-17T01:39:44Z | 2021-11-17T01:39:44Z | OWNER | Closing this PR because I shipped the code in it as a separate library instead. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055402144 | |
https://github.com/simonw/datasette/pull/1512#issuecomment-971055677 | https://api.github.com/repos/simonw/datasette/issues/1512 | 971055677 | IC_kwDOBm6k_c454SI9 | 9599 | 2021-11-17T01:39:25Z | 2021-11-17T01:39:25Z | OWNER | https://github.com/simonw/asyncinject version 0.1a0 is now live on PyPI: https://pypi.org/project/asyncinject/ | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055402144 | |
https://github.com/simonw/datasette/pull/1512#issuecomment-971010724 | https://api.github.com/repos/simonw/datasette/issues/1512 | 971010724 | IC_kwDOBm6k_c454HKk | 9599 | 2021-11-17T01:12:22Z | 2021-11-17T01:12:22Z | OWNER | I'm going to extract out the `asyncinject` stuff into a separate library. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055402144 | |
https://github.com/simonw/datasette/pull/1512#issuecomment-970718652 | https://api.github.com/repos/simonw/datasette/issues/1512 | 970718652 | IC_kwDOBm6k_c452_28 | 22429695 | 2021-11-16T22:02:59Z | 2021-11-16T23:51:48Z | NONE | # [Codecov](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report > Merging [#1512](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (8f757da) into [main](https://codecov.io/gh/simonw/datasette/commit/0156c6b5e52d541e93f0d68e9245f20ae83bc933?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (0156c6b) will **decrease** coverage by `2.10%`. > The diff coverage is `36.20%`. [![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1512/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) ```diff @@ Coverage Diff @@ ## main #1512 +/- ## ========================================== - Coverage 91.82% 89.72% -2.11% ========================================== Files 34 36 +2 Lines 4430 4604 +174 ========================================== + Hits 4068 4131 +63 - Misses 362 473 +111 ``` | [Impacted Files](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) | Coverage Δ | | |---|---|---| | [datasette/utils/vendored\_graphlib.py](https://codecov.io/gh/simonw/datasette/pull/1512/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-ZGF0YXNldHRlL3V0aWxzL3ZlbmRvcmVkX2dyYXBobGliLnB5) |… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055402144 | |
https://github.com/simonw/datasette/pull/1512#issuecomment-970861628 | https://api.github.com/repos/simonw/datasette/issues/1512 | 970861628 | IC_kwDOBm6k_c453iw8 | 9599 | 2021-11-16T23:46:07Z | 2021-11-16T23:46:07Z | OWNER | I made the changes locally and tested them with Python 3.6 like so: ``` cd /tmp mkdir v cd v pipenv shell --python=python3.6 cd ~/Dropbox/Development/datasette pip install -e '.[test]' pytest tests/test_asyncdi.py ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055402144 | |
https://github.com/simonw/datasette/pull/1512#issuecomment-970857411 | https://api.github.com/repos/simonw/datasette/issues/1512 | 970857411 | IC_kwDOBm6k_c453hvD | 9599 | 2021-11-16T23:43:21Z | 2021-11-16T23:43:21Z | OWNER | ``` E File "/home/runner/work/datasette/datasette/datasette/utils/vendored_graphlib.py", line 56 E if (result := self._node2info.get(node)) is None: E ^ E SyntaxError: invalid syntax ``` Oh no - the vendored code I use has `:=` so doesn't work on Python 3.6! Will have to backport it more thoroughly. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055402144 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970855084 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970855084 | IC_kwDOBm6k_c453hKs | 9599 | 2021-11-16T23:41:46Z | 2021-11-16T23:41:46Z | OWNER | Conclusion: using a giant convoluted CTE and UNION ALL query to attempt to calculate facets at the same time as retrieving rows is a net LOSS for performance! Very surprised to see that. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970853917 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970853917 | IC_kwDOBm6k_c453g4d | 9599 | 2021-11-16T23:41:01Z | 2021-11-16T23:41:01Z | OWNER | One very interesting difference between the two: on the single giant query page: ```json { "request_duration_ms": 376.4317020000476, "sum_trace_duration_ms": 370.0828700000329, "num_traces": 5 } ``` And on the page that uses separate queries: ```json { "request_duration_ms": 819.012272000009, "sum_trace_duration_ms": 201.52852100000018, "num_traces": 19 } ``` The separate pages page takes 819ms total to render the page, but spends 201ms across 19 SQL queries. The single big query takes 376ms total to render the page, spending 370ms in 5 queries <details><summary>Those 5 queries, if you're interested</summary> ```sql select database_name, schema_version from databases PRAGMA schema_version PRAGMA schema_version explain with cte as (\r\n select rowid, date, county, state, fips, cases, deaths\r\n from ny_times_us_counties\r\n),\r\ntruncated as (\r\n select null as _facet, null as facet_name, null as facet_count, rowid, date, county, state, fips, cases, deaths\r\n from cte order by date desc limit 4\r\n),\r\nstate_facet as (\r\n select 'state' as _facet, state as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n),\r\nfips_facet as (\r\n select 'fips' as _facet, fips as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n),\r\ncounty_facet as (\r\n select 'county' as _facet, county as facet_name, count(*) as facet_count,\r\n null, null, null, null, null, null, null\r\n from cte group by facet_name order by facet_count desc limit 3\r\n)\r\nselect * from truncated\r\nunion all select * from state_facet\r\nunion all select * from fips_facet\r\nunion all select * from county_facet with cte as (\r\n select rowid, date, county, state, fips, cases, deaths\r\n from ny_times_us_counties\r\n),\r\ntruncated as (\r\n select null as _facet, null as facet_name, null as face… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970845844 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970845844 | IC_kwDOBm6k_c453e6U | 9599 | 2021-11-16T23:35:38Z | 2021-11-16T23:35:38Z | OWNER | I tried adding `cases > 10000` but the SQL query now takes too long - so moving this to my laptop. ``` cd /tmp wget https://covid-19.datasettes.com/covid.db datasette covid.db \ --setting facet_time_limit_ms 10000 \ --setting sql_time_limit_ms 10000 \ --setting trace_debug 1 ``` `http://127.0.0.1:8006/covid/ny_times_us_counties?_trace=1&_facet_size=3&_size=2&cases__gt=10000` shows in the traces: ```json [ { "type": "sql", "start": 12.693033525, "end": 12.694056904, "duration_ms": 1.0233789999993803, "traceback": [ " File \"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\", line 262, in get\n return await self.view_get(\n", " File \"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\", line 477, in view_get\n response_or_template_contexts = await self.data(\n", " File \"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\", line 705, in data\n results = await db.execute(sql, params, truncate=True, **extra_args)\n" ], "database": "covid", "sql": "select rowid, date, county, state, fips, cases, deaths from ny_times_us_counties where \"cases\" > :p0 order by rowid limit 3", "params": { "p0": 10000 } }, { "type": "sql", "start": 12.694285093, "end": 12.814936275, "duration_ms": 120.65118200000136, "traceback": [ " File \"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\", line 262, in get\n return await self.view_get(\n", " File \"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\", line 477, in view_get\n response_or_template_contexts = await self.data(\n", " File \"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\", line 723, i… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970828568 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970828568 | IC_kwDOBm6k_c453asY | 9599 | 2021-11-16T23:27:11Z | 2021-11-16T23:27:11Z | OWNER | One last experiment: I'm going to try running an expensive query in the CTE portion. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970827674 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970827674 | IC_kwDOBm6k_c453aea | 9599 | 2021-11-16T23:26:58Z | 2021-11-16T23:26:58Z | OWNER | With trace. https://covid-19.datasettes.com/covid/ny_times_us_counties?_trace=1&_facet_size=3&_size=2&_trace=1 shows the following: ``` fetch rows: 0.41762600005768036 ms facet state: 284.30423800000426 ms facet county: 273.2565999999679 ms facet fips: 197.80996999998024 ms ``` = 755.78843400001ms total It didn't run a count because that's the homepage and the count is cached. So I dropped the count from the query and ran it: https://covid-19.datasettes.com/covid?sql=with+cte+as+(%0D%0A++select+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+ny_times_us_counties%0D%0A)%2C%0D%0Atruncated+as+(%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+cte+order+by+date+desc+limit+4%0D%0A)%2C%0D%0Astate_facet+as+(%0D%0A++select+%27state%27+as+_facet%2C+state+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%2C%0D%0Afips_facet+as+(%0D%0A++select+%27fips%27+as+_facet%2C+fips+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%2C%0D%0Acounty_facet+as+(%0D%0A++select+%27county%27+as+_facet%2C+county+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+state_facet%0D%0Aunion+all+select+*+from+fips_facet%0D%0Aunion+all+select+*+from+county_facet&_trace=1 Shows 649.4359889999259 ms for the query - compared to 755.78843400001ms for the separate. So it saved about 100ms. Still not a huge difference though! | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970780866 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970780866 | IC_kwDOBm6k_c453PDC | 9599 | 2021-11-16T23:01:57Z | 2021-11-16T23:01:57Z | OWNER | One disadvantage to this approach: if you have a SQL time limit of 1s and it takes 0.9s to return the rows but then 0.5s to calculate each of the requested facets the entire query will exceed the time limit. Could work around this by catching that error and then re-running the query just for the rows, but that would result in the user having to wait longer for the results. Could try to remember if that has happened using an in-memory Python data structure and skip the faceting optimization if it's caused problems in the past? That seems a bit gross. Maybe this becomes an opt-in optimization you can request in your `metadata.json` setting for that table, which massively increases the time limit? That's a bit weird too - now there are two separate implementations of the faceting logic, which had better have a REALLY big pay-off to be worth maintaining. What if we kept the query that returns the rows to be displayed on the page separate from the facets, but then executed all of the facets together using this method such that the `cte` only (presumably) has to be calculated once? That would still lead to multiple facets potentially exceeding the SQL time limit when single facets would not have. Maybe a better optimization would be to move facets to happening via `fetch()` calls from the client, so the user gets to see their rows instantly and the facets then appear as and when they are available (though it would cause page jank). | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970766486 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970766486 | IC_kwDOBm6k_c453LiW | 9599 | 2021-11-16T22:52:56Z | 2021-11-16T22:56:07Z | OWNER | https://covid-19.datasettes.com/covid is 805.2MB https://covid-19.datasettes.com/covid/ny_times_us_counties?_trace=1&_facet_size=3&_size=2 Equivalent SQL: https://covid-19.datasettes.com/covid?sql=with+cte+as+%28%0D%0A++select+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+ny_times_us_counties%0D%0A%29%2C%0D%0Atruncated+as+%28%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+cte+order+by+date+desc+limit+4%0D%0A%29%2C%0D%0Astate_facet+as+%28%0D%0A++select+%27state%27+as+_facet%2C+state+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Afips_facet+as+%28%0D%0A++select+%27fips%27+as+_facet%2C+fips+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Acounty_facet+as+%28%0D%0A++select+%27county%27+as+_facet%2C+county+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Atotal_count+as+%28%0D%0A++select+%27COUNT%27+as+_facet%2C+%27%27+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte%0D%0A%29%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+state_facet%0D%0Aunion+all+select+*+from+fips_facet%0D%0Aunion+all+select+*+from+county_facet%0D%0Aunion+all+select+*+from+total_count ```sql with cte as ( select rowid, date, county, state, fips, cases, deaths from ny_times_us_counties ), truncated as ( select null as _facet, null as facet_name, null as facet_count, rowid, date, county, state, fips, cases, deaths from cte order by date desc limit 4 ), state_facet as ( select 's… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970770304 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970770304 | IC_kwDOBm6k_c453MeA | 9599 | 2021-11-16T22:55:19Z | 2021-11-16T22:55:19Z | OWNER | (One thing I really like about this pattern is that it should work exactly the same when used to facet the results of arbitrary SQL queries as it does when faceting results from the table page.) | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970767952 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970767952 | IC_kwDOBm6k_c453L5Q | 9599 | 2021-11-16T22:53:52Z | 2021-11-16T22:53:52Z | OWNER | It's going to take another 15 minutes for the build to finish and deploy the version with `_trace=1`: https://github.com/simonw/covid-19-datasette/actions/runs/1469150112 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970758179 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970758179 | IC_kwDOBm6k_c453Jgj | 9599 | 2021-11-16T22:47:38Z | 2021-11-16T22:47:38Z | OWNER | Trace now enabled: https://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_facet_size=3&_size=2&_nocount=1&_trace=1 Here are the relevant traces: ```json [ { "type": "sql", "start": 31.214430154, "end": 31.214817089, "duration_ms": 0.3869350000016425, "traceback": [ " File \"/usr/local/lib/python3.8/site-packages/datasette/views/base.py\", line 262, in get\n return await self.view_get(\n", " File \"/usr/local/lib/python3.8/site-packages/datasette/views/base.py\", line 477, in view_get\n response_or_template_contexts = await self.data(\n", " File \"/usr/local/lib/python3.8/site-packages/datasette/views/table.py\", line 705, in data\n results = await db.execute(sql, params, truncate=True, **extra_args)\n" ], "database": "global-power-plants", "sql": "select rowid, country, country_long, name, gppd_idnr, capacity_mw, latitude, longitude, primary_fuel, other_fuel1, other_fuel2, other_fuel3, commissioning_year, owner, source, url, geolocation_source, wepp_id, year_of_capacity_data, generation_gwh_2013, generation_gwh_2014, generation_gwh_2015, generation_gwh_2016, generation_gwh_2017, generation_data_source, estimated_generation_gwh from [global-power-plants] order by rowid limit 3", "params": {} }, { "type": "sql", "start": 31.215234586, "end": 31.220110342, "duration_ms": 4.875756000000564, "traceback": [ " File \"/usr/local/lib/python3.8/site-packages/datasette/views/table.py\", line 760, in data\n ) = await facet.facet_results()\n", " File \"/usr/local/lib/python3.8/site-packages/datasette/facets.py\", line 212, in facet_results\n facet_rows_results = await self.ds.execute(\n", " File \"/usr/local/lib/python3.8/site-packages/datasette/app.py\", line 634, in execute\n return await self.databases[db_name].execute(\n" ], "database": "global-power-plants", "sql": "select countr… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970742415 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970742415 | IC_kwDOBm6k_c453FqP | 9599 | 2021-11-16T22:37:14Z | 2021-11-16T22:37:14Z | OWNER | The query takes 42.794ms to run. Here's the equivalent page using separate queries: https://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_facet_size=3&_size=2&_nocount=1 Annoyingly I can't disable facet suggestions but keep facets. I'm going to turn on tracing so I can see how long the separate queries took. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/issues/1513#issuecomment-970738130 | https://api.github.com/repos/simonw/datasette/issues/1513 | 970738130 | IC_kwDOBm6k_c453EnS | 9599 | 2021-11-16T22:32:19Z | 2021-11-16T22:32:19Z | OWNER | I came up with the following query which seems to work! ```sql with cte as ( select rowid, country, country_long, name, owner, primary_fuel from [global-power-plants] ), truncated as ( select null as _facet, null as facet_name, null as facet_count, rowid, country, country_long, name, owner, primary_fuel from cte order by rowid limit 4 ), country_long_facet as ( select 'country_long' as _facet, country_long as facet_name, count(*) as facet_count, null, null, null, null, null, null from cte group by facet_name order by facet_count desc limit 3 ), owner_facet as ( select 'owner' as _facet, owner as facet_name, count(*) as facet_count, null, null, null, null, null, null from cte group by facet_name order by facet_count desc limit 3 ), primary_fuel_facet as ( select 'primary_fuel' as _facet, primary_fuel as facet_name, count(*) as facet_count, null, null, null, null, null, null from cte group by facet_name order by facet_count desc limit 3 ) select * from truncated union all select * from country_long_facet union all select * from owner_facet union all select * from primary_fuel_facet ``` (Limits should be 101, 31, 31, 31 but I reduced size to get a shorter example table). Results [look like this](https://global-power-plants.datasettes.com/global-power-plants?sql=with+cte+as+%28%0D%0A++select+rowid%2C+country%2C+country_long%2C+name%2C+owner%2C+primary_fuel%0D%0A++from+%5Bglobal-power-plants%5D%0D%0A%29%2C%0D%0Atruncated+as+%28%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+country%2C+country_long%2C+name%2C+owner%2C+primary_fuel%0D%0A++from+cte+order+by+rowid+limit+4%0D%0A%29%2C%0D%0Acountry_long_facet+as+%28%0D%0A++select+%27country_long%27+as+_facet%2C+country_long+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Aowner_facet+as+%28%0D%0A++select+%27owner%27+as+_facet%2C+owner+as+fa… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055469073 | |
https://github.com/simonw/datasette/pull/1512#issuecomment-970718337 | https://api.github.com/repos/simonw/datasette/issues/1512 | 970718337 | IC_kwDOBm6k_c452_yB | 9599 | 2021-11-16T22:02:30Z | 2021-11-16T22:02:30Z | OWNER | I've decided to make the clever `asyncio` dependency injection opt-in - so you can either decorate with `@inject` or you can set `inject_all = True` on the class - for example: ```python import asyncio from datasette.utils.asyncdi import AsyncBase, inject class Simple(AsyncBase): def __init__(self): self.log = [] @inject async def two(self): self.log.append("two") @inject async def one(self, two): self.log.append("one") return self.log async def not_inject(self, one, two): return one + two class Complex(AsyncBase): inject_all = True def __init__(self): self.log = [] async def b(self): self.log.append("b") async def a(self, b): self.log.append("a") async def go(self, a): self.log.append("go") return self.log ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1055402144 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970712713 | https://api.github.com/repos/simonw/datasette/issues/878 | 970712713 | IC_kwDOBm6k_c452-aJ | 9599 | 2021-11-16T21:54:33Z | 2021-11-16T21:54:33Z | OWNER | I'm going to continue working on this in a PR. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970705738 | https://api.github.com/repos/simonw/datasette/issues/878 | 970705738 | IC_kwDOBm6k_c4528tK | 9599 | 2021-11-16T21:44:31Z | 2021-11-16T21:44:31Z | OWNER | Wrote a TIL about what I learned using `TopologicalSorter`: https://til.simonwillison.net/python/graphlib-topologicalsorter | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970673085 | https://api.github.com/repos/simonw/datasette/issues/878 | 970673085 | IC_kwDOBm6k_c4520u9 | 9599 | 2021-11-16T20:58:24Z | 2021-11-16T20:58:24Z | OWNER | New test: ```python class Complex(AsyncBase): def __init__(self): self.log = [] async def d(self): await asyncio.sleep(random() * 0.1) print("LOG: d") self.log.append("d") async def c(self): await asyncio.sleep(random() * 0.1) print("LOG: c") self.log.append("c") async def b(self, c, d): print("LOG: b") self.log.append("b") async def a(self, b, c): print("LOG: a") self.log.append("a") async def go(self, a): print("LOG: go") self.log.append("go") return self.log @pytest.mark.asyncio async def test_complex(): result = await Complex().go() # 'c' should only be called once assert tuple(result) in ( # c and d could happen in either order ("c", "d", "b", "a", "go"), ("d", "c", "b", "a", "go"), ) ``` And this code passes it: ```python import asyncio from functools import wraps import inspect try: import graphlib except ImportError: from . import vendored_graphlib as graphlib class AsyncMeta(type): def __new__(cls, name, bases, attrs): # Decorate any items that are 'async def' methods _registry = {} new_attrs = {"_registry": _registry} for key, value in attrs.items(): if inspect.iscoroutinefunction(value) and not value.__name__ == "resolve": new_attrs[key] = make_method(value) _registry[key] = new_attrs[key] else: new_attrs[key] = value # Gather graph for later dependency resolution graph = { key: { p for p in inspect.signature(method).parameters.keys() if p != "self" and not p.startswith("_") } for key, method in _registry.items() } new_attrs["_graph"] = graph return super().__new__(cls, name, bases, new_attrs) def make_met… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970660299 | https://api.github.com/repos/simonw/datasette/issues/878 | 970660299 | IC_kwDOBm6k_c452xnL | 9599 | 2021-11-16T20:39:43Z | 2021-11-16T20:42:27Z | OWNER | But that does seem to be the plan that `TopographicalSorter` provides: ```python graph = {"go": {"a"}, "a": {"b", "c"}, "b": {"c", "d"}} ts = TopologicalSorter(graph) ts.prepare() while ts.is_active(): nodes = ts.get_ready() print(nodes) ts.done(*nodes) ``` Outputs: ``` ('c', 'd') ('b',) ('a',) ('go',) ``` Also: ```python graph = {"go": {"d", "e", "f"}, "d": {"b", "c"}, "b": {"c"}} ts = TopologicalSorter(graph) ts.prepare() while ts.is_active(): nodes = ts.get_ready() print(nodes) ts.done(*nodes) ``` Gives: ``` ('e', 'f', 'c') ('b',) ('d',) ('go',) ``` I'm confident that `TopologicalSorter` is the way to do this. I think I need to rewrite my code to call it once to get that plan, then `await asyncio.gather(*nodes)` in turn to execute it. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970657874 | https://api.github.com/repos/simonw/datasette/issues/878 | 970657874 | IC_kwDOBm6k_c452xBS | 9599 | 2021-11-16T20:36:01Z | 2021-11-16T20:36:01Z | OWNER | My goal here is to calculate the most efficient way to resolve the different nodes, running them in parallel where possible. So for this class: ```python class Complex(AsyncBase): async def d(self): pass async def c(self): pass async def b(self, c, d): pass async def a(self, b, c): pass async def go(self, a): pass ``` A call to `go()` should do this: - `c` and `d` in parallel - `b` - `a` - `go` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970655927 | https://api.github.com/repos/simonw/datasette/issues/878 | 970655927 | IC_kwDOBm6k_c452wi3 | 9599 | 2021-11-16T20:33:11Z | 2021-11-16T20:33:11Z | OWNER | What should be happening here instead is it should resolve the full graph and notice that `c` is depended on by both `b` and `a` - so it should run `c` first, then run the next ones in parallel. So maybe the algorithm I'm inheriting from https://docs.python.org/3/library/graphlib.html isn't the correct algorithm? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970655304 | https://api.github.com/repos/simonw/datasette/issues/878 | 970655304 | IC_kwDOBm6k_c452wZI | 9599 | 2021-11-16T20:32:16Z | 2021-11-16T20:32:16Z | OWNER | This code is really fiddly. I just got to this version: ```python import asyncio from functools import wraps import inspect try: import graphlib except ImportError: from . import vendored_graphlib as graphlib class AsyncMeta(type): def __new__(cls, name, bases, attrs): # Decorate any items that are 'async def' methods _registry = {} new_attrs = {"_registry": _registry} for key, value in attrs.items(): if inspect.iscoroutinefunction(value) and not value.__name__ == "resolve": new_attrs[key] = make_method(value) _registry[key] = new_attrs[key] else: new_attrs[key] = value # Gather graph for later dependency resolution graph = { key: { p for p in inspect.signature(method).parameters.keys() if p != "self" and not p.startswith("_") } for key, method in _registry.items() } new_attrs["_graph"] = graph return super().__new__(cls, name, bases, new_attrs) def make_method(method): @wraps(method) async def inner(self, _results=None, **kwargs): print("inner - _results=", _results) parameters = inspect.signature(method).parameters.keys() # Any parameters not provided by kwargs are resolved from registry to_resolve = [p for p in parameters if p not in kwargs and p != "self"] missing = [p for p in to_resolve if p not in self._registry] assert ( not missing ), "The following DI parameters could not be found in the registry: {}".format( missing ) results = {} results.update(kwargs) if to_resolve: resolved_parameters = await self.resolve(to_resolve, _results) results.update(resolved_parameters) return_value = await method(self, **results) if _results is not None: _res… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970624197 | https://api.github.com/repos/simonw/datasette/issues/878 | 970624197 | IC_kwDOBm6k_c452ozF | 9599 | 2021-11-16T19:49:05Z | 2021-11-16T19:49:05Z | OWNER | Here's the latest version of my weird dependency injection async class: ```python import inspect class AsyncMeta(type): def __new__(cls, name, bases, attrs): # Decorate any items that are 'async def' methods _registry = {} new_attrs = {"_registry": _registry} for key, value in attrs.items(): if inspect.iscoroutinefunction(value) and not value.__name__ == "resolve": new_attrs[key] = make_method(value) _registry[key] = new_attrs[key] else: new_attrs[key] = value # Topological sort of _registry by parameter dependencies graph = { key: { p for p in inspect.signature(method).parameters.keys() if p != "self" and not p.startswith("_") } for key, method in _registry.items() } new_attrs["_graph"] = graph return super().__new__(cls, name, bases, new_attrs) def make_method(method): @wraps(method) async def inner(self, **kwargs): parameters = inspect.signature(method).parameters.keys() # Any parameters not provided by kwargs are resolved from registry to_resolve = [p for p in parameters if p not in kwargs and p != "self"] missing = [p for p in to_resolve if p not in self._registry] assert ( not missing ), "The following DI parameters could not be found in the registry: {}".format( missing ) results = {} results.update(kwargs) results.update(await self.resolve(to_resolve)) return await method(self, **results) return inner bad = [0] class AsyncBase(metaclass=AsyncMeta): async def resolve(self, names): print(" resolve({})".format(names)) results = {} # Resolve them in the correct order ts = TopologicalSorter() ts2 = TopologicalSorter() print(" names = ", names) print(" s… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/782#issuecomment-970554697 | https://api.github.com/repos/simonw/datasette/issues/782 | 970554697 | IC_kwDOBm6k_c452X1J | 9599 | 2021-11-16T18:32:03Z | 2021-11-16T18:32:03Z | OWNER | I'm going to take another look at this: - https://github.com/simonw/datasette/issues/878 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
627794879 | |
https://github.com/simonw/datasette/issues/782#issuecomment-970553780 | https://api.github.com/repos/simonw/datasette/issues/782 | 970553780 | IC_kwDOBm6k_c452Xm0 | 9599 | 2021-11-16T18:30:51Z | 2021-11-16T18:30:58Z | OWNER | OK, I'm ready to start working on this today. I'm going to go with a default representation that looks like this: ```json { "rows": [ {"id": 1, "name": "One"}, {"id": 2, "name": "Two"} ], "next_url": null } ``` Note that there's no `count` - all it provides is the current selection of results and an indication as to how the next can be retrieved (`null` if there are no more results). I'll implement `?_extra=` to provide everything else. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
627794879 |