home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

60 rows where "updated_at" is on date 2021-11-19 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 9

  • base_url is omitted in JSON and CSV views 15
  • Complete refactor of TableView and table.html template 10
  • Docker configuration for exercising Datasette behind Apache mod_proxy 10
  • New pattern for views that return either JSON or HTML, available for plugins 7
  • Extra options to `lookup()` which get passed to `insert()` 7
  • Deploy a live instance of demos/apache-proxy 7
  • Let `register_routes()` over-ride default routes within Datasette 2
  • Allow routes to have extra options 1
  • Pattern for avoiding accidental URL over-rides 1

user 2

  • simonw 59
  • mroswell 1

author_association 2

  • OWNER 59
  • CONTRIBUTOR 1
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
974542348 https://github.com/simonw/datasette/issues/1522#issuecomment-974542348 https://api.github.com/repos/simonw/datasette/issues/1522 IC_kwDOBm6k_c46FlYM simonw 9599 2021-11-19T23:41:47Z 2021-11-19T23:44:07Z OWNER

Do I have to use cloudbuild.yml to specify these? https://stackoverflow.com/a/58327340/6083 and https://stackoverflow.com/a/66232670/6083 suggest I do.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Deploy a live instance of demos/apache-proxy 1058896236  
974541971 https://github.com/simonw/datasette/issues/1522#issuecomment-974541971 https://api.github.com/repos/simonw/datasette/issues/1522 IC_kwDOBm6k_c46FlST simonw 9599 2021-11-19T23:40:32Z 2021-11-19T23:40:32Z OWNER

I want to be able to use build arguments to specify which commit version or branch of Datasette to deploy.

This is proving hard to work out. I have this in my Dockerfile now:

``` ARG DATASETTE_REF

RUN pip install https://github.com/simonw/datasette/archive/${DATASETTE_REF}.zip ``` Which works locally:

docker build -t datasette-apache-proxy-demo . \
  --build-arg DATASETTE_REF=c617e1769ea27e045b0f2907ef49a9a1244e577d

But I can't figure out the right incantation to pass to gcloud build submit.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Deploy a live instance of demos/apache-proxy 1058896236  
974523569 https://github.com/simonw/datasette/issues/1522#issuecomment-974523569 https://api.github.com/repos/simonw/datasette/issues/1522 IC_kwDOBm6k_c46Fgyx simonw 9599 2021-11-19T22:51:10Z 2021-11-19T22:51:10Z OWNER

I wan a GitHub Action which I can manually activate to deploy a new version of that demo... and I want it to bake in the latest release of Datasette so I can use it to demonstrate bug fixes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Deploy a live instance of demos/apache-proxy 1058896236  
974523297 https://github.com/simonw/datasette/issues/1522#issuecomment-974523297 https://api.github.com/repos/simonw/datasette/issues/1522 IC_kwDOBm6k_c46Fguh simonw 9599 2021-11-19T22:50:31Z 2021-11-19T22:50:31Z OWNER

Demo code is now at: https://github.com/simonw/datasette/tree/main/demos/apache-proxy

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Deploy a live instance of demos/apache-proxy 1058896236  
974521687 https://github.com/simonw/datasette/issues/1522#issuecomment-974521687 https://api.github.com/repos/simonw/datasette/issues/1522 IC_kwDOBm6k_c46FgVX simonw 9599 2021-11-19T22:46:26Z 2021-11-19T22:46:26Z OWNER

Oh weird, it started working: https://datasette-apache-proxy-demo-j7hipcg4aq-uc.a.run.app/prefix/fixtures/sortable

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Deploy a live instance of demos/apache-proxy 1058896236  
974506401 https://github.com/simonw/datasette/issues/1522#issuecomment-974506401 https://api.github.com/repos/simonw/datasette/issues/1522 IC_kwDOBm6k_c46Fcmh simonw 9599 2021-11-19T22:11:51Z 2021-11-19T22:11:51Z OWNER

This is frustrating: I have the following Dockerfile: ```dockerfile FROM python:3-alpine

RUN apk add --no-cache \ apache2 \ apache2-proxy \ bash

RUN pip install datasette

ENV TINI_VERSION v0.18.0 ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini RUN chmod +x /tini

Append this to the end of the default httpd.conf file

RUN echo $'ServerName localhost\n\ \n\ <Proxy *>\n\ Order deny,allow\n\ Allow from all\n\ </Proxy>\n\ \n\ ProxyPass /prefix/ http://localhost:8001/\n\ Header add X-Proxied-By "Apache2"' >> /etc/apache2/httpd.conf

RUN echo $'Datasette' > /var/www/localhost/htdocs/index.html

WORKDIR /app

ADD https://latest.datasette.io/fixtures.db /app/fixtures.db

RUN echo $'#!/usr/bin/env bash\n\ set -e\n\ \n\ httpd -D FOREGROUND &\n\ datasette fixtures.db --setting base_url "/prefix/" -h 0.0.0.0 -p 8001 &\n\ \n\ wait -n' > /app/start.sh

RUN chmod +x /app/start.sh

EXPOSE 80 ENTRYPOINT ["/tini", "--", "/app/start.sh"] It works fine when I run it locally: docker build -t datasette-apache-proxy-demo . docker run -p 5000:80 datasette-apache-proxy-demo But when I deploy it to Cloud Run with the following script:bash

!/bin/bash

https://til.simonwillison.net/cloudrun/ship-dockerfile-to-cloud-run

NAME="datasette-apache-proxy-demo" PROJECT=$(gcloud config get-value project) IMAGE="gcr.io/$PROJECT/$NAME"

gcloud builds submit --tag $IMAGE gcloud run deploy \ --allow-unauthenticated \ --platform=managed \ --image $IMAGE $NAME \ --port 80 `` It serves the/page successfully, but hits to/prefix/` return the following 503 error:

Service Unavailable

The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

Apache/2.4.51 (Unix) Server at datasette-apache-proxy-demo-j7hipcg4aq-uc.a.run.app Port 80

Cloud Run logs:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Deploy a live instance of demos/apache-proxy 1058896236  
974478126 https://github.com/simonw/datasette/issues/1519#issuecomment-974478126 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FVsu simonw 9599 2021-11-19T21:16:36Z 2021-11-19T21:16:36Z OWNER

In the meantime I can catch these errors by changing the test to run each path twice, once with and once without the prefix. This should accurately simulate how Apache is working here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974477465 https://github.com/simonw/datasette/issues/1519#issuecomment-974477465 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FViZ simonw 9599 2021-11-19T21:15:30Z 2021-11-19T21:15:30Z OWNER

I think what's happening here is Apache is actually making a request to /fixtures rather than making a request to /prefix/fixtures - and Datasette is replying to requests on both the prefixed and the non-prefixed paths.

This is pretty confusing! I think Datasette should ONLY reply to /prefix/fixtures instead and return a 404 for /fixtures - this would make things a whole lot easier to debug.

But shipping that change could break existing deployments. Maybe that should be a breaking change for 1.0.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974450232 https://github.com/simonw/datasette/issues/1519#issuecomment-974450232 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FO44 simonw 9599 2021-11-19T20:41:53Z 2021-11-19T20:42:19Z OWNER

https://docs.datasette.io/en/stable/deploying.html#apache-proxy-configuration says I should use ProxyPreserveHost on.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974447950 https://github.com/simonw/datasette/issues/1519#issuecomment-974447950 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FOVO simonw 9599 2021-11-19T20:40:19Z 2021-11-19T20:40:19Z OWNER

Figured it out! The test is not an accurate recreation of what is happening, because it doesn't simulate a request with a path of /fixtures that has been redirected by the proxy to /prefix/fixtures.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974435661 https://github.com/simonw/datasette/issues/1522#issuecomment-974435661 https://api.github.com/repos/simonw/datasette/issues/1522 IC_kwDOBm6k_c46FLVN simonw 9599 2021-11-19T20:33:42Z 2021-11-19T20:33:42Z OWNER

Should just be a case of deploying this Dockerfile:

```Dockerfile FROM python:3-alpine

RUN apk add --no-cache \ apache2 \ apache2-proxy \ bash

RUN pip install datasette

ENV TINI_VERSION v0.18.0 ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini RUN chmod +x /tini

Append this to the end of the default httpd.conf file

RUN echo $'ServerName localhost\n\ \n\ <Proxy *>\n\ Order deny,allow\n\ Allow from all\n\ </Proxy>\n\ \n\ ProxyPass /foo/bar/ http://localhost:9000/\n\ Header add X-Proxied-By "Apache2"' >> /etc/apache2/httpd.conf

RUN echo $'Datasette' > /var/www/localhost/htdocs/index.html

WORKDIR /app

ADD https://latest.datasette.io/fixtures.db /app/fixtures.db

RUN echo $'#!/usr/bin/env bash\n\ set -e\n\ \n\ httpd -D FOREGROUND &\n\ datasette fixtures.db --setting base_url "/foo/bar/" -p 9000 &\n\ \n\ wait -n' > /app/start.sh

RUN chmod +x /app/start.sh

EXPOSE 80 ENTRYPOINT ["/tini", "--", "/app/start.sh"] ``` I can follow this TIL: https://til.simonwillison.net/cloudrun/ship-dockerfile-to-cloud-run

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Deploy a live instance of demos/apache-proxy 1058896236  
974433520 https://github.com/simonw/datasette/issues/1521#issuecomment-974433520 https://api.github.com/repos/simonw/datasette/issues/1521 IC_kwDOBm6k_c46FKzw simonw 9599 2021-11-19T20:32:29Z 2021-11-19T20:32:29Z OWNER

This configuration works great.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Docker configuration for exercising Datasette behind Apache mod_proxy 1058815557  
974433320 https://github.com/simonw/datasette/issues/1519#issuecomment-974433320 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FKwo simonw 9599 2021-11-19T20:32:04Z 2021-11-19T20:32:04Z OWNER

Still not clear why the tests pass but the live example fails.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974433206 https://github.com/simonw/datasette/issues/1519#issuecomment-974433206 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FKu2 simonw 9599 2021-11-19T20:31:52Z 2021-11-19T20:31:52Z OWNER

Modified my Dockerfile to do this:

RUN pip install https://github.com/simonw/datasette/archive/ff0dd4da38d48c2fa9250ecf336002c9ed724e36.zip

And now the request in that debug ?_context=1 looks like this: "request": "<asgi.Request method=\"GET\" url=\"http://localhost:9000/fixtures?sql=select+*+from+compound_three_primary_keys+limit+1&_context=1\">" That explains the bug - that request doesn't maintain the original path prefix of http://localhost:5000/foo/bar/fixtures?sql= (also it's been rewritten to localhost:9000 instead of localhost:5000).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974422829 https://github.com/simonw/datasette/issues/1519#issuecomment-974422829 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FIMt simonw 9599 2021-11-19T20:26:35Z 2021-11-19T20:26:35Z OWNER

In the ?_context= debug view the request looks like this: "request": "<datasette.utils.asgi.Request object at 0x7faf9fe06200>", I'm going to add a repr() to it such that it's a bit more useful.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974420619 https://github.com/simonw/datasette/issues/1519#issuecomment-974420619 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FHqL simonw 9599 2021-11-19T20:25:19Z 2021-11-19T20:25:19Z OWNER

The implementations of path_with_removed_args and path_with_format:

https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/utils/init.py#L228-L254

https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/utils/init.py#L710-L729

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974418496 https://github.com/simonw/datasette/issues/1519#issuecomment-974418496 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FHJA simonw 9599 2021-11-19T20:24:16Z 2021-11-19T20:24:16Z OWNER

Here's the code that generates edit_sql_url correctly: https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/views/database.py#L416-L420

And here's the code for show_hide_link: https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/views/database.py#L432-L433

And for url_csv: https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/views/base.py#L600-L602

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974398399 https://github.com/simonw/datasette/issues/1519#issuecomment-974398399 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FCO_ simonw 9599 2021-11-19T20:08:20Z 2021-11-19T20:22:02Z OWNER

The relevant test is this one: https://github.com/simonw/datasette/blob/30255055150d7bc0affc8156adc18295495020ff/tests/test_html.py#L1608-L1649

I modified that test to add "/fixtures/facetable?sql=select+1" as one of the tested paths, and dropped in an assert False to pause it in the debugger: ``` @pytest.mark.parametrize( "path", [ "/", "/fixtures", "/fixtures/compound_three_primary_keys", "/fixtures/compound_three_primary_keys/a,a,a", "/fixtures/paginated_view", "/fixtures/facetable", "/fixtures?sql=select+1", ], ) def test_base_url_config(app_client_base_url_prefix, path): client = app_client_base_url_prefix response = client.get("/prefix/" + path.lstrip("/")) soup = Soup(response.body, "html.parser") if path == "/fixtures?sql=select+1":

      assert False

E assert False BUT... in the debugger: (Pdb) print(soup) ...

This data as json, testall, testnone, testresponse, CSV

`` Those all have the correct prefix! But that's not what I'm seeing in myDockerfile` reproduction of the issue.

Something very weird is going on here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974405016 https://github.com/simonw/datasette/issues/1519#issuecomment-974405016 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FD2Y simonw 9599 2021-11-19T20:14:19Z 2021-11-19T20:15:05Z OWNER

I added template_debug in the Dockerfile: datasette fixtures.db --setting template_debug 1 --setting base_url "/foo/bar/" -p 9000 &\n\ And then hit http://localhost:5000/foo/bar/fixtures?sql=select+*+from+compound_three_primary_keys+limit+1&_context=1 to view the template context - and it showed the bug, output edited to just show relevant keys:

json { "edit_sql_url": "/foo/bar/fixtures?sql=select+%2A+from+compound_three_primary_keys+limit+1", "settings": { "force_https_urls": false, "template_debug": true, "trace_debug": false, "base_url": "/foo/bar/" }, "show_hide_link": "/fixtures?sql=select+%2A+from+compound_three_primary_keys+limit+1&_context=1&_hide_sql=1", "show_hide_text": "hide", "show_hide_hidden": "", "renderers": { "json": "/fixtures.json?sql=select+*+from+compound_three_primary_keys+limit+1&_context=1" }, "url_csv": "/fixtures.csv?sql=select+*+from+compound_three_primary_keys+limit+1&_context=1&_size=max", "url_csv_path": "/fixtures.csv", "base_url": "/foo/bar/" } This is so strange. edit_sql_url and base_url are correct, but show_hide_link and url_csv and renderers.json are not.

And it's really strange that the bug doesn't show up in the tests.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974391204 https://github.com/simonw/datasette/issues/1519#issuecomment-974391204 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FAek simonw 9599 2021-11-19T20:02:41Z 2021-11-19T20:02:41Z OWNER

Bug confirmed:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974389472 https://github.com/simonw/datasette/issues/1519#issuecomment-974389472 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46FADg simonw 9599 2021-11-19T20:01:02Z 2021-11-19T20:01:02Z OWNER

I now have a Dockerfile in https://github.com/simonw/datasette/issues/1521#issuecomment-974388295 that I can use to run a local Apache 2 with mod_proxy to investigate this class of bugs!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974388295 https://github.com/simonw/datasette/issues/1521#issuecomment-974388295 https://api.github.com/repos/simonw/datasette/issues/1521 IC_kwDOBm6k_c46E_xH simonw 9599 2021-11-19T20:00:06Z 2021-11-19T20:00:06Z OWNER

And this is the version that proxies to a base_url of /foo/bar/:

```Dockerfile FROM python:3-alpine

RUN apk add --no-cache \ apache2 \ apache2-proxy \ bash

RUN pip install datasette

ENV TINI_VERSION v0.18.0 ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini RUN chmod +x /tini

Append this to the end of the default httpd.conf file

RUN echo $'ServerName localhost\n\ \n\ <Proxy *>\n\ Order deny,allow\n\ Allow from all\n\ </Proxy>\n\ \n\ ProxyPass /foo/bar/ http://localhost:9000/\n\ Header add X-Proxied-By "Apache2"' >> /etc/apache2/httpd.conf

RUN echo $'Datasette' > /var/www/localhost/htdocs/index.html

WORKDIR /app

ADD https://latest.datasette.io/fixtures.db /app/fixtures.db

RUN echo $'#!/usr/bin/env bash\n\ set -e\n\ \n\ httpd -D FOREGROUND &\n\ datasette fixtures.db --setting base_url "/foo/bar/" -p 9000 &\n\ \n\ wait -n' > /app/start.sh

RUN chmod +x /app/start.sh

EXPOSE 80 ENTRYPOINT ["/tini", "--", "/app/start.sh"] ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Docker configuration for exercising Datasette behind Apache mod_proxy 1058815557  
974380798 https://github.com/simonw/datasette/issues/1521#issuecomment-974380798 https://api.github.com/repos/simonw/datasette/issues/1521 IC_kwDOBm6k_c46E97- simonw 9599 2021-11-19T19:54:26Z 2021-11-19T19:54:26Z OWNER

Got it working! Here's a Dockerfile which runs completely stand-alone (thanks to using the echo $' trick to write out the config files it needs) and successfully serves Datasette behind Apache and mod_proxy:

```Dockerfile FROM python:3-alpine

RUN apk add --no-cache \ apache2 \ apache2-proxy \ bash

RUN pip install datasette

ENV TINI_VERSION v0.18.0 ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-static /tini RUN chmod +x /tini

Append this to the end of the default httpd.conf file

RUN echo $'ServerName localhost\n\ \n\ <Proxy *>\n\ Order deny,allow\n\ Allow from all\n\ </Proxy>\n\ \n\ ProxyPass / http://localhost:9000/\n\ ProxyPassReverse / http://localhost:9000/\n\ Header add X-Proxied-By "Apache2"' >> /etc/apache2/httpd.conf

WORKDIR /app

RUN echo $'#!/usr/bin/env bash\n\ set -e\n\ \n\ httpd -D FOREGROUND &\n\ datasette -p 9000 &\n\ \n\ wait -n' > /app/start.sh

RUN chmod +x /app/start.sh

EXPOSE 80 ENTRYPOINT ["/tini", "--", "/app/start.sh"] ```

Run it like this:

docker build -t datasette-apache2-proxy . docker run -p 5000:80 --rm datasette-apache2-proxy Then run this to confirm: ``` ~ % curl -i 'http://localhost:5000/-/versions.json' HTTP/1.1 200 OK Date: Fri, 19 Nov 2021 19:54:05 GMT Server: uvicorn content-type: application/json; charset=utf-8 X-Proxied-By: Apache2 Transfer-Encoding: chunked

{"python": {"version": "3.10.0", "full": "3.10.0 (default, Nov 13 2021, 03:23:03) [GCC 10.3.1 20210424]"}, "datasette": {"version": "0.59.2"}, "asgi": "3.0", "uvicorn": "0.15.0", "sqlite": {"version": "3.35.5", "fts_versions": ["FTS5", "FTS4", "FTS3"], "extensions": {"json1": null}, "compile_options": ["COMPILER=gcc-10.3.1 20210424", "ENABLE_COLUMN_METADATA", "ENABLE_DBSTAT_VTAB", "ENABLE_FTS3", "ENABLE_FTS3_PARENTHESIS", "ENABLE_FTS4", "ENABLE_FTS5", "ENABLE_GEOPOLY", "ENABLE_JSON1", "ENABLE_MATH_FUNCTIONS", "ENABLE_RTREE", "ENABLE_UNLOCK_NOTIFY", "MAX_VARIABLE_NUMBER=250000", "SECURE_DELETE", "THREADSAFE=1", "USE_URI"]}} ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Docker configuration for exercising Datasette behind Apache mod_proxy 1058815557  
974371116 https://github.com/simonw/datasette/issues/1521#issuecomment-974371116 https://api.github.com/repos/simonw/datasette/issues/1521 IC_kwDOBm6k_c46E7ks simonw 9599 2021-11-19T19:45:47Z 2021-11-19T19:45:47Z OWNER

https://github.com/krallin/tini says:

NOTE: If you are using Docker 1.13 or greater, Tini is included in Docker itself. This includes all versions of Docker CE. To enable Tini, just pass the --init flag to docker run.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Docker configuration for exercising Datasette behind Apache mod_proxy 1058815557  
974336020 https://github.com/simonw/datasette/issues/1521#issuecomment-974336020 https://api.github.com/repos/simonw/datasette/issues/1521 IC_kwDOBm6k_c46EzAU simonw 9599 2021-11-19T19:10:48Z 2021-11-19T19:10:48Z OWNER

There's a promising looking minimal Apache 2 proxy config here: https://stackoverflow.com/questions/26474476/minimal-configuration-for-apache-reverse-proxy-in-docker-container

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Docker configuration for exercising Datasette behind Apache mod_proxy 1058815557  
974334278 https://github.com/simonw/datasette/issues/1521#issuecomment-974334278 https://api.github.com/repos/simonw/datasette/issues/1521 IC_kwDOBm6k_c46EylG simonw 9599 2021-11-19T19:08:09Z 2021-11-19T19:08:09Z OWNER

Stripping comments using this StackOverflow recipe: https://unix.stackexchange.com/a/157619

docker run -it --entrypoint sh alpine-apache2-sh \
  -c "cat /etc/apache2/httpd.conf" | sed '/^[[:blank:]]*#/d;s/#.*//'

Result is here: https://gist.github.com/simonw/0a05090df5fcff8e8b3334621fa17976

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Docker configuration for exercising Datasette behind Apache mod_proxy 1058815557  
974332787 https://github.com/simonw/datasette/issues/1521#issuecomment-974332787 https://api.github.com/repos/simonw/datasette/issues/1521 IC_kwDOBm6k_c46EyNz simonw 9599 2021-11-19T19:05:52Z 2021-11-19T19:05:52Z OWNER

Made myself this Dockerfile to let me explore a bit: ```Dockerfile FROM python:3-alpine

RUN apk add --no-cache \ apache2

CMD ["sh"] Then: % docker run alpine-apache2-sh % docker run -it alpine-apache2-sh / # ls /etc/apache2/httpd.conf /etc/apache2/httpd.conf / # cat /etc/apache2/httpd.conf

This is the main Apache HTTP server configuration file. It contains the

configuration directives that give the server its instructions.

... Copying that into a GIST like so: docker run -it --entrypoint sh alpine-apache2-sh -c "cat /etc/apache2/httpd.conf" | pbcopy ``` Gist here: https://gist.github.com/simonw/5ea0db6049192cb9f761fbd6beb3a84a

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Docker configuration for exercising Datasette behind Apache mod_proxy 1058815557  
974327812 https://github.com/simonw/datasette/issues/1521#issuecomment-974327812 https://api.github.com/repos/simonw/datasette/issues/1521 IC_kwDOBm6k_c46ExAE simonw 9599 2021-11-19T18:58:49Z 2021-11-19T18:59:55Z OWNER

From this example: https://github.com/tigelane/dockerfiles/blob/06cff2ac8cdc920ebd64f50965115eaa3d0afb84/Alpine-Apache2/Dockerfile#L25-L31 it looks like running apk add apache2 installs a config file at /etc/apache2/httpd.conf - so one approach is to then modify that file.

```

APACHE - Alpine

RUN apk --update add apache2 php5-apache2 && \ #apk add openrc --no-cache && \ rm -rf /var/cache/apk/* && \ sed -i 's/#ServerName www.example.com:80/ServerName localhost/' /etc/apache2/httpd.conf && \ mkdir -p /run/apache2/

Upload our files from folder "dist".

COPY dist /var/www/localhost/htdocs

Manually set up the apache environment variables

ENV APACHE_RUN_USER www-data ENV APACHE_RUN_GROUP www-data ENV APACHE_LOG_DIR /var/log/apache2 ENV APACHE_LOCK_DIR /var/lock/apache2 ENV APACHE_PID_FILE /var/run/apache2.pid

Execute apache2 on run

EXPOSE 80 ENTRYPOINT ["httpd"] CMD ["-D", "FOREGROUND"] ```

I think I'll create my own separate copy and modify that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Docker configuration for exercising Datasette behind Apache mod_proxy 1058815557  
974321391 https://github.com/simonw/datasette/issues/1521#issuecomment-974321391 https://api.github.com/repos/simonw/datasette/issues/1521 IC_kwDOBm6k_c46Evbv simonw 9599 2021-11-19T18:49:15Z 2021-11-19T18:57:18Z OWNER

This pattern looks like it can help: https://ahmet.im/blog/cloud-run-multiple-processes-easy-way/ - see example in https://github.com/ahmetb/multi-process-container-lazy-solution

I got that demo working locally like this:

bash cd /tmp git clone https://github.com/ahmetb/multi-process-container-lazy-solution cd multi-process-container-lazy-solution docker build -t multi-process-container-lazy-solution . docker run -p 5000:8080 --rm multi-process-container-lazy-solution

I want to use apache2 rather than nginx though. I found a few relevant examples of Apache in Alpine:

  • https://github.com/Hacking-Lab/alpine-apache2-reverse-proxy/blob/master/Dockerfile
  • https://www.sentiatechblog.com/running-apache-in-a-docker-container
  • https://github.com/search?l=Dockerfile&q=alpine+apache2&type=code
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Docker configuration for exercising Datasette behind Apache mod_proxy 1058815557  
974322178 https://github.com/simonw/datasette/issues/1521#issuecomment-974322178 https://api.github.com/repos/simonw/datasette/issues/1521 IC_kwDOBm6k_c46EvoC simonw 9599 2021-11-19T18:50:22Z 2021-11-19T18:50:22Z OWNER

I'll get this working on my laptop first, but then I want to get it up and running on Cloud Run - maybe with a GitHub Actions workflow in this repo that re-deploys it on manual execution.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Docker configuration for exercising Datasette behind Apache mod_proxy 1058815557  
974310208 https://github.com/simonw/datasette/issues/1519#issuecomment-974310208 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46EstA simonw 9599 2021-11-19T18:32:31Z 2021-11-19T18:32:31Z OWNER

Having a live demo running on Cloud Run that proxies through Apache and uses base_url would be incredibly useful for replicating and debugging this kind of thing. I wonder how hard it is to run Apache and mod_proxy in the same Docker container as Datasette?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974309591 https://github.com/simonw/datasette/issues/1519#issuecomment-974309591 https://api.github.com/repos/simonw/datasette/issues/1519 IC_kwDOBm6k_c46EsjX simonw 9599 2021-11-19T18:31:32Z 2021-11-19T18:31:32Z OWNER

base_url has been a source of so many bugs like this! I often find them quite hard to replicate, likely because I haven't made myself a good Apache mod_proxy testing environment yet.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url is omitted in JSON and CSV views 1058790545  
974308215 https://github.com/simonw/datasette/issues/1520#issuecomment-974308215 https://api.github.com/repos/simonw/datasette/issues/1520 IC_kwDOBm6k_c46EsN3 simonw 9599 2021-11-19T18:29:26Z 2021-11-19T18:29:26Z OWNER

The solution that jumps to mind first is that it would be neat if routes could return something that meant "actually my bad, I can't handle this after all - move to the next one in the list".

A related idea: it might be useful for custom views like my one here to say "no actually call the default view for this, but give me back the response so I can modify it in some way". Kind of like Django or ASGI middleware.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pattern for avoiding accidental URL over-rides 1058803238  
974300823 https://github.com/simonw/datasette/issues/1518#issuecomment-974300823 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c46EqaX simonw 9599 2021-11-19T18:18:32Z 2021-11-19T18:18:32Z OWNER

This may be an argument for continuing to allow non-JSON-objects through to the HTML templates. Need to think about that a bit more.

I can definitely support this using pure-JSON - I could make two versions of the row available, one that's an array of cell objects and the other that's an object mapping column names to column raw values.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
974285803 https://github.com/simonw/datasette/issues/1518#issuecomment-974285803 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c46Emvr simonw 9599 2021-11-19T17:56:48Z 2021-11-19T18:14:30Z OWNER

Very confused by this piece of code here: https://github.com/simonw/datasette/blob/1c13e1af0664a4dfb1e69714c56523279cae09e4/datasette/views/table.py#L37-L63

I added it in https://github.com/simonw/datasette/commit/754836eef043676e84626c4fd3cb993eed0d2976 - in the new world that should probably be replaced by pure JSON.

Aha - this comment explains it: https://github.com/simonw/datasette/issues/521#issuecomment-505279560

I think the trick is to redefine what a "cell_row" is. Each row is currently a list of cells:

https://github.com/simonw/datasette/blob/6341f8cbc7833022012804dea120b838ec1f6558/datasette/views/table.py#L159-L163

I can redefine the row (the cells variable in the above example) as a thing-that-iterates-cells (hence behaving like a list) but that also supports __getitem__ access for looking up cell values if you know the name of the column.

The goal was to support neater custom templates like this: ```html+jinja {% for row in display_rows %}

{{ row["First_Name"] }} {{ row["Last_Name"] }}

... ``` This may be an argument for continuing to allow non-JSON-objects through to the HTML templates. Need to think about that a bit more.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
974287570 https://github.com/simonw/datasette/issues/1518#issuecomment-974287570 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c46EnLS simonw 9599 2021-11-19T17:59:33Z 2021-11-19T17:59:33Z OWNER

I'm going to try leaning into the asyncinject mechanism a bit here. One method can execute and return the raw rows. Another can turn that into the default minimal JSON representation. Then a third can take that (or take both) and use it to inflate out the JSON that the HTML template needs, with those extras and with the rendered cells from plugins.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
974108455 https://github.com/simonw/datasette/pull/1495#issuecomment-974108455 https://api.github.com/repos/simonw/datasette/issues/1495 IC_kwDOBm6k_c46D7cn mroswell 192568 2021-11-19T14:14:35Z 2021-11-19T14:14:35Z CONTRIBUTOR

A nudge on this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow routes to have extra options 1033678984  
973820125 https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973820125 https://api.github.com/repos/simonw/sqlite-utils/issues/342 IC_kwDOCGYnMM46C1Dd simonw 9599 2021-11-19T07:25:55Z 2021-11-19T07:25:55Z OWNER

alter=True doesn't make sense to support here either, because .lookup() already adds missing columns: https://github.com/simonw/sqlite-utils/blob/3b8abe608796e99e4ffc5f3f4597a85e605c0e9b/sqlite_utils/db.py#L2743-L2746

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extra options to `lookup()` which get passed to `insert()` 1058196641  
973802998 https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802998 https://api.github.com/repos/simonw/sqlite-utils/issues/342 IC_kwDOCGYnMM46Cw32 simonw 9599 2021-11-19T06:59:22Z 2021-11-19T06:59:32Z OWNER

I don't think I need the DEFAULT defaults for .insert() either, since it just passes through to .insert().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extra options to `lookup()` which get passed to `insert()` 1058196641  
973802766 https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802766 https://api.github.com/repos/simonw/sqlite-utils/issues/342 IC_kwDOCGYnMM46Cw0O simonw 9599 2021-11-19T06:58:45Z 2021-11-19T06:58:45Z OWNER

And neither does hash_id. On that basis I'm going to specifically list the ones that DO make sense, and hope that I remember to add any new ones in the future. I can add a code comment hint to .insert() about that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extra options to `lookup()` which get passed to `insert()` 1058196641  
973802469 https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802469 https://api.github.com/repos/simonw/sqlite-utils/issues/342 IC_kwDOCGYnMM46Cwvl simonw 9599 2021-11-19T06:58:03Z 2021-11-19T06:58:03Z OWNER

Also: I don't think ignore= and replace= make sense in the context of lookup().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extra options to `lookup()` which get passed to `insert()` 1058196641  
973802308 https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973802308 https://api.github.com/repos/simonw/sqlite-utils/issues/342 IC_kwDOCGYnMM46CwtE simonw 9599 2021-11-19T06:57:37Z 2021-11-19T06:57:37Z OWNER

Here's the current full method signature for .insert(): https://github.com/simonw/sqlite-utils/blob/3b8abe608796e99e4ffc5f3f4597a85e605c0e9b/sqlite_utils/db.py#L2462-L2477

I could add a test which uses introspection (inspect.signature(method).parameters) to confirm that .lookup() has a super-set of the arguments accepted by .insert().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extra options to `lookup()` which get passed to `insert()` 1058196641  
973801650 https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973801650 https://api.github.com/repos/simonw/sqlite-utils/issues/342 IC_kwDOCGYnMM46Cwiy simonw 9599 2021-11-19T06:55:56Z 2021-11-19T06:55:56Z OWNER

pk needs to be an explicit argument to .lookup(). The rest could be **kwargs passed through to .insert(), like this hacked together version (docstring removed for brevity):

python def lookup( self, lookup_values: Dict[str, Any], extra_values: Optional[Dict[str, Any]] = None, pk="id", **insert_kwargs, ): """ assert isinstance(lookup_values, dict) if extra_values is not None: assert isinstance(extra_values, dict) combined_values = dict(lookup_values) if extra_values is not None: combined_values.update(extra_values) if self.exists(): self.add_missing_columns([combined_values]) unique_column_sets = [set(i.columns) for i in self.indexes] if set(lookup_values.keys()) not in unique_column_sets: self.create_index(lookup_values.keys(), unique=True) wheres = ["[{}] = ?".format(column) for column in lookup_values] rows = list( self.rows_where( " and ".join(wheres), [value for _, value in lookup_values.items()] ) ) try: return rows[0][pk] except IndexError: return self.insert(combined_values, pk=pk, **insert_kwargs).last_pk else: pk = self.insert(combined_values, pk=pk, **insert_kwargs).last_pk self.create_index(lookup_values.keys(), unique=True) return pk I think I'll explicitly list the parameters, mainly so they can be typed and covered by automatic documentation.

I do worry that I'll add more keyword arguments to .insert() in the future and forget to mirror them to .lookup() though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extra options to `lookup()` which get passed to `insert()` 1058196641  
973800795 https://github.com/simonw/sqlite-utils/issues/342#issuecomment-973800795 https://api.github.com/repos/simonw/sqlite-utils/issues/342 IC_kwDOCGYnMM46CwVb simonw 9599 2021-11-19T06:54:08Z 2021-11-19T06:54:08Z OWNER

Looking at the code for lookup() it currently hard-codes pk to "id" - but it actually only calls .insert() in two places, both of which could be passed extra arguments.

https://github.com/simonw/sqlite-utils/blob/3b8abe608796e99e4ffc5f3f4597a85e605c0e9b/sqlite_utils/db.py#L2756-L2763

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extra options to `lookup()` which get passed to `insert()` 1058196641  
973700549 https://github.com/simonw/datasette/issues/1518#issuecomment-973700549 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c46CX3F simonw 9599 2021-11-19T03:31:20Z 2021-11-19T03:31:26Z OWNER

... and while I'm doing all of this I can rewrite the templates to not use those cheating magical functions AND document the template context at the same time, refs: - #1510.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
973700322 https://github.com/simonw/datasette/issues/1518#issuecomment-973700322 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c46CXzi simonw 9599 2021-11-19T03:30:30Z 2021-11-19T03:30:30Z OWNER

Right now the HTML version gets to cheat - it passes through objects that are not JSON serializable, including custom functions that can then be called by Jinja.

I'm interested in maybe removing this cheating - if the HTML version could only request JSON-serializable extras those could be exposed in the API as well.

It would also help cleanup the kind-of-nasty pattern I use in the current BaseView where everything returns both a bunch of JSON-serializable data AND an awaitable function that then gets to add extra things to the HTML context.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
973698917 https://github.com/simonw/datasette/issues/1518#issuecomment-973698917 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c46CXdl simonw 9599 2021-11-19T03:26:18Z 2021-11-19T03:29:03Z OWNER

A (likely incomplete) list of features on the table page:

  • [ ] Display table/database/instance metadata
  • [ ] Show count of all results
  • [ ] Display table of results
  • [ ] Special table display treatment for URLs, numbers
  • [ ] Allow plugins to modify table cells
  • [ ] Respect ?_col= and ?_nocol=
  • [ ] Show interface for filtering by columns and operations
  • [ ] Show search box, support executing FTS searches
  • [ ] Sort table by specified column
  • [ ] Paginate table
  • [ ] Show facet results
  • [ ] Show suggested facets
  • [ ] Link to available exports
  • [ ] Display schema for table
  • [ ] Maybe it should show the SQL for the query too?
  • [ ] Handle various non-obvious querystring options, like ?_where= and ?_through=
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
973699424 https://github.com/simonw/datasette/issues/1518#issuecomment-973699424 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c46CXlg simonw 9599 2021-11-19T03:27:49Z 2021-11-19T03:27:49Z OWNER

My goal is to break up a lot of this functionality into separate methods. These methods can be executed in parallel by asyncinject, but more importantly they can be used to build a much better JSON representation, where the default representation is lighter and ?_extra=x options can be used to execute more expensive portions and add them to the response.

So the HTML version itself needs to be re-written to use those JSON extras.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
973696604 https://github.com/simonw/datasette/issues/1517#issuecomment-973696604 https://api.github.com/repos/simonw/datasette/issues/1517 IC_kwDOBm6k_c46CW5c simonw 9599 2021-11-19T03:20:00Z 2021-11-19T03:20:00Z OWNER

Confirmed - my test plugin is indeed correctly over-riding the table page.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Let `register_routes()` over-ride default routes within Datasette 1057996111  
973687978 https://github.com/simonw/datasette/issues/1518#issuecomment-973687978 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c46CUyq simonw 9599 2021-11-19T03:07:47Z 2021-11-19T03:07:47Z OWNER

I was wrong about that, you CAN over-ride default routes already.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
973686874 https://github.com/simonw/datasette/issues/1517#issuecomment-973686874 https://api.github.com/repos/simonw/datasette/issues/1517 IC_kwDOBm6k_c46CUha simonw 9599 2021-11-19T03:06:58Z 2021-11-19T03:06:58Z OWNER

I made a mistake: I just wrote a test that proves that plugins CAN over-ride default routes, plus if you look at the code here the plugins get to register themselves first: https://github.com/simonw/datasette/blob/0156c6b5e52d541e93f0d68e9245f20ae83bc933/datasette/app.py#L965-L981

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Let `register_routes()` over-ride default routes within Datasette 1057996111  
973682389 https://github.com/simonw/datasette/issues/1518#issuecomment-973682389 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c46CTbV simonw 9599 2021-11-19T02:57:39Z 2021-11-19T02:57:39Z OWNER

Ideally I'd like to execute the existing test suite against the new implementation - that would require me to solve this so I can replace the view with the plugin version though:

  • 1517

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
973681970 https://github.com/simonw/datasette/issues/1518#issuecomment-973681970 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c46CTUy simonw 9599 2021-11-19T02:56:31Z 2021-11-19T02:56:53Z OWNER

Here's where I got to with my hacked-together initial plugin prototype - it managed to render the table page with some rows on it (and a bunch of missing functionality such as filters): https://gist.github.com/simonw/281eac9c73b062c3469607ad86470eb2

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
973678931 https://github.com/simonw/datasette/issues/878#issuecomment-973678931 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46CSlT simonw 9599 2021-11-19T02:51:17Z 2021-11-19T02:51:17Z OWNER

OK, I managed to get a table to render! Here's the code I used - I had to copy a LOT of stuff. https://gist.github.com/simonw/281eac9c73b062c3469607ad86470eb2

I'm going to move this work into a new, separate issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973635157 https://github.com/simonw/datasette/issues/878#issuecomment-973635157 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46CH5V simonw 9599 2021-11-19T01:07:08Z 2021-11-19T01:07:08Z OWNER

This exercise is proving so useful in getting my head around how the enormous and complex TableView class works again.

Here's where I've got to now - I'm systematically working through the variables that are returned for HTML and for JSON copying across code to get it to work:

```python from datasette.database import QueryInterrupted from datasette.utils import escape_sqlite from datasette.utils.asgi import Response, NotFound, Forbidden from datasette.views.base import DatasetteError from datasette import hookimpl from asyncinject import AsyncInject, inject from pprint import pformat

class Table(AsyncInject): @inject async def database(self, request, datasette): # TODO: all that nasty hash resolving stuff can go here db_name = request.url_vars["db_name"] try: db = datasette.databases[db_name] except KeyError: raise NotFound(f"Database '{db_name}' does not exist") return db

@inject
async def table_and_format(self, request, database, datasette):
    table_and_format = request.url_vars["table_and_format"]
    # TODO: be a lot smarter here
    if "." in table_and_format:
        return table_and_format.split(".", 2)
    else:
        return table_and_format, "html"

@inject
async def main(self, request, database, table_and_format, datasette):
    # TODO: if this is actually a canned query, dispatch to it

    table, format = table_and_format

    is_view = bool(await database.get_view_definition(table))
    table_exists = bool(await database.table_exists(table))
    if not is_view and not table_exists:
        raise NotFound(f"Table not found: {table}")

    await check_permissions(
        datasette,
        request,
        [
            ("view-table", (database.name, table)),
            ("view-database", database.name),
            "view-instance",
        ],
    )

    private = not await datasette.permission_allowed(
        None, "view-table", (database.name, table), default=True
    )

    pks = await database.primary_keys(table)
    table_columns = await database.table_columns(table)

    specified_columns = await columns_to_select(datasette, database, table, request)
    select_specified_columns = ", ".join(
        escape_sqlite(t) for t in specified_columns
    )
    select_all_columns = ", ".join(escape_sqlite(t) for t in table_columns)

    use_rowid = not pks and not is_view
    if use_rowid:
        select_specified_columns = f"rowid, {select_specified_columns}"
        select_all_columns = f"rowid, {select_all_columns}"
        order_by = "rowid"
        order_by_pks = "rowid"
    else:
        order_by_pks = ", ".join([escape_sqlite(pk) for pk in pks])
        order_by = order_by_pks

    if is_view:
        order_by = ""

    nocount = request.args.get("_nocount")
    nofacet = request.args.get("_nofacet")

    if request.args.get("_shape") in ("array", "object"):
        nocount = True
        nofacet = True

    # Next, a TON of SQL to build where_params and filters and suchlike
    # skipping that and jumping straight to...
    where_clauses = []
    where_clause = ""
    if where_clauses:
        where_clause = f"where {' and '.join(where_clauses)} "

    from_sql = "from {table_name} {where}".format(
        table_name=escape_sqlite(table),
        where=("where {} ".format(" and ".join(where_clauses)))
        if where_clauses
        else "",
    )
    from_sql_params ={}
    params = {}
    count_sql = f"select count(*) {from_sql}"
    sql_no_order_no_limit = (
        "select {select_all_columns} from {table_name} {where}".format(
            select_all_columns=select_all_columns,
            table_name=escape_sqlite(table),
            where=where_clause,
        )
    )

    page_size = 100
    offset = " offset 0"

    sql = "select {select_specified_columns} from {table_name} {where}{order_by} limit {page_size}{offset}".format(
        select_specified_columns=select_specified_columns,
        table_name=escape_sqlite(table),
        where=where_clause,
        order_by=order_by,
        page_size=page_size + 1,
        offset=offset,
    )

    # Fetch rows
    results = await database.execute(sql, params, truncate=True)
    columns = [r[0] for r in results.description]
    rows = list(results.rows)

    # Fetch count
    filtered_table_rows_count = None
    if count_sql:
        try:
            count_rows = list(await database.execute(count_sql, from_sql_params))
            filtered_table_rows_count = count_rows[0][0]
        except QueryInterrupted:
            pass


    vars = {
        "json": {
            # THIS STUFF is from the regular JSON
            "database": database.name,
            "table": table,
            "is_view": is_view,
            # "human_description_en": human_description_en,
            "rows": rows[:page_size],
            "truncated": results.truncated,
            "filtered_table_rows_count": filtered_table_rows_count,
            # "expanded_columns": expanded_columns,
            # "expandable_columns": expandable_columns,
            "columns": columns,
            "primary_keys": pks,
            # "units": units,
            "query": {"sql": sql, "params": params},
            # "facet_results": facet_results,
            # "suggested_facets": suggested_facets,
            # "next": next_value and str(next_value) or None,
            # "next_url": next_url,
            "private": private,
            "allow_execute_sql": await datasette.permission_allowed(
                request.actor, "execute-sql", database, default=True
            ),
        },
        "html": {
            # ... this is the HTML special stuff
            # "table_actions": table_actions,
            # "supports_search": bool(fts_table),
            # "search": search or "",
            "use_rowid": use_rowid,
            # "filters": filters,
            # "display_columns": display_columns,
            # "filter_columns": filter_columns,
            # "display_rows": display_rows,
            # "facets_timed_out": facets_timed_out,
            # "sorted_facet_results": sorted(
            #     facet_results.values(),
            #     key=lambda f: (len(f["results"]), f["name"]),
            #     reverse=True,
            # ),
            # "show_facet_counts": special_args.get("_facet_size") == "max",
            # "extra_wheres_for_ui": extra_wheres_for_ui,
            # "form_hidden_args": form_hidden_args,
            # "is_sortable": any(c["sortable"] for c in display_columns),
            # "path_with_replaced_args": path_with_replaced_args,
            # "path_with_removed_args": path_with_removed_args,
            # "append_querystring": append_querystring,
            "request": request,
            # "sort": sort,
            # "sort_desc": sort_desc,
            "disable_sort": is_view,
            # "custom_table_templates": [
            #     f"_table-{to_css_class(database)}-{to_css_class(table)}.html",
            #     f"_table-table-{to_css_class(database)}-{to_css_class(table)}.html",
            #     "_table.html",
            # ],
            # "metadata": metadata,
            # "view_definition": await db.get_view_definition(table),
            # "table_definition": await db.get_table_definition(table),
        },
    }

    # I'm just trying to get HTML to work for the moment
    if format == "json":
        return Response.json(dict(vars, locals=locals()), default=repr)
    else:
        return Response.html(repr(vars["html"]))

async def view(self, request, datasette):
    return await self.main(request=request, datasette=datasette)

@hookimpl def register_routes(): return [ (r"/t/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)", Table().view), ]

async def check_permissions(datasette, request, permissions): """permissions is a list of (action, resource) tuples or 'action' strings""" for permission in permissions: if isinstance(permission, str): action = permission resource = None elif isinstance(permission, (tuple, list)) and len(permission) == 2: action, resource = permission else: assert ( False ), "permission should be string or tuple of two items: {}".format( repr(permission) ) ok = await datasette.permission_allowed( request.actor, action, resource=resource, default=None, ) if ok is not None: if ok: return else: raise Forbidden(action)

async def columns_to_select(datasette, database, table, request): table_columns = await database.table_columns(table) pks = await database.primary_keys(table) columns = list(table_columns) if "_col" in request.args: columns = list(pks) _cols = request.args.getlist("_col") bad_columns = [column for column in _cols if column not in table_columns] if bad_columns: raise DatasetteError( "_col={} - invalid columns".format(", ".join(bad_columns)), status=400, ) # De-duplicate maintaining order: columns.extend(dict.fromkeys(_cols)) if "_nocol" in request.args: # Return all columns EXCEPT these bad_columns = [ column for column in request.args.getlist("_nocol") if (column not in table_columns) or (column in pks) ] if bad_columns: raise DatasetteError( "_nocol={} - invalid columns".format(", ".join(bad_columns)), status=400, ) tmp_columns = [ column for column in columns if column not in request.args.getlist("_nocol") ] columns = tmp_columns return columns ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973568285 https://github.com/simonw/datasette/issues/878#issuecomment-973568285 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46B3kd simonw 9599 2021-11-19T00:29:20Z 2021-11-19T00:29:20Z OWNER

This is working! ```python from datasette.utils.asgi import Response from datasette import hookimpl import html from asyncinject import AsyncInject, inject

class Table(AsyncInject): @inject async def database(self, request): return request.url_vars["db_name"]

@inject
async def main(self, request, database):
    return Response.html("Database: {}".format(
        html.escape(database)
    ))

async def view(self, request):
    return await self.main(request=request)

@hookimpl def register_routes(): return [ (r"/t/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)", Table().view), ] `` This project will definitely show me if I actually like theasyncinject` patterns or not.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973564260 https://github.com/simonw/datasette/issues/878#issuecomment-973564260 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46B2lk simonw 9599 2021-11-19T00:27:06Z 2021-11-19T00:27:06Z OWNER

Problem: the fancy asyncinject stuff inteferes with the fancy Datasette thing that introspects view functions to look for what parameters they take: ```python class Table(asyncinject.AsyncInjectAll): async def view(self, request): return Response.html("Hello from {}".format( html.escape(repr(request.url_vars)) ))

@hookimpl def register_routes(): return [ (r"/t/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)", Table().view), ] ``` This failed with error: "Table.view() takes 1 positional argument but 2 were given"

So I'm going to use AsyncInject and have the view function NOT use the @inject decorator.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973554024 https://github.com/simonw/datasette/issues/878#issuecomment-973554024 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46B0Fo simonw 9599 2021-11-19T00:21:20Z 2021-11-19T00:21:20Z OWNER

That's annoying: it looks like plugins can't use register_routes() to over-ride default routes within Datasette itself. This didn't work: ```python from datasette.utils.asgi import Response from datasette import hookimpl import html

async def table(request): return Response.html("Hello from {}".format( html.escape(repr(request.url_vars)) ))

@hookimpl def register_routes(): return [ (r"/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)", table), ] `` I'll use a/t/` prefix for the moment, but this is probably something I'll fix in Datasette itself later.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973542284 https://github.com/simonw/datasette/issues/878#issuecomment-973542284 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46BxOM simonw 9599 2021-11-19T00:16:44Z 2021-11-19T00:16:44Z OWNER

Development % cookiecutter gh:simonw/datasette-plugin You've downloaded /Users/simon/.cookiecutters/datasette-plugin before. Is it okay to delete and re-download it? [yes]: yes plugin_name []: table-new description []: New implementation of TableView, see https://github.com/simonw/datasette/issues/878 hyphenated [table-new]: underscored [table_new]: github_username []: simonw author_name []: Simon Willison include_static_directory []: include_templates_directory []:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973527870 https://github.com/simonw/datasette/issues/878#issuecomment-973527870 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46Bts- simonw 9599 2021-11-19T00:13:43Z 2021-11-19T00:13:43Z OWNER

New plan: I'm going to build a brand new implementation of TableView starting out as a plugin, using the register_routes() plugin hook.

It will reuse the existing HTML template but will be a completely new Python implementation, based on asyncinject.

I'm going to start by just getting the table to show up on the page - then I'll add faceting, suggested facets, filters and so-on.

Bonus: I'm going to see if I can get it to work for arbitrary SQL queries too (stretch goal).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1603.819ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows