5,261 rows where user = 9599 sorted by updated_at descending

View and edit SQL

Suggested facets: reactions, created_at (date), updated_at (date)

issue

author_association

user

  • simonw · 5,261
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
882138084 https://github.com/simonw/datasette/issues/123#issuecomment-882138084 https://api.github.com/repos/simonw/datasette/issues/123 IC_kwDOBm6k_c40lFvk simonw 9599 2021-07-19T00:04:31Z 2021-07-19T00:04:31Z OWNER

I've been thinking more about this one today too. An extension of this (touched on in #417, Datasette Library) would be to support pointing Datasette at a directory and having it automatically load any CSV files it finds anywhere in that folder or its descendants - either loading them fully, or providing a UI that allows users to select a file to open it in Datasette.

For larger files I think the right thing to do is import them into an on-disk SQLite database, which is limited only by available disk space. For smaller files loading them into an in-memory database should work fine.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Datasette serve should accept paths/URLs to CSVs and other file formats 275125561  
882052852 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-882052852 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM40kw70 simonw 9599 2021-07-18T12:59:20Z 2021-07-18T12:59:20Z OWNER

I'm not too worried about sqlite-utils memory because if your data is large enough that you can benefit from this optimization you probably should use a real file as opposed to a disposable memory database when analyzing it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
882052693 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-882052693 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM40kw5V simonw 9599 2021-07-18T12:57:54Z 2021-07-18T12:57:54Z OWNER

Another implementation option would be to use the CSV virtual table mechanism. This could avoid shelling out to the sqlite3 binary, but requires solving the harder problem of compiling and distributing a loadable SQLite module: https://www.sqlite.org/csv.html

This would also help solve the challenge of making this optimization available to the sqlite-utils memory command. That command operates against an in-memory database so it's not obvious how it could shell out to a binary.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
881932880 https://github.com/simonw/datasette/issues/1199#issuecomment-881932880 https://api.github.com/repos/simonw/datasette/issues/1199 IC_kwDOBm6k_c40kTpQ simonw 9599 2021-07-17T17:39:17Z 2021-07-17T17:39:17Z OWNER

I asked about optimizing performance on the SQLite forum and this came up as a suggestion: https://sqlite.org/forum/forumpost/9a6b9ae8e2048c8b?t=c

I can start by trying this:

PRAGMA mmap_size=268435456;
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Experiment with PRAGMA mmap_size=N 792652391  
881686662 https://github.com/simonw/datasette/issues/1396#issuecomment-881686662 https://api.github.com/repos/simonw/datasette/issues/1396 IC_kwDOBm6k_c40jXiG simonw 9599 2021-07-16T20:02:44Z 2021-07-16T20:02:44Z OWNER

Confirmed fixed: 0.58.1 was successfully published to Docker Hub in https://github.com/simonw/datasette/runs/3089447346 and the latest tag on https://hub.docker.com/r/datasetteproject/datasette/tags was updated.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"invalid reference format" publishing Docker image 944903881  
881677620 https://github.com/simonw/datasette/issues/1231#issuecomment-881677620 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jVU0 simonw 9599 2021-07-16T19:44:12Z 2021-07-16T19:44:12Z OWNER

That fixed the race condition in the datasette-graphql tests, which is the only place that I've been able to successfully replicate this. I'm going to land this change.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881674857 https://github.com/simonw/datasette/issues/1231#issuecomment-881674857 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jUpp simonw 9599 2021-07-16T19:38:39Z 2021-07-16T19:38:39Z OWNER

I can't replicate the race condition locally with or without this patch. I'm going to push the commit and then test the CI run from datasette-graphql that was failing against it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881671706 https://github.com/simonw/datasette/issues/1231#issuecomment-881671706 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jT4a simonw 9599 2021-07-16T19:32:05Z 2021-07-16T19:32:05Z OWNER

The test suite passes with that change.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881668759 https://github.com/simonw/datasette/issues/1231#issuecomment-881668759 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jTKX simonw 9599 2021-07-16T19:27:46Z 2021-07-16T19:27:46Z OWNER

Second attempt at this:

diff --git a/datasette/app.py b/datasette/app.py
index 5976d8b..5f348cb 100644
--- a/datasette/app.py
+++ b/datasette/app.py
@@ -224,6 +224,7 @@ class Datasette:
         self.inspect_data = inspect_data
         self.immutables = set(immutables or [])
         self.databases = collections.OrderedDict()
+        self._refresh_schemas_lock = asyncio.Lock()
         self.crossdb = crossdb
         if memory or crossdb or not self.files:
             self.add_database(Database(self, is_memory=True), name="_memory")
@@ -332,6 +333,12 @@ class Datasette:
         self.client = DatasetteClient(self)

     async def refresh_schemas(self):
+        if self._refresh_schemas_lock.locked():
+            return
+        async with self._refresh_schemas_lock:
+            await self._refresh_schemas()
+
+    async def _refresh_schemas(self):
         internal_db = self.databases["_internal"]
         if not self.internal_db_created:
             await init_internal_db(internal_db)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881665383 https://github.com/simonw/datasette/issues/1231#issuecomment-881665383 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jSVn simonw 9599 2021-07-16T19:21:35Z 2021-07-16T19:21:35Z OWNER

https://stackoverflow.com/a/25799871/6083 has a good example of using asyncio.Lock():

stuff_lock = asyncio.Lock()

async def get_stuff(url):
    async with stuff_lock:
        if url in cache:
            return cache[url]
        stuff = await aiohttp.request('GET', url)
        cache[url] = stuff
        return stuff
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881664408 https://github.com/simonw/datasette/issues/1231#issuecomment-881664408 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jSGY simonw 9599 2021-07-16T19:19:35Z 2021-07-16T19:19:35Z OWNER

The only place that calls refresh_schemas() is here: https://github.com/simonw/datasette/blob/dd5ee8e66882c94343cd3f71920878c6cfd0da41/datasette/views/base.py#L120-L124

Ideally only one call to refresh_schemas() would be running at any one time.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881663968 https://github.com/simonw/datasette/issues/1231#issuecomment-881663968 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jR_g simonw 9599 2021-07-16T19:18:42Z 2021-07-16T19:18:42Z OWNER

The race condition happens inside this method - initially with the call to await init_internal_db(): https://github.com/simonw/datasette/blob/dd5ee8e66882c94343cd3f71920878c6cfd0da41/datasette/app.py#L334-L359

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881204782 https://github.com/simonw/datasette/issues/1231#issuecomment-881204782 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40hh4u simonw 9599 2021-07-16T06:14:12Z 2021-07-16T06:14:12Z OWNER

Here's the traceback I got from datasette-graphql (annoyingly only running the tests in GitHub Actions CI - I've not been able to replicate on my laptop yet):

tests/test_utils.py .                                                    [100%]

=================================== FAILURES ===================================
_________________________ test_graphql_examples[path0] _________________________

ds = <datasette.app.Datasette object at 0x7f6b8b6f8fd0>
path = PosixPath('/home/runner/work/datasette-graphql/datasette-graphql/examples/filters.md')

    @pytest.mark.asyncio
    @pytest.mark.parametrize(
        "path", (pathlib.Path(__file__).parent.parent / "examples").glob("*.md")
    )
    async def test_graphql_examples(ds, path):
        content = path.read_text()
        query = graphql_re.search(content)[1]
        try:
            variables = variables_re.search(content)[1]
        except TypeError:
            variables = "{}"
        expected = json.loads(json_re.search(content)[1])
        response = await ds.client.post(
            "/graphql",
            json={
                "query": query,
                "variables": json.loads(variables),
            },
        )
>       assert response.status_code == 200, response.json()
E       AssertionError: {'data': {'repos_arraycontains': None, 'users_contains': None, 'users_date': None, 'users_endswith': None, ...}, 'erro...", 'path': ['users_gt']}, {'locations': [{'column': 5, 'line': 34}], 'message': "'rows'", 'path': ['users_gte']}, ...]}
E       assert 500 == 200
E        +  where 500 = <Response [500 Internal Server Error]>.status_code

tests/test_graphql.py:142: AssertionError
----------------------------- Captured stderr call -----------------------------
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
table databases already exists
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/app.py", line 1171, in route_path
    response = await view(request, send)
  File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/views/base.py", line 151, in view
    request, **request.scope["url_route"]["kwargs"]
  File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/views/base.py", line 123, in dispatch_request
    await self.ds.refresh_schemas()
  File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/app.py", line 338, in refresh_schemas
    await init_internal_db(internal_db)
  File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/utils/internal_db.py", line 16, in init_internal_db
    block=True,
  File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/database.py", line 102, in execute_write
    return await self.execute_write_fn(_inner, block=block)
  File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/database.py", line 118, in execute_write_fn
    raise result
  File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/database.py", line 139, in _execute_writes
    result = task.fn(conn)
  File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/database.py", line 100, in _inner
    return conn.execute(sql, params or [])
sqlite3.OperationalError: table databases already exists
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881204343 https://github.com/simonw/datasette/issues/1231#issuecomment-881204343 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40hhx3 simonw 9599 2021-07-16T06:13:11Z 2021-07-16T06:13:11Z OWNER

This just broke the datasette-graphql test suite: https://github.com/simonw/datasette-graphql/issues/77 - I need to figure out a solution here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881129149 https://github.com/simonw/datasette/issues/1394#issuecomment-881129149 https://api.github.com/repos/simonw/datasette/issues/1394 IC_kwDOBm6k_c40hPa9 simonw 9599 2021-07-16T02:23:32Z 2021-07-16T02:23:32Z OWNER

Wrote about this in the annotated release notes for 0.58: https://simonwillison.net/2021/Jul/16/datasette-058/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Big performance boost on faceting: skip the inner order by 944870799  
881125124 https://github.com/simonw/datasette/issues/759#issuecomment-881125124 https://api.github.com/repos/simonw/datasette/issues/759 IC_kwDOBm6k_c40hOcE simonw 9599 2021-07-16T02:11:48Z 2021-07-16T02:11:54Z OWNER

I added "searchmode": "raw" as a supported option for table metadata in #1389 and released that in Datasette 0.58.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
fts search on a column doesn't work anymore due to escape_fts 612673948  
880967052 https://github.com/simonw/datasette/issues/1396#issuecomment-880967052 https://api.github.com/repos/simonw/datasette/issues/1396 MDEyOklzc3VlQ29tbWVudDg4MDk2NzA1Mg== simonw 9599 2021-07-15T19:47:25Z 2021-07-15T19:47:25Z OWNER

Actually I'm going to close this now and re-open it if the problem occurs again in the future.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"invalid reference format" publishing Docker image 944903881  
880900534 https://github.com/simonw/datasette/issues/1394#issuecomment-880900534 https://api.github.com/repos/simonw/datasette/issues/1394 MDEyOklzc3VlQ29tbWVudDg4MDkwMDUzNA== simonw 9599 2021-07-15T17:58:03Z 2021-07-15T17:58:03Z OWNER

Started a conversation about this on the SQLite forum: https://sqlite.org/forum/forumpost/2d76f2bcf65d256a?t=h

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Big performance boost on faceting: skip the inner order by 944870799  
880374156 https://github.com/simonw/datasette/issues/1396#issuecomment-880374156 https://api.github.com/repos/simonw/datasette/issues/1396 MDEyOklzc3VlQ29tbWVudDg4MDM3NDE1Ng== simonw 9599 2021-07-15T04:03:18Z 2021-07-15T04:03:18Z OWNER

I fixed datasette:latest by running the following on my laptop:

docker pull datasetteproject/datasette:0.58
docker tag datasetteproject/datasette:0.58 datasetteproject/datasette:latest
docker login -u datasetteproject -p ...
docker push datasetteproject/datasette:latest

Confirmed on https://hub.docker.com/r/datasetteproject/datasette/tags?page=1&ordering=last_updated that datasette:latest and datasette:0.58 both now have the same digest of 3b5ba478040e.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"invalid reference format" publishing Docker image 944903881  
880372149 https://github.com/simonw/datasette/issues/1396#issuecomment-880372149 https://api.github.com/repos/simonw/datasette/issues/1396 MDEyOklzc3VlQ29tbWVudDg4MDM3MjE0OQ== simonw 9599 2021-07-15T03:56:49Z 2021-07-15T03:56:49Z OWNER

I'm going to leave this open until I next successfully publish a new version.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"invalid reference format" publishing Docker image 944903881  
880326049 https://github.com/simonw/datasette/issues/1396#issuecomment-880326049 https://api.github.com/repos/simonw/datasette/issues/1396 MDEyOklzc3VlQ29tbWVudDg4MDMyNjA0OQ== simonw 9599 2021-07-15T01:50:05Z 2021-07-15T01:50:05Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"invalid reference format" publishing Docker image 944903881  
880325362 https://github.com/simonw/datasette/issues/1396#issuecomment-880325362 https://api.github.com/repos/simonw/datasette/issues/1396 MDEyOklzc3VlQ29tbWVudDg4MDMyNTM2Mg== simonw 9599 2021-07-15T01:48:11Z 2021-07-15T01:48:11Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"invalid reference format" publishing Docker image 944903881  
880325004 https://github.com/simonw/datasette/issues/1396#issuecomment-880325004 https://api.github.com/repos/simonw/datasette/issues/1396 MDEyOklzc3VlQ29tbWVudDg4MDMyNTAwNA== simonw 9599 2021-07-15T01:47:17Z 2021-07-15T01:47:17Z OWNER

This is the part of the publish workflow that failed and threw the "invalid reference format" error: https://github.com/simonw/datasette/blob/084cfe1e00e1a4c0515390a513aca286eeea20c2/.github/workflows/publish.yml#L100-L119

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"invalid reference format" publishing Docker image 944903881  
880324637 https://github.com/simonw/datasette/issues/1396#issuecomment-880324637 https://api.github.com/repos/simonw/datasette/issues/1396 MDEyOklzc3VlQ29tbWVudDg4MDMyNDYzNw== simonw 9599 2021-07-15T01:46:26Z 2021-07-15T01:46:26Z OWNER

I manually published the Docker image using https://github.com/simonw/datasette/actions/workflows/push_docker_tag.yml https://github.com/simonw/datasette/runs/3072505126

The 0.58 release shows up on https://hub.docker.com/r/datasetteproject/datasette/tags?page=1&ordering=last_updated now - BUT the latest tag still points to a version from a month ago.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"invalid reference format" publishing Docker image 944903881  
880287483 https://github.com/simonw/datasette/issues/1394#issuecomment-880287483 https://api.github.com/repos/simonw/datasette/issues/1394 MDEyOklzc3VlQ29tbWVudDg4MDI4NzQ4Mw== simonw 9599 2021-07-15T00:01:47Z 2021-07-15T00:01:47Z OWNER

I wrote this code:

_order_by_re = re.compile(r"(^.*) order by [a-zA-Z_][a-zA-Z0-9_]+( desc)?$", re.DOTALL)
_order_by_braces_re = re.compile(r"(^.*) order by \[[^\]]+\]( desc)?$", re.DOTALL)


def strip_order_by(sql):
    for regex in (_order_by_re, _order_by_braces_re):
        match = regex.match(sql)
        if match is not None:
            return match.group(1)
    return sql

@pytest.mark.parametrize(
    "sql,expected",
    [
        ("blah", "blah"),
        ("select * from foo", "select * from foo"),
        ("select * from foo order by bah", "select * from foo"),
        ("select * from foo order by bah desc", "select * from foo"),
        ("select * from foo order by [select]", "select * from foo"),
        ("select * from foo order by [select] desc", "select * from foo"),
    ],
)
def test_strip_order_by(sql, expected):
    assert strip_order_by(sql) == expected

But it turns out I don't need it! The SQL that is passed to the facet class is created by this code: https://github.com/simonw/datasette/blob/ba11ef27edd6981eeb26d7ecf5aa236707f5f8ce/datasette/views/table.py#L677-L684

And the only place that uses that sql_no_limit variable is here: https://github.com/simonw/datasette/blob/ba11ef27edd6981eeb26d7ecf5aa236707f5f8ce/datasette/views/table.py#L733-L745

So I can change that to sql_no_limit_no_order and fix the bug that way instead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Big performance boost on faceting: skip the inner order by 944870799  
880278256 https://github.com/simonw/datasette/issues/1394#issuecomment-880278256 https://api.github.com/repos/simonw/datasette/issues/1394 MDEyOklzc3VlQ29tbWVudDg4MDI3ODI1Ng== simonw 9599 2021-07-14T23:35:18Z 2021-07-14T23:35:18Z OWNER

The challenge here is that faceting doesn't currently modify the inner SQL at all - it wraps it so that it can work against any SQL statement (though Datasette itself does not yet take advantage of that ability, only offering faceting on table pages).

So just removing the order by wouldn't be appropriate if the inner query looked something like this:

select * from items order by created desc limit 100

Since the intent there would be to return facet counts against only the most recent 100 items.

In SQLite the limit has to come after the order by though, so the fix here could be as easy as using a regular expression to identify queries that end with order by COLUMN (desc)? and stripping off that clause.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Big performance boost on faceting: skip the inner order by 944870799  
880259255 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-880259255 https://api.github.com/repos/simonw/sqlite-utils/issues/297 MDEyOklzc3VlQ29tbWVudDg4MDI1OTI1NQ== simonw 9599 2021-07-14T22:48:41Z 2021-07-14T22:48:41Z OWNER

Should also take advantage of .mode tabs to support sqlite-utils insert blah.db blah blah.csv --tsv --fast

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
880257587 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-880257587 https://api.github.com/repos/simonw/sqlite-utils/issues/297 MDEyOklzc3VlQ29tbWVudDg4MDI1NzU4Nw== simonw 9599 2021-07-14T22:44:05Z 2021-07-14T22:44:05Z OWNER

https://unix.stackexchange.com/a/642364 suggests you can also use this to import from stdin, like so:

sqlite3 -csv $database_file_name ".import '|cat -' $table_name"

Here the sqlite3 -csv is an alternative to using .mode csv.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
880256865 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-880256865 https://api.github.com/repos/simonw/sqlite-utils/issues/297 MDEyOklzc3VlQ29tbWVudDg4MDI1Njg2NQ== simonw 9599 2021-07-14T22:42:11Z 2021-07-14T22:42:11Z OWNER

Potential workaround for missing --skip implementation is that the filename can be a command instead, so maybe it could shell out to tail -n +1 filename:

The source argument is the name of a file to be read or, if it begins with a "|" character, specifies a command which will be run to produce the input CSV data.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
880256058 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-880256058 https://api.github.com/repos/simonw/sqlite-utils/issues/297 MDEyOklzc3VlQ29tbWVudDg4MDI1NjA1OA== simonw 9599 2021-07-14T22:40:01Z 2021-07-14T22:40:47Z OWNER

Full docs here: https://www.sqlite.org/draft/cli.html#csv

One catch: how this works has changed in recent SQLite versions: https://www.sqlite.org/changes.html

  • 2020-12-01 (3.34.0) - "Table name quoting works correctly for the .import dot-command"
  • 2020-05-22 (3.32.0) - "Add options to the .import command: --csv, --ascii, --skip"
  • 2017-08-01 (3.20.0) - " The ".import" command ignores an initial UTF-8 BOM."

The "skip" feature is particularly important to understand. https://www.sqlite.org/draft/cli.html#csv says:

There are two cases to consider: (1) Table "tab1" does not previously exist and (2) table "tab1" does already exist.

In the first case, when the table does not previously exist, the table is automatically created and the content of the first row of the input CSV file is used to determine the name of all the columns in the table. In other words, if the table does not previously exist, the first row of the CSV file is interpreted to be column names and the actual data starts on the second row of the CSV file.

For the second case, when the table already exists, every row of the CSV file, including the first row, is assumed to be actual content. If the CSV file contains an initial row of column labels, you can cause the .import command to skip that initial row using the "--skip 1" option.

But the --skip 1 option is only available in 3.32.0 and higher.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
880153069 https://github.com/simonw/datasette/issues/268#issuecomment-880153069 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDg4MDE1MzA2OQ== simonw 9599 2021-07-14T19:31:00Z 2021-07-14T19:31:00Z OWNER

... though interestingly I can't replicate that error on latest.datasette.io - https://latest.datasette.io/fixtures/searchable?_search=park.&_searchmode=raw

That's running https://latest.datasette.io/-/versions SQLite 3.35.4 whereas https://www.niche-museums.com/-/versions is running 3.27.2 (the most recent version available with Vercel) - but there's nothing in the SQLite changelog between those two versions that suggests changes to how the FTS5 parser works. https://www.sqlite.org/changes.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
880150755 https://github.com/simonw/datasette/issues/268#issuecomment-880150755 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDg4MDE1MDc1NQ== simonw 9599 2021-07-14T19:26:47Z 2021-07-14T19:29:08Z OWNER

What are the side-effects of turning that on in the query string, or even by default as you suggested? I see that you stated in the docs... "to ensure they do not cause any confusion for users who are not aware of them", but I'm not sure what those could be.

Mainly that it's possible to generate SQL queries that crash with an error. This was the example that convinced me to default to escaping:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
879477586 https://github.com/dogsheep/healthkit-to-sqlite/issues/12#issuecomment-879477586 https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/12 MDEyOklzc3VlQ29tbWVudDg3OTQ3NzU4Ng== simonw 9599 2021-07-13T23:50:06Z 2021-07-13T23:50:06Z MEMBER

Unfortunately I don't think updating the database is practical, because the export doesn't include unique identifiers which can be used to update existing records and create new ones. Recreating from scratch works around that limitation.

I've not explored workouts with SpatiaLite but that's a really good idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Some workout columns should be float, not text 727848625  
879309636 https://github.com/simonw/datasette/pull/1393#issuecomment-879309636 https://api.github.com/repos/simonw/datasette/issues/1393 MDEyOklzc3VlQ29tbWVudDg3OTMwOTYzNg== simonw 9599 2021-07-13T18:32:25Z 2021-07-13T18:32:25Z OWNER

Thanks

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Update deploying.rst 941412189  
879277953 https://github.com/simonw/datasette/pull/1392#issuecomment-879277953 https://api.github.com/repos/simonw/datasette/issues/1392 MDEyOklzc3VlQ29tbWVudDg3OTI3Nzk1Mw== simonw 9599 2021-07-13T17:42:31Z 2021-07-13T17:42:31Z OWNER

Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Update deploying.rst 941403676  
877835171 https://github.com/simonw/datasette/issues/511#issuecomment-877835171 https://api.github.com/repos/simonw/datasette/issues/511 MDEyOklzc3VlQ29tbWVudDg3NzgzNTE3MQ== simonw 9599 2021-07-11T17:23:05Z 2021-07-11T17:23:05Z OWNER
== 87 failed, 819 passed, 7 skipped, 29 errors in 2584.85s (0:43:04) ==

https://github.com/simonw/datasette/runs/3038188870?check_suite_focus=true

Full copy of log here: https://gist.github.com/simonw/4b1fdd24496b989fca56bc757be345ad

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette tests passing on Windows in GitHub Actions 456578474  
877726495 https://github.com/simonw/datasette/issues/511#issuecomment-877726495 https://api.github.com/repos/simonw/datasette/issues/511 MDEyOklzc3VlQ29tbWVudDg3NzcyNjQ5NQ== simonw 9599 2021-07-11T01:32:27Z 2021-07-11T01:32:27Z OWNER

I'm using pytest-xdist and this:

pytest -n auto -m "not serial"

I'll try not using the -n auto bit on Windows and see if that helps.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette tests passing on Windows in GitHub Actions 456578474  
877726288 https://github.com/simonw/datasette/issues/511#issuecomment-877726288 https://api.github.com/repos/simonw/datasette/issues/511 MDEyOklzc3VlQ29tbWVudDg3NzcyNjI4OA== simonw 9599 2021-07-11T01:29:41Z 2021-07-11T01:29:41Z OWNER

Lots of errors that look like this:

2021-07-11T00:40:32.1189321Z E           NotADirectoryError: [WinError 267] The directory name is invalid: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpdr41pgwg\\data.db'
2021-07-11T00:40:32.1190083Z 
2021-07-11T00:40:32.1191128Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\shutil.py:596: NotADirectoryError
2021-07-11T00:40:32.1191999Z ___________________ ERROR at teardown of test_insert_error ____________________
2021-07-11T00:40:32.1192842Z [gw1] win32 -- Python 3.8.10 c:\hostedtoolcache\windows\python\3.8.10\x64\python.exe
2021-07-11T00:40:32.1193387Z 
2021-07-11T00:40:32.1193930Z path = 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpry729pq_'
2021-07-11T00:40:32.1194876Z onerror = <function TemporaryDirectory._rmtree.<locals>.onerror at 0x00000291FCEA93A0>
2021-07-11T00:40:32.1195480Z 
2021-07-11T00:40:32.1195927Z     def _rmtree_unsafe(path, onerror):
2021-07-11T00:40:32.1196435Z         try:
2021-07-11T00:40:32.1196910Z             with os.scandir(path) as scandir_it:
2021-07-11T00:40:32.1197504Z                 entries = list(scandir_it)
2021-07-11T00:40:32.1198002Z         except OSError:
2021-07-11T00:40:32.1198607Z             onerror(os.scandir, path, sys.exc_info())
2021-07-11T00:40:32.1199137Z             entries = []
2021-07-11T00:40:32.1199637Z         for entry in entries:
2021-07-11T00:40:32.1200184Z             fullname = entry.path
2021-07-11T00:40:32.1200692Z             if _rmtree_isdir(entry):
2021-07-11T00:40:32.1201198Z                 try:
2021-07-11T00:40:32.1201643Z                     if entry.is_symlink():
2021-07-11T00:40:32.1202280Z                         # This can only happen if someone replaces
2021-07-11T00:40:32.1202944Z                         # a directory with a symlink after the call to
2021-07-11T00:40:32.1203623Z                         # os.scandir or entry.is_dir above.
2021-07-11T00:40:32.1204303Z                         raise OSError("Cannot call rmtree on a symbolic link")
2021-07-11T00:40:32.1204942Z                 except OSError:
2021-07-11T00:40:32.1206416Z                     onerror(os.path.islink, fullname, sys.exc_info())
2021-07-11T00:40:32.1207022Z                     continue
2021-07-11T00:40:32.1207584Z                 _rmtree_unsafe(fullname, onerror)
2021-07-11T00:40:32.1208074Z             else:
2021-07-11T00:40:32.1208496Z                 try:
2021-07-11T00:40:32.1208926Z >                   os.unlink(fullname)
2021-07-11T00:40:32.1210053Z E                   PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpry729pq_\\data.db'
2021-07-11T00:40:32.1210974Z 
2021-07-11T00:40:32.1211638Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\shutil.py:616: PermissionError
2021-07-11T00:40:32.1212211Z 
2021-07-11T00:40:32.1212846Z During handling of the above exception, another exception occurred:
2021-07-11T00:40:32.1213320Z 
2021-07-11T00:40:32.1213797Z func = <built-in function unlink>
2021-07-11T00:40:32.1214529Z path = 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpry729pq_\\data.db'
2021-07-11T00:40:32.1215763Z exc_info = (<class 'PermissionError'>, PermissionError(13, 'The process cannot access the file because it is being used by another process'), <traceback object at 0x00000291FB4D7040>)
2021-07-11T00:40:32.1217263Z 
2021-07-11T00:40:32.1217777Z     def onerror(func, path, exc_info):
2021-07-11T00:40:32.1218421Z         if issubclass(exc_info[0], PermissionError):
2021-07-11T00:40:32.1219079Z             def resetperms(path):
2021-07-11T00:40:32.1219518Z                 try:
2021-07-11T00:40:32.1219992Z                     _os.chflags(path, 0)
2021-07-11T00:40:32.1220535Z                 except AttributeError:
2021-07-11T00:40:32.1221110Z                     pass
2021-07-11T00:40:32.1221545Z                 _os.chmod(path, 0o700)
2021-07-11T00:40:32.1221984Z     
2021-07-11T00:40:32.1222330Z             try:
2021-07-11T00:40:32.1222768Z                 if path != name:
2021-07-11T00:40:32.1223332Z                     resetperms(_os.path.dirname(path))
2021-07-11T00:40:32.1223963Z                 resetperms(path)
2021-07-11T00:40:32.1224408Z     
2021-07-11T00:40:32.1224749Z                 try:
2021-07-11T00:40:32.1225954Z >                   _os.unlink(path)
2021-07-11T00:40:32.1227032Z E                   PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpry729pq_\\data.db'
2021-07-11T00:40:32.1227927Z 
2021-07-11T00:40:32.1228646Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\tempfile.py:802: PermissionError
2021-07-11T00:40:32.1229200Z 
2021-07-11T00:40:32.1229842Z During handling of the above exception, another exception occurred:
2021-07-11T00:40:32.1230355Z 
2021-07-11T00:40:32.1230783Z     @pytest.fixture
2021-07-11T00:40:32.1231322Z     def canned_write_client():
2021-07-11T00:40:32.1231805Z         with make_app_client(
2021-07-11T00:40:32.1232467Z             extra_databases={"data.db": "create table names (name text)"},
2021-07-11T00:40:32.1233104Z             metadata={
2021-07-11T00:40:32.1233535Z                 "databases": {
2021-07-11T00:40:32.1233989Z                     "data": {
2021-07-11T00:40:32.1234416Z                         "queries": {
2021-07-11T00:40:32.1235001Z                             "canned_read": {"sql": "select * from names"},
2021-07-11T00:40:32.1235527Z                             "add_name": {
2021-07-11T00:40:32.1236117Z                                 "sql": "insert into names (name) values (:name)",
2021-07-11T00:40:32.1236686Z                                 "write": True,
2021-07-11T00:40:32.1237317Z                                 "on_success_redirect": "/data/add_name?success",
2021-07-11T00:40:32.1237882Z                             },
2021-07-11T00:40:32.1238331Z                             "add_name_specify_id": {
2021-07-11T00:40:32.1239009Z                                 "sql": "insert into names (rowid, name) values (:rowid, :name)",
2021-07-11T00:40:32.1239610Z                                 "write": True,
2021-07-11T00:40:32.1240259Z                                 "on_error_redirect": "/data/add_name_specify_id?error",
2021-07-11T00:40:32.1240839Z                             },
2021-07-11T00:40:32.1241320Z                             "delete_name": {
2021-07-11T00:40:32.1242504Z                                 "sql": "delete from names where rowid = :rowid",
2021-07-11T00:40:32.1243127Z                                 "write": True,
2021-07-11T00:40:32.1243721Z                                 "on_success_message": "Name deleted",
2021-07-11T00:40:32.1244282Z                                 "allow": {"id": "root"},
2021-07-11T00:40:32.1244749Z                             },
2021-07-11T00:40:32.1245959Z                             "update_name": {
2021-07-11T00:40:32.1246614Z                                 "sql": "update names set name = :name where rowid = :rowid",
2021-07-11T00:40:32.1247267Z                                 "params": ["rowid", "name", "extra"],
2021-07-11T00:40:32.1247828Z                                 "write": True,
2021-07-11T00:40:32.1248247Z                             },
2021-07-11T00:40:32.1248653Z                         }
2021-07-11T00:40:32.1249166Z                     }
2021-07-11T00:40:32.1249577Z                 }
2021-07-11T00:40:32.1249962Z             },
2021-07-11T00:40:32.1250333Z         ) as client:
2021-07-11T00:40:32.1250822Z >           yield client
2021-07-11T00:40:32.1251078Z 
2021-07-11T00:40:32.1251678Z D:\a\datasette\datasette\tests\test_canned_queries.py:43: 
2021-07-11T00:40:32.1252347Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2021-07-11T00:40:32.1253040Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\contextlib.py:120: in __exit__
2021-07-11T00:40:32.1253759Z     next(self.gen)
2021-07-11T00:40:32.1254398Z D:\a\datasette\datasette\tests\fixtures.py:156: in make_app_client
2021-07-11T00:40:32.1255098Z     yield TestClient(ds)
2021-07-11T00:40:32.1255796Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\tempfile.py:827: in __exit__
2021-07-11T00:40:32.1256510Z     self.cleanup()
2021-07-11T00:40:32.1257200Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\tempfile.py:831: in cleanup
2021-07-11T00:40:32.1257961Z     self._rmtree(self.name)
2021-07-11T00:40:32.1258712Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\tempfile.py:813: in _rmtree
2021-07-11T00:40:32.1259487Z     _shutil.rmtree(name, onerror=onerror)
2021-07-11T00:40:32.1260280Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\shutil.py:740: in rmtree
2021-07-11T00:40:32.1261039Z     return _rmtree_unsafe(path, onerror)
2021-07-11T00:40:32.1261843Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\shutil.py:618: in _rmtree_unsafe
2021-07-11T00:40:32.1262633Z     onerror(os.unlink, fullname, sys.exc_info())
2021-07-11T00:40:32.1263456Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\tempfile.py:805: in onerror
2021-07-11T00:40:32.1264175Z     cls._rmtree(path)
2021-07-11T00:40:32.1264848Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\tempfile.py:813: in _rmtree
2021-07-11T00:40:32.1266329Z     _shutil.rmtree(name, onerror=onerror)
2021-07-11T00:40:32.1267082Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\shutil.py:740: in rmtree
2021-07-11T00:40:32.1267858Z     return _rmtree_unsafe(path, onerror)
2021-07-11T00:40:32.1268615Z c:\hostedtoolcache\windows\python\3.8.10\x64\lib\shutil.py:599: in _rmtree_unsafe
2021-07-11T00:40:32.1269440Z     onerror(os.scandir, path, sys.exc_info())
2021-07-11T00:40:32.1269979Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2021-07-11T00:40:32.1270287Z 
2021-07-11T00:40:32.1270947Z path = 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpry729pq_\\data.db'
2021-07-11T00:40:32.1273356Z onerror = <function TemporaryDirectory._rmtree.<locals>.onerror at 0x00000291FCF40E50>
2021-07-11T00:40:32.1273999Z 
2021-07-11T00:40:32.1274493Z     def _rmtree_unsafe(path, onerror):
2021-07-11T00:40:32.1274953Z         try:
2021-07-11T00:40:32.1275461Z >           with os.scandir(path) as scandir_it:
2021-07-11T00:40:32.1276459Z E           NotADirectoryError: [WinError 267] The directory name is invalid: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpry729pq_\\data.db'
2021-07-11T00:40:32.1277220Z 
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette tests passing on Windows in GitHub Actions 456578474  
877725742 https://github.com/simonw/datasette/issues/511#issuecomment-877725742 https://api.github.com/repos/simonw/datasette/issues/511 MDEyOklzc3VlQ29tbWVudDg3NzcyNTc0Mg== simonw 9599 2021-07-11T01:25:01Z 2021-07-11T01:26:38Z OWNER

That's weird. https://github.com/simonw/datasette/runs/3037862798 finished running and came up green - but actually a TON of the tests failed on Windows. Not sure why that didn't fail the whole test suite:

https://user-images.githubusercontent.com/9599/125180192-12257000-e1ac-11eb-8657-d46b7bcdc1b2.png">

Also the test suite took 50 minutes on Windows!

Here's a copy of the full log file for the tests on Python 3.8 on Windows: https://gist.github.com/simonw/2900ef33693c1bbda09188eb31c8212d

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette tests passing on Windows in GitHub Actions 456578474  
877725193 https://github.com/simonw/datasette/issues/1388#issuecomment-877725193 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NzcyNTE5Mw== simonw 9599 2021-07-11T01:18:38Z 2021-07-11T01:18:38Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
877721003 https://github.com/simonw/datasette/issues/1388#issuecomment-877721003 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NzcyMTAwMw== simonw 9599 2021-07-11T00:21:19Z 2021-07-11T00:21:19Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
877718364 https://github.com/simonw/datasette/issues/511#issuecomment-877718364 https://api.github.com/repos/simonw/datasette/issues/511 MDEyOklzc3VlQ29tbWVudDg3NzcxODM2NA== simonw 9599 2021-07-10T23:54:37Z 2021-07-10T23:54:37Z OWNER

Looks like it's not even 10% of the way through, and already a bunch of errors:

https://user-images.githubusercontent.com/9599/125179059-81e12e00-e19f-11eb-94d9-0f2d9ce8afad.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette tests passing on Windows in GitHub Actions 456578474  
877718286 https://github.com/simonw/datasette/issues/511#issuecomment-877718286 https://api.github.com/repos/simonw/datasette/issues/511 MDEyOklzc3VlQ29tbWVudDg3NzcxODI4Ng== simonw 9599 2021-07-10T23:53:29Z 2021-07-10T23:53:29Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette tests passing on Windows in GitHub Actions 456578474  
877717791 https://github.com/simonw/datasette/issues/511#issuecomment-877717791 https://api.github.com/repos/simonw/datasette/issues/511 MDEyOklzc3VlQ29tbWVudDg3NzcxNzc5MQ== simonw 9599 2021-07-10T23:45:35Z 2021-07-10T23:45:35Z OWNER

Trying to run on Windows today, I get an error from the utils/asgi.py module.

It's trying from os import EX_CANTCREAT which is Unix-only. I commented this line out, and (so far) it's working.

Good news: that line was removed in #1094.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette tests passing on Windows in GitHub Actions 456578474  
877717392 https://github.com/simonw/datasette/pull/557#issuecomment-877717392 https://api.github.com/repos/simonw/datasette/issues/557 MDEyOklzc3VlQ29tbWVudDg3NzcxNzM5Mg== simonw 9599 2021-07-10T23:39:48Z 2021-07-10T23:39:48Z OWNER

Abandoning this - need to switch to using GitHub Actions for this instead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get tests running on Windows using Travis CI 466996584  
877717262 https://github.com/simonw/datasette/issues/1388#issuecomment-877717262 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NzcxNzI2Mg== simonw 9599 2021-07-10T23:37:54Z 2021-07-10T23:37:54Z OWNER

I wonder if --fd is worth supporting too?

I'm going to hold off on implementing this until someone asks for it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
877716993 https://github.com/simonw/datasette/issues/1388#issuecomment-877716993 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NzcxNjk5Mw== simonw 9599 2021-07-10T23:34:02Z 2021-07-10T23:34:02Z OWNER

Figured out an example nginx configuration. This in nginx.conf:

daemon off;
events {
  worker_connections  1024;
}
http {
  server {
    listen 8092;
    location / {
      proxy_pass              http://datasette;
      proxy_set_header        X-Real-IP $remote_addr;
      proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
    }
  }
  upstream datasette {
    server unix:/tmp/datasette.sock;
  }
}

Then run datasette --uds /tmp/datasette.sock

Then run nginx like this:

nginx -c ./nginx.conf

Then hits to http://localhost:8092/ will be proxied to Datasette.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
877716359 https://github.com/simonw/datasette/issues/1388#issuecomment-877716359 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NzcxNjM1OQ== simonw 9599 2021-07-10T23:24:58Z 2021-07-10T23:24:58Z OWNER

Apparently Windows 10 has Unix domain socket support: https://bugs.python.org/issue33408

Unix socket (AF_UNIX) is now avalible in Windows 10 (April 2018 Update). Please add Python support for it.
More details about it on https://blogs.msdn.microsoft.com/commandline/2017/12/19/af_unix-comes-to-windows/

But it's not clear if this is going to work. That same issue thread (the issue is still open) suggests using hasattr(socket, 'AF_UNIX')) to detect support in tests.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
877716156 https://github.com/simonw/datasette/issues/1388#issuecomment-877716156 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NzcxNjE1Ng== simonw 9599 2021-07-10T23:22:21Z 2021-07-10T23:22:21Z OWNER

I don't have the Datasette test suite running on Windows yet, but I'd like it to run there some day - so ideally this test would be skipped if Unix domain sockets are not supported by the underlying operating system.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
877715654 https://github.com/simonw/datasette/issues/1388#issuecomment-877715654 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NzcxNTY1NA== simonw 9599 2021-07-10T23:15:06Z 2021-07-10T23:15:06Z OWNER

I can run tests against it using httpx: https://www.python-httpx.org/advanced/#usage_1

```pycon

import httpx

Connect to the Docker API via a Unix Socket.

transport = httpx.HTTPTransport(uds="/var/run/docker.sock")
client = httpx.Client(transport=transport)
response = client.get("http://docker/info")
```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
877714698 https://github.com/simonw/datasette/issues/1388#issuecomment-877714698 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NzcxNDY5OA== simonw 9599 2021-07-10T23:01:37Z 2021-07-10T23:01:37Z OWNER

Can test this with:

curl --unix-socket ${socket} -i "http://localhost/" 
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
877691558 https://github.com/simonw/datasette/issues/1391#issuecomment-877691558 https://api.github.com/repos/simonw/datasette/issues/1391 MDEyOklzc3VlQ29tbWVudDg3NzY5MTU1OA== simonw 9599 2021-07-10T19:26:57Z 2021-07-10T19:26:57Z OWNER

The https://latest.datasette.io/fixtures.db file no longer includes generated columns, which will help avoid confusion such as seen in #1376.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Stop using generated columns in fixtures.db 941300946  
877691427 https://github.com/simonw/datasette/issues/1391#issuecomment-877691427 https://api.github.com/repos/simonw/datasette/issues/1391 MDEyOklzc3VlQ29tbWVudDg3NzY5MTQyNw== simonw 9599 2021-07-10T19:26:00Z 2021-07-10T19:26:00Z OWNER

I had to run the tests locally on my macOS laptop using pysqlite3 to get a version that supported generated columns - wrote up a TIL about that here: https://til.simonwillison.net/sqlite/pysqlite3-on-macos

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Stop using generated columns in fixtures.db 941300946  
877687196 https://github.com/simonw/datasette/issues/1391#issuecomment-877687196 https://api.github.com/repos/simonw/datasette/issues/1391 MDEyOklzc3VlQ29tbWVudDg3NzY4NzE5Ng== simonw 9599 2021-07-10T18:58:40Z 2021-07-10T18:58:40Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Stop using generated columns in fixtures.db 941300946  
877686784 https://github.com/simonw/datasette/issues/1391#issuecomment-877686784 https://api.github.com/repos/simonw/datasette/issues/1391 MDEyOklzc3VlQ29tbWVudDg3NzY4Njc4NA== simonw 9599 2021-07-10T18:56:03Z 2021-07-10T18:56:03Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Stop using generated columns in fixtures.db 941300946  
877682533 https://github.com/simonw/datasette/issues/1391#issuecomment-877682533 https://api.github.com/repos/simonw/datasette/issues/1391 MDEyOklzc3VlQ29tbWVudDg3NzY4MjUzMw== simonw 9599 2021-07-10T18:28:05Z 2021-07-10T18:28:05Z OWNER

Here's the test in question: https://github.com/simonw/datasette/blob/a6c55afe8c82ead8deb32f90c9324022fd422324/tests/test_api.py#L2033-L2046

Various other places in the test code also need changing - anything that calls supports_generated_columns().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Stop using generated columns in fixtures.db 941300946  
877681031 https://github.com/simonw/datasette/issues/1389#issuecomment-877681031 https://api.github.com/repos/simonw/datasette/issues/1389 MDEyOklzc3VlQ29tbWVudDg3NzY4MTAzMQ== simonw 9599 2021-07-10T18:17:29Z 2021-07-10T18:17:29Z OWNER

I don't like ?_searchmode=default because it suggests "use the default" - but it actually over-rides the default that was specified by "searchmode": "raw" in metadata.json.

I'm going with ?_searchmode=escaped instead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"searchmode": "raw" in table metadata 940077168  
877310125 https://github.com/simonw/datasette/issues/1390#issuecomment-877310125 https://api.github.com/repos/simonw/datasette/issues/1390 MDEyOklzc3VlQ29tbWVudDg3NzMxMDEyNQ== simonw 9599 2021-07-09T16:32:57Z 2021-07-09T16:32:57Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mention restarting systemd in documentation 940891698  
877308310 https://github.com/simonw/datasette/issues/1390#issuecomment-877308310 https://api.github.com/repos/simonw/datasette/issues/1390 MDEyOklzc3VlQ29tbWVudDg3NzMwODMxMA== simonw 9599 2021-07-09T16:29:48Z 2021-07-09T16:29:48Z OWNER
sudo systemctl restart datasette.service
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mention restarting systemd in documentation 940891698  
876620095 https://github.com/simonw/datasette/issues/1389#issuecomment-876620095 https://api.github.com/repos/simonw/datasette/issues/1389 MDEyOklzc3VlQ29tbWVudDg3NjYyMDA5NQ== simonw 9599 2021-07-08T17:35:09Z 2021-07-08T17:35:09Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"searchmode": "raw" in table metadata 940077168  
876619531 https://github.com/simonw/datasette/issues/1389#issuecomment-876619531 https://api.github.com/repos/simonw/datasette/issues/1389 MDEyOklzc3VlQ29tbWVudDg3NjYxOTUzMQ== simonw 9599 2021-07-08T17:34:16Z 2021-07-08T17:34:16Z OWNER

If I implement this I'll also set it up so ?_searchmode=default can be used to over-ride that and go back to the default behaviour.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"searchmode": "raw" in table metadata 940077168  
876619271 https://github.com/simonw/datasette/issues/1389#issuecomment-876619271 https://api.github.com/repos/simonw/datasette/issues/1389 MDEyOklzc3VlQ29tbWVudDg3NjYxOTI3MQ== simonw 9599 2021-07-08T17:33:49Z 2021-07-08T17:33:49Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"searchmode": "raw" in table metadata 940077168  
876618847 https://github.com/simonw/datasette/issues/1389#issuecomment-876618847 https://api.github.com/repos/simonw/datasette/issues/1389 MDEyOklzc3VlQ29tbWVudDg3NjYxODg0Nw== simonw 9599 2021-07-08T17:33:08Z 2021-07-08T17:33:08Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"searchmode": "raw" in table metadata 940077168  
876618582 https://github.com/simonw/datasette/issues/1389#issuecomment-876618582 https://api.github.com/repos/simonw/datasette/issues/1389 MDEyOklzc3VlQ29tbWVudDg3NjYxODU4Mg== simonw 9599 2021-07-08T17:32:40Z 2021-07-08T17:32:40Z OWNER

This makes sense to me since other useful querystring arguments like this can be set as defaults in the metadata.json configuration.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"searchmode": "raw" in table metadata 940077168  
876616414 https://github.com/simonw/datasette/issues/268#issuecomment-876616414 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDg3NjYxNjQxNA== simonw 9599 2021-07-08T17:29:04Z 2021-07-08T17:29:04Z OWNER

I had setup a full text search on my instance of Datasette for title data for our public library, and was noticing that some of the features of the SQLite FTS weren't working as expected ... and maybe the issue is in the escape_fts() function

That's a deliberate feature (albeit controversial, see #759) - part of the main problem here is that it's easy to construct a SQLite full-text search string which results in a database error. This is a bad user-experience!

You can opt-in to raw SQL queries by appending ?_searchmode=raw to the page, see https://docs.datasette.io/en/stable/full_text_search.html#advanced-sqlite-search-queries

But maybe there should be an option for turning that on by default without needing the query string?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
875742910 https://github.com/simonw/datasette/issues/1388#issuecomment-875742910 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NTc0MjkxMA== simonw 9599 2021-07-07T16:20:50Z 2021-07-07T16:23:02Z OWNER

I wonder if --fd is worth supporting too? Uvicorn documentation says that's useful for running under process managers, I'd want to understand exactly how to use that (and test it) before adding the feature though.

https://www.uvicorn.org/settings/

Docs on how to use a process manager: https://www.uvicorn.org/deployment/#supervisor

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
875741410 https://github.com/simonw/datasette/issues/1388#issuecomment-875741410 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NTc0MTQxMA== simonw 9599 2021-07-07T16:18:50Z 2021-07-07T16:18:50Z OWNER

You could actually run Datasette like this today without modifications by running a thin Python script that imports from datasette.app, instantiates the ASGI app and passes that to uvicorn.run - but I like this as a supported feature too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
875740085 https://github.com/simonw/datasette/issues/1388#issuecomment-875740085 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NTc0MDA4NQ== simonw 9599 2021-07-07T16:17:08Z 2021-07-07T16:17:08Z OWNER

Looks pretty easy to implement - here's a hint from Uvicorn source code: https://github.com/encode/uvicorn/blob/b5af1049e63c059dc750a450c807b9768f91e906/uvicorn/main.py#L368

Need to work out a simple pattern for testing this too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
875738149 https://github.com/simonw/datasette/issues/1388#issuecomment-875738149 https://api.github.com/repos/simonw/datasette/issues/1388 MDEyOklzc3VlQ29tbWVudDg3NTczODE0OQ== simonw 9599 2021-07-07T16:14:29Z 2021-07-07T16:14:29Z OWNER

This sounds like a valuable feature for people running Datasette behind a proxy.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Serve using UNIX domain socket 939051549  
873156408 https://github.com/simonw/datasette/issues/1387#issuecomment-873156408 https://api.github.com/repos/simonw/datasette/issues/1387 MDEyOklzc3VlQ29tbWVudDg3MzE1NjQwOA== simonw 9599 2021-07-02T17:37:30Z 2021-07-02T17:37:30Z OWNER
{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
absolute_url() behind a proxy assembles incorrect http://127.0.0.1:8001/ URLs 935930820  
873141222 https://github.com/simonw/datasette/issues/1387#issuecomment-873141222 https://api.github.com/repos/simonw/datasette/issues/1387 MDEyOklzc3VlQ29tbWVudDg3MzE0MTIyMg== simonw 9599 2021-07-02T17:09:32Z 2021-07-02T17:09:32Z OWNER

I'm going to add this to the suggested Apache configuration at https://docs.datasette.io/en/stable/deploying.html#apache-proxy-configuration

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
absolute_url() behind a proxy assembles incorrect http://127.0.0.1:8001/ URLs 935930820  
873140742 https://github.com/simonw/datasette/issues/1387#issuecomment-873140742 https://api.github.com/repos/simonw/datasette/issues/1387 MDEyOklzc3VlQ29tbWVudDg3MzE0MDc0Mg== simonw 9599 2021-07-02T17:08:40Z 2021-07-02T17:08:40Z OWNER

ProxyPreserveHost On is the Apache setting - it defaults to Off: https://httpd.apache.org/docs/2.4/mod/mod_proxy.html#proxypreservehost

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
absolute_url() behind a proxy assembles incorrect http://127.0.0.1:8001/ URLs 935930820  
873139138 https://github.com/simonw/datasette/issues/1387#issuecomment-873139138 https://api.github.com/repos/simonw/datasette/issues/1387 MDEyOklzc3VlQ29tbWVudDg3MzEzOTEzOA== simonw 9599 2021-07-02T17:05:47Z 2021-07-02T17:05:47Z OWNER

In this case the proxy is Apache. So there are a couple of potential fixes:

  • Configure Apache to pass the original HTTP request Host: header through to the proxied application. This should then be documented.
  • Add a new optional feature to Datasette called something like base_host which, if set, is always used in place of the host in request.url when constructing new URLs.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
absolute_url() behind a proxy assembles incorrect http://127.0.0.1:8001/ URLs 935930820  
873137935 https://github.com/simonw/datasette/issues/1387#issuecomment-873137935 https://api.github.com/repos/simonw/datasette/issues/1387 MDEyOklzc3VlQ29tbWVudDg3MzEzNzkzNQ== simonw 9599 2021-07-02T17:03:36Z 2021-07-02T17:03:36Z OWNER

And the links to apply a facet value are broken too! https://ilsweb.cincinnatilibrary.org/collection-analysis/current_collection-3d4a4b7/bib?_facet=bib_level_callnumber

        {
          "value": "g l fiction",
          "label": "g l fiction",
          "count": 212,
          "toggle_url": "https://127.0.0.1:8010/collection-analysis/current_collection-3d4a4b7/bib.json?_facet=bib_level_callnumber&bib_level_callnumber=g+l+fiction",
          "selected": false
        }

Same problem: https://github.com/simonw/datasette/blob/ea627baccf980d7d8ebc9e1ffff1fe34d556e56f/datasette/facets.py#L251-L261

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
absolute_url() behind a proxy assembles incorrect http://127.0.0.1:8001/ URLs 935930820  
873136440 https://github.com/simonw/datasette/issues/1387#issuecomment-873136440 https://api.github.com/repos/simonw/datasette/issues/1387 MDEyOklzc3VlQ29tbWVudDg3MzEzNjQ0MA== simonw 9599 2021-07-02T17:01:48Z 2021-07-02T17:01:48Z OWNER

Here's what's happening: https://github.com/simonw/datasette/blob/d23a2671386187f61872b9f6b58e0f80ac61f8fe/datasette/views/table.py#L827-L829

This is being run through absolute_url() - defined here: https://github.com/simonw/datasette/blob/d23a2671386187f61872b9f6b58e0f80ac61f8fe/datasette/app.py#L633-L637

That's because the next_url in the JSON needs to be a full URL that a client can retrieve - as opposed to the other links on that page which are all relative links that start with /: https://github.com/simonw/datasette/blob/ea627baccf980d7d8ebc9e1ffff1fe34d556e56f/datasette/templates/_table.html#L11-L15

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
absolute_url() behind a proxy assembles incorrect http://127.0.0.1:8001/ URLs 935930820  
873134866 https://github.com/simonw/datasette/issues/1387#issuecomment-873134866 https://api.github.com/repos/simonw/datasette/issues/1387 MDEyOklzc3VlQ29tbWVudDg3MzEzNDg2Ng== simonw 9599 2021-07-02T16:58:52Z 2021-07-02T16:58:52Z OWNER

What's weird here is that the URL itself is correct - it starts with /collection-analysis/ as expected - but the hostname is wrong.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
absolute_url() behind a proxy assembles incorrect http://127.0.0.1:8001/ URLs 935930820  
869812567 https://github.com/simonw/datasette/issues/1101#issuecomment-869812567 https://api.github.com/repos/simonw/datasette/issues/1101 MDEyOklzc3VlQ29tbWVudDg2OTgxMjU2Nw== simonw 9599 2021-06-28T16:06:57Z 2021-06-28T16:07:24Z OWNER

Relevant blog post: https://simonwillison.net/2021/Jun/25/streaming-large-api-responses/ - including notes on efficiently streaming formats with some kind of separator in between the records (regular JSON).

Some export formats are friendlier for streaming than others. CSV and TSV are pretty easy to stream, as is newline-delimited JSON.

Regular JSON requires a bit more thought: you can output a [ character, then output each row in a stream with a comma suffix, then skip the comma for the last row and output a ]. Doing that requires peeking ahead (looping two at a time) to verify that you haven't yet reached the end.

Or... Martin De Wulf pointed out that you can output the first row, then output every other row with a preceeding comma---which avoids the whole "iterate two at a time" problem entirely.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
register_output_renderer() should support streaming data 749283032  
869075395 https://github.com/simonw/datasette/issues/1384#issuecomment-869075395 https://api.github.com/repos/simonw/datasette/issues/1384 MDEyOklzc3VlQ29tbWVudDg2OTA3NTM5NQ== simonw 9599 2021-06-26T23:54:21Z 2021-06-26T23:59:21Z OWNER

(It may well be that implementing #1168 involves a switch to async metadata)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for dynamic metadata 930807135  
869075368 https://github.com/simonw/datasette/issues/1384#issuecomment-869075368 https://api.github.com/repos/simonw/datasette/issues/1384 MDEyOklzc3VlQ29tbWVudDg2OTA3NTM2OA== simonw 9599 2021-06-26T23:53:55Z 2021-06-26T23:53:55Z OWNER

Great, let's drop fallback then.

My instinct at the moment is to ship this plugin hook as-is but with a warning that it may change before Datasette 1.0 - then before 1.0 either figure out an async variant or finish the database-backed metadata concept from #1168 and recommend that as an alternative.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for dynamic metadata 930807135  
869071790 https://github.com/simonw/datasette/issues/1384#issuecomment-869071790 https://api.github.com/repos/simonw/datasette/issues/1384 MDEyOklzc3VlQ29tbWVudDg2OTA3MTc5MA== simonw 9599 2021-06-26T23:04:12Z 2021-06-26T23:04:12Z OWNER

Hmmm... that's tricky, since one of the most obvious ways to use this hook is to load metadata from database tables using SQL queries.

@brandonrobertz do you have a working example of using this hook to populate metadata from database tables I can try?

Answering my own question: here's how Brandon implements it in his datasette-live-config plugin: https://github.com/next-LI/datasette-live-config/blob/72e335e887f1c69c54c6c2441e07148955b0fc9f/datasette_live_config/__init__.py#L50-L160

That's using a completely separate SQLite connection (actually wrapped in sqlite-utils) and making blocking synchronous calls to it.

This is a pragmatic solution, which works - and likely performs just fine, because SQL queries like this against a small database are so fast that not running them asynchronously isn't actually a problem.

But... it's weird. Everywhere else in Datasette land uses await db.execute(...) - but here's an example where users are encouraged to use blocking calls instead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for dynamic metadata 930807135  
869071435 https://github.com/simonw/datasette/issues/1384#issuecomment-869071435 https://api.github.com/repos/simonw/datasette/issues/1384 MDEyOklzc3VlQ29tbWVudDg2OTA3MTQzNQ== simonw 9599 2021-06-26T22:59:26Z 2021-06-26T22:59:26Z OWNER

The other alternative is to finish the work to build a _metadata internal table, see #1168. The idea there was that if we want to support efficient pagination and search across the metadata for thousands of attached tables powering it with a plugin hook doesn't work well - we don't want to call the hook once for every one of 1,000+ tables just to implement the homepage.

So instead, all metadata for all attached databases would be loaded into an in-memory database called _metadata. Plugins that want to modify stored metadata could then do so by directly writing to that table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for dynamic metadata 930807135  
869071236 https://github.com/simonw/datasette/issues/860#issuecomment-869071236 https://api.github.com/repos/simonw/datasette/issues/860 MDEyOklzc3VlQ29tbWVudDg2OTA3MTIzNg== simonw 9599 2021-06-26T22:56:28Z 2021-06-26T22:56:28Z OWNER

This work is continuing in #1384.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for instance/database/table metadata 642651572  
869071167 https://github.com/simonw/datasette/issues/1384#issuecomment-869071167 https://api.github.com/repos/simonw/datasette/issues/1384 MDEyOklzc3VlQ29tbWVudDg2OTA3MTE2Nw== simonw 9599 2021-06-26T22:55:36Z 2021-06-26T22:55:36Z OWNER

Just realized I already have an issue open for this, at #860. I'm going to close that and continue work on this in this issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for dynamic metadata 930807135  
869070941 https://github.com/simonw/datasette/issues/1384#issuecomment-869070941 https://api.github.com/repos/simonw/datasette/issues/1384 MDEyOklzc3VlQ29tbWVudDg2OTA3MDk0MQ== simonw 9599 2021-06-26T22:53:34Z 2021-06-26T22:53:34Z OWNER

The await thing is worrying me a lot - it feels like this plugin hook is massively less useful if it can't make it's own DB queries and generally do asynchronous stuff - but I'd also like not to break every existing plugin that calls datasette.metadata(...).

One solution that could work: introduce a new method, maybe await datasette.get_metadata(...), which uses this plugin hook - and keep the existing datasette.metadata() method (which doesn't call the hook) around. This would ensure existing plugins keep on working.

Then, upgrade those plugins separately - with the goal of deprecating and removing .metadata() entirely in Datasette 1.0 - having upgraded the plugins in the meantime.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for dynamic metadata 930807135  
869070348 https://github.com/simonw/datasette/issues/1384#issuecomment-869070348 https://api.github.com/repos/simonw/datasette/issues/1384 MDEyOklzc3VlQ29tbWVudDg2OTA3MDM0OA== simonw 9599 2021-06-26T22:46:18Z 2021-06-26T22:46:18Z OWNER

Here's where the plugin hook is called, demonstrating the fallback= argument: https://github.com/simonw/datasette/blob/05a312caf3debb51aa1069939923a49e21cd2bd1/datasette/app.py#L426-L472

I'm not convinced of the use-case for passing fallback= to the hook here - is there a reason a plugin might care whether fallback is True or False, seeing as the metadata() method already respects that fallback logic on line 459?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for dynamic metadata 930807135  
869070076 https://github.com/simonw/datasette/issues/1384#issuecomment-869070076 https://api.github.com/repos/simonw/datasette/issues/1384 MDEyOklzc3VlQ29tbWVudDg2OTA3MDA3Ng== simonw 9599 2021-06-26T22:42:21Z 2021-06-26T22:42:21Z OWNER

Hmmm... that's tricky, since one of the most obvious ways to use this hook is to load metadata from database tables using SQL queries.

@brandonrobertz do you have a working example of using this hook to populate metadata from database tables I can try?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for dynamic metadata 930807135  
869069926 https://github.com/simonw/datasette/issues/1384#issuecomment-869069926 https://api.github.com/repos/simonw/datasette/issues/1384 MDEyOklzc3VlQ29tbWVudDg2OTA2OTkyNg== simonw 9599 2021-06-26T22:40:15Z 2021-06-26T22:40:53Z OWNER

The documentation says:

datasette: You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name), or to execute SQL queries.

That's not accurate: since the plugin hook is a regular function, not an awaitable, you can't use it to run await db.execute(...) so you can't execute SQL queries.

I can fix this with the await-me-maybe pattern, used for other plugin hooks: https://simonwillison.net/2020/Sep/2/await-me-maybe/

BUT... that requires changing the ds.metadata() function to be awaitable, which will affect every existing plugn that uses that documented internal method!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for dynamic metadata 930807135  
869069768 https://github.com/simonw/datasette/issues/1384#issuecomment-869069768 https://api.github.com/repos/simonw/datasette/issues/1384 MDEyOklzc3VlQ29tbWVudDg2OTA2OTc2OA== simonw 9599 2021-06-26T22:37:53Z 2021-06-26T22:37:53Z OWNER

The documentation doesn't describe the fallback argument at the moment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for dynamic metadata 930807135  
869069655 https://github.com/simonw/datasette/issues/1384#issuecomment-869069655 https://api.github.com/repos/simonw/datasette/issues/1384 MDEyOklzc3VlQ29tbWVudDg2OTA2OTY1NQ== simonw 9599 2021-06-26T22:36:14Z 2021-06-26T22:37:37Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin hook for dynamic metadata 930807135  
869068554 https://github.com/simonw/datasette/pull/1368#issuecomment-869068554 https://api.github.com/repos/simonw/datasette/issues/1368 MDEyOklzc3VlQ29tbWVudDg2OTA2ODU1NA== simonw 9599 2021-06-26T22:23:57Z 2021-06-26T22:23:57Z OWNER

The only test failure is Black. I'm going to merge this and then reformat.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
DRAFT: A new plugin hook for dynamic metadata 913865304  
868881190 https://github.com/simonw/sqlite-utils/issues/37#issuecomment-868881190 https://api.github.com/repos/simonw/sqlite-utils/issues/37 MDEyOklzc3VlQ29tbWVudDg2ODg4MTE5MA== simonw 9599 2021-06-25T23:24:28Z 2021-06-25T23:24:28Z OWNER

Maybe I could release a separate Python package types-sqlite-utils-numpy which adds an over-ridden type definition that includes the numpy types?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Experiment with type hints 465815372  
868881033 https://github.com/simonw/sqlite-utils/issues/37#issuecomment-868881033 https://api.github.com/repos/simonw/sqlite-utils/issues/37 MDEyOklzc3VlQ29tbWVudDg2ODg4MTAzMw== simonw 9599 2021-06-25T23:23:49Z 2021-06-25T23:23:49Z OWNER

Twitter conversation about how to add types to the .create_table(columns=) parameter: https://twitter.com/simonw/status/1408532867592818693

Anyone know how to write a mypy type definition for this?

{"id": int, "name": str, "image": bytes, "weight": float}

It's a dict where keys are strings and values are one of int/str/bytes/float (weird API design I know, but I designed this long before I was thinking about mypy)

Looks like this could work:

    def create_table(
        self,
        name,
        columns: Dict[str, Union[Type[int], Type[bytes], Type[str], Type[float]]],
        pk=None,
        foreign_keys=None,
        column_order=None,
        not_null=None,
        defaults=None,
        hash_id=None,
        extracts=None,
    ):

Except... that method can optionally also accept numpy types if numpy is installed. I don't know if it's possible to dynamically change a signature based on an import, since mypy is a static type analyzer and doesn't ever execute the code.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Experiment with type hints 465815372  
868728092 https://github.com/simonw/sqlite-utils/pull/293#issuecomment-868728092 https://api.github.com/repos/simonw/sqlite-utils/issues/293 MDEyOklzc3VlQ29tbWVudDg2ODcyODA5Mg== simonw 9599 2021-06-25T17:39:35Z 2021-06-25T17:39:35Z OWNER

Here's more about this problem: https://github.com/numpy/numpy/issues/15947

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Test against Python 3.10-dev 929748885  
868134040 https://github.com/simonw/sqlite-utils/pull/293#issuecomment-868134040 https://api.github.com/repos/simonw/sqlite-utils/issues/293 MDEyOklzc3VlQ29tbWVudDg2ODEzNDA0MA== simonw 9599 2021-06-25T01:49:44Z 2021-06-25T01:50:33Z OWNER

Test failed on 3.10 with numpy on macOS:

sqlite_utils/__init__.py:1: in <module>
11
    from .db import Database
12
sqlite_utils/db.py:48: in <module>
13
    import numpy as np  # type: ignore
14
../../../hostedtoolcache/Python/3.10.0-beta.3/x64/lib/python3.10/site-packages/numpy/__init__.py:391: in <module>
15
    raise RuntimeError(msg)
16
E   RuntimeError: Polyfit sanity test emitted a warning, most likely due to using a buggy Accelerate backend. If you compiled yourself, more information is available at https://numpy.org/doc/stable/user/building.html#accelerated-blas-lapack-libraries Otherwise report this to the vendor that provided NumPy.
17
E   RankWarning: Polyfit may be poorly conditioned
18
Error: Process completed with exit code 4.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Test against Python 3.10-dev 929748885  
868021624 https://github.com/simonw/sqlite-utils/issues/290#issuecomment-868021624 https://api.github.com/repos/simonw/sqlite-utils/issues/290 MDEyOklzc3VlQ29tbWVudDg2ODAyMTYyNA== simonw 9599 2021-06-24T23:17:38Z 2021-06-24T23:17:38Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`db.query()` method (renamed `db.execute_returning_dicts()`) 926777310  
867209791 https://github.com/simonw/datasette/issues/1377#issuecomment-867209791 https://api.github.com/repos/simonw/datasette/issues/1377 MDEyOklzc3VlQ29tbWVudDg2NzIwOTc5MQ== simonw 9599 2021-06-23T22:51:32Z 2021-06-23T22:51:32Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for plugins to exclude certain paths from CSRF checks 920884085  
867102944 https://github.com/simonw/datasette/pull/1368#issuecomment-867102944 https://api.github.com/repos/simonw/datasette/issues/1368 MDEyOklzc3VlQ29tbWVudDg2NzEwMjk0NA== simonw 9599 2021-06-23T19:32:01Z 2021-06-23T19:32:01Z OWNER

Yes, let's move ahead with getting this into an alpha.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
DRAFT: A new plugin hook for dynamic metadata 913865304  
866466388 https://github.com/simonw/sqlite-utils/issues/291#issuecomment-866466388 https://api.github.com/repos/simonw/sqlite-utils/issues/291 MDEyOklzc3VlQ29tbWVudDg2NjQ2NjM4OA== simonw 9599 2021-06-23T02:10:24Z 2021-06-23T02:10:24Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Adopt flake8 927766296  
866461926 https://github.com/simonw/sqlite-utils/issues/291#issuecomment-866461926 https://api.github.com/repos/simonw/sqlite-utils/issues/291 MDEyOklzc3VlQ29tbWVudDg2NjQ2MTkyNg== simonw 9599 2021-06-23T01:59:57Z 2021-06-23T01:59:57Z OWNER

That shouldn't be failing: it's complaining about these: https://github.com/simonw/sqlite-utils/blob/02898bf7af4a4e484ecc8ec852d5fee98463277b/tests/test_register_function.py#L56-L67

But I added # noqa: F811 to them.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Adopt flake8 927766296  
865511810 https://github.com/simonw/sqlite-utils/issues/290#issuecomment-865511810 https://api.github.com/repos/simonw/sqlite-utils/issues/290 MDEyOklzc3VlQ29tbWVudDg2NTUxMTgxMA== simonw 9599 2021-06-22T04:07:34Z 2021-06-22T18:26:21Z OWNER

That documentation section is pretty weak at the moment - here's the whole thing:

Executing queries

The db.execute() and db.executescript() methods provide wrappers around .execute() and .executescript() on the underlying SQLite connection. These wrappers log to the tracer function if one has been registered.
python db = Database(memory=True) db["dogs"].insert({"name": "Cleo"}) db.execute("update dogs set name = 'Cleopaws'")
You can pass parameters as an optional second argument, using either a list or a dictionary. These will be correctly quoted and escaped.
```python

Using ? and a list:

db.execute("update dogs set name = ?", ["Cleopaws"])

Or using :name and a dictionary:

db.execute("update dogs set name = :name", {"name": "Cleopaws"})
```

  • Talks about .execute() - I want to talk about .query() instead
  • Doesn't clarify that .execute() returns a Cursor - and assumes you know what to do with one
  • Doesn't show an example of a select query at all
  • The "tracer function" bit is confusing (should at least link to docs further down)
  • For UPDATE should show how to access the number of rows modified (probably using .execute() there)

It does at least cover the two types of parameters, though that could be bulked out.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`db.query()` method (renamed `db.execute_returning_dicts()`) 926777310  

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);