home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

28 rows where "created_at" is on date 2022-02-02 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date)

issue 16

  • Reconsider policy on blocking queries containing the string "pragma" 6
  • Add `Link: rel="alternate"` header pointing to JSON for a table/query 3
  • Link: rel="alternate" to JSON for queries too 3
  • JSON link on row page is 404 if base_url setting is used 2
  • /-/patterns returns link: alternate JSON header to 404 2
  • Try test suite against macOS and Windows 2
  • Maybe return JSON from HTML pages if `Accept: application/json` is sent 1
  • Traces should include SQL executed by subtasks created with `asyncio.gather` 1
  • run analyze on all databases as part of start up or publishing 1
  • More detailed information about installed SpatiaLite version 1
  • Avoid ever running count(*) against SpatiaLite KNN table 1
  • Potential simplified publishing mechanism 1
  • Bump black from 21.12b0 to 22.1.0 1
  • Ensure template_path always uses "/" to match jinja 1
  • Test against Python 3.11-dev 1
  • Index page `/` has no CORS headers 1

user 3

  • simonw 24
  • codecov[bot] 3
  • strada 1

author_association 2

  • OWNER 24
  • NONE 4
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1028419517 https://github.com/simonw/datasette/pull/1617#issuecomment-1028419517 https://api.github.com/repos/simonw/datasette/issues/1617 IC_kwDOBm6k_c49TG-9 codecov[bot] 22429695 2022-02-02T22:30:26Z 2022-02-03T01:36:07Z NONE

Codecov Report

Merging #1617 (af293c9) into main (2aa686c) will increase coverage by 0.06%. The diff coverage is 100.00%.

```diff @@ Coverage Diff @@

main #1617 +/-

========================================== + Coverage 92.09% 92.16% +0.06%
========================================== Files 34 34
Lines 4518 4531 +13
========================================== + Hits 4161 4176 +15
+ Misses 357 355 -2
```

| Impacted Files | Coverage Δ | | |---|---|---| | datasette/app.py | 95.37% <100.00%> (ø) | | | datasette/views/table.py | 96.19% <0.00%> (ø) | | | datasette/utils/__init__.py | 94.79% <0.00%> (+<0.01%) | :arrow_up: | | datasette/views/base.py | 95.49% <0.00%> (+0.07%) | :arrow_up: | | datasette/views/special.py | 95.09% <0.00%> (+2.38%) | :arrow_up: |


Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2aa686c...af293c9. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ensure template_path always uses "/" to match jinja 1120990806  
1028461220 https://github.com/simonw/datasette/issues/1534#issuecomment-1028461220 https://api.github.com/repos/simonw/datasette/issues/1534 IC_kwDOBm6k_c49TRKk simonw 9599 2022-02-02T23:39:33Z 2022-02-02T23:39:33Z OWNER

I've decided not to do this, because of the risk that Cloudflare could cache the JSON version for an HTML page or vice-versa.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Maybe return JSON from HTML pages if `Accept: application/json` is sent 1065432388  
1028423514 https://github.com/simonw/datasette/pull/1626#issuecomment-1028423514 https://api.github.com/repos/simonw/datasette/issues/1626 IC_kwDOBm6k_c49TH9a codecov[bot] 22429695 2022-02-02T22:36:37Z 2022-02-02T22:39:52Z NONE

Codecov Report

Merging #1626 (4b4d0e1) into main (b5e6b1a) will not change coverage. The diff coverage is n/a.

```diff @@ Coverage Diff @@

main #1626 +/-

======================================= Coverage 92.16% 92.16%
======================================= Files 34 34
Lines 4531 4531
======================================= Hits 4176 4176
Misses 355 355
```


Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b5e6b1a...4b4d0e1. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Try test suite against macOS and Windows 1122451096  
1028420821 https://github.com/simonw/datasette/pull/1626#issuecomment-1028420821 https://api.github.com/repos/simonw/datasette/issues/1626 IC_kwDOBm6k_c49THTV simonw 9599 2022-02-02T22:32:26Z 2022-02-02T22:33:31Z OWNER

That broke on a macOS test: https://github.com/simonw/datasette/runs/5044036993?check_suite_focus=true

I'm going to remove macOS and Ubuntu and just try Windows purely to see what happens there.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Try test suite against macOS and Windows 1122451096  
1028414871 https://github.com/simonw/datasette/pull/1616#issuecomment-1028414871 https://api.github.com/repos/simonw/datasette/issues/1616 IC_kwDOBm6k_c49TF2X simonw 9599 2022-02-02T22:23:45Z 2022-02-02T22:23:45Z OWNER

First stable Black release!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump black from 21.12b0 to 22.1.0 1119413338  
1028397935 https://github.com/simonw/datasette/issues/1623#issuecomment-1028397935 https://api.github.com/repos/simonw/datasette/issues/1623 IC_kwDOBm6k_c49TBtv simonw 9599 2022-02-02T21:59:43Z 2022-02-02T21:59:43Z OWNER

Here's the new test: https://github.com/simonw/datasette/blob/23a09b0f6af33c52acf8c1d9002fe475b42fee10/tests/test_html.py#L927-L936

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
/-/patterns returns link: alternate JSON header to 404 1122416919  
1028396866 https://github.com/simonw/datasette/issues/1624#issuecomment-1028396866 https://api.github.com/repos/simonw/datasette/issues/1624 IC_kwDOBm6k_c49TBdC simonw 9599 2022-02-02T21:58:06Z 2022-02-02T21:58:06Z OWNER

It looks like this is because IndexView extends BaseView rather than extending DataView which is where all that CORS stuff happens:

https://github.com/simonw/datasette/blob/23a09b0f6af33c52acf8c1d9002fe475b42fee10/datasette/views/index.py#L18-L21

Another thing I should address with the refactor project in: - #878

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Index page `/` has no CORS headers 1122427321  
1028393259 https://github.com/simonw/datasette/issues/1620#issuecomment-1028393259 https://api.github.com/repos/simonw/datasette/issues/1620 IC_kwDOBm6k_c49TAkr simonw 9599 2022-02-02T21:53:02Z 2022-02-02T21:53:02Z OWNER

I ran the following on https://www.google.com/ in the console to demonstrate that these work as intended:

javascript [ "https://latest.datasette.io/fixtures", "https://latest.datasette.io/fixtures?sql=select+1", "https://latest.datasette.io/fixtures/facetable" ].forEach(async (url) => { response = await fetch(url, {method: "HEAD"}); console.log(response.headers.get("Link")); });

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Link: rel="alternate" to JSON for queries too 1121618041  
1028389953 https://github.com/simonw/datasette/issues/1623#issuecomment-1028389953 https://api.github.com/repos/simonw/datasette/issues/1623 IC_kwDOBm6k_c49S_xB simonw 9599 2022-02-02T21:48:34Z 2022-02-02T21:48:34Z OWNER

A few other pages do that too, including: - https://latest.datasette.io/-/messages - https://latest.datasette.io/-/allow-debug

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
/-/patterns returns link: alternate JSON header to 404 1122416919  
1028387529 https://github.com/simonw/datasette/pull/1622#issuecomment-1028387529 https://api.github.com/repos/simonw/datasette/issues/1622 IC_kwDOBm6k_c49S_LJ codecov[bot] 22429695 2022-02-02T21:45:21Z 2022-02-02T21:45:21Z NONE

Codecov Report

Merging #1622 (fbaf317) into main (8d5779a) will not change coverage. The diff coverage is n/a.

```diff @@ Coverage Diff @@

main #1622 +/-

======================================= Coverage 92.11% 92.11%
======================================= Files 34 34
Lines 4525 4525
======================================= Hits 4168 4168
Misses 357 357
```


Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 8d5779a...fbaf317. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Test against Python 3.11-dev 1122414274  
1028385067 https://github.com/simonw/datasette/issues/1620#issuecomment-1028385067 https://api.github.com/repos/simonw/datasette/issues/1620 IC_kwDOBm6k_c49S-kr simonw 9599 2022-02-02T21:42:23Z 2022-02-02T21:42:23Z OWNER

% curl -s -I 'https://latest.datasette.io/' | grep link link: https://latest.datasette.io/.json; rel="alternate"; type="application/json+datasette" % curl -s -I 'https://latest.datasette.io/fixtures' | grep link link: https://latest.datasette.io/fixtures.json; rel="alternate"; type="application/json+datasette" % curl -s -I 'https://latest.datasette.io/fixtures?sql=select+1' | grep link link: https://latest.datasette.io/fixtures.json?sql=select+1; rel="alternate"; type="application/json+datasette" % curl -s -I 'https://latest.datasette.io/-/plugins' | grep link link: https://latest.datasette.io/-/plugins.json; rel="alternate"; type="application/json+datasette"

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Link: rel="alternate" to JSON for queries too 1121618041  
1028374330 https://github.com/simonw/datasette/issues/1620#issuecomment-1028374330 https://api.github.com/repos/simonw/datasette/issues/1620 IC_kwDOBm6k_c49S786 simonw 9599 2022-02-02T21:28:16Z 2022-02-02T21:28:16Z OWNER

I just realized I can refactor this to make it much simpler.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Link: rel="alternate" to JSON for queries too 1121618041  
1028294089 https://github.com/simonw/datasette/issues/1618#issuecomment-1028294089 https://api.github.com/repos/simonw/datasette/issues/1618 IC_kwDOBm6k_c49SoXJ strada 770231 2022-02-02T19:42:03Z 2022-02-02T19:42:03Z NONE

Thanks for looking into this. It might have been nice if explain surfaced these function calls. Looks like explain query plan does, but only for basic queries.

``` sqlite-utils fixtures.db 'explain query plan select * from pragma_function_list(), pragma_database_list(), pragma_module_list()' -t id parent notused detail


4 0 0 SCAN pragma_function_list VIRTUAL TABLE INDEX 0: 8 0 0 SCAN pragma_database_list VIRTUAL TABLE INDEX 0: 12 0 0 SCAN pragma_module_list VIRTUAL TABLE INDEX 0: ```

``` sqlite-utils fixtures.db 'explain query plan select * from pragma_function_list() as fl, pragma_database_list() as dl, pragma_module_list() as ml' -t id parent notused detail


4 0 0 SCAN fl VIRTUAL TABLE INDEX 0: 8 0 0 SCAN dl VIRTUAL TABLE INDEX 0: 12 0 0 SCAN ml VIRTUAL TABLE INDEX 0: ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reconsider policy on blocking queries containing the string "pragma" 1121121305  
1027672617 https://github.com/simonw/datasette/issues/1533#issuecomment-1027672617 https://api.github.com/repos/simonw/datasette/issues/1533 IC_kwDOBm6k_c49QQop simonw 9599 2022-02-02T07:56:51Z 2022-02-02T07:56:51Z OWNER

Demos - these pages both have <link rel=... if you view source on them:

  • https://latest.datasette.io/fixtures/sortable
  • https://latest.datasette.io/fixtures/sortable/a,a

And you can hit them with curl like so: ``` % curl -I 'https://latest.datasette.io/fixtures/sortable'
HTTP/1.1 200 OK link: https://latest.datasette.io/fixtures/sortable.json; rel="alternate"; type="application/json+datasette" cache-control: max-age=5 referrer-policy: no-referrer access-control-allow-origin: * access-control-allow-headers: Authorization access-control-expose-headers: Link content-type: text/html; charset=utf-8 x-databases: _memory, _internal, fixtures, extra_database Date: Wed, 02 Feb 2022 07:56:17 GMT Server: Google Frontend Transfer-Encoding: chunked

% curl -I 'https://latest.datasette.io/fixtures/sortable/a,a' HTTP/1.1 200 OK link: https://latest.datasette.io/fixtures/sortable/a,a.json; rel="alternate"; type="application/json+datasette" cache-control: max-age=5 referrer-policy: no-referrer access-control-allow-origin: * access-control-allow-headers: Authorization access-control-expose-headers: Link content-type: text/html; charset=utf-8 x-databases: _memory, _internal, fixtures, extra_database Date: Wed, 02 Feb 2022 07:56:24 GMT Server: Google Frontend Transfer-Encoding: chunked ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add `Link: rel="alternate"` header pointing to JSON for a table/query 1065431383  
1027669851 https://github.com/simonw/datasette/issues/1533#issuecomment-1027669851 https://api.github.com/repos/simonw/datasette/issues/1533 IC_kwDOBm6k_c49QP9b simonw 9599 2022-02-02T07:51:57Z 2022-02-02T07:51:57Z OWNER

Documentation: https://docs.datasette.io/en/latest/json_api.html#discovering-the-json-for-a-page

https://docs.datasette.io/en/latest/json_api.html top --cors section mentions the new Access-Control-Expose-Headers: Link header.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add `Link: rel="alternate"` header pointing to JSON for a table/query 1065431383  
1027659890 https://github.com/simonw/datasette/issues/1615#issuecomment-1027659890 https://api.github.com/repos/simonw/datasette/issues/1615 IC_kwDOBm6k_c49QNhy simonw 9599 2022-02-02T07:34:17Z 2022-02-02T07:34:17Z OWNER

I've been thinking about this a bunch too. If I build anything along these lines it will be as part of the Datasette Cloud hosted service I'm working on, maybe as a free tier.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Potential simplified publishing mechanism 1117132741  
1027659018 https://github.com/simonw/datasette/issues/1618#issuecomment-1027659018 https://api.github.com/repos/simonw/datasette/issues/1618 IC_kwDOBm6k_c49QNUK simonw 9599 2022-02-02T07:32:47Z 2022-02-02T07:32:47Z OWNER

I was hoping that explain select ... might be able to easily spot when people are calling PRAGMA functions, but this output doesn't look very helpful: ``` % sqlite-utils fixtures.db 'explain select * from pragma_database_list()' -t addr opcode p1 p2 p3 p4 p5 comment


 0  Init            0    11     0                        0
 1  VOpen           0     0     0  vtab:7F9C90AC3070     0
 2  Integer         0     1     0                        0
 3  Integer         0     2     0                        0
 4  VFilter         0    10     1                        0
 5  VColumn         0     0     3                        0
 6  VColumn         0     1     4                        0
 7  VColumn         0     2     5                        0
 8  ResultRow       3     3     0                        0
 9  VNext           0     5     0                        0
10  Halt            0     0     0                        0
11  Transaction     0     0    35  0                     1
12  Goto            0     1     0                        0

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reconsider policy on blocking queries containing the string "pragma" 1121121305  
1027656518 https://github.com/simonw/datasette/issues/1618#issuecomment-1027656518 https://api.github.com/repos/simonw/datasette/issues/1618 IC_kwDOBm6k_c49QMtG simonw 9599 2022-02-02T07:28:14Z 2022-02-02T07:31:30Z OWNER

I also need to consider if supposedly harmless side-effect free pragma functions could be used to work around the Datasette permissions system. My hunch is that wouldn't be a problem, because if you're allowing arbitrary SQL queries you're already letting people ignore the permissions system.

One example: ``` sqlite-utils fixtures.db 'pragma database_list' -t seq name file


0  main    /Users/simon/Dropbox/Development/datasette/fixtures.db

``` Though it looks like I already allow-listed that one in #761: https://latest.datasette.io/_memory?sql=select+*+from+pragma_database_list%28%29

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reconsider policy on blocking queries containing the string "pragma" 1121121305  
1027656000 https://github.com/simonw/datasette/issues/1618#issuecomment-1027656000 https://api.github.com/repos/simonw/datasette/issues/1618 IC_kwDOBm6k_c49QMlA simonw 9599 2022-02-02T07:27:14Z 2022-02-02T07:27:14Z OWNER

I also just realized that pragma pragma_list can be used to generate a list of all known pragmas for the connection:

sqlite-utils fixtures.db 'pragma pragma_list' --fmt github

| name | |---------------------------| | analysis_limit | | application_id | | auto_vacuum | | automatic_index | | busy_timeout | | cache_size | | cache_spill | | case_sensitive_like | | cell_size_check | | checkpoint_fullfsync | | collation_list | | compile_options | | count_changes | | data_version | | database_list | | default_cache_size | | defer_foreign_keys | | empty_result_callbacks | | encoding | | foreign_key_check | | foreign_key_list | | foreign_keys | | freelist_count | | full_column_names | | fullfsync | | function_list | | hard_heap_limit | | ignore_check_constraints | | incremental_vacuum | | index_info | | index_list | | index_xinfo | | integrity_check | | journal_mode | | journal_size_limit | | legacy_alter_table | | lock_proxy_file | | locking_mode | | max_page_count | | mmap_size | | module_list | | optimize | | page_count | | page_size | | pragma_list | | query_only | | quick_check | | read_uncommitted | | recursive_triggers | | reverse_unordered_selects | | schema_version | | secure_delete | | short_column_names | | shrink_memory | | soft_heap_limit | | synchronous | | table_info | | table_list | | table_xinfo | | temp_store | | temp_store_directory | | threads | | trusted_schema | | user_version | | wal_autocheckpoint | | wal_checkpoint | | writable_schema |

So I could use that list to create a much more specific regular expression, which would then allow the word "pragma" to be used more freely while still protecting against any known pragma function being called.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reconsider policy on blocking queries containing the string "pragma" 1121121305  
1027654979 https://github.com/simonw/datasette/issues/1618#issuecomment-1027654979 https://api.github.com/repos/simonw/datasette/issues/1618 IC_kwDOBm6k_c49QMVD simonw 9599 2022-02-02T07:25:22Z 2022-02-02T07:25:22Z OWNER

But... I just noticed something I had missed in the docs for https://www.sqlite.org/pragma.html#pragfunc

Table-valued functions exist only for PRAGMAs that return results and that have no side-effects.

So it's possible I'm being overly paranoid here after all: what I want to block here is people running things like PRAGMA case_sensitive_like = 1 which could affect the global state for that connection and cause unexpected behaviour later on.

So maybe I should allow all pragma functions. I previously allowed an allow-list of them in: - #761

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reconsider policy on blocking queries containing the string "pragma" 1121121305  
1027653005 https://github.com/simonw/datasette/issues/1618#issuecomment-1027653005 https://api.github.com/repos/simonw/datasette/issues/1618 IC_kwDOBm6k_c49QL2N simonw 9599 2022-02-02T07:22:13Z 2022-02-02T07:22:13Z OWNER

There's a workaround for this at the moment, which is to use parameterized SQL queries. For example, this:

https://fivethirtyeight.datasettes.com/polls?sql=select+*+from+books+where+title+%3D+%3Atitle&title=The+Pragmatic+Programmer

So the SQL query is select * from books where title = :title and then &title=... is added to the URL.

The reason behind the quite aggressive pragma filtering is that SQLite allows you to execute pragmas using function calls, like this one:

sql SELECT * FROM pragma_index_info('idx52'); These can be nested arbitrarily deeply in sub-queries, so it's difficult to write a regular expression that will definitely catch them.

I'm open to relaxing the regex a bit, but I need to be very confident that it's safe to do so.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reconsider policy on blocking queries containing the string "pragma" 1121121305  
1027648180 https://github.com/simonw/datasette/issues/1586#issuecomment-1027648180 https://api.github.com/repos/simonw/datasette/issues/1586 IC_kwDOBm6k_c49QKq0 simonw 9599 2022-02-02T07:13:31Z 2022-02-02T07:13:31Z OWNER

Running it as part of datasette publish is a smart idea - I'm slightly nervous about modifying the database file that has been published though, since part of the undocumented contract right now is that the bytes served are the exact same bytes as the ones you ran the publish against.

But there's no reason for that expectation to exist, and I doubt anyone is relying on that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
run analyze on all databases as part of start up or publishing 1096536240  
1027647257 https://github.com/simonw/datasette/issues/1619#issuecomment-1027647257 https://api.github.com/repos/simonw/datasette/issues/1619 IC_kwDOBm6k_c49QKcZ simonw 9599 2022-02-02T07:11:43Z 2022-02-02T07:11:43Z OWNER

Weirdly the bug does NOT exhibit itself on this demo: https://datasette-apache-proxy-demo.datasette.io/prefix/fixtures/no_primary_key/1 - which correctly links to https://datasette-apache-proxy-demo.datasette.io/prefix/fixtures/no_primary_key/1.json

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
JSON link on row page is 404 if base_url setting is used 1121583414  
1027646659 https://github.com/simonw/datasette/issues/1619#issuecomment-1027646659 https://api.github.com/repos/simonw/datasette/issues/1619 IC_kwDOBm6k_c49QKTD simonw 9599 2022-02-02T07:10:37Z 2022-02-02T07:10:37Z OWNER

It's not just the table with slashes in the name. Same thing on http://127.0.0.1:3344/foo/bar/fixtures/attraction_characteristic/1 - the json link goes to a JSON-rendered 404 on http://127.0.0.1:3344/foo/bar/foo/bar/fixtures/attraction_characteristic/1.json

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
JSON link on row page is 404 if base_url setting is used 1121583414  
1027635925 https://github.com/simonw/datasette/issues/1576#issuecomment-1027635925 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c49QHrV simonw 9599 2022-02-02T06:47:20Z 2022-02-02T06:47:20Z OWNER

Here's what I was hacking around with when I uncovered this problem: ```diff diff --git a/datasette/views/table.py b/datasette/views/table.py index 77fb285..8c57d08 100644 --- a/datasette/views/table.py +++ b/datasette/views/table.py @@ -1,3 +1,4 @@ +import asyncio import urllib import itertools import json @@ -615,44 +616,37 @@ class TableView(RowTableShared): if request.args.get("_timelimit"): extra_args["custom_time_limit"] = int(request.args.get("_timelimit"))

  • Execute the main query!

  • results = await db.execute(sql, params, truncate=True, **extra_args)

  • Calculate the total count for this query

  • filtered_table_rows_count = None
  • if (
  • not db.is_mutable
  • and self.ds.inspect_data
  • and count_sql == f"select count(*) from {table} "
  • ):
  • We can use a previously cached table row count

  • try:
  • filtered_table_rows_count = self.ds.inspect_data[database]["tables"][
  • table
  • ]["count"]
  • except KeyError:
  • pass

  • Otherwise run a select count(*) ...

  • if count_sql and filtered_table_rows_count is None and not nocount:
  • try:
  • count_rows = list(await db.execute(count_sql, from_sql_params))
  • filtered_table_rows_count = count_rows[0][0]
  • except QueryInterrupted:
  • pass

  • Faceting

  • if not self.ds.setting("allow_facet") and any(
  • arg.startswith("_facet") for arg in request.args
  • ):
  • raise BadRequest("_facet= is not allowed")
  • async def execute_count():
  • Calculate the total count for this query

  • filtered_table_rows_count = None
  • if (
  • not db.is_mutable
  • and self.ds.inspect_data
  • and count_sql == f"select count(*) from {table} "
  • ):
  • We can use a previously cached table row count

  • try:
  • filtered_table_rows_count = self.ds.inspect_data[database][
  • "tables"
  • ][table]["count"]
  • except KeyError:
  • pass +
  • if count_sql and filtered_table_rows_count is None and not nocount:
  • try:
  • count_rows = list(await db.execute(count_sql, from_sql_params))
  • filtered_table_rows_count = count_rows[0][0]
  • except QueryInterrupted:
  • pass +
  • return filtered_table_rows_count +
  • filtered_table_rows_count = await execute_count()

     # pylint: disable=no-member
     facet_classes = list(
         itertools.chain.from_iterable(pm.hook.register_facet_classes())
     )
    
    • facet_results = {}
    • facets_timed_out = [] facet_instances = [] for klass in facet_classes: facet_instances.append( @@ -668,33 +662,58 @@ class TableView(RowTableShared): ) )
  • if not nofacet:

  • for facet in facet_instances:
  • (
  • instance_facet_results,
  • instance_facets_timed_out,
  • ) = await facet.facet_results()
  • for facet_info in instance_facet_results:
  • base_key = facet_info["name"]
  • key = base_key
  • i = 1
  • while key in facet_results:
  • i += 1
  • key = f"{base_key}_{i}"
  • facet_results[key] = facet_info
  • facets_timed_out.extend(instance_facets_timed_out)

  • Calculate suggested facets

  • suggested_facets = []
  • if (
  • self.ds.setting("suggest_facets")
  • and self.ds.setting("allow_facet")
  • and not _next
  • and not nofacet
  • and not nosuggest
  • ):
  • for facet in facet_instances:
  • suggested_facets.extend(await facet.suggest())
  • async def execute_suggested_facets():
  • Calculate suggested facets

  • suggested_facets = []
  • if (
  • self.ds.setting("suggest_facets")
  • and self.ds.setting("allow_facet")
  • and not _next
  • and not nofacet
  • and not nosuggest
  • ):
  • for facet in facet_instances:
  • suggested_facets.extend(await facet.suggest())
  • return suggested_facets +
  • async def execute_facets():
  • facet_results = {}
  • facets_timed_out = []
  • if not self.ds.setting("allow_facet") and any(
  • arg.startswith("_facet") for arg in request.args
  • ):
  • raise BadRequest("_facet= is not allowed") +
  • if not nofacet:
  • for facet in facet_instances:
  • (
  • instance_facet_results,
  • instance_facets_timed_out,
  • ) = await facet.facet_results()
  • for facet_info in instance_facet_results:
  • base_key = facet_info["name"]
  • key = base_key
  • i = 1
  • while key in facet_results:
  • i += 1
  • key = f"{base_key}_{i}"
  • facet_results[key] = facet_info
  • facets_timed_out.extend(instance_facets_timed_out) +
  • return facet_results, facets_timed_out +
  • Execute the main query, facets and facet suggestions in parallel:

  • (
  • results,
  • suggested_facets,
  • (facet_results, facets_timed_out),
  • ) = await asyncio.gather(
  • db.execute(sql, params, truncate=True, **extra_args),
  • execute_suggested_facets(),
  • execute_facets(),
  • ) +
  • results = await db.execute(sql, params, truncate=True, **extra_args)
     # Figure out columns and rows for the query
     columns = [r[0] for r in results.description]
    

    ``` It's a hacky attempt at running some of the table page queries in parallel to see what happens.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
1027635175 https://github.com/simonw/datasette/issues/1611#issuecomment-1027635175 https://api.github.com/repos/simonw/datasette/issues/1611 IC_kwDOBm6k_c49QHfn simonw 9599 2022-02-02T06:45:47Z 2022-02-02T06:45:47Z OWNER

Prototype, not sure that this actually works yet: diff diff --git a/datasette/database.py b/datasette/database.py index 6ce8721..0c4aec7 100644 --- a/datasette/database.py +++ b/datasette/database.py @@ -256,18 +256,26 @@ class Database: # Try to get counts for each table, $limit timeout for each count counts = {} for table in await self.table_names(): - try: - table_count = ( - await self.execute( - f"select count(*) from [{table}]", - custom_time_limit=limit, - ) - ).rows[0][0] - counts[table] = table_count - # In some cases I saw "SQL Logic Error" here in addition to - # QueryInterrupted - so we catch that too: - except (QueryInterrupted, sqlite3.OperationalError, sqlite3.DatabaseError): - counts[table] = None + print(table.lower()) + if table.lower() == "knn": + counts[table] = 0 + else: + try: + table_count = ( + await self.execute( + f"select count(*) from [{table}]", + custom_time_limit=limit, + ) + ).rows[0][0] + counts[table] = table_count + # In some cases I saw "SQL Logic Error" here in addition to + # QueryInterrupted - so we catch that too: + except ( + QueryInterrupted, + sqlite3.OperationalError, + sqlite3.DatabaseError, + ): + counts[table] = None if not self.is_mutable: self._cached_table_counts = counts return counts

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Avoid ever running count(*) against SpatiaLite KNN table 1113384383  
1027634490 https://github.com/simonw/datasette/issues/1607#issuecomment-1027634490 https://api.github.com/repos/simonw/datasette/issues/1607 IC_kwDOBm6k_c49QHU6 simonw 9599 2022-02-02T06:44:30Z 2022-02-02T06:44:30Z OWNER

Prototype: diff diff --git a/datasette/app.py b/datasette/app.py index 09d7d03..e2a5aea 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -724,6 +724,47 @@ class Datasette: sqlite_extensions[extension] = None except Exception: pass + # More details on SpatiaLite + if "spatialite" in sqlite_extensions: + spatialite_details = {} + fns = ( + "spatialite_version", + "spatialite_target_cpu", + "rcheck_strict_sql_quoting", + "freexl_version", + "proj_version", + "geos_version", + "rttopo_version", + "libxml2_version", + "HasIconv", + "HasMathSQL", + "HasGeoCallbacks", + "HasProj", + "HasProj6", + "HasGeos", + "HasGeosAdvanced", + "HasGeosTrunk", + "HasGeosReentrant", + "HasGeosOnlyReentrant", + "HasMiniZip", + "HasRtTopo", + "HasLibXML2", + "HasEpsg", + "HasFreeXL", + "HasGeoPackage", + "HasGCP", + "HasTopology", + "HasKNN", + "HasRouting", + ) + for fn in fns: + try: + result = conn.execute("select {}()".format(fn)) + spatialite_details[fn] = result.fetchone()[0] + except Exception: + pass + sqlite_extensions["spatialite"] = spatialite_details + # Figure out supported FTS versions

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
More detailed information about installed SpatiaLite version 1109783030  
1027633686 https://github.com/simonw/datasette/issues/1533#issuecomment-1027633686 https://api.github.com/repos/simonw/datasette/issues/1533 IC_kwDOBm6k_c49QHIW simonw 9599 2022-02-02T06:42:53Z 2022-02-02T06:42:53Z OWNER

I'm going to apply the hack, then fix it again in: - #1518

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add `Link: rel="alternate"` header pointing to JSON for a table/query 1065431383  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 3402.464ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows