issues

1,141 rows sorted by updated_at descending

View and edit SQL

type

state

id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association pull_request body repo type active_lock_reason
610829227 MDU6SXNzdWU2MTA4MjkyMjc= 749 Respect Cloud Run max response size of 32MB simonw 9599 open 0   Datasette 1.0 3268330 1 2020-05-01T16:06:46Z 2020-06-06T20:01:54Z   OWNER  

https://cloud.google.com/run/quotas lists the maximum response size as 32MB.

I spotted a bug where attempting to download a database file larger than that from a Cloud Run deployment (in this case it was https://github-to-sqlite.dogsheep.net/github.db after I accidentally increased the size of that database) returned a 500 error because of this.

datasette 107914493 issue  
449854604 MDU6SXNzdWU0NDk4NTQ2MDQ= 492 Facets not correctly persisted in hidden form fields simonw 9599 open 0   Datasette 1.0 3268330 3 2019-05-29T14:49:39Z 2020-06-06T20:01:53Z   OWNER  

Steps to reproduce: visit https://2a4b892.datasette.io/fixtures/roadside_attractions?_facet_m2m=attraction_characteristic and click "Apply"

Result is a 500: no such column: attraction_characteristic

The error occurs because of this hidden HTML input:

<input type="hidden" name="_facet" value="attraction_characteristic">

This should be:

<input type="hidden" name="_facet_m2m" value="attraction_characteristic">
datasette 107914493 issue  
450032134 MDU6SXNzdWU0NTAwMzIxMzQ= 495 facet_m2m gets confused by multiple relationships simonw 9599 open 0   Datasette 1.0 3268330 2 2019-05-29T21:37:28Z 2020-06-06T20:01:53Z   OWNER  

I got this for a database I was playing with:

I think this is because of these three tables:

datasette 107914493 issue  
463492815 MDU6SXNzdWU0NjM0OTI4MTU= 534 500 error on m2m facet detection simonw 9599 open 0   Datasette 1.0 3268330 1 2019-07-03T00:42:42Z 2020-06-06T20:01:53Z   OWNER  

This may help debug:

diff --git a/datasette/facets.py b/datasette/facets.py
index 76d73e5..07a4034 100644
--- a/datasette/facets.py
+++ b/datasette/facets.py
@@ -499,11 +499,14 @@ class ManyToManyFacet(Facet):
                 "outgoing"
             ]
             if len(other_table_outgoing_foreign_keys) == 2:
-                destination_table = [
-                    t
-                    for t in other_table_outgoing_foreign_keys
-                    if t["other_table"] != self.table
-                ][0]["other_table"]
+                try:
+                    destination_table = [
+                        t
+                        for t in other_table_outgoing_foreign_keys
+                        if t["other_table"] != self.table
+                    ][0]["other_table"]
+                except IndexError:
+                    import pdb; pdb.pm()
                 # Only suggest if it's not selected already
                 if ("_facet_m2m", destination_table) in args:
                     continue
datasette 107914493 issue  
520740741 MDU6SXNzdWU1MjA3NDA3NDE= 625 If you apply ?_facet_array=tags then &_facet=tags does nothing simonw 9599 open 0   Datasette 1.0 3268330 0 2019-11-11T04:59:29Z 2020-06-06T20:01:53Z   OWNER  

Start here: https://v0-30-2.datasette.io/fixtures/facetable?_facet_array=tags

Note that tags is offered as a suggested facet. But if you click that you get this:

https://v0-30-2.datasette.io/fixtures/facetable?_facet_array=tags&_facet=tags

The _facet=tags is added to the URL and it's removed from the list of suggested tags... but the facet itself is not displayed:

The _facet=tags facet should look like this:

datasette 107914493 issue  
542553350 MDU6SXNzdWU1NDI1NTMzNTA= 655 Copy and paste doesn't work reliably on iPhone for SQL editor simonw 9599 open 0   Datasette 1.0 3268330 2 2019-12-26T13:15:10Z 2020-06-06T20:01:53Z   OWNER  

I'm having a lot of trouble copying and pasting from the codemirror editor on my iPhone.

datasette 107914493 issue  
576722115 MDU6SXNzdWU1NzY3MjIxMTU= 696 Single failing unit test when run inside the Docker image simonw 9599 open 0   Datasette 1.0 3268330 1 2020-03-06T06:16:36Z 2020-06-06T20:01:53Z   OWNER  
docker run -it -v `pwd`:/mnt datasetteproject/datasette:latest /bin/bash
root@0e1928cfdf79:/# cd /mnt
root@0e1928cfdf79:/mnt# pip install -e .[test]
root@0e1928cfdf79:/mnt# pytest

I get one failure!

It was for test_searchable[/fixtures/searchable.json?_search=te*+AND+do*&_searchmode=raw-expected_rows3]

    def test_searchable(app_client, path, expected_rows):
        response = app_client.get(path)
>       assert expected_rows == response.json["rows"]
E       AssertionError: assert [[1, 'barry c...sel', 'puma']] == []
E         Left contains 2 more items, first extra item: [1, 'barry cat', 'terry dog', 'panther']
E         Full diff:
E         + []
E         - [[1, 'barry cat', 'terry dog', 'panther'],
E         -  [2, 'terry dog', 'sara weasel', 'puma']]

Originally posted by @simonw in https://github.com/simonw/datasette/issues/695#issuecomment-595614469

datasette 107914493 issue  
398011658 MDU6SXNzdWUzOTgwMTE2NTg= 398 Ensure downloading a 100+MB SQLite database file works simonw 9599 open 0   Datasette 1.0 3268330 2 2019-01-10T20:57:52Z 2020-06-06T20:01:52Z   OWNER  

I've seen attempted downloads of large files fail after about ten seconds.

datasette 107914493 issue  
440222719 MDU6SXNzdWU0NDAyMjI3MTk= 448 _facet_array should work against views simonw 9599 open 0   Datasette 1.0 3268330 1 2019-05-03T21:08:04Z 2020-06-06T20:01:52Z   OWNER  

I created this view: https://json-view-facet-bug-demo-j7hipcg4aq-uc.a.run.app/russian-ads-8dbda00/ads_with_targets

CREATE VIEW ads_with_targets as select ads.*, json_group_array(targets.name) as target_names from ads
  join ad_targets on ad_targets.ad_id = ads.id
  join targets on ad_targets.target_id = targets.id
  group by ad_targets.ad_id

When I try to apply faceting by array it appears to work at first: https://json-view-facet-bug-demo-j7hipcg4aq-uc.a.run.app/russian-ads/ads_with_targets?_facet_array=target_names

But actually it's doing the wrong thing - the SQL for the facets uses rowid, but rowid is not present on views at all! These results are incorrect, and clicking to select a facet will fail to produce any rows: https://json-view-facet-bug-demo-j7hipcg4aq-uc.a.run.app/russian-ads/ads_with_targets?_facet_array=target_names&target_names__arraycontains=people_who_match%3Ainterests%3AAfrican-American+Civil+Rights+Movement+%281954%E2%80%9468%29

Here's the SQL it should be using when you select a facet (note that it does not use a rowid):

https://json-view-facet-bug-demo-j7hipcg4aq-uc.a.run.app/russian-ads?sql=select+*+from+ads_with_targets+where+id+in+%28%0D%0A++++++++++++select+ads_with_targets.id+from+ads_with_targets%2C+json_each%28ads_with_targets.target_names%29+j%0D%0A++++++++++++where+j.value+%3D+%3Ap0%0D%0A++++++++%29+limit+101&p0=people_who_match%3Ainterests%3ABlack+%28Color%29

So we need to do something a lot smarter here. I'm not sure what the fix will look like, or even if it's feasible given that views don't have a rowid to hook into so the JSON faceting SQL may have to be completely rewritten.

datasette publish cloudrun \
    russian-ads.db \
    --name json-view-facet-bug-demo \
    --branch master \
    --extra-options "--config sql_time_limit_ms:5000 --config facet_time_limit_ms:5000"
datasette 107914493 issue  
582517965 MDU6SXNzdWU1ODI1MTc5NjU= 698 Ability for a canned query to write to the database simonw 9599 closed 0   Datasette 0.44 5512395 26 2020-03-16T18:31:59Z 2020-06-06T19:43:49Z 2020-06-06T19:43:48Z OWNER  

Canned queries are currently read-only: https://datasette.readthedocs.io/en/0.38/sql_queries.html#canned-queries

Add a "write": true option to their definition in metadata.json which turns them into queries that are submitted via POST and send their queries to the write queue.

Then they can be used as a really quick way to define a writable interface and JSON API!

datasette 107914493 issue  
582526961 MDU6SXNzdWU1ODI1MjY5NjE= 699 Authentication (and permissions) as a core concept simonw 9599 closed 0   Datasette 0.44 5512395 40 2020-03-16T18:48:00Z 2020-06-06T19:42:11Z 2020-06-06T19:42:11Z OWNER  

Right now Datasette authentication is provided exclusively by plugins:

This is an all-or-nothing approach: either your Datasette instance requires authentication at the top level or it does not.

But... as I build new plugins like https://github.com/simonw/datasette-configure-fts and https://github.com/simonw/datasette-edit-tables I increasingly have individual features which should be reserved for logged-in users while still wanting other parts of Datasette to be open to all.

This is too much for plugins to own independently of Datasette core. Datasette needs to ship a single "user is authenticated" concept (independent of how users actually sign in) so that different plugins can integrate with it.

datasette 107914493 issue  
632645865 MDExOlB1bGxSZXF1ZXN0NDI5MzY2NjQx 803 Canned query permissions simonw 9599 closed 0     0 2020-06-06T18:20:00Z 2020-06-06T19:40:21Z 2020-06-06T19:40:20Z OWNER simonw/datasette/pulls/803

Refs #800. Closes #786

datasette 107914493 pull  
628087971 MDU6SXNzdWU2MjgwODc5NzE= 786 Documentation page describing Datasette's authentication system simonw 9599 closed 0   Datasette 0.44 5512395 2 2020-06-01T01:10:06Z 2020-06-06T19:40:20Z 2020-06-06T19:40:20Z OWNER  

Originally posted by @simonw in https://github.com/simonw/datasette/issues/699#issuecomment-636562999

datasette 107914493 issue  
629524205 MDU6SXNzdWU2Mjk1MjQyMDU= 793 CSRF protection for /-/messages tool and writable canned queries simonw 9599 closed 0   Datasette 0.44 5512395 3 2020-06-02T21:22:21Z 2020-06-06T00:43:41Z 2020-06-05T19:05:59Z OWNER  

The /-/messages debug tool will need CSRF protection or people will be able to add messages using a hidden form on another website.
Originally posted by @simonw in https://github.com/simonw/datasette/issues/790#issuecomment-637790860

datasette 107914493 issue  
631300342 MDExOlB1bGxSZXF1ZXN0NDI4MjEyNDIx 798 CSRF protection simonw 9599 closed 0   Datasette 0.44 5512395 5 2020-06-05T04:22:35Z 2020-06-06T00:43:41Z 2020-06-05T19:05:58Z OWNER simonw/datasette/pulls/798

Refs #793

datasette 107914493 pull  
628025100 MDU6SXNzdWU2MjgwMjUxMDA= 785 Datasette secret mechanism - initially for signed cookies simonw 9599 closed 0   Datasette 0.44 5512395 11 2020-05-31T19:14:52Z 2020-06-06T00:43:40Z 2020-06-01T00:18:40Z OWNER  

See comment in https://github.com/simonw/datasette/issues/784#issuecomment-636514974

Datasette needs to be able to set signed cookies - which means it needs a mechanism for safely handling a signing secret.

Since Datasette is a long-running process the default behaviour here can be to create a random secret on startup. This means that if the server restarts any signed cookies will be invalidated.

If the user wants a persistent secret they'll have to generate it themselves - maybe by setting an environment variable?

datasette 107914493 issue  
628121234 MDU6SXNzdWU2MjgxMjEyMzQ= 788 /-/permissions debugging tool simonw 9599 closed 0   Datasette 0.44 5512395 2 2020-06-01T03:13:47Z 2020-06-06T00:43:40Z 2020-06-01T05:01:01Z OWNER  

Debugging tool idea: /-/permissions page which shows you the actor and lets you type in the strings for action, resource_type and resource_identifier - then shows you EVERY plugin hook that would have executed and what it would have said, plus when the chain would have terminated.

Bonus: if you're logged in as the root user (or a user that matches some kind of permission check, maybe a check for permissions_debug) you get to see a rolling log of the last 30 permission checks and what the results were across the whole of Datasette. This should make figuring out permissions policies a whole lot easier.

Originally posted by @simonw in https://github.com/simonw/datasette/issues/699#issuecomment-636576603

datasette 107914493 issue  
632056825 MDU6SXNzdWU2MzIwNTY4MjU= 802 "datasette plugins" command is broken simonw 9599 closed 0     1 2020-06-05T23:33:01Z 2020-06-05T23:46:43Z 2020-06-05T23:46:43Z OWNER  

I broke it in https://github.com/simonw/datasette/commit/a7137dfe069e5fceca56f78631baebd4a6a19967 - and it turns out there was no test coverage so I didn't realize it was broken.

datasette 107914493 issue  
631789422 MDU6SXNzdWU2MzE3ODk0MjI= 799 TestResponse needs to handle multiple set-cookie headers simonw 9599 closed 0     2 2020-06-05T17:39:52Z 2020-06-05T18:34:10Z 2020-06-05T18:34:10Z OWNER  

Seeing this test failure on #798:

_______________________ test_auth_token _______________________
app_client = <tests.fixtures.TestClient object at 0x11285c910>
    def test_auth_token(app_client):
        "The /-/auth-token endpoint sets the correct cookie"
        assert app_client.ds._root_token is not None
        path = "/-/auth-token?token={}".format(app_client.ds._root_token)
        response = app_client.get(path, allow_redirects=False,)
        assert 302 == response.status
        assert "/" == response.headers["Location"]
>       assert {"id": "root"} == app_client.ds.unsign(response.cookies["ds_actor"], "actor")
E       KeyError: 'ds_actor'
datasette/tests/test_auth.py:12: KeyError

It looks like that's happening because the ASGI middleware is adding another set-cookie header - but those two set-cookie headers are combined into one when the TestResponse is constructed:

https://github.com/simonw/datasette/blob/0c064c5fe220b7b3d8dcf85b02b4e60452c47232/tests/fixtures.py#L113-L127

datasette 107914493 issue  
570301333 MDU6SXNzdWU1NzAzMDEzMzM= 684 Add documentation on Database introspection methods to internals.rst simonw 9599 closed 0   Datasette 1.0 3268330 4 2020-02-25T04:20:24Z 2020-06-04T18:56:15Z 2020-05-30T18:40:39Z OWNER  

internals.rst will be landing as part of #683

datasette 107914493 issue  
275082158 MDU6SXNzdWUyNzUwODIxNTg= 119 Build an "export this data to google sheets" plugin simonw 9599 closed 0     1 2017-11-18T14:14:51Z 2020-06-04T18:46:40Z 2020-06-04T18:46:39Z OWNER  

Inspired by https://github.com/kren1/tosheets

It should be a plug-in because I'd like to keep all interactions with proprietary / non-open-source software encapsulated in plugins rather than shipped as part of core.

datasette 107914493 issue  
629595228 MDExOlB1bGxSZXF1ZXN0NDI2ODkxNDcx 796 New WIP writable canned queries simonw 9599 closed 0   Datasette 1.0 3268330 9 2020-06-03T00:08:00Z 2020-06-03T15:16:52Z 2020-06-03T15:16:50Z OWNER simonw/datasette/pulls/796

Refs #698. Replaces #703

Still todo:

  • Unit tests
  • <del>Figure out .json mode</del>
  • Flash message solution
  • <del>CSRF protection</del>
  • Better error message display on errors
  • Documentation
  • <del>Maybe widgets?</del> I'll do these later
datasette 107914493 pull  
585597133 MDExOlB1bGxSZXF1ZXN0MzkxOTI0NTA5 703 WIP implementation of writable canned queries simonw 9599 closed 0     3 2020-03-21T22:23:51Z 2020-06-03T00:08:14Z 2020-06-02T23:57:35Z OWNER simonw/datasette/pulls/703

Refs #698.

datasette 107914493 pull  
629535669 MDU6SXNzdWU2Mjk1MzU2Njk= 794 Show hooks implemented by each plugin on /-/plugins simonw 9599 closed 0   Datasette 1.0 3268330 2 2020-06-02T21:44:38Z 2020-06-02T22:30:17Z 2020-06-02T21:50:10Z OWNER  

e.g.

    {
        "name": "qs_actor.py",
        "static": false,
        "templates": false,
        "version": null,
        "hooks": [
            "actor_from_request"
        ]
    }
datasette 107914493 issue  
626593402 MDU6SXNzdWU2MjY1OTM0MDI= 780 Internals documentation for datasette.metadata() method simonw 9599 open 0   Datasette 1.0 3268330 2 2020-05-28T15:14:22Z 2020-06-02T22:13:12Z   OWNER  

https://github.com/simonw/datasette/blob/40885ef24e32d91502b6b8bbad1c7376f50f2830/datasette/app.py#L297-L328

datasette 107914493 issue  
497170355 MDU6SXNzdWU0OTcxNzAzNTU= 576 Documented internals API for use in plugins simonw 9599 open 0   Datasette 1.0 3268330 8 2019-09-23T15:28:50Z 2020-06-02T22:13:09Z   OWNER  

Quite a few of the plugin hooks make a datasette”instance of the Datasette class available to the plugins, so that they can look up configuration settings and execute database queries.

This means it should provide a documented, stable API so that plugin authors can rely on it.

datasette 107914493 issue  
440134714 MDU6SXNzdWU0NDAxMzQ3MTQ= 446 Define mechanism for plugins to return structured data simonw 9599 open 0   Datasette 1.0 3268330 6 2019-05-03T17:00:16Z 2020-06-02T22:12:15Z   OWNER  

Several plugin hooks now expect plugins to return data in a specific shape - notably the new output format hook and the custom facet hook.

These use Python dictionaries right now but that's quite error prone: it would be good to have a mechanism that supported a more structured format.

Full list of current hooks is here: https://datasette.readthedocs.io/en/latest/plugins.html#plugin-hooks

datasette 107914493 issue  
629459637 MDU6SXNzdWU2Mjk0NTk2Mzc= 792 Replace response.body.decode("utf8") with response.text in tests simonw 9599 closed 0     0 2020-06-02T19:32:24Z 2020-06-02T21:29:58Z 2020-06-02T21:29:58Z OWNER  

Make use of the response.text property to clean up the tests a tiny bit:

https://github.com/simonw/datasette/blob/57cf5139c552cb7feab9947daa949ca434cc0a66/tests/fixtures.py#L26-L38

datasette 107914493 issue  
629473827 MDU6SXNzdWU2Mjk0NzM4Mjc= 5 Suggesion: Add output example to readme harryvederci 26745575 open 0     0 2020-06-02T19:56:49Z 2020-06-02T19:56:49Z   NONE  

First off, thanks for open sourcing this application! This is a suggestion to increase the amount of people that would make use of it: an example in the readme file would help.

Currently, users have to clone the app, install it, authorize through pocket, run a command, an then find out if this application does what they hope it does.

Another possibility is to add a file example-output.db, containing one (mock) Pocket article.

Keep up the good work!

pocket-to-sqlite 213286752 issue  
628156527 MDU6SXNzdWU2MjgxNTY1Mjc= 789 Mechanism for enabling pluggy tracing simonw 9599 open 0     2 2020-06-01T05:10:14Z 2020-06-01T05:11:03Z   OWNER  

Could be useful for debugging plugins: https://pluggy.readthedocs.io/en/latest/#call-tracing

I tried this out by adding these two lines in plugins.py:

pm = pluggy.PluginManager("datasette")
pm.add_hookspecs(hookspecs)
# Added these:
pm.trace.root.setwriter(print)
pm.enable_tracing()

Output looked something like this:

INFO:     127.0.0.1:52724 - "GET /-/-/static/app.css HTTP/1.1" 404 Not Found
  actor_from_request [hook]
      datasette: <datasette.app.Datasette object at 0x106277ad0>
      request: <datasette.utils.asgi.Request object at 0x106550a50>

  finish actor_from_request --> [] [hook]

  extra_body_script [hook]
      template: show_json.html
      database: None
      table: None
      view_name: json_data
      datasette: <datasette.app.Datasette object at 0x106277ad0>

  finish extra_body_script --> [] [hook]

  extra_template_vars [hook]
      template: show_json.html
      database: None
      table: None
      view_name: json_data
      request: <datasette.utils.asgi.Request object at 0x1065504d0>
      datasette: <datasette.app.Datasette object at 0x106277ad0>

  finish extra_template_vars --> [] [hook]

  extra_css_urls [hook]
      template: show_json.html
      database: None
      table: None
      datasette: <datasette.app.Datasette object at 0x106277ad0>

  finish extra_css_urls --> [] [hook]

  extra_js_urls [hook]
      template: show_json.html
      database: None
      table: None
      datasette: <datasette.app.Datasette object at 0x106277ad0>

  finish extra_js_urls --> [] [hook]

INFO:     127.0.0.1:52724 - "GET /-/actor HTTP/1.1" 200 OK
  actor_from_request [hook]
      datasette: <datasette.app.Datasette object at 0x106277ad0>
      request: <datasette.utils.asgi.Request object at 0x1065500d0>

  finish actor_from_request --> [] [hook]
datasette 107914493 issue  
627836898 MDExOlB1bGxSZXF1ZXN0NDI1NTMxMjA1 783 Authentication: plugin hooks plus default --root auth mechanism simonw 9599 closed 0     0 2020-05-30T22:25:47Z 2020-06-01T01:16:44Z 2020-06-01T01:16:43Z OWNER simonw/datasette/pulls/783

See #699

datasette 107914493 pull  
459590021 MDU6SXNzdWU0NTk1OTAwMjE= 519 Decide what goes into Datasette 1.0 simonw 9599 open 0   Datasette 1.0 3268330 2 2019-06-23T15:47:41Z 2020-05-30T18:55:24Z   OWNER  

Datasette ASGI #272 is a big part of it... but 1.0 will generally be an indicator that Datasette is a stable platform for developers to write plugins and custom templates against. So lots to think about.

datasette 107914493 issue  
326800219 MDU6SXNzdWUzMjY4MDAyMTk= 292 Mechanism for customizing the SQL used to select specific columns in the table view simonw 9599 open 0     14 2018-05-27T09:05:52Z 2020-05-30T18:45:38Z   OWNER  

Some columns don't make a lot of sense in their default representation - binary blobs such as SpatiaLite geometries for example, or lengthy columns that really should be truncated somehow.

We may also find that there are tables where we don't want to show all of the columns - so a mechanism to select a subset of columns would be nice.

I think there are two features here:

  • the ability to request a subset of columns on the table view
  • the ability to override the SQL for a specific column and/or add extra columns - AsGeoJSON(Geometry) for example

Both features should be available via both querystring arguments and in metadata.json

The querystring argument for custom SQL should only work if allow_sql config is turned on.

Refs #276

datasette 107914493 issue  
445850934 MDU6SXNzdWU0NDU4NTA5MzQ= 473 Plugin hook: register_filters simonw 9599 open 0     7 2019-05-19T18:44:33Z 2020-05-30T18:44:55Z   OWNER  

I meant to add this as part of the facets plugin mechanism but didn't quite get to it. This will allow plugins to register extra filters, as seen in datasette/filters.py:

https://github.com/simonw/datasette/blob/260085838887ee343f4d3b177c422e7aef5ade9d/datasette/filters.py#L83-L98

datasette 107914493 issue  
570101428 MDExOlB1bGxSZXF1ZXN0Mzc5MTkyMjU4 683 .execute_write() and .execute_write_fn() methods on Database simonw 9599 closed 0   Datasette 1.0 3268330 14 2020-02-24T19:51:58Z 2020-05-30T18:40:20Z 2020-02-25T04:45:08Z OWNER simonw/datasette/pulls/683

See #682

  • Come up with design for .execute_write() and .execute_write_fn()
  • Build some quick demo plugins to exercise the design
  • Write some unit tests
  • Write the documentation
datasette 107914493 pull  
268453968 MDU6SXNzdWUyNjg0NTM5Njg= 37 Ability to serialize massive JSON without blocking event loop simonw 9599 closed 0     2 2017-10-25T15:58:03Z 2020-05-30T17:29:20Z 2020-05-30T17:29:20Z OWNER  

We run the risk of someone attempting a select statement that returns thousands of rows and hence takes several seconds just to JSON encode the response, effectively blocking the event loop and pausing all other traffic.

The Twisted community have a solution for this, can we adapt that in some way? http://as.ynchrono.us/2010/06/asynchronous-json_18.html?m=1

datasette 107914493 issue  
451513541 MDU6SXNzdWU0NTE1MTM1NDE= 498 Full text search of all tables at once? chrismp 7936571 closed 0     12 2019-06-03T14:24:43Z 2020-05-30T17:26:02Z 2020-05-30T17:26:02Z NONE  

Does datasette have a built-in way, in a browser, to do a full-text search of all columns, in all databases and tables, that have full-text search enabled? Is there a plugin that does this?

datasette 107914493 issue  
374953006 MDU6SXNzdWUzNzQ5NTMwMDY= 369 Interface should show same JSON shape options for custom SQL queries slygent 416374 open 0   Datasette 1.0 3268330 2 2018-10-29T10:39:15Z 2020-05-30T17:24:06Z   NONE  

At the moment the page returning a custom SQL query shows the JSON and CSV APIs, but not the multiple JSON shapes. However, adding the _shape parameter to the JSON API URL manually still works, so perhaps there should be consistency in the interface by having the same "Advanced Export" box for custom SQL queries.

datasette 107914493 issue  
459397625 MDU6SXNzdWU0NTkzOTc2MjU= 514 Documentation with recommendations on running Datasette in production without using Docker chrismp 7936571 open 0   Datasette 1.0 3268330 26 2019-06-21T22:48:12Z 2020-05-30T17:22:56Z   NONE  

I've got some SQLite databases too big to push to Heroku or the other services with built-in support in datasette.

So instead I moved my datasette code and databases to a remote server on Kimsufi. In the folder containing the SQLite databases I run the following code.

nohup datasette serve -h 0.0.0.0 *.db --cors --port 8000 --metadata metadata.json > output.log 2>&1 &.

When I go to http://my-remote-server.com:8000, the site loads. But I know this is not a good long-term solution to running datasette on this server.

What is the "correct" way to have this site run, preferably on server port 80?

datasette 107914493 issue  
520667773 MDU6SXNzdWU1MjA2Njc3NzM= 620 Mechanism for indicating foreign key relationships in the table and query page URLs simonw 9599 open 0   Datasette 1.0 3268330 5 2019-11-10T22:26:27Z 2020-05-30T17:22:56Z   OWNER  

Datasette currently only inflates foreign keys (into names hyperlinks) if it detects them as foreign key constraints in the underlying database.

It would be useful if you could specify additional "foreign keys" using both metadata.json and the querystring - similar time how you can pass ?_fts_table=x https://datasette.readthedocs.io/en/stable/full_text_search.html#configuring-full-text-search-for-a-table-or-view

datasette 107914493 issue  
520681725 MDU6SXNzdWU1MjA2ODE3MjU= 621 Syntax for ?_through= that works as a form field simonw 9599 open 0   Datasette 1.0 3268330 3 2019-11-11T00:19:03Z 2020-05-30T17:22:56Z   OWNER  

The current syntax for ?_through= uses JSON to avoid any risk of confusion with table or column names that contain special characters.

This means you can't target a form field at it.

We should be able to support both - ?x.y.z=value for tables and columns with "regular" names, falling back to the current JSON syntax for columns or tables that won't work with the key/value syntax.

datasette 107914493 issue  
531502365 MDU6SXNzdWU1MzE1MDIzNjU= 646 Make database level information from metadata.json available in the index.html template lagolucas 18017473 open 0   Datasette 1.0 3268330 3 2019-12-02T19:55:10Z 2020-05-30T17:22:56Z   NONE  

Did a search on the issues here and didn't find anything related to what I want.

I want to have information that is on the database level of the JSON like title, source and source_url, and use it on the index page.

I tried some small tweaks on the python and html files, but failed to get that result.

Is there a way? Thanks!

datasette 107914493 issue  
570309546 MDU6SXNzdWU1NzAzMDk1NDY= 685 Document (and reconsider design of) Database.execute() and Database.execute_against_connection_in_thread() simonw 9599 closed 0   Datasette 1.0 3268330 15 2020-02-25T04:49:44Z 2020-05-30T13:20:50Z 2020-05-08T17:42:18Z OWNER  

In #683 I started a new section of internals documentation covering the Database class: https://datasette.readthedocs.io/en/latest/internals.html#database-class

I decided not to document .execute() and .execute_against_connection_in_thread() yet because I'm not 100% happy with their API design yet.

datasette 107914493 issue  
585633142 MDU6SXNzdWU1ODU2MzMxNDI= 706 Documentation for the "request" object simonw 9599 closed 0   Datasette 1.0 3268330 6 2020-03-22T02:55:50Z 2020-05-30T13:20:00Z 2020-05-27T22:31:22Z OWNER  

Since that object is passed to the extra_template_vars hooks AND the classes registered by register_facet_classes it should be part of the documented interface on https://datasette.readthedocs.io/en/stable/internals.html

I could also start passing it to the register_output_renderer callback.

datasette 107914493 issue  
626078521 MDU6SXNzdWU2MjYwNzg1MjE= 774 Consolidate request.raw_args and request.args simonw 9599 closed 0   Datasette 1.0 3268330 8 2020-05-27T22:30:59Z 2020-05-29T23:27:35Z 2020-05-29T23:22:38Z OWNER  

request.raw_args is not documented, and I'd like to remove it entirely.
Originally posted by @simonw in https://github.com/simonw/datasette/issues/706#issuecomment-634975252

I use it in a few places in other projects though, so I'll have to fix those first: https://github.com/search?q=user%3Asimonw+raw_args&type=Code

datasette 107914493 issue  
345469355 MDU6SXNzdWUzNDU0NjkzNTU= 351 Automatically create a GitHub release linking to release notes for every tagged release simonw 9599 closed 0     1 2018-07-28T18:31:12Z 2020-05-28T18:56:16Z 2020-05-28T18:56:15Z OWNER  

Can use this API called from Travis: https://developer.github.com/v3/repos/releases/#create-a-release

The release it generates should look like this one: https://github.com/simonw/datasette/releases/tag/0.24

datasette 107914493 issue  
455965174 MDU6SXNzdWU0NTU5NjUxNzQ= 508 Ability to set default sort order for a table or view in metadata.json simonw 9599 closed 0 simonw 9599   1 2019-06-13T21:40:51Z 2020-05-28T18:53:03Z 2020-05-28T18:53:02Z OWNER  

It can go here in the documentation: https://datasette.readthedocs.io/en/stable/metadata.html#setting-which-columns-can-be-used-for-sorting

Also need to fix this sentence which is no longer true:

By default, database views in Datasette do not support sorting

datasette 107914493 issue  
626663119 MDU6SXNzdWU2MjY2NjMxMTk= 781 request.url and request.scheme should obey force_https_urls config setting simonw 9599 closed 0     3 2020-05-28T16:54:47Z 2020-05-28T17:39:54Z 2020-05-28T17:10:13Z OWNER  

I'm trying to get the https://www.niche-museums.com/browse/feed.atom feed to validate and I git this from https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fwww.niche-museums.com%2Fbrowse%2Ffeed.atom

This feed is valid, but interoperability with the widest range of feed readers could be improved by implementing the following recommendations.

line 6, column 73: Self reference doesn't match document location [help]

<link href="http://www.niche-museums.com/browse/feed.atom" rel="self"/>

I tried to fix this using force_https_urls (commit) but it didn't work - because that setting isn't respected by the Request class:

https://github.com/simonw/datasette/blob/40885ef24e32d91502b6b8bbad1c7376f50f2830/datasette/utils/asgi.py#L15-L32

datasette 107914493 issue  
626582657 MDU6SXNzdWU2MjY1ODI2NTc= 779 Make human_description_en explicitly available to output renderers simonw 9599 open 0     0 2020-05-28T14:59:54Z 2020-05-28T14:59:54Z   OWNER  

datasette-atom uses this:

https://github.com/simonw/datasette-atom/blob/df98a6c43a443224b6cd232f84703ec297ef046b/datasette_atom/init.py#L36-L37

    if data.get("human_description_en"):
        title += ": " + data["human_description_en"]

It's a nice way to generate a useful title for a filtered table.

datasette 107914493 issue  
608058890 MDU6SXNzdWU2MDgwNTg4OTA= 744 link_or_copy_directory() error - Invalid cross-device link aborruso 30607 closed 0     28 2020-04-28T06:26:45Z 2020-05-28T14:32:53Z 2020-05-27T06:01:28Z NONE  

Hi,
when I run

datasette  publish heroku -n myapp --template-dir ./template mydb.db

I have this error

Traceback (most recent call last):
  File "/home/aborruso/.local/lib/python3.7/site-packages/datasette/utils/__init__.py", line 607, in link_or_copy_directory
    shutil.copytree(src, dst, copy_function=os.link)
  File "/usr/lib/python3.7/shutil.py", line 365, in copytree
    raise Error(errors)
shutil.Error: [('/myfolder/youtubeComunePalermo/processing/./template/base.html', '/tmp/tmps9_4mzc4/templates/base.html', "[Errno 18] Invalid cross-device link: '/myfolder/youtubeComunePalermo/processing/./template/base.html' -> '/tmp/tmps9_4mzc4/templates/base.html'"), ('/myfolder/youtubeComunePalermo/processing/./template/index.html', '/tmp/tmps9_4mzc4/templates/index.html', "[Errno 18] Invalid cross-device link: '/myfolder/youtubeComunePalermo/processing/./template/index.html' -> '/tmp/tmps9_4mzc4/templates/index.html'")]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/aborruso/.local/bin/datasette", line 8, in <module>
    sys.exit(cli())
  File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/aborruso/.local/lib/python3.7/site-packages/datasette/publish/heroku.py", line 103, in heroku
    extra_metadata,
  File "/usr/lib/python3.7/contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "/home/aborruso/.local/lib/python3.7/site-packages/datasette/publish/heroku.py", line 191, in temporary_heroku_directory
    os.path.join(tmp.name, "templates"),
  File "/home/aborruso/.local/lib/python3.7/site-packages/datasette/utils/__init__.py", line 609, in link_or_copy_directory
    shutil.copytree(src, dst)
  File "/usr/lib/python3.7/shutil.py", line 321, in copytree
    os.makedirs(dst)
  File "/usr/lib/python3.7/os.py", line 221, in makedirs
    mkdir(name, mode)
FileExistsError: [Errno 17] File exists: '/tmp/tmps9_4mzc4/templates'

I'm attaching my very basic template folder.

Thank you

template.zip

datasette 107914493 issue  
612382643 MDU6SXNzdWU2MTIzODI2NDM= 758 Question: Access to immutable database-path clausjuhl 2181410 open 0     6 2020-05-05T07:01:18Z 2020-05-28T08:23:27Z   NONE  

Hi Simon

Is there anywhere in the app-context where one can access the hashed urlpath of the database? Currently it's included in the template-context (databases[0]["path") when rendering urls of the database (eg. /db-44b06v9/cases...), but where can I find the hashed url when rendering the index-page? I'm trying to avoid redirects. Thanks!

datasette 107914493 issue  
625930207 MDU6SXNzdWU2MjU5MzAyMDc= 770 register_output_renderer can_render mechanism simonw 9599 closed 0   Datasette 0.43 5471110 4 2020-05-27T18:29:14Z 2020-05-28T05:57:16Z 2020-05-28T05:57:16Z OWNER  

I would like is the ability for renderers to opt-in / opt-out of being displayed as options on the page.

https://www.niche-museums.com/browse/museums for example shows a atom link because the datasette-atom plugin is installed... but clicking it will give you a 400 error because the correct columns are not present.

Here's the code that passes a list of renderers to the template:

https://github.com/simonw/datasette/blob/2d099ad9c657d2cab59de91cdb8bfed2da236ef6/datasette/views/base.py#L411-L423

A renderer is currently defined as a two-key dictionary:

@hookimpl
def register_output_renderer(datasette):
    return {
        'extension': 'test',
        'callback': render_test
    }

I can add a third key, "should_suggest" which is a function that returns True or False for a given query. If that key is missing it is assumed to return True.

One catch: what arguments should be passed to the should_suggest(...) function?

UPDATE: now calling it can_render instead.

Originally posted by @simonw in https://github.com/simonw/datasette/issues/581#issuecomment-634856748

datasette 107914493 issue  
611540797 MDU6SXNzdWU2MTE1NDA3OTc= 751 Ability to set custom default _size on a per-table basis simonw 9599 closed 0   Datasette 0.43 5471110 4 2020-05-04T00:13:03Z 2020-05-28T05:00:22Z 2020-05-28T05:00:20Z OWNER  

I have some tables where I'd like the default page size to be 10, without affecting the rest of my Datasette instance.

datasette 107914493 issue  
626211658 MDU6SXNzdWU2MjYyMTE2NTg= 778 Ability to configure keyset pagination for views simonw 9599 open 0     0 2020-05-28T04:48:56Z 2020-05-28T04:48:56Z   OWNER  

Currently views offer pagination, but it uses offset/limit - e.g. https://latest.datasette.io/fixtures/paginated_view?_next=100

This means pagination will perform poorly on deeper pages.

If a view is based on a table that has a primary key it should be possible to configure efficient keyset pagination that works the same way that table pagination works.

This may be as simple as configuring a column that can be treated as a "primary key" for the purpose of pagination using metadata.json - or with a ?_view_pk=colname querystring argument.

datasette 107914493 issue  
626001501 MDU6SXNzdWU2MjYwMDE1MDE= 773 All plugin hooks should have unit tests simonw 9599 closed 0   Datasette 0.43 5471110 2 2020-05-27T20:17:41Z 2020-05-28T04:12:11Z 2020-05-28T04:09:25Z OWNER  

Four hooks currently missing tests:

  • prepare_jinja2_environment
  • publish_subcommand
  • register_facet_classes
  • register_output_renderer

```
$ pytest -k test_plugin_hooks_have_tests -vv
====================================== test session starts ======================================
platform darwin -- Python 3.7.7, pytest-5.2.4, py-1.8.1, pluggy-0.13.1 -- /Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/bin/python
cachedir: .pytest_cache
rootdir: /Users/simon/Dropbox/Development/datasette, inifile: pytest.ini
plugins: asyncio-0.10.0
collected 486 items / 475 deselected / 11 selected

tests/test_plugins.py::test_plugin_hooks_have_tests[asgi_wrapper] XPASS [ 9%]
tests/test_plugins.py::test_plugin_hooks_have_tests[extra_body_script] XPASS [ 18%]
tests/test_plugins.py::test_plugin_hooks_have_tests[extra_css_urls] XPASS [ 27%]
tests/test_plugins.py::test_plugin_hooks_have_tests[extra_js_urls] XPASS [ 36%]
tests/test_plugins.py::test_plugin_hooks_have_tests[extra_template_vars] XPASS [ 45%]
tests/test_plugins.py::test_plugin_hooks_have_tests[prepare_connection] XPASS [ 54%]
tests/test_plugins.py::test_plugin_hooks_have_tests[prepare_jinja2_environment] XFAIL [ 63%]
tests/test_plugins.py::test_plugin_hooks_have_tests[publish_subcommand] XFAIL [ 72%]
tests/test_plugins.py::test_plugin_hooks_have_tests[register_facet_classes] XFAIL [ 81%]
tests/test_plugins.py::test_plugin_hooks_have_tests[register_output_renderer] XFAIL [ 90%]
tests/test_plugins.py::test_plugin_hooks_have_tests[render_cell] XPASS [100%]

========================= 475 deselected, 4 xfailed, 7 xpassed in 1.70s =========================

Originally posted by @simonw in https://github.com/simonw/datasette/issues/771#issuecomment-634915104

datasette 107914493 issue  
626163974 MDU6SXNzdWU2MjYxNjM5NzQ= 776 register_output_renderer render callback should be optionally awaitable simonw 9599 closed 0   Datasette 0.43 5471110 1 2020-05-28T02:26:29Z 2020-05-28T02:43:36Z 2020-05-28T02:43:36Z OWNER  

In #581 I made a bunch of improvements to this, including making datasette available to it so it could execute queries.

But... it needs to be able to await in order to do that. Which means it should be optionally-awaitable.

Original idea here: https://github.com/simonw/datasette/issues/645#issuecomment-560036740

datasette 107914493 issue  
502993509 MDU6SXNzdWU1MDI5OTM1MDk= 581 Redesign register_output_renderer callback simonw 9599 closed 0   Datasette 0.43 5471110 24 2019-10-05T17:43:23Z 2020-05-28T02:24:14Z 2020-05-28T02:21:50Z OWNER  

In building https://github.com/simonw/datasette-atom it became clear that the callback function (which currently accepts just args, data and view_name) would also benefit from access to a mechanism to render templates and a datasette instance so it can execute SQL.

To maintain backwards compatibility with existing plugins, we can introspect the callback function to see if it wants those new arguments or not.

At a minimum I want to make datasette and ASGI scope available.

datasette 107914493 issue  
530653633 MDU6SXNzdWU1MzA2NTM2MzM= 645 Mechanism for register_output_renderer to suggest extension or not simonw 9599 closed 0     4 2019-12-01T01:26:27Z 2020-05-28T02:22:18Z 2020-05-28T02:22:12Z OWNER  

datasette-atom only works if the user constructs a SQL query with specific output columns (atom_id ,atom_updated etc).

It would be good if the .atom link wasn't shown on the query/table page unless those columns were present. Right now you get a link which results in a 400 error:

See also #581.

datasette 107914493 issue  
626131309 MDU6SXNzdWU2MjYxMzEzMDk= 775 Move test plugins into datasette/tests/plugins/ directory simonw 9599 closed 0     1 2020-05-28T00:46:58Z 2020-05-28T00:57:31Z 2020-05-28T00:57:31Z OWNER  

Right now the plugins used during test runs are baked into strings. It would be nicer if they were actual files on disk.

Will make #581 easier to write tests for.

datasette 107914493 issue  
620969465 MDU6SXNzdWU2MjA5Njk0NjU= 767 Allow to specify a URL fragment for canned queries rixx 2657547 closed 0   Datasette 0.43 5471110 2 2020-05-19T13:17:42Z 2020-05-27T21:52:25Z 2020-05-27T21:52:25Z CONTRIBUTOR  

Canned queries are very useful to direct users to prepared data and views. I like to use them with charts using datasette-vega a lot, because people get a direct impression at first glance.

datasette-vega doesn't show up by default though, and users have to click through to it. Also, datasette-vega does not always guess the best way to render columns correctly though, so it would be nice if I could specify a URL fragment in my canned queries to make sure people see what I want them to see.

My current workaround is to include a fragement link in description_html and ask people to reload the page, like here, which is a bit hacky.

datasette 107914493 issue  
622672640 MDExOlB1bGxSZXF1ZXN0NDIxNDkxODEw 768 Use dirs_exist_ok=True simonw 9599 closed 0   Datasette 0.43 5471110 0 2020-05-21T17:53:44Z 2020-05-27T20:21:56Z 2020-05-21T17:53:51Z OWNER simonw/datasette/pulls/768

Refs #744

datasette 107914493 pull  
625922239 MDExOlB1bGxSZXF1ZXN0NDI0MDMyNDQ1 769 Backport of Python 3.8 shutil.copytree simonw 9599 closed 0   Datasette 0.43 5471110 0 2020-05-27T18:17:15Z 2020-05-27T20:21:56Z 2020-05-27T18:17:44Z OWNER simonw/datasette/pulls/769

Closes #744

datasette 107914493 pull  
625991831 MDExOlB1bGxSZXF1ZXN0NDI0MDg1MjY0 772 Test that plugin hooks are unit tested simonw 9599 closed 0   Datasette 0.43 5471110 0 2020-05-27T20:01:32Z 2020-05-27T20:21:56Z 2020-05-27T20:16:03Z OWNER simonw/datasette/pulls/772

Refs #771

datasette 107914493 pull  
616012427 MDU6SXNzdWU2MTYwMTI0Mjc= 764 Add PyPI project urls to setup.py simonw 9599 closed 0   Datasette 0.43 5471110 3 2020-05-11T16:23:08Z 2020-05-27T20:21:36Z 2020-05-11T18:28:55Z OWNER  

Spotted this example here:

    project_urls={
        "Issues": "https://gitlab.com/Cyb3r-Jak3/ExifReader/issues",
        "Source Code": "https://gitlab.com/Cyb3r-Jak3/ExifReader/-/tree/publish",
        "CI": "https://gitlab.com/Cyb3r-Jak3/ExifReader/pipelines",
        "Releases": "https://github.com/Cyb3r-Jak3/ExifReader"
    },

Results in this on https://pypi.org/project/ExifReader/

datasette 107914493 issue  
625980317 MDU6SXNzdWU2MjU5ODAzMTc= 771 Unit test that checks that all plugin hooks have corresponding unit tests simonw 9599 closed 0   Datasette 0.43 5471110 5 2020-05-27T19:42:35Z 2020-05-27T20:21:36Z 2020-05-27T20:17:13Z OWNER  

Turns out some hooks are missing unit test coverage: https://github.com/simonw/datasette/issues/581#issuecomment-634893744_

datasette 107914493 issue  
624490929 MDU6SXNzdWU2MjQ0OTA5Mjk= 28 Invalid SQL no such table: main.uploads dmd 41439 open 0     0 2020-05-25T21:25:39Z 2020-05-25T21:25:39Z   NONE  

http://127.0.0.1:8001/photos/photos_with_apple_metadata gives "Invalid SQL no such table: main.uploads"

dogsheep-photos 256834907 issue  
613006393 MDU6SXNzdWU2MTMwMDYzOTM= 20 Ability to serve thumbnailed Apple Photo from its place on disk simonw 9599 closed 0     10 2020-05-06T02:17:50Z 2020-05-25T20:14:22Z 2020-05-25T20:09:41Z MEMBER  

A custom Datasette plugin that can be run locally on a Mac laptop which knows how to serve photos such that they can be seen in the browser.

Originally posted by @simonw in https://github.com/dogsheep/photos-to-sqlite/issues/19#issuecomment-624406285

dogsheep-photos 256834907 issue  
621332242 MDU6SXNzdWU2MjEzMzIyNDI= 25 Create a public demo simonw 9599 closed 0     5 2020-05-19T22:47:20Z 2020-05-21T22:26:16Z 2020-05-20T05:54:18Z MEMBER  

So I can show people what this does, using some of my photos.

dogsheep-photos 256834907 issue  
621486115 MDU6SXNzdWU2MjE0ODYxMTU= 27 photos_with_apple_metadata view should include labels simonw 9599 open 0     0 2020-05-20T06:06:17Z 2020-05-20T06:06:17Z   MEMBER  

https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?place_city=New+Orleans&_facet=place_city&_facet_array=albums&_facet_array=persons

Here's one way to add that:

        select
          rowid,
          photo,
          (
            select
              json_group_array(
                json_object(
                  'label',
                  normalized_string,
                  'href',
                  '/photos/labelled?_hide_sql=1&label=' || normalized_string
                )
              )
            from
              labels
            where
              labels.uuid = photos_with_apple_metadata.uuid
          ) as labels,
          date,
dogsheep-photos 256834907 issue  
621323348 MDU6SXNzdWU2MjEzMjMzNDg= 24 Configurable URL for images simonw 9599 open 0     1 2020-05-19T22:25:56Z 2020-05-20T06:00:29Z   MEMBER  

This is hard-coded at the moment, which is bad:
https://github.com/dogsheep/photos-to-sqlite/blob/d5d69b9019703c47bc251444838578dd752801e2/photos_to_sqlite/cli.py#L269-L272

dogsheep-photos 256834907 issue  
621444763 MDU6SXNzdWU2MjE0NDQ3NjM= 26 Rename project to dogsheep-photos simonw 9599 closed 0     8 2020-05-20T04:12:34Z 2020-05-20T04:31:02Z 2020-05-20T04:30:40Z MEMBER  

photos-to-sqlite doesn't really capture the full scope of this project anymore.

dogsheep-photos 256834907 issue  
621280529 MDU6SXNzdWU2MjEyODA1Mjk= 23 create-subset command for creating a publishable subset of a photos database simonw 9599 closed 0     1 2020-05-19T20:58:20Z 2020-05-19T22:32:48Z 2020-05-19T22:32:37Z MEMBER  

I want to share a subset of my photos, without sharing everything. Idea:

$ photos-to-sqlite create-subset photos.db public.db "select sha256 from ... where ..."

So the command takes a SQL query that returns sha256 hashes, then creates a new file called public.db containing just the data corresponding to those photos.

dogsheep-photos 256834907 issue  
621286870 MDU6SXNzdWU2MjEyODY4NzA= 113 Syntactic sugar for ATTACH DATABASE simonw 9599 open 0     1 2020-05-19T21:10:00Z 2020-05-19T21:11:22Z   OWNER  

https://www.sqlite.org/lang_attach.html

Maybe something like this:

db.attach("other_db", "other_db.db")
sqlite-utils 140912432 issue  
613002220 MDU6SXNzdWU2MTMwMDIyMjA= 19 apple-photos command should work even if upload has not run simonw 9599 closed 0     1 2020-05-06T02:02:25Z 2020-05-19T20:59:59Z 2020-05-19T20:59:59Z MEMBER  

I want people to be able to query their Apple Photos metadata without having to first run upload to upload all of their files to their own S3 bucket.

To do this I can have apple-photos calculate SHA256 hashes of each photo if the uploads table does not yet exist (or does not contain that photo).

dogsheep-photos 256834907 issue  
603295970 MDU6SXNzdWU2MDMyOTU5NzA= 729 Visually distinguish integer and text columns simonw 9599 closed 0     8 2020-04-20T14:47:26Z 2020-05-18T17:20:02Z 2020-05-15T18:16:56Z OWNER  

It would be useful if I could tell from looking at the table page if a column was a integer or a text (or a float I guess?). This is particularly important for knowing if it safe to sort by that column.

datasette 107914493 issue  
617323873 MDU6SXNzdWU2MTczMjM4NzM= 766 Enable wildcard-searches by default clausjuhl 2181410 open 0     0 2020-05-13T10:14:48Z 2020-05-15T10:12:25Z   NONE  

Hi Simon.

It seems that datasette currently has wildcard-searches disabled by default (along with the boolean search-options, NEAR-queries and more, and despite the docs). If I try out the search-url provided in the docs (https://fara.datasettes.com/fara/FARA_All_ShortForms?_search=manafort), it does not handle wildcard-searches, and I'm unable to make it work on my datasette-instance.

I would argue that wildcard-searches is such a standard query, that it should be enabled by default. Requiring "_searchmode=raw" when using prefix-searches seems unnecessary. Plus: What happens to non-ascii searches when using "_searchmode=raw"? Is the "escape_fts"-function from datasette.utils ignored?

Thanks!

/Claus

datasette 107914493 issue  
615626118 MDU6SXNzdWU2MTU2MjYxMTg= 22 Try out ExifReader simonw 9599 open 0     4 2020-05-11T06:32:13Z 2020-05-14T05:59:53Z   MEMBER  

https://pypi.org/project/ExifReader/

New fork that should be able to handle EXIF in HEIC files.

Forked here: https://github.com/ianare/exif-py/issues/102#issuecomment-626376522

Refs #3

dogsheep-photos 256834907 issue  
610517472 MDU6SXNzdWU2MTA1MTc0NzI= 103 sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns b0b5h4rp13 32605365 closed 0     8 2020-05-01T02:26:14Z 2020-05-14T00:18:57Z 2020-05-14T00:18:57Z CONTRIBUTOR  

If using insert_all to put in 1000 rows of data with varying number of columns, it comes up with this message sqlite3.OperationalError: too many SQL variables if the number of columns is larger in later records (past the first row)

I've reduced SQLITE_MAX_VARS by 100 to 899 at the top of db.py to add wiggle room, so that if the column count increases it wont go past SQLite's batch limit as calculated by this line of code based on the count of the first row's dict keys

    batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns))
sqlite-utils 140912432 issue  
616271236 MDU6SXNzdWU2MTYyNzEyMzY= 112 add_foreign_key(...., ignore=True) simonw 9599 open 0     4 2020-05-12T00:24:00Z 2020-05-12T00:27:24Z   OWNER  

When using this library I often find myself wanting to "add this foreign key, but only if it doesn't exist yet". The ignore=True parameter is increasingly being used for this else where in the library (e.g. in create_view()).

sqlite-utils 140912432 issue  
461215118 MDU6SXNzdWU0NjEyMTUxMTg= 30 Option to open database in read-only mode simonw 9599 closed 0     1 2019-06-26T22:50:38Z 2020-05-11T19:17:17Z 2020-05-11T19:17:17Z OWNER  

Would this make it 100% safe to run reads against a database file that is being written to by another process?

sqlite-utils 140912432 issue  
616087149 MDU6SXNzdWU2MTYwODcxNDk= 765 publish heroku should default to currently tagged version simonw 9599 open 0     1 2020-05-11T18:24:06Z 2020-05-11T18:25:43Z   OWNER  

Had a report that deploying to Heroku was using the previously installed version of Datasette, not the latest.

Could be because of this:

https://github.com/simonw/datasette/blob/af6c6c5d6f929f951c0e63bfd1c82e37a071b50f/datasette/publish/heroku.py#L172-L179

Heroku documentation recommends pinning to specific versions https://devcenter.heroku.com/articles/python-pip

So... we could ensure we default to an install value of ["datasette>=current_tag"].

datasette 107914493 issue  
615477131 MDU6SXNzdWU2MTU0NzcxMzE= 111 sqlite-utils drop-table and drop-view commands simonw 9599 closed 0     2 2020-05-10T21:10:42Z 2020-05-11T01:58:36Z 2020-05-11T00:44:26Z OWNER  

Would be useful to be able to drop views and tables from the CLI.

sqlite-utils 140912432 issue  
613755043 MDU6SXNzdWU2MTM3NTUwNDM= 110 Support decimal.Decimal type dvhthomas 134771 closed 0     6 2020-05-07T03:57:19Z 2020-05-11T01:58:20Z 2020-05-11T01:50:11Z NONE  

Decimal types in Postgres cause a failure in db.py data type selection

I have a Django app using a MoneyField, which uses a numeric(14,0) data type in Postgres (https://www.postgresql.org/docs/9.3/datatype-numeric.html). When attempting to export that table I get the following error:

$ db-to-sqlite --table isaweb_proposal "postgres://connection" test.db
....
    column_type=COLUMN_TYPE_MAPPING[column_type],
KeyError: <class 'decimal.Decimal'>

Looking at sql_utils.db.py at 292-ish it's clear that there is no matching type for what I assume SQLAlchemy interprets as Python decimal.Decimal.

From the SQLite docs it looks like DECIMAL in other DBs are considered numeric.

I'm not quite sure if it's as simple as adding a data type to that list or if there are repercussions beyond it.

Thanks for a great tool!

sqlite-utils 140912432 issue  
615474990 MDU6SXNzdWU2MTU0NzQ5OTA= 21 bpylist.archiver.CircularReference: archive has a cycle with uid(13) simonw 9599 closed 0     10 2020-05-10T20:58:06Z 2020-05-10T22:01:48Z 2020-05-10T21:57:13Z MEMBER  
% python -i $(which photos-to-sqlite) apple-photos photos.db                     
Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/photoinfo.py", line 611, in place
    return self._place  # pylint: disable=access-member-before-definition
AttributeError: 'PhotoInfo' object has no attribute '_place'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/bin/photos-to-sqlite", line 11, in <module>
    load_entry_point('photos-to-sqlite', 'console_scripts', 'photos-to-sqlite')()
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/simon/Dropbox/Development/photos-to-sqlite/photos_to_sqlite/cli.py", line 249, in apple_photos
    photo_row = osxphoto_to_row(sha256, photo)
  File "/Users/simon/Dropbox/Development/photos-to-sqlite/photos_to_sqlite/utils.py", line 91, in osxphoto_to_row
    place = photo.place
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/photoinfo.py", line 614, in place
    self._place = PlaceInfo5(self._info["reverse_geolocation"])
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/placeinfo.py", line 505, in __init__
    self._plrevgeoloc = archiver.unarchive(revgeoloc_bplist)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 16, in unarchive
    return Unarchive(plist).top_object()
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 256, in top_object
    return self.decode_object(self.top_uid)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 247, in decode_object
    obj = klass.decode_archive(ArchivedObject(raw_obj, self))
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/placeinfo.py", line 126, in decode_archive
    mapItem = archive.decode("mapItem")
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 140, in decode
    return self._unarchiver.decode_key(self._object, key)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 216, in decode_key
    return self.decode_object(val)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 247, in decode_object
    obj = klass.decode_archive(ArchivedObject(raw_obj, self))
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/placeinfo.py", line 180, in decode_archive
    sortedPlaceInfos = archive.decode("sortedPlaceInfos")
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 140, in decode
    return self._unarchiver.decode_key(self._object, key)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 216, in decode_key
    return self.decode_object(val)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 247, in decode_object
    obj = klass.decode_archive(ArchivedObject(raw_obj, self))
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 112, in decode_archive
    return [archive._decode_index(index) for index in uids]
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 112, in <listcomp>
    return [archive._decode_index(index) for index in uids]
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 137, in _decode_index
    return self._unarchiver.decode_object(index)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 247, in decode_object
    obj = klass.decode_archive(ArchivedObject(raw_obj, self))
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/placeinfo.py", line 217, in decode_archive
    placeType = archive.decode("placeType")
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 140, in decode
    return self._unarchiver.decode_key(self._object, key)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 216, in decode_key
    return self.decode_object(val)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 227, in decode_object
    raise CircularReference(index)
bpylist.archiver.CircularReference: archive has a cycle with uid(13)

In the debugger I traced this back to:

178         @staticmethod
179         def decode_archive(archive):
180  ->         sortedPlaceInfos = archive.decode("sortedPlaceInfos")
181             finalPlaceInfos = archive.decode("finalPlaceInfos")
182             return PLRevGeoMapItem(sortedPlaceInfos, finalPlaceInfos)
dogsheep-photos 256834907 issue  
322283067 MDU6SXNzdWUzMjIyODMwNjc= 254 Escaping named parameters in canned queries philroche 247131 closed 0     4 2018-05-11T12:43:30Z 2020-05-10T14:54:14Z 2020-05-10T14:54:13Z NONE  

Thank you very much for this project.

I have created some canned queries but some of the filters include a colon eg. "com.ubuntu.cloud:server:18.04:amd64". When saved these colons are parsed as named parameters.

Is there a way to escape colons in a canned query?

datasette 107914493 issue  
520655983 MDU6SXNzdWU1MjA2NTU5ODM= 619 "Invalid SQL" page should let you edit the SQL simonw 9599 open 0     1 2019-11-10T20:54:12Z 2020-05-08T20:29:12Z   OWNER  

https://latest.datasette.io/fixtures?sql=select%0D%0A++*%0D%0Afrom%0D%0A++%5Bfoo%5D

Would be useful if this page showed you the invalid SQL you entered so you can edit it and try again.

datasette 107914493 issue  
534492501 MDU6SXNzdWU1MzQ0OTI1MDE= 648 Mechanism for adding arbitrary pages like /about simonw 9599 closed 0     13 2019-12-08T04:55:19Z 2020-05-07T15:21:19Z 2020-04-26T18:46:45Z OWNER  

For www.niche-museums.com I solved this by creating an empty about.db database file - see https://simonwillison.net/2019/Nov/25/niche-museums/

I want a neater mechanism for this.

datasette 107914493 issue  
613777056 MDU6SXNzdWU2MTM3NzcwNTY= 39 issues foreign key to repo isn't working simonw 9599 open 0     0 2020-05-07T05:11:48Z 2020-05-07T05:11:48Z   MEMBER  

https://dogsheep.simonwillison.net/github/issues?_facet=repo

If the foreign key was working those would be repository names.

From the schema at the bottom of the page:

   [repo] TEXT,

That's the wrong type and not a foreign key.

github-to-sqlite 207052882 issue  
613467382 MDU6SXNzdWU2MTM0NjczODI= 761 Allow-list pragma_table_info(tablename) and similar simonw 9599 closed 0     8 2020-05-06T16:54:14Z 2020-05-07T03:09:05Z 2020-05-06T17:18:38Z OWNER  

It would be great if pragma_table_info(tablename) was allowed to be used in queries. See also https://github.com/simonw/til/blob/master/sqlite/list-all-columns-in-a-database.md

select * from pragma_table_info(tablename); is currently disallowed for user-provided queries via a regex restriction - but could help here too.

https://github.com/simonw/datasette/blob/d349d57cdf3d577afb62bdf784af342a4d5be660/datasette/utils/init.py#L174

Originally posted by @simonw in https://github.com/simonw/datasette/issues/760#issuecomment-624729459

datasette 107914493 issue  
613491342 MDU6SXNzdWU2MTM0OTEzNDI= 762 Experiment with PRAGMA hard_heap_limit simonw 9599 open 0     0 2020-05-06T17:33:23Z 2020-05-07T03:08:44Z   OWNER  

This was added in SQLite 2020-01-22 (3.31.0): https://www.sqlite.org/changes.html#version_3_31_0

Add the sqlite3_hard_heap_limit64() interface and the corresponding PRAGMA hard_heap_limit command.

This sounds like it could be a nice extra safety measure.

datasette 107914493 issue  
613422636 MDU6SXNzdWU2MTM0MjI2MzY= 760 Way of seeing full schema for a database simonw 9599 open 0     3 2020-05-06T15:46:08Z 2020-05-06T23:49:06Z   OWNER  

I find myself wanting to quickly figure out all of the BLOB columns in a database.

A /-/schema page showing the full schema (actually since it's per-database probably /dbname/-/schema or /-/schema/dbname) would be really handy.

It would need to be carefully constructed from various queries against sqlite_master - just doing select * from sqlite_master where type='table' isn't quite enough because I also want to show indexes, triggers etc.

datasette 107914493 issue  
612673948 MDU6SXNzdWU2MTI2NzM5NDg= 759 fts search on a column doesn't work anymore due to escape_fts Krazybug 133845 closed 0     2 2020-05-05T15:03:44Z 2020-05-06T20:04:42Z 2020-05-06T17:50:57Z NONE  

Hi and first, thank you for this awesome work you make with this projet.

On a db indexed in full text search, I can't query on indexed column anymore.

This request "cauvin language:ita": is running smoothly on a old version of datasette but not on the current version.

Compare the current version query
select uuid, title, authors, year, series, language, formats, publisher, tags, identifiers from summary where rowid in (select rowid from summary_fts where summary_fts match escape_fts(:search)) order by uuid limit 101

To an older version:

select title, authors, series, uuid, language, identifiers, tags, publisher, formats, year, links from summary where rowid in (select rowid from summary_fts where summary_fts match :search) order by uuid limit 101

language is a searchable column but now the search string is known as "cauvin language:ita" literally as a search term. columns are not parsed.

datasette 107914493 issue  
612378203 MDU6SXNzdWU2MTIzNzgyMDM= 757 Question: Any fixed date for the release with the uft8-encoding fix? clausjuhl 2181410 closed 0     3 2020-05-05T06:51:20Z 2020-05-06T18:41:29Z 2020-05-06T18:41:29Z NONE  

Just a little impatient :)

datasette 107914493 issue  
612860758 MDU6SXNzdWU2MTI4NjA3NTg= 18 Switch CI solution to GitHub Actions with a macOS runner simonw 9599 open 0     1 2020-05-05T20:03:50Z 2020-05-05T23:49:18Z   MEMBER  

Refs #17.

dogsheep-photos 256834907 issue  
612860531 MDU6SXNzdWU2MTI4NjA1MzE= 17 Only install osxphotos if running on macOS simonw 9599 closed 0     3 2020-05-05T20:03:26Z 2020-05-05T20:20:05Z 2020-05-05T20:11:23Z MEMBER  

The build is broken right now because you can't pip install osxphotos on Ubuntu.

dogsheep-photos 256834907 issue  
612658444 MDU6SXNzdWU2MTI2NTg0NDQ= 109 table.create_index(..., ignore=True) simonw 9599 closed 0     1 2020-05-05T14:44:21Z 2020-05-05T14:46:53Z 2020-05-05T14:46:53Z OWNER  

Option to silently do nothing if the index already exists.

sqlite-utils 140912432 issue  
612287234 MDU6SXNzdWU2MTIyODcyMzQ= 16 Import machine-learning detected labels (dog, llama etc) from Apple Photos simonw 9599 open 0     13 2020-05-05T02:45:43Z 2020-05-05T05:38:16Z   MEMBER  

Follow-on from #1. Apple Photos runs some very sophisticated machine learning on-device to figure out if photos are of dogs, llamas and so on. I really want to extract those labels out into my own database.

dogsheep-photos 256834907 issue  
612151767 MDU6SXNzdWU2MTIxNTE3Njc= 15 Expose scores from ZCOMPUTEDASSETATTRIBUTES simonw 9599 closed 0     4 2020-05-04T20:36:07Z 2020-05-05T00:11:45Z 2020-05-05T00:11:45Z MEMBER  

The Apple Photos database has a ZCOMPUTEDASSETATTRIBUTES that looks absurdly interesting... it has calculated scores for every photo:

dogsheep-photos 256834907 issue  
612089949 MDU6SXNzdWU2MTIwODk5NDk= 756 Add pipx to installation documentation simonw 9599 closed 0     2 2020-05-04T18:49:01Z 2020-05-04T19:19:06Z 2020-05-04T19:10:33Z OWNER  

Add to this page: https://datasette.readthedocs.io/en/stable/installation.html

Here's how to install plugins: https://twitter.com/simonw/status/1257348687979778050

$ datasette plugins
[]

$ pipx inject datasette datasette-json-html            
  injected package datasette-json-html into venv datasette
done! ✨ 🌟 ✨

$ datasette plugins
[
    {
        "name": "datasette-json-html",
        "static": false,
        "templates": false,
        "version": "0.6"
    }
]
datasette 107914493 issue  
612082842 MDU6SXNzdWU2MTIwODI4NDI= 755 Fix "no such column: id" output in tests simonw 9599 closed 0     1 2020-05-04T18:37:49Z 2020-05-04T18:42:14Z 2020-05-04T18:42:14Z OWNER  
pytest
...
tests/test_custom_pages.py ........                                                                                                                                            [ 33%]
tests/test_database.py ......no such column: id
...                                                                                                                                               [ 35%]
datasette 107914493 issue  

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);
Powered by Datasette · Query took 47.834ms · About: github-to-sqlite