5,263 rows sorted by updated_at descending

View and edit SQL

Suggested facets: reactions, updated_at (date)

author_association

id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
786925280 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-786925280 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc4NjkyNTI4MA== simonw 9599 2021-02-26T22:23:10Z 2021-02-26T22:23:10Z MEMBER

Thanks!

I requested my Gmail export from takeout - once that arrives I'll test it against this and then merge the PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
786849095 https://github.com/simonw/datasette/issues/1238#issuecomment-786849095 https://api.github.com/repos/simonw/datasette/issues/1238 MDEyOklzc3VlQ29tbWVudDc4Njg0OTA5NQ== simonw 9599 2021-02-26T19:29:38Z 2021-02-26T19:29:38Z OWNER

Here's the test I wrote:

git diff tests/test_custom_pages.py
diff --git a/tests/test_custom_pages.py b/tests/test_custom_pages.py
index 6a23192..5a71f56 100644
--- a/tests/test_custom_pages.py
+++ b/tests/test_custom_pages.py
@@ -2,11 +2,19 @@ import pathlib
 import pytest
 from .fixtures import make_app_client

+TEST_TEMPLATE_DIRS = str(pathlib.Path(__file__).parent / "test_templates")
+

 @pytest.fixture(scope="session")
 def custom_pages_client():
+    with make_app_client(template_dir=TEST_TEMPLATE_DIRS) as client:
+        yield client
+
+
+@pytest.fixture(scope="session")
+def custom_pages_client_with_base_url():
     with make_app_client(
-        template_dir=str(pathlib.Path(__file__).parent / "test_templates")
+        template_dir=TEST_TEMPLATE_DIRS, config={"base_url": "/prefix/"}
     ) as client:
         yield client

@@ -23,6 +31,12 @@ def test_request_is_available(custom_pages_client):
     assert "path:/request" == response.text


+def test_custom_pages_with_base_url(custom_pages_client_with_base_url):
+    response = custom_pages_client_with_base_url.get("/prefix/request")
+    assert 200 == response.status
+    assert "path:/prefix/request" == response.text
+
+
 def test_custom_pages_nested(custom_pages_client):
     response = custom_pages_client.get("/nested/nest")
     assert 200 == response.status
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Custom pages don't work with base_url setting 813899472  
786848654 https://github.com/simonw/datasette/issues/1238#issuecomment-786848654 https://api.github.com/repos/simonw/datasette/issues/1238 MDEyOklzc3VlQ29tbWVudDc4Njg0ODY1NA== simonw 9599 2021-02-26T19:28:48Z 2021-02-26T19:28:48Z OWNER

I added a debug line just before for regex, wildcard_template here:

https://github.com/simonw/datasette/blob/afed51b1e36cf275c39e71c7cb262d6c5bdbaa31/datasette/app.py#L1148-L1155

And it showed that for some reason request.path is /prefix/prefix/request here - the prefix got doubled somehow.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Custom pages don't work with base_url setting 813899472  
786841261 https://github.com/simonw/datasette/issues/1238#issuecomment-786841261 https://api.github.com/repos/simonw/datasette/issues/1238 MDEyOklzc3VlQ29tbWVudDc4Njg0MTI2MQ== simonw 9599 2021-02-26T19:13:44Z 2021-02-26T19:13:44Z OWNER

Sounds like a bug - thanks for reporting this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Custom pages don't work with base_url setting 813899472  
786840734 https://github.com/simonw/datasette/issues/1246#issuecomment-786840734 https://api.github.com/repos/simonw/datasette/issues/1246 MDEyOklzc3VlQ29tbWVudDc4Njg0MDczNA== simonw 9599 2021-02-26T19:12:39Z 2021-02-26T19:12:47Z OWNER

Could I take this part:

             suggested_facet_sql = """ 
                 select distinct json_type({column}) 
                 from ({sql}) 
             """.format( 
                 column=escape_sqlite(column), sql=self.sql 
             ) 

And add where {column} is not null and {column} != '' perhaps?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Suggest for ArrayFacet possibly confused by blank values 817597268  
786840425 https://github.com/simonw/datasette/issues/1246#issuecomment-786840425 https://api.github.com/repos/simonw/datasette/issues/1246 MDEyOklzc3VlQ29tbWVudDc4Njg0MDQyNQ== simonw 9599 2021-02-26T19:11:56Z 2021-02-26T19:11:56Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Suggest for ArrayFacet possibly confused by blank values 817597268  
786830832 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786830832 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NjgzMDgzMg== simonw 9599 2021-02-26T18:52:40Z 2021-02-26T18:52:40Z OWNER

Could this handle lists of objects too? That would be pretty amazing - if the column has a [{...}, {...}] list in it could turn that into a many-to-many.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
786813506 https://github.com/simonw/datasette/issues/1240#issuecomment-786813506 https://api.github.com/repos/simonw/datasette/issues/1240 MDEyOklzc3VlQ29tbWVudDc4NjgxMzUwNg== simonw 9599 2021-02-26T18:19:46Z 2021-02-26T18:19:46Z OWNER

Linking to rows from custom queries is a lot harder - because given an arbitrary string of SQL it's difficult to analyze it and figure out which (if any) of the returned columns represent a primary key.

It's possible to manually write a SQL query that returns a column that will be treated as a link to another page using this plugin, but it's not particularly straight-forward: https://datasette.io/plugins/datasette-json-html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow facetting on custom queries 814591962  
786812716 https://github.com/simonw/datasette/issues/1240#issuecomment-786812716 https://api.github.com/repos/simonw/datasette/issues/1240 MDEyOklzc3VlQ29tbWVudDc4NjgxMjcxNg== simonw 9599 2021-02-26T18:18:18Z 2021-02-26T18:18:18Z OWNER

Agreed, this would be extremely useful. I'd love to be able to facet against custom queries. It's a fair bit of work to implement but it's not impossible. Closing this as a duplicate of #972.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
Allow facetting on custom queries 814591962  
786795132 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786795132 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4Njc5NTEzMg== simonw 9599 2021-02-26T17:45:53Z 2021-02-26T17:45:53Z OWNER

If there's no primary key in the JSON could use the hash_id mechanism.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
786794435 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786794435 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4Njc5NDQzNQ== simonw 9599 2021-02-26T17:44:38Z 2021-02-26T17:44:38Z OWNER

This came up in office hours!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
786786645 https://github.com/simonw/datasette/issues/1244#issuecomment-786786645 https://api.github.com/repos/simonw/datasette/issues/1244 MDEyOklzc3VlQ29tbWVudDc4Njc4NjY0NQ== simonw 9599 2021-02-26T17:30:38Z 2021-02-26T17:30:38Z OWNER

New paragraph at the top of https://docs.datasette.io/en/latest/writing_plugins.html

Want to start by looking at an example? The Datasette plugins directory lists more than 50 open source plugins with code you can explore. The plugin hooks page includes links to example plugins for each of the documented hooks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Plugin tip: look at the examples linked from the hooks page 817528452  
786050562 https://github.com/simonw/sqlite-utils/issues/237#issuecomment-786050562 https://api.github.com/repos/simonw/sqlite-utils/issues/237 MDEyOklzc3VlQ29tbWVudDc4NjA1MDU2Mg== simonw 9599 2021-02-25T16:57:56Z 2021-02-25T16:57:56Z OWNER

sqlite-utils create-view currently has a --ignore option, so adding that to sqlite-utils drop-view and sqlite-utils drop-table makes sense as well.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
db["my_table"].drop(ignore=True) parameter, plus sqlite-utils drop-table --ignore and drop-view --ignore 815554385  
786049686 https://github.com/simonw/sqlite-utils/issues/237#issuecomment-786049686 https://api.github.com/repos/simonw/sqlite-utils/issues/237 MDEyOklzc3VlQ29tbWVudDc4NjA0OTY4Ng== simonw 9599 2021-02-25T16:56:42Z 2021-02-25T16:56:42Z OWNER

So:

    db["my_table"].drop(ignore=True)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
db["my_table"].drop(ignore=True) parameter, plus sqlite-utils drop-table --ignore and drop-view --ignore 815554385  
786049394 https://github.com/simonw/sqlite-utils/issues/237#issuecomment-786049394 https://api.github.com/repos/simonw/sqlite-utils/issues/237 MDEyOklzc3VlQ29tbWVudDc4NjA0OTM5NA== simonw 9599 2021-02-25T16:56:14Z 2021-02-25T16:56:14Z OWNER

Other methods (db.create_view() for example) have ignore=True to mean "don't throw an error if this causes a problem", so I'm good with adding that to .drop_view().

I don't like using it as the default partly because that would be a very minor breaking API change, but mainly because I don't want to hide mistakes people make - e.g. if you mistype the name of the table you are trying to drop.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
db["my_table"].drop(ignore=True) parameter, plus sqlite-utils drop-table --ignore and drop-view --ignore 815554385  
786037219 https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786037219 https://api.github.com/repos/simonw/sqlite-utils/issues/240 MDEyOklzc3VlQ29tbWVudDc4NjAzNzIxOQ== simonw 9599 2021-02-25T16:39:23Z 2021-02-25T16:39:23Z OWNER

Example from the docs:

>>> db = sqlite_utils.Database(memory=True)
>>> db["dogs"].insert({"name": "Cleo"})
>>> for pk, row in db["dogs"].pks_and_rows_where():
...     print(pk, row)
1 {'rowid': 1, 'name': 'Cleo'}

>>> db["dogs_with_pk"].insert({"id": 5, "name": "Cleo"}, pk="id")
>>> for pk, row in db["dogs_with_pk"].pks_and_rows_where():
...     print(pk, row)
5 {'id': 5, 'name': 'Cleo'}

>>> db["dogs_with_compound_pk"].insert(
...     {"species": "dog", "id": 3, "name": "Cleo"},
...     pk=("species", "id")
... )
>>> for pk, row in db["dogs_with_compound_pk"].pks_and_rows_where():
...     print(pk, row)
('dog', 3) {'species': 'dog', 'id': 3, 'name': 'Cleo'}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.pks_and_rows_where() method returning primary keys along with the rows 816560819  
786036355 https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786036355 https://api.github.com/repos/simonw/sqlite-utils/issues/240 MDEyOklzc3VlQ29tbWVudDc4NjAzNjM1NQ== simonw 9599 2021-02-25T16:38:07Z 2021-02-25T16:38:07Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.pks_and_rows_where() method returning primary keys along with the rows 816560819  
786035142 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786035142 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NjAzNTE0Mg== simonw 9599 2021-02-25T16:36:17Z 2021-02-25T16:36:17Z OWNER

WIP in a pull request.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
786016380 https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786016380 https://api.github.com/repos/simonw/sqlite-utils/issues/240 MDEyOklzc3VlQ29tbWVudDc4NjAxNjM4MA== simonw 9599 2021-02-25T16:10:01Z 2021-02-25T16:10:01Z OWNER

I prototyped this and I like it:

In [1]: import sqlite_utils
In [2]: db = sqlite_utils.Database("/Users/simon/Dropbox/Development/datasette/fixtures.db")
In [3]: list(db["compound_primary_key"].pks_and_rows_where())
Out[3]: [(('a', 'b'), {'pk1': 'a', 'pk2': 'b', 'content': 'c'})]
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.pks_and_rows_where() method returning primary keys along with the rows 816560819  
786007209 https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786007209 https://api.github.com/repos/simonw/sqlite-utils/issues/240 MDEyOklzc3VlQ29tbWVudDc4NjAwNzIwOQ== simonw 9599 2021-02-25T15:57:50Z 2021-02-25T15:57:50Z OWNER

table.pks_and_rows_where(...) is explicit and I think less ambiguous than the other options.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.pks_and_rows_where() method returning primary keys along with the rows 816560819  
786006794 https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786006794 https://api.github.com/repos/simonw/sqlite-utils/issues/240 MDEyOklzc3VlQ29tbWVudDc4NjAwNjc5NA== simonw 9599 2021-02-25T15:57:17Z 2021-02-25T15:57:28Z OWNER

I quite like pks_with_rows_where(...) - but grammatically it suggests it will return the primary keys that exist where their rows match the criteria - "pks with rows" can be interpreted as "pks for the rows that..." as opposed to "pks accompanied by rows"

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.pks_and_rows_where() method returning primary keys along with the rows 816560819  
786005078 https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786005078 https://api.github.com/repos/simonw/sqlite-utils/issues/240 MDEyOklzc3VlQ29tbWVudDc4NjAwNTA3OA== simonw 9599 2021-02-25T15:54:59Z 2021-02-25T15:56:16Z OWNER

Is pk_rows_where() a good name? It sounds like it returns "primary key rows" which isn't a thing. It actually returns rows along with their primary key.

Other options:

  • table.rows_with_pk_where(...) - should this return (row, pk) rather than (pk, row)?
  • table.rows_where_pk(...)
  • table.pk_and_rows_where(...)
  • table.pk_with_rows_where(...)
  • table.pks_with_rows_where(...) - because rows is pluralized, so pks should be pluralized too?
  • table.pks_rows_where(...)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.pks_and_rows_where() method returning primary keys along with the rows 816560819  
786001768 https://github.com/simonw/sqlite-utils/issues/240#issuecomment-786001768 https://api.github.com/repos/simonw/sqlite-utils/issues/240 MDEyOklzc3VlQ29tbWVudDc4NjAwMTc2OA== simonw 9599 2021-02-25T15:50:28Z 2021-02-25T15:52:12Z OWNER

One option: .rows_where() could grow a ensure_pk=True option which checks to see if the table is a rowid table and, if it is, includes that in the select.

Or... how about you can call .rows_where(..., pks=True) and it will yield (pk, rowdict) tuple pairs instead of just returning the sequence of dictionaries?

I'm always a little bit nervous of methods that vary their return type based on their arguments. Maybe this would be a separate method instead?

    for pk, row in table.pk_rows_where(...):
        # ...
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.pks_and_rows_where() method returning primary keys along with the rows 816560819  
785992158 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785992158 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk5MjE1OA== simonw 9599 2021-02-25T15:37:04Z 2021-02-25T15:37:04Z OWNER

Here's the current implementation of .extract(): https://github.com/simonw/sqlite-utils/blob/806c21044ac8d31da35f4c90600e98115aade7c6/sqlite_utils/db.py#L1049-L1074

Tricky detail here: I create the lookup table first, based on the types of the columns that are being extracted.

I need to do this because extraction currently uses unique tuples of values, so the table has to be created in advance.

But if I'm using these new expand functions to figure out what's going to be extracted, I don't know the names of the columns and their types in advance. I'm only going to find those out during the transformation.

This may turn out to be incompatible with how .extract() works at the moment. I may need a new method, .extract_expand() perhaps? It could be simpler - work only against a single column for example.

I can still use the existing sqlite-utils extract CLI command though, with a --json flag and a rule that you can't run it against multiple columns.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785983837 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785983837 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk4MzgzNw== simonw 9599 2021-02-25T15:25:21Z 2021-02-25T15:28:57Z OWNER

Problem with calling this argument transform= is that the term "transform" already means something else in this library.

I could use convert= instead.

... but that doesn't instantly make me think of turning a value into multiple columns.

How about expand=? I've not used that term anywhere yet.

db["Reports"].extract(["Reported by"], expand={"Reported by": json.loads})

I think that works. You're expanding a single value into several columns of information.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785983070 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785983070 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk4MzA3MA== simonw 9599 2021-02-25T15:24:17Z 2021-02-25T15:24:17Z OWNER

I'm going to go with last-wins - so if multiple transform functions return the same key the last one will over-write the others.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785980813 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785980813 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk4MDgxMw== simonw 9599 2021-02-25T15:21:02Z 2021-02-25T15:23:47Z OWNER

Maybe the Python version takes an optional dictionary mapping column names to transformation functions? It could then merge all of those results together - and maybe throw an error if the same key is produced by more than one column.

    db["Reports"].extract(["Reported by"], transform={"Reported by": json.loads})

Or it could have an option for different strategies if keys collide: first wins, last wins, throw exception, add a prefix to the new column name. That feels a bit too complex for an edge-case though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785980083 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785980083 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk4MDA4Mw== simonw 9599 2021-02-25T15:20:02Z 2021-02-25T15:20:02Z OWNER

It would be OK if the CLI version only allows you to specify a single column if you are using the --json option.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785979769 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785979769 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk3OTc2OQ== simonw 9599 2021-02-25T15:19:37Z 2021-02-25T15:19:37Z OWNER

For the Python version I'd like to be able to provide a transformation callback function - which can be json.loads but could also be anything else which accepts the value of the current column and returns a Python dictionary of columns and their values to use in the new table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785979192 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785979192 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk3OTE5Mg== simonw 9599 2021-02-25T15:18:46Z 2021-02-25T15:18:46Z OWNER

Likewise the sqlite-utils extract command takes one or more columns:

Usage: sqlite-utils extract [OPTIONS] PATH TABLE COLUMNS...

  Extract one or more columns into a separate table

Options:
  --table TEXT             Name of the other table to extract columns to
  --fk-column TEXT         Name of the foreign key column to add to the table
  --rename <TEXT TEXT>...  Rename this column in extracted table
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785978689 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785978689 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk3ODY4OQ== simonw 9599 2021-02-25T15:18:03Z 2021-02-25T15:18:03Z OWNER

The Python .extract() method currently starts like this:

def extract(self, columns, table=None, fk_column=None, rename=None):
        rename = rename or {}
        if isinstance(columns, str):
            columns = [columns]
        if not set(columns).issubset(self.columns_dict.keys()):
            raise InvalidColumns(
                "Invalid columns {} for table with columns {}".format(
                    columns, list(self.columns_dict.keys())
                )
            )
        ...

Note that it takes a list of columns (and treats a string as a single item list). That's because it can be called with a list of columns and it will use them to populate another table of unique tuples of those column values.

So a new mechanism that can instead read JSON values from a single column needs to be compatible with that existing design.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785972074 https://github.com/simonw/sqlite-utils/issues/238#issuecomment-785972074 https://api.github.com/repos/simonw/sqlite-utils/issues/238 MDEyOklzc3VlQ29tbWVudDc4NTk3MjA3NA== simonw 9599 2021-02-25T15:08:36Z 2021-02-25T15:08:36Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.add_foreign_key() corrupts database if column contains a space 816523763  
785485597 https://github.com/simonw/datasette/pull/1243#issuecomment-785485597 https://api.github.com/repos/simonw/datasette/issues/1243 MDEyOklzc3VlQ29tbWVudDc4NTQ4NTU5Nw== codecov[bot] 22429695 2021-02-25T00:28:30Z 2021-02-25T00:28:30Z NONE

Codecov Report

Merging #1243 (887bfd2) into main (726f781) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1243   +/-   ##
=======================================
  Coverage   91.56%   91.56%           
=======================================
  Files          34       34           
  Lines        4242     4242           
=======================================
  Hits         3884     3884           
  Misses        358      358           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 726f781...32652d9. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
fix small typo 815955014  
784638394 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-784638394 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc4NDYzODM5NA== UtahDave 306240 2021-02-24T00:36:18Z 2021-02-24T00:36:18Z NONE

I noticed that @simonw is using black for formatting. I ran black on my additions in this PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
784567547 https://github.com/simonw/datasette/issues/1241#issuecomment-784567547 https://api.github.com/repos/simonw/datasette/issues/1241 MDEyOklzc3VlQ29tbWVudDc4NDU2NzU0Nw== simonw 9599 2021-02-23T22:45:56Z 2021-02-23T22:46:12Z OWNER

I really like the way the Share feature on Stack Overflow works: https://stackoverflow.com/questions/18934149/how-can-i-use-postgresqls-text-column-type-in-django

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
[Feature request] Button to copy URL 814595021  
784347646 https://github.com/simonw/datasette/issues/1241#issuecomment-784347646 https://api.github.com/repos/simonw/datasette/issues/1241 MDEyOklzc3VlQ29tbWVudDc4NDM0NzY0Ng== Kabouik 7107523 2021-02-23T16:55:26Z 2021-02-23T16:57:39Z NONE

I think it's possible that many users these days no longer assume they can paste a URL from the browser address bar (if they ever understood that at all) because to many apps are SPAs with broken URLs.

Absolutely, that's why I thought my corner case with iframe preventing access to the datasette URL could actually be relevant in more general situations.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
[Feature request] Button to copy URL 814595021  
784334931 https://github.com/simonw/datasette/issues/1241#issuecomment-784334931 https://api.github.com/repos/simonw/datasette/issues/1241 MDEyOklzc3VlQ29tbWVudDc4NDMzNDkzMQ== simonw 9599 2021-02-23T16:37:26Z 2021-02-23T16:37:26Z OWNER

A "Share link" button would only be needed on the table page and the arbitrary query page I think - and maybe on the row page, especially as that page starts to grow more features in the future.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
[Feature request] Button to copy URL 814595021  
784333768 https://github.com/simonw/datasette/issues/1241#issuecomment-784333768 https://api.github.com/repos/simonw/datasette/issues/1241 MDEyOklzc3VlQ29tbWVudDc4NDMzMzc2OA== simonw 9599 2021-02-23T16:35:51Z 2021-02-23T16:35:51Z OWNER

This can definitely be done with a plugin.

Adding to Datasette itself is an interesting idea. I think it's possible that many users these days no longer assume they can paste a URL from the browser address bar (if they ever understood that at all) because to many apps are SPAs with broken URLs.

The shareable URLs are actually a key feature of Datasette - so maybe they should be highlighted in the default UI?

I built a "copy to clipboard" feature for datasette-copyable and wrote up how that works here: https://til.simonwillison.net/javascript/copy-button

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
[Feature request] Button to copy URL 814595021  
784312460 https://github.com/simonw/datasette/issues/1240#issuecomment-784312460 https://api.github.com/repos/simonw/datasette/issues/1240 MDEyOklzc3VlQ29tbWVudDc4NDMxMjQ2MA== Kabouik 7107523 2021-02-23T16:07:10Z 2021-02-23T16:08:28Z NONE

Likewise, while answering to another issue regarding the Vega plugin, I realized that there is no such way of linking rows after a custom query, I only get this "Link" column with individual URLs for the default SQL view:

Or is it there and I am just missing the option in my custom queries?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow facetting on custom queries 814591962  
784157345 https://github.com/simonw/datasette/issues/1218#issuecomment-784157345 https://api.github.com/repos/simonw/datasette/issues/1218 MDEyOklzc3VlQ29tbWVudDc4NDE1NzM0NQ== soobrosa 1244799 2021-02-23T12:12:17Z 2021-02-23T12:12:17Z NONE

Topline this fixed the same problem for me.

brew install python@3.7
ln -s /usr/local/opt/python@3.7/bin/python3.7 /usr/local/opt/python/bin/python3.7
pip3 uninstall -y numpy
pip3 uninstall -y setuptools
pip3 install setuptools
pip3 install numpy
pip3 install datasette-publish-fly
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
/usr/local/opt/python3/bin/python3.6: bad interpreter: No such file or directory 803356942  
783794520 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-783794520 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc4Mzc5NDUyMA== UtahDave 306240 2021-02-23T01:13:54Z 2021-02-23T01:13:54Z NONE

Also, @simonw I created a test based off the existing tests. I think it's working correctly

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
783774084 https://github.com/simonw/datasette/issues/1239#issuecomment-783774084 https://api.github.com/repos/simonw/datasette/issues/1239 MDEyOklzc3VlQ29tbWVudDc4Mzc3NDA4NA== simonw 9599 2021-02-23T00:18:56Z 2021-02-23T00:19:18Z OWNER

Bug is here: https://github.com/simonw/datasette/blob/42caabf7e9e6e4d69ef6dd7de16f2cd96bc79d5b/datasette/filters.py#L149-L165

Those json_each lines should be:

select {t}.rowid from {t}, json_each([{t}].[{c}]) j
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
JSON filter fails if column contains spaces 813978858  
783688547 https://github.com/dogsheep/google-takeout-to-sqlite/issues/4#issuecomment-783688547 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4 MDEyOklzc3VlQ29tbWVudDc4MzY4ODU0Nw== UtahDave 306240 2021-02-22T21:31:28Z 2021-02-22T21:31:28Z NONE

@Btibert3 I've opened a PR with my initial attempt at this. Would you be willing to give this a try?

https://github.com/dogsheep/google-takeout-to-sqlite/pull/5

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Feature Request: Gmail 778380836  
783676548 https://github.com/simonw/datasette/issues/1237#issuecomment-783676548 https://api.github.com/repos/simonw/datasette/issues/1237 MDEyOklzc3VlQ29tbWVudDc4MzY3NjU0OA== simonw 9599 2021-02-22T21:10:19Z 2021-02-22T21:10:25Z OWNER

This is another change which is a little bit hard to figure out because I haven't solved #878 yet.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
?_pretty=1 option for pretty-printing JSON output 812704869  
783674659 https://github.com/simonw/datasette/issues/1234#issuecomment-783674659 https://api.github.com/repos/simonw/datasette/issues/1234 MDEyOklzc3VlQ29tbWVudDc4MzY3NDY1OQ== simonw 9599 2021-02-22T21:06:28Z 2021-02-22T21:06:28Z OWNER

I'm not going to work on this for a while, but if anyone has needs or ideas around that they can add them to this issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Runtime support for ATTACHing multiple databases 811505638  
783674038 https://github.com/simonw/datasette/issues/1236#issuecomment-783674038 https://api.github.com/repos/simonw/datasette/issues/1236 MDEyOklzc3VlQ29tbWVudDc4MzY3NDAzOA== simonw 9599 2021-02-22T21:05:21Z 2021-02-22T21:05:21Z OWNER

It's good on mobile - iOS at least. Going to close this open new issues if anyone reports bugs.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 1,
    "eyes": 0
}
Ability to increase size of the SQL editor window 812228314  
783662968 https://github.com/simonw/sqlite-utils/issues/220#issuecomment-783662968 https://api.github.com/repos/simonw/sqlite-utils/issues/220 MDEyOklzc3VlQ29tbWVudDc4MzY2Mjk2OA== mhalle 649467 2021-02-22T20:44:51Z 2021-02-22T20:44:51Z NONE

Actually, coming back to this, I have a clearer use case for enabling fts generation for views: making it easier to bring in text from lookup tables and other joins.

The datasette documentation describes populating an fts table like so:

INSERT INTO "items_fts" (rowid, name, description, category_name)
    SELECT items. rowid,
    items.name,
    items.description,
    categories.name
    FROM items JOIN categories ON items.category_id=categories.id;

Alternatively if you have fts support in sqlite_utils for views (which sqlite and fts5 support), you can do the same thing just by creating a view that captures the above joins as columns, then creating an fts table from that view. Such an fts table can be created using sqlite_utils, where one created with your method can't.

The resulting fts table can then be used by a whole family of related tables and views in the manner you described earlier in this issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better error message for *_fts methods against views 783778672  
783560017 https://github.com/simonw/datasette/issues/1166#issuecomment-783560017 https://api.github.com/repos/simonw/datasette/issues/1166 MDEyOklzc3VlQ29tbWVudDc4MzU2MDAxNw== thorn0 94334 2021-02-22T18:00:57Z 2021-02-22T18:13:11Z NONE

Hi! I don't think Prettier supports this syntax for globs: datasette/static/*[!.min].js Are you sure that works?
Prettier uses https://github.com/mrmlnc/fast-glob, which in turn uses https://github.com/micromatch/micromatch, and the docs for these packages don't mention this syntax. As per the docs, square brackets should work as in regexes (foo-[1-5].js).

Tested it. Apparently, it works as a negated character class in regexes (like [^.min]). I wonder where this syntax comes from. Micromatch doesn't support that:

micromatch(['static/table.js', 'static/n.js'], ['static/*[!.min].js']);
// result: ["static/n.js"] -- brackets are treated like [!.min] in regexes, without negation
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Adopt Prettier for JavaScript code formatting 777140799  
783265830 https://github.com/simonw/datasette/issues/782#issuecomment-783265830 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4MzI2NTgzMA== frankieroberto 30665 2021-02-22T10:21:14Z 2021-02-22T10:21:14Z NONE

@simonw:

The problem there is that ?_size=x isn't actually doing the same thing as the SQL limit keyword.

Interesting! Although I don't think it matters too much what the underlying implementation is - I more meant that limit is familiar to developers conceptually as "up to and including this number, if they exist", whereas "size" is potentially more ambiguous. However, it's probably no big deal either way.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782789598 https://github.com/simonw/datasette/issues/782#issuecomment-782789598 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc4OTU5OA== simonw 9599 2021-02-21T03:30:02Z 2021-02-21T03:30:02Z OWNER

Another benefit to default:object - I could include a key that shows a list of available extras. I could then use that to power an interactive API explorer.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782765665 https://github.com/simonw/datasette/issues/782#issuecomment-782765665 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc2NTY2NQ== simonw 9599 2021-02-20T23:34:41Z 2021-02-20T23:34:41Z OWNER

OK, I'm back to the "top level object as the default" side of things now - it's pretty much unanimous at this point, and it's certainly true that it's not a decision you'll even regret.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782756398 https://github.com/simonw/datasette/issues/782#issuecomment-782756398 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc1NjM5OA== simonrjones 601316 2021-02-20T22:05:48Z 2021-02-20T22:05:48Z NONE

I think it’s a good idea if the top level item of the response JSON is always an object, rather than an array, at least as the default.

I agree it is more predictable if the top level item is an object with a rows or data object that contains an array of data, which then allows for other top-level meta data.

I can see the argument for removing this and just using an array for convenience - but I think that's OK as an option (as you have now).

Rather than have lots of top-level keys you could have a "meta" object to contain non-data stuff. You could use something like "links" for API endpoint URLs (or use a standard like HAL). Which would then leave the top level a bit cleaner - if that's what you what.

Have you had much feedback from users who use the Datasette API a lot?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782748501 https://github.com/simonw/datasette/issues/782#issuecomment-782748501 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0ODUwMQ== simonw 9599 2021-02-20T20:58:18Z 2021-02-20T20:58:18Z OWNER

Yet another option: support a ?_path=x option which returns a nested path from the result. So you could do this:

/github/commits.json?_path=rows - to get back a top-level array pulled from the "rows" key.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782748093 https://github.com/simonw/datasette/issues/782#issuecomment-782748093 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0ODA5Mw== simonw 9599 2021-02-20T20:54:52Z 2021-02-20T20:54:52Z OWNER

Have you given any thought as to whether to pretty print (format with spaces) the output or not? Can be useful for debugging/exploring in a browser or other basic tools which don’t parse the JSON. Could be default (can’t be much bigger with gzip?) or opt-in.

Adding a ?_pretty=1 option that does that is a great idea, I'm filing a ticket for it: #1237

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782747878 https://github.com/simonw/datasette/issues/782#issuecomment-782747878 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0Nzg3OA== simonw 9599 2021-02-20T20:53:11Z 2021-02-20T20:53:11Z OWNER

... though thinking about this further, I could re-implement the select * from commits (but only return a max of 10 results) feature using a nested select * from (select * from commits) limit 10 query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782747743 https://github.com/simonw/datasette/issues/782#issuecomment-782747743 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0Nzc0Mw== simonw 9599 2021-02-20T20:52:10Z 2021-02-20T20:52:10Z OWNER

Minor suggestion: rename size query param to limit, to better reflect that it’s a maximum number of rows returned rather than a guarantee of getting that number, and also for consistency with the SQL keyword?

The problem there is that ?_size=x isn't actually doing the same thing as the SQL limit keyword. Consider this query:

https://latest-with-plugins.datasette.io/github?sql=select+*+from+commits - select * from commits

Datasette returns 1,000 results, and shows a "Custom SQL query returning more than 1,000 rows" message at the top. That's the size kicking in - I only fetch the first 1,000 results from the cursor to avoid exhausting resources. In the JSON version of that at https://latest-with-plugins.datasette.io/github.json?sql=select+*+from+commits there's a "truncated": true key to let you know what happened.

I find myself using ?_size=2 against Datasette occasionally if I know the rows being returned are really big and I don't want to load 10+MB of HTML.

This is only really a concern for arbitrary SQL queries though - for table pages such as https://latest-with-plugins.datasette.io/github/commits?_size=10 adding ?_size=10 actually puts a limit 10 on the underlying SQL query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782747164 https://github.com/simonw/datasette/issues/782#issuecomment-782747164 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0NzE2NA== simonw 9599 2021-02-20T20:47:16Z 2021-02-20T20:47:16Z OWNER

(I started a thread on Twitter about this: https://twitter.com/simonw/status/1363220355318358016)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782746755 https://github.com/simonw/datasette/issues/782#issuecomment-782746755 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0Njc1NQ== frankieroberto 30665 2021-02-20T20:44:05Z 2021-02-20T20:44:05Z NONE

Minor suggestion: rename size query param to limit, to better reflect that it’s a maximum number of rows returned rather than a guarantee of getting that number, and also for consistency with the SQL keyword?

I like the idea of specifying a limit of 0 if you don’t want any rows data - and returning an empty array under the rows key seems fine.

Have you given any thought as to whether to pretty print (format with spaces) the output or not? Can be useful for debugging/exploring in a browser or other basic tools which don’t parse the JSON. Could be default (can’t be much bigger with gzip?) or opt-in.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782746633 https://github.com/simonw/datasette/issues/782#issuecomment-782746633 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0NjYzMw== simonw 9599 2021-02-20T20:43:07Z 2021-02-20T20:43:07Z OWNER

Another option: .json always returns an object with a list of keys that gets increased through adding ?_extra= parameters.

.jsona always returns a JSON array of objects

I had something similar to this in Datasette a few years ago - a .jsono extension, which still redirects to the shape=array version.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782745199 https://github.com/simonw/datasette/issues/782#issuecomment-782745199 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0NTE5OQ== frankieroberto 30665 2021-02-20T20:32:03Z 2021-02-20T20:32:03Z NONE

I think it’s a good idea if the top level item of the response JSON is always an object, rather than an array, at least as the default. Mainly because it allows you to add extra keys in a backwards-compatible way. Also just seems more expected somehow.

The API design guidance for the UK government also recommends this: https://www.gov.uk/guidance/gds-api-technical-and-data-standards#use-json

I also strongly dislike having versioned APIs (eg with a /v1/ path prefix, as it invariably means that old versions stop working at some point, even though the bit of the API you’re using might not have changed at all.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782742233 https://github.com/simonw/datasette/issues/782#issuecomment-782742233 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MjIzMw== simonw 9599 2021-02-20T20:09:16Z 2021-02-20T20:09:16Z OWNER

I just noticed that https://latest-with-plugins.datasette.io/github/commits.json-preview?_extra=total&_size=0&_trace=1 executes 35 SQL queries at the moment! A great reminder that a big improvement from this change will be a reduction in queries through not calculating things like suggested facets unless they are explicitly requested.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782741719 https://github.com/simonw/datasette/issues/782#issuecomment-782741719 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MTcxOQ== simonw 9599 2021-02-20T20:05:04Z 2021-02-20T20:05:04Z OWNER

The only advantage of headers is that you don’t need to do .rows, but that’s actually good as a data validation step anyway—if .rows is missing assume there’s an error and do your error handling path instead of parsing the rest.

This is something I've not thought very hard about. If there's an error, I need to return a top-level object, not a top-level array, so I can provide details of the error.

But this means that client code will have to handle this difference - it will have to know that the returned data can be array-shaped if nothing went wrong, and object-shaped if there's an error.

The HTTP status code helps here - calling client code can know that a 200 status code means there will be an array, but an error status code means an object.

If developers really hate that the shape could be different, they can always use ?_extra=next to ensure that the top level item is an object whether or not an error occurred. So I think this is OK.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782741107 https://github.com/simonw/datasette/issues/782#issuecomment-782741107 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MTEwNw== simonw 9599 2021-02-20T20:00:22Z 2021-02-20T20:00:22Z OWNER

A really exciting opportunity this opens up is for parallel execution - the facets() and suggested_facets() and total() async functions could be called in parallel, which could speed things up if I'm confident the SQLite thread pool can execute on multiple CPU cores (it should be able to because the Python sqlite3 module releases the GIL while it's executing C code).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782740985 https://github.com/simonw/datasette/issues/782#issuecomment-782740985 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MDk4NQ== simonw 9599 2021-02-20T19:59:21Z 2021-02-20T19:59:21Z OWNER

This design should be influenced by how it's implemented.

One implementation that could be nice is that each of the keys that can be requested - next_url, total etc - maps to an async def function which can do the work. So that expensive count(*) will only be executed by the async def total function if it is requested.

This raises more questions: Both next and next_url work off the same underlying data, so if they are both requested can we re-use the work that next does somehow? Maybe by letting these functions depend on each other (so next_url() knows to first call next(), but only if it hasn't been called already.

I think I need to flesh out the full default collection of ?_extra= parameters in order to design how they will work under the hood.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782740604 https://github.com/simonw/datasette/issues/782#issuecomment-782740604 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MDYwNA== simonw 9599 2021-02-20T19:56:21Z 2021-02-20T19:56:33Z OWNER

I think I want to support ?_extra=next_url,total in addition to ?_extra=next_url&_extra=total - partly because it's less characters to type, and also because I know there exist URL handling library that don't know how to handle the same parameter multiple times (though they're going to break against Datasette already, so it's not a big deal).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782740488 https://github.com/simonw/datasette/issues/782#issuecomment-782740488 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MDQ4OA== simonw 9599 2021-02-20T19:55:23Z 2021-02-20T19:55:23Z OWNER

Am I saying you won't get back a key in the response unless you explicitly request it, either by name or by specifying a bundle of extras (e.g. all or paginated)?

The "truncated": true key that tells you that your arbitrary query returned more than X results but was truncated is pretty important, do I really want people to have to opt-in to that one?

Also: having bundles like all or paginated live in the same namespace as single keys like next_url or total is a little odd - you can't tell by looking at them if they'll add a key called all or if they'll add a bunch of other stuff.

Maybe bundles could be prefixed with something, perhaps an underscore? ?_extra=_all and ?_extra=_paginated for example.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782739926 https://github.com/simonw/datasette/issues/782#issuecomment-782739926 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4MjczOTkyNg== simonw 9599 2021-02-20T19:51:30Z 2021-02-20T19:52:19Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782709425 https://github.com/simonw/datasette/issues/782#issuecomment-782709425 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4MjcwOTQyNQ== simonw 9599 2021-02-20T16:24:54Z 2021-02-20T16:24:54Z OWNER

Having shortcuts means I could support ?_extra=all for returning ALL possible keys.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782709270 https://github.com/simonw/datasette/issues/782#issuecomment-782709270 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4MjcwOTI3MA== simonw 9599 2021-02-20T16:23:51Z 2021-02-20T16:24:11Z OWNER

Also how would you opt out of returning the "rows" key? I sometimes want to do this - if I want to get back just the count or just the facets for example.

Some options:

  • /fixtures/roadside_attractions.json?_extra=total&_extra=-rows
  • /fixtures/roadside_attractions.json?_extra=total&_skip=rows
  • /fixtures/roadside_attractions.json?_extra=total&_size=0

I quite like that last one with ?_size=0. I think it would still return "rows": [] but that's OK.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782708938 https://github.com/simonw/datasette/issues/782#issuecomment-782708938 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4MjcwODkzOA== simonw 9599 2021-02-20T16:22:14Z 2021-02-20T16:22:14Z OWNER

I'm leaning back in the direction of a flat JSON array of objects as the default - this:

/fixtures/roadside_attractions.json

Would return:

[
    {
      "pk": 1,
      "name": "The Mystery Spot",
      "address": "465 Mystery Spot Road, Santa Cruz, CA 95065",
      "latitude": 37.0167,
      "longitude": -122.0024
    },
    {
      "pk": 2,
      "name": "Winchester Mystery House",
      "address": "525 South Winchester Boulevard, San Jose, CA 95128",
      "latitude": 37.3184,
      "longitude": -121.9511
    },
    {
      "pk": 3,
      "name": "Burlingame Museum of PEZ Memorabilia",
      "address": "214 California Drive, Burlingame, CA 94010",
      "latitude": 37.5793,
      "longitude": -122.3442
    },
    {
      "pk": 4,
      "name": "Bigfoot Discovery Museum",
      "address": "5497 Highway 9, Felton, CA 95018",
      "latitude": 37.0414,
      "longitude": -122.0725
    }
]

To get the version that includes pagination information you would use the ?_extra= parameter. For example:

/fixtures/roadside_attractions.json?_extra=total&_extra=next_url

{
  "rows": [
    {
      "pk": 1,
      "name": "The Mystery Spot",
      "address": "465 Mystery Spot Road, Santa Cruz, CA 95065",
      "latitude": 37.0167,
      "longitude": -122.0024
    },
    {
      "pk": 2,
      "name": "Winchester Mystery House",
      "address": "525 South Winchester Boulevard, San Jose, CA 95128",
      "latitude": 37.3184,
      "longitude": -121.9511
    },
    {
      "pk": 3,
      "name": "Burlingame Museum of PEZ Memorabilia",
      "address": "214 California Drive, Burlingame, CA 94010",
      "latitude": 37.5793,
      "longitude": -122.3442
    },
    {
      "pk": 4,
      "name": "Bigfoot Discovery Museum",
      "address": "5497 Highway 9, Felton, CA 95018",
      "latitude": 37.0414,
      "longitude": -122.0725
    }
  ],
  "total": 4,
  "next_url": null
}

ANY usage of the ?_extra= parameter would turn the list into an object with a "rows" key.

Opting in to the total is nice because it's actually expensive to run a count, so only doing a count if the user requests it feels good.

But... having to add ?_extra=total&_extra=next_url for the common case of wanting both the total count and the URL to get the next page of results is a bit verbose. So maybe support aliases, like ?_extra=paginated which is a shortcut for ?_extra=total&_extra=next_url?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default JSON format in preparation for Datasette 1.0 627794879  
782464306 https://github.com/simonw/datasette/issues/1236#issuecomment-782464306 https://api.github.com/repos/simonw/datasette/issues/1236 MDEyOklzc3VlQ29tbWVudDc4MjQ2NDMwNg== simonw 9599 2021-02-19T23:57:32Z 2021-02-19T23:57:32Z OWNER

Need to test this on mobile.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ability to increase size of the SQL editor window 812228314  
782464215 https://github.com/simonw/datasette/issues/1236#issuecomment-782464215 https://api.github.com/repos/simonw/datasette/issues/1236 MDEyOklzc3VlQ29tbWVudDc4MjQ2NDIxNQ== simonw 9599 2021-02-19T23:57:13Z 2021-02-19T23:57:13Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ability to increase size of the SQL editor window 812228314  
782462049 https://github.com/simonw/datasette/issues/1236#issuecomment-782462049 https://api.github.com/repos/simonw/datasette/issues/1236 MDEyOklzc3VlQ29tbWVudDc4MjQ2MjA0OQ== simonw 9599 2021-02-19T23:51:12Z 2021-02-19T23:51:12Z OWNER

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ability to increase size of the SQL editor window 812228314  
782459550 https://github.com/simonw/datasette/issues/1236#issuecomment-782459550 https://api.github.com/repos/simonw/datasette/issues/1236 MDEyOklzc3VlQ29tbWVudDc4MjQ1OTU1MA== simonw 9599 2021-02-19T23:45:30Z 2021-02-19T23:45:30Z OWNER

Encoded using https://meyerweb.com/eric/tools/dencoder/

%3Csvg%20aria-labelledby%3D%22cm-drag-to-resize%22%20role%3D%22img%22%20fill%3D%22%23ccc%22%20stroke%3D%22%23ccc%22%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%20viewBox%3D%220%200%2016%2016%22%20width%3D%2216%22%20height%3D%2216%22%3E%0A%20%20%3Ctitle%20id%3D%22cm-drag-to-resize%22%3EDrag%20to%20resize%3C%2Ftitle%3E%0A%20%20%3Cpath%20fill-rule%3D%22evenodd%22%20d%3D%22M1%202.75A.75.75%200%20011.75%202h12.5a.75.75%200%20110%201.5H1.75A.75.75%200%20011%202.75zm0%205A.75.75%200%20011.75%207h12.5a.75.75%200%20110%201.5H1.75A.75.75%200%20011%207.75zM1.75%2012a.75.75%200%20100%201.5h12.5a.75.75%200%20100-1.5H1.75z%22%3E%3C%2Fpath%3E%0A%3C%2Fsvg%3E

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ability to increase size of the SQL editor window 812228314  
782459405 https://github.com/simonw/datasette/issues/1236#issuecomment-782459405 https://api.github.com/repos/simonw/datasette/issues/1236 MDEyOklzc3VlQ29tbWVudDc4MjQ1OTQwNQ== simonw 9599 2021-02-19T23:45:02Z 2021-02-19T23:45:02Z OWNER

I'm going to use a variant of the Datasette menu icon. Here it is in #ccc with an ARIA label:

<svg aria-labelledby="cm-drag-to-resize" role="img" fill="#ccc" stroke="#ccc" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16">
  <title id="cm-drag-to-resize">Drag to resize</title>
  <path fill-rule="evenodd" d="M1 2.75A.75.75 0 011.75 2h12.5a.75.75 0 110 1.5H1.75A.75.75 0 011 2.75zm0 5A.75.75 0 011.75 7h12.5a.75.75 0 110 1.5H1.75A.75.75 0 011 7.75zM1.75 12a.75.75 0 100 1.5h12.5a.75.75 0 100-1.5H1.75z"></path>
</svg>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ability to increase size of the SQL editor window 812228314  
782458983 https://github.com/simonw/datasette/issues/1236#issuecomment-782458983 https://api.github.com/repos/simonw/datasette/issues/1236 MDEyOklzc3VlQ29tbWVudDc4MjQ1ODk4Mw== simonw 9599 2021-02-19T23:43:34Z 2021-02-19T23:43:34Z OWNER

I only want it to resize up and down, not left to right - so I'm not keen on the default resize handle:

https://user-images.githubusercontent.com/9599/108573363-364de680-72c9-11eb-8741-5112463ebfaa.png">

https://rawgit.com/Sphinxxxx/cm-resize/master/demo/index.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ability to increase size of the SQL editor window 812228314  
782458744 https://github.com/simonw/datasette/issues/1236#issuecomment-782458744 https://api.github.com/repos/simonw/datasette/issues/1236 MDEyOklzc3VlQ29tbWVudDc4MjQ1ODc0NA== simonw 9599 2021-02-19T23:42:42Z 2021-02-19T23:42:42Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ability to increase size of the SQL editor window 812228314  
782430028 https://github.com/simonw/datasette/issues/1212#issuecomment-782430028 https://api.github.com/repos/simonw/datasette/issues/1212 MDEyOklzc3VlQ29tbWVudDc4MjQzMDAyOA== kbaikov 4488943 2021-02-19T22:54:13Z 2021-02-19T22:54:13Z NONE

I will close this issue since it appears only in my particular setup.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Tests are very slow.  797651831  
782246111 https://github.com/simonw/datasette/issues/619#issuecomment-782246111 https://api.github.com/repos/simonw/datasette/issues/619 MDEyOklzc3VlQ29tbWVudDc4MjI0NjExMQ== simonw 9599 2021-02-19T18:11:22Z 2021-02-19T18:11:22Z OWNER

Big usability improvement, see also #1236

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Invalid SQL" page should let you edit the SQL 520655983  
782053455 https://github.com/simonw/datasette/pull/1229#issuecomment-782053455 https://api.github.com/repos/simonw/datasette/issues/1229 MDEyOklzc3VlQ29tbWVudDc4MjA1MzQ1NQ== camallen 295329 2021-02-19T12:47:19Z 2021-02-19T12:47:19Z NONE

I believe this pr and #1031 are related and fix the same issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
ensure immutable databses when starting in configuration directory mode with 810507413  
781825726 https://github.com/simonw/sqlite-utils/issues/236#issuecomment-781825726 https://api.github.com/repos/simonw/sqlite-utils/issues/236 MDEyOklzc3VlQ29tbWVudDc4MTgyNTcyNg== simonw 9599 2021-02-19T05:10:41Z 2021-02-19T05:10:41Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--attach command line option for attaching extra databases 811680502  
781825187 https://github.com/simonw/sqlite-utils/issues/113#issuecomment-781825187 https://api.github.com/repos/simonw/sqlite-utils/issues/113 MDEyOklzc3VlQ29tbWVudDc4MTgyNTE4Nw== simonw 9599 2021-02-19T05:09:12Z 2021-02-19T05:09:12Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Syntactic sugar for ATTACH DATABASE 621286870  
781764561 https://github.com/simonw/datasette/issues/283#issuecomment-781764561 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTc2NDU2MQ== simonw 9599 2021-02-19T02:10:21Z 2021-02-19T02:10:21Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781736855 https://github.com/simonw/datasette/issues/1235#issuecomment-781736855 https://api.github.com/repos/simonw/datasette/issues/1235 MDEyOklzc3VlQ29tbWVudDc4MTczNjg1NQ== simonw 9599 2021-02-19T00:52:47Z 2021-02-19T01:47:53Z OWNER

I bumped the two lines in the Dockerfile to FROM python:3.7.10-slim-stretch as build and ran this to build it:

docker build -f Dockerfile -t datasetteproject/datasette:python-3-7-10 .

Then I ran it with:

docker run -p 8001:8001 -v `pwd`:/mnt datasetteproject/datasette:python-3-7-10 datasette -p 8001 -h 0.0.0.0 /mnt/fixtures.db

http://0.0.0.0:8001/-/versions confirmed that it was now running Python 3.7.10

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Upgrade Python version used by official Datasette Docker image 811589344  
781735887 https://github.com/simonw/datasette/issues/1235#issuecomment-781735887 https://api.github.com/repos/simonw/datasette/issues/1235 MDEyOklzc3VlQ29tbWVudDc4MTczNTg4Nw== simonw 9599 2021-02-19T00:50:21Z 2021-02-19T00:50:55Z OWNER

I'll bump to 3.7.10 for the moment - the fix for 3.8 isn't out until March 1st according to https://news.ycombinator.com/item?id=26186434

https://www.python.org/downloads/release/python-3710/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Upgrade Python version used by official Datasette Docker image 811589344  
781670827 https://github.com/simonw/datasette/issues/283#issuecomment-781670827 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTY3MDgyNw== simonw 9599 2021-02-18T22:16:46Z 2021-02-18T22:16:46Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781599929 https://github.com/simonw/datasette/pull/1232#issuecomment-781599929 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTU5OTkyOQ== codecov[bot] 22429695 2021-02-18T19:59:54Z 2021-02-18T22:06:42Z NONE

Codecov Report

Merging #1232 (8876499) into main (4df548e) will increase coverage by 0.03%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #1232      +/-   ##
==========================================
+ Coverage   91.42%   91.46%   +0.03%     
==========================================
  Files          32       32              
  Lines        3955     3970      +15     
==========================================
+ Hits         3616     3631      +15     
  Misses        339      339              
<table> <thead> <tr> <th>Impacted Files</th> <th>Coverage Δ</th> <th></th> </tr> </thead> <tbody> <tr> <td>datasette/app.py</td> <td>95.68% <100.00%> (+0.06%)</td> <td>:arrow_up:</td> </tr> <tr> <td>datasette/cli.py</td> <td>76.62% <100.00%> (+0.36%)</td> <td>:arrow_up:</td> </tr> <tr> <td>datasette/views/database.py</td> <td>97.19% <100.00%> (+0.01%)</td> <td>:arrow_up:</td> </tr> </tbody> </table>

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4df548e...8876499. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781665560 https://github.com/simonw/datasette/issues/283#issuecomment-781665560 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTY2NTU2MA== simonw 9599 2021-02-18T22:06:14Z 2021-02-18T22:06:14Z OWNER

The implementation in #1232 is ready to land. It's the simplest-thing-that-could-possibly-work: you can run datasette one.db two.db three.db --crossdb and then use the /_memory page to run joins across tables from multiple databases.

It only works on the first 10 databases that were passed to the command-line. This means that if you have a Datasette instance with hundreds of attached databases (see Datasette Library) this won't be particularly useful for you.

So... a better, future version of this feature would be one that lets you join across databases on command - maybe by hitting /_memory?attach=db1&attach=db2 to get a special connection.

Also worth noting: plugins that implement the prepare_connection() hook can attach additional databases - so if you need better, customized support for this one way to handle that would be with a custom plugin.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781651283 https://github.com/simonw/datasette/pull/1232#issuecomment-781651283 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTY1MTI4Mw== simonw 9599 2021-02-18T21:37:55Z 2021-02-18T21:37:55Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781641728 https://github.com/simonw/datasette/pull/1232#issuecomment-781641728 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTY0MTcyOA== simonw 9599 2021-02-18T21:19:34Z 2021-02-18T21:19:34Z OWNER

I tested the demo deployment like this:

datasette publish cloudrun fixtures.db extra_database.db \                                      
            -m fixtures.json \
            --plugins-dir=plugins \
            --branch=crossdb \
            --extra-options="--setting template_debug 1 --crossdb" \
            --install=pysqlite3-binary \
            --service=datasette-latest-crossdb
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781637292 https://github.com/simonw/datasette/pull/1232#issuecomment-781637292 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTYzNzI5Mg== simonw 9599 2021-02-18T21:11:31Z 2021-02-18T21:11:31Z OWNER

Due to bug #1233 I'm going to publish the additional database as extra_database.db rather than extra database.db as it is used in the tests.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781636590 https://github.com/simonw/datasette/issues/1233#issuecomment-781636590 https://api.github.com/repos/simonw/datasette/issues/1233 MDEyOklzc3VlQ29tbWVudDc4MTYzNjU5MA== simonw 9599 2021-02-18T21:10:08Z 2021-02-18T21:10:08Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"datasette publish cloudrun" cannot publish files with spaces in their name 811458446  
781634819 https://github.com/simonw/datasette/pull/1232#issuecomment-781634819 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTYzNDgxOQ== simonw 9599 2021-02-18T21:06:43Z 2021-02-18T21:06:43Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781629841 https://github.com/simonw/datasette/pull/1232#issuecomment-781629841 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTYyOTg0MQ== simonw 9599 2021-02-18T20:57:23Z 2021-02-18T20:57:23Z OWNER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781598585 https://github.com/simonw/datasette/pull/1232#issuecomment-781598585 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTU5ODU4NQ== simonw 9599 2021-02-18T19:57:30Z 2021-02-18T19:57:30Z OWNER

It would also be neat if https://latest.datasette.io/ had multiple databases attached in order to demonstrate this feature.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781594632 https://github.com/simonw/datasette/pull/1232#issuecomment-781594632 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTU5NDYzMg== simonw 9599 2021-02-18T19:50:21Z 2021-02-18T19:50:21Z OWNER

It would be neat if the /_memory page showed a list of attached databases, to indicate that the --crossdb option is working and give people links to click to start running queries.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781593169 https://github.com/simonw/datasette/issues/283#issuecomment-781593169 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTU5MzE2OQ== simonw 9599 2021-02-18T19:47:34Z 2021-02-18T19:47:34Z OWNER

I have a working version now, moving development to a pull request.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781591015 https://github.com/simonw/datasette/issues/283#issuecomment-781591015 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTU5MTAxNQ== simonw 9599 2021-02-18T19:44:02Z 2021-02-18T19:44:02Z OWNER

For the moment I'm going to hard-code a SQLITE_LIMIT_ATTACHED=10 constant and only attach the first 10 databases to the _memory connection.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781574786 https://github.com/simonw/datasette/issues/283#issuecomment-781574786 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTU3NDc4Ng== simonw 9599 2021-02-18T19:15:37Z 2021-02-18T19:15:37Z OWNER

select * from pragma_database_list(); is useful - shows all attached databases for the current connection.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781573676 https://github.com/simonw/datasette/issues/283#issuecomment-781573676 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTU3MzY3Ng== simonw 9599 2021-02-18T19:13:30Z 2021-02-18T19:13:30Z OWNER

It turns out SQLite defaults to a maximum of 10 attached databases. This can be increased using a compile-time constant, but even with that it cannot be more than 62: https://stackoverflow.com/questions/9845448/attach-limit-10

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);