issue_comments

14 rows where issue = 326800219 sorted by updated_at descending

View and edit SQL

Suggested facets: created_at (date), updated_at (date)

user

issue

  • Mechanism for customizing the SQL used to select specific columns in the table view · 14

author_association

id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue
549584753 https://github.com/simonw/datasette/issues/292#issuecomment-549584753 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDU0OTU4NDc1Mw== simonw 9599 2019-11-04T22:54:26Z 2019-11-04T22:54:26Z OWNER

I'm going to split off an issue just for ?_col= and ?_nocol=

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
549169101 https://github.com/simonw/datasette/issues/292#issuecomment-549169101 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDU0OTE2OTEwMQ== simonw 9599 2019-11-03T19:17:08Z 2019-11-03T19:17:16Z OWNER

A good basic starting point for this would be to ignore the ability to add custom SQL fragments and instead focus on being able to show and hide specific columns. This will play particularly well with #613.

Proposed syntax for that:

/db/table?_col=id&_col=name - just show the id and name columns
/db/table?_nocol=extras&_nocol=age - show all columns except for extras and age

I don't think it makes sense to allow both ?_col= and ?_nocol= arguments in the same request, so if you provide both I think we throw a 400 error.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
423543060 https://github.com/simonw/datasette/issues/292#issuecomment-423543060 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDQyMzU0MzA2MA== simonw 9599 2018-09-21T14:06:31Z 2018-09-21T14:09:06Z OWNER

I keep on finding new reasons that I want this.

The latest is that I'm playing with the more advanced features of FTS5 - in particular the highlight() function and the ability to sort by rank.

The problem is... in order to do this, I need to properly join against the _fts table. Here's an example query:

select
  highlight(events_fts, 0, '<b>', '</b>'),
  events_fts.rank,
  events.*
from events
  join events_fts on events.rowid = events_fts.rowid
where events_fts match :search 
order by rank

Note that this is a different query from the usual FTS one (which does where rowid in (select rowid from events_fts...)) because I need the rank column somewhere I can sort against.

I'd like to be able to use this on the table view page so I can get faceting etc for free, but this is a completely different query from the default. Maybe I need a way to customize the entire query? That feels weird though - why am I not using a view in that case?

Answer: because views can't accept :search style parameters. I could use a canned query, but canned queries don't get faceting etc.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
392343839 https://github.com/simonw/datasette/issues/292#issuecomment-392343839 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDM5MjM0MzgzOQ== simonw 9599 2018-05-27T16:10:09Z 2018-06-04T17:38:04Z OWNER

The more efficient way of doing this kind of count would be to provide a mechanism which can also add extra fragments to a GROUP BY clause used for the SELECT.

Or... how about a mechanism similar to Django's prefetch_related which lets you define extra queries that will be called with a list of primary keys (or values from other columns) and used to populate a new column? A little unconventional but could be extremely useful and efficient.

Related to that: since the per-query overhead in SQLite is tiny, could even define an extra query to be run once-per-row before returning results.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
392350980 https://github.com/simonw/datasette/issues/292#issuecomment-392350980 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDM5MjM1MDk4MA== simonw 9599 2018-05-27T17:56:30Z 2018-05-27T17:56:50Z OWNER

Should ?_raw=1 also turn off foreign key expansions? No, we will eventually provide a separate mechanism for that (or leave it to nerds who care to figure out using JSON or CSV export).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
392350568 https://github.com/simonw/datasette/issues/292#issuecomment-392350568 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDM5MjM1MDU2OA== simonw 9599 2018-05-27T17:48:45Z 2018-05-27T17:54:41Z OWNER

If any ?_column= parameters are provided the metadata version is completely ignored.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
392350495 https://github.com/simonw/datasette/issues/292#issuecomment-392350495 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDM5MjM1MDQ5NQ== simonw 9599 2018-05-27T17:47:31Z 2018-05-27T17:47:31Z OWNER

Querystring design:

  • ?_column=a&_column=b - equivalent of "columns": ["a", "b"] in metadata.json
  • ?_select_nameupper=upper(name) - equivalent of "column_selects": {"nameupper": "upper(name)"}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
392345062 https://github.com/simonw/datasette/issues/292#issuecomment-392345062 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDM5MjM0NTA2Mg== simonw 9599 2018-05-27T16:26:53Z 2018-05-27T16:26:53Z OWNER

There needs to be a way to turn this off and return to Datasette default bahviour. Maybe a ?_raw=1 querystring parameter for the table view.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
392343690 https://github.com/simonw/datasette/issues/292#issuecomment-392343690 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDM5MjM0MzY5MA== simonw 9599 2018-05-27T16:08:25Z 2018-05-27T16:08:40Z OWNER

Turns out it's actually possible to pull data from other tables using the mechanism in the prototype:

{
    "databases": {
        "wtr": {
            "tables": {
                "license": {
                    "column_selects": {
                        "count": "(select count(*) from license_frequency where license_frequency.license = license.id)"
                    }
                }
            }
        }
    }
}

Performance using this technique is pretty terrible though:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
392342947 https://github.com/simonw/datasette/issues/292#issuecomment-392342947 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDM5MjM0Mjk0Nw== simonw 9599 2018-05-27T16:01:43Z 2018-05-27T16:01:43Z OWNER

I'd still like to be able to over-ride this using querystring arguments.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
392342269 https://github.com/simonw/datasette/issues/292#issuecomment-392342269 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDM5MjM0MjI2OQ== simonw 9599 2018-05-27T15:55:40Z 2018-05-27T16:01:26Z OWNER

Here's the metadata I tried against that first working prototype:

{
    "databases": {
        "timezones": {
            "tables": {
                "timezones": {
                    "columns": ["PK_UID"],
                    "column_selects": {
                        "upper_tzid": "upper(tzid)",
                        "Geometry": "AsGeoJSON(Geometry)"
                    }
                }
            }
        },
        "wtr": {
            "tables": {
                "license_frequency": {
                    "columns": ["id", "license", "tx_rx", "frequency"],
                    "column_selects": {
                        "latitude": "Y(Geometry)",
                        "longitude": "X(Geometry)"
                    }
                }
            }
        }
    }
}

Run using this:

datasette timezones.db wtr.db \
    --reload --debug --load-extension=/usr/local/lib/mod_spatialite.dylib \
    -m column-metadata.json --config sql_time_limit_ms:10000

Usefully, the --reload flag detects changes to the metadata.json file as well as Datasette's own Python code.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
392338130 https://github.com/simonw/datasette/issues/292#issuecomment-392338130 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDM5MjMzODEzMA== simonw 9599 2018-05-27T15:09:18Z 2018-05-27T15:09:28Z OWNER

Here's my first sketch at a metadata format for this:

  • columns: optional list of columns to include - if missing, shows all
  • column_selects: dictionary mapping column names to alternative select clauses

column_selects can also invent new keys and use them to create derived columns. These new keys will be selected at the end of the list of columns UNLESS they are mentioned in columns, in which case that sequence will define the order.

Can you facet by things that are customized using column_selects? Yes, and let's try running suggested facets against those columns as well.

{
    "databases": {
        "databasename": {
            "tables": {
                "tablename": {
                    "columns": [
                        "id", "name", "size"
                    ],
                    "column_selects": {
                        "name": "upper(name)",
                        "geo_json": "AsGeoJSON(Geometry)"
                    }
                    "row_columns": [...]
                    "row_column_selects": {...}
                }

The row_columns and row_column_selects properties work the same as the column* ones, except they are applied on the row page instead.

If omitted, the column* ones will be used on the row page as well.

If you want the row page to switch back to Datasette's default behaviour you can set "row_columns": [], "row_column_selects": {}.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
392316701 https://github.com/simonw/datasette/issues/292#issuecomment-392316701 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDM5MjMxNjcwMQ== simonw 9599 2018-05-27T09:08:49Z 2018-05-27T09:08:49Z OWNER

I could certainly see people wanting different custom column selects for the row page compared to the table page.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219
392316673 https://github.com/simonw/datasette/issues/292#issuecomment-392316673 https://api.github.com/repos/simonw/datasette/issues/292 MDEyOklzc3VlQ29tbWVudDM5MjMxNjY3Mw== simonw 9599 2018-05-27T09:08:06Z 2018-05-27T09:08:06Z OWNER

Open question: how should this affect the row page? Just because columns were hidden on the table page doesn't necessarily mean they should be hidden on the row page as well.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for customizing the SQL used to select specific columns in the table view 326800219

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Query took 23.969ms · About: github-to-sqlite