github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/datasette/issues/894#issuecomment-709575818 | https://api.github.com/repos/simonw/datasette/issues/894 | 709575818 | MDEyOklzc3VlQ29tbWVudDcwOTU3NTgxOA== | 9599 | 2020-10-15T20:35:03Z | 2020-10-15T20:35:03Z | OWNER | Prototype so far: ```diff diff --git a/datasette/views/table.py b/datasette/views/table.py index ea11a51..d61f8bd 100644 --- a/datasette/views/table.py +++ b/datasette/views/table.py @@ -497,17 +497,32 @@ class TableView(RowTableShared): if sort and sort_desc: raise DatasetteError("Cannot use _sort and _sort_desc at the same time") + def parse_sort(sort): + if "~" in sort: + if sort.endswith("~default"): + col = sort.rsplit("~", 1)[0] + return col, escape_sqlite(col) + elif sort.endswith("~numeric"): + col = sort.rsplit("~", 1)[0] + return col, "cast(nullif({}, '') as real)".format(escape_sqlite(col)) + else: + return sort, escape_sqlite(sort) + else: + return sort, escape_sqlite(sort) + if sort: - if sort not in sortable_columns: - raise DatasetteError("Cannot sort table by {}".format(sort)) + sort_column, sort_clause = parse_sort(sort) + if sort_column not in sortable_columns: + raise DatasetteError("Cannot sort table by {}".format(sort_column)) - order_by = escape_sqlite(sort) + order_by = sort_clause if sort_desc: - if sort_desc not in sortable_columns: - raise DatasetteError("Cannot sort table by {}".format(sort_desc)) + sort_column, sort_clause = parse_sort(sort_desc) + if sort_column not in sortable_columns: + raise DatasetteError("Cannot sort table by {}".format(sort_column)) - order_by = "{} desc".format(escape_sqlite(sort_desc)) + order_by = "{} desc".format(sort_clause) from_sql = "from {table_name} {where}".format( table_name=escape_sqlite(table), ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709572425 | https://api.github.com/repos/simonw/datasette/issues/894 | 709572425 | MDEyOklzc3VlQ29tbWVudDcwOTU3MjQyNQ== | 9599 | 2020-10-15T20:28:18Z | 2020-10-15T20:28:18Z | OWNER | Also need to rethink this template logic that decides if to show a column as sorted or not: https://github.com/simonw/datasette/blob/4f7c0ebd85ccd8c1853d7aa0147628f7c1b749cc/datasette/templates/_table.html#L10-L14 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709571143 | https://api.github.com/repos/simonw/datasette/issues/894 | 709571143 | MDEyOklzc3VlQ29tbWVudDcwOTU3MTE0Mw== | 9599 | 2020-10-15T20:25:35Z | 2020-10-15T20:25:35Z | OWNER | `cast(nullif(colname, '') as real)` can fix this - it will treat `''` the same as `null`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709569951 | https://api.github.com/repos/simonw/datasette/issues/894 | 709569951 | MDEyOklzc3VlQ29tbWVudDcwOTU2OTk1MQ== | 9599 | 2020-10-15T20:23:02Z | 2020-10-15T20:23:02Z | OWNER | Something to watch out for: `""` empty strings cast to `0.0`: `select cast("100" as real), "100", cast(null as real), cast("" as real)` cast("100" as real) | "100" | cast(null as real) | cast("" as real) -- | -- | -- | -- 100.0 | 100 | | 0.0 https://latest.datasette.io/fixtures?sql=select+cast%28%22100%22+as+real%29%2C+%22100%22%2C+cast%28null+as+real%29%2C+cast%28%22%22+as+real%29 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709562940 | https://api.github.com/repos/simonw/datasette/issues/894 | 709562940 | MDEyOklzc3VlQ29tbWVudDcwOTU2Mjk0MA== | 9599 | 2020-10-15T20:08:16Z | 2020-10-15T20:08:16Z | OWNER | Relevant code: https://github.com/simonw/datasette/blob/4f7c0ebd85ccd8c1853d7aa0147628f7c1b749cc/datasette/views/table.py#L485-L510 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709546976 | https://api.github.com/repos/simonw/datasette/issues/894 | 709546976 | MDEyOklzc3VlQ29tbWVudDcwOTU0Njk3Ng== | 9599 | 2020-10-15T19:35:55Z | 2020-10-15T19:36:38Z | OWNER | Much easier solution: if the suffix is `~numeric` then treat it as the column name sorted numerically. If the suffix is missing OR the suffix is `~default`, sort without casting. Only add the `~default` suffix if the column name itself contains at least one `~` symbol. Using `~` because it doesn't need to be URL-encoded. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709539257 | https://api.github.com/repos/simonw/datasette/issues/894 | 709539257 | MDEyOklzc3VlQ29tbWVudDcwOTUzOTI1Nw== | 9599 | 2020-10-15T19:19:29Z | 2020-10-15T19:34:07Z | OWNER | Urgh this isn't going to work. `%7E~%7E` gets decoded as `~~~` so I wouldn't be able to tell the difference. I could use double-percentage-encoding here instead. I feel like there's a simpler solution that I'm missing (and that may well be in use within Datasette already, I'm not doing great thinking this morning). | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709534197 | https://api.github.com/repos/simonw/datasette/issues/894 | 709534197 | MDEyOklzc3VlQ29tbWVudDcwOTUzNDE5Nw== | 9599 | 2020-10-15T19:08:53Z | 2020-10-15T19:17:55Z | OWNER | Even better solution: use URL encoding in the parameter details. This is consistent with how `?_next=` tokens work, e.g. `?_next=0.291861560261786%2Ce%2Cj`. So the format can be: - `mycolumn` - `urlencoded-mycolumn$castname` For most columns this will look like: `?_sort=score$numeric` For columns with a `$` in their name it will be `?_sort=score%24hasdollar$numeric` Problem: both `$` and `,` are usually URL encoded anyway. I need a character which isn't encoded by default, so that I can use its encoded form to show it is part of the column name and its un-encoded form to split the cast indicator. `_` is a candidate here - not encoded by default, but can be encoded as `%5F`. The other unreserved non-alphanumeric characters are `-`, `.`, `_`, `~`. Of these, `~` is least likely to show up in a column name. So I'll use that. - `mycolumn` - `mycolumn~numeric` - `mycolumn%7Ewith%7Etildes~numeric` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709532369 | https://api.github.com/repos/simonw/datasette/issues/894 | 709532369 | MDEyOklzc3VlQ29tbWVudDcwOTUzMjM2OQ== | 9599 | 2020-10-15T19:05:07Z | 2020-10-15T19:07:35Z | OWNER | Simpler option: `?_sort=` column values look like this: - `mycolumn` - for sort by column - `mycolumn$numeric` - for sort by column after cast to float - `mycolumn$morename$default` - for the edge case where the column name itself contains a $ symbol | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709531343 | https://api.github.com/repos/simonw/datasette/issues/894 | 709531343 | MDEyOklzc3VlQ29tbWVudDcwOTUzMTM0Mw== | 9599 | 2020-10-15T19:03:12Z | 2020-10-15T19:03:12Z | OWNER | The Sort by `<select>` menu needs a rethink: https://github.com/simonw/datasette/blob/4f7c0ebd85ccd8c1853d7aa0147628f7c1b749cc/datasette/templates/table.html#L80-L87 If it's going to include sort by numeric options it needs a different format - since `?_sort=colname` needs to also support `?_sort=numeric_colname` - but shouldn't clash with a weird column already named `numeric_colname`. Can I come up with a value syntax for this that is guaranteed not to clash with a weirdly named existing column? I think so. I could use `{"type": "numeric", "column": "mycolumn"}` as the value - then if a column has a name that is itself valid JSON (weird but possible) the `_sort=` value would be `{"type": "default", "column": "{\"type\": \"numeric\", \"name\": \"column\"}"}`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709525082 | https://api.github.com/repos/simonw/datasette/issues/894 | 709525082 | MDEyOklzc3VlQ29tbWVudDcwOTUyNTA4Mg== | 9599 | 2020-10-15T18:51:59Z | 2020-10-15T18:51:59Z | OWNER | This is enough of a design to build a working prototype. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709505147 | https://api.github.com/repos/simonw/datasette/issues/894 | 709505147 | MDEyOklzc3VlQ29tbWVudDcwOTUwNTE0Nw== | 9599 | 2020-10-15T18:18:45Z | 2020-10-15T18:50:50Z | OWNER | This needs querystring parameter design. Some options: * `?_sort_clause=cast(mycol+as+integer)` and `?_sort_clause_desc=cast(mycol+as+integer)` - allowing any expression. This would need to be disabled if arbitrary SQL was turned off, similar to the restrictions on `?_where=`. * `?_sort_numeric=mycol` and `?_sort_numeric_desc=mycol` - this would cast to `real` which would work on integer values as well. It could be allowed even when arbitrary SQL was disabled. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709524123 | https://api.github.com/repos/simonw/datasette/issues/894 | 709524123 | MDEyOklzc3VlQ29tbWVudDcwOTUyNDEyMw== | 9599 | 2020-10-15T18:50:25Z | 2020-10-15T18:50:25Z | OWNER | For the "Sort by X" select menu case... I could automatically expand that menu to contain extra options for "Sort numerically by X" for each TEXT column in the table. That's a pretty good option. For the action cog menu, I can add the extra options to the cog menu - and rely on the fact that the title of the page will say "Sorted numerically by colname descending". | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709513483 | https://api.github.com/repos/simonw/datasette/issues/894 | 709513483 | MDEyOklzc3VlQ29tbWVudDcwOTUxMzQ4Mw== | 9599 | 2020-10-15T18:31:56Z | 2020-10-15T18:31:56Z | OWNER | I think the first version of this feature involves implementing `?_sort_numeric=col` and `?_sort_numeric_desc=col` plus the JavaScript to detect if those values should be shown in the column actions menu. One question: how to reflect that this is happening in the current sort UI. This menu here for example: <img width="440" alt="fixtures__compound_three_primary_keys__1_001_rows_where_sorted_by_pk3_descending" src="https://user-images.githubusercontent.com/9599/96171519-e35ac180-0ed9-11eb-9945-6780054e4ad5.png"> And this interface: how should it indicate that a text is currently sorted numerically v.s. sorted alphabetically, and allow the user to switch from one to the other? <img width="794" alt="fixtures__sortable__201_rows_where_sorted_by_sortable_with_nulls_descending" src="https://user-images.githubusercontent.com/9599/96171618-04bbad80-0eda-11eb-946c-ed3b17a8c749.png"> | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709511399 | https://api.github.com/repos/simonw/datasette/issues/894 | 709511399 | MDEyOklzc3VlQ29tbWVudDcwOTUxMTM5OQ== | 9599 | 2020-10-15T18:28:09Z | 2020-10-15T18:28:09Z | OWNER | The simplest solution would be for Python code to scan all of the visible values on the current page and show the column action for "sort by this numeric" based purely on that. I already do that in the JavaScript for "are there any blank values in the first page?" here: https://github.com/simonw/datasette/blob/4f7c0ebd85ccd8c1853d7aa0147628f7c1b749cc/datasette/static/table.js#L106-L118 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709510422 | https://api.github.com/repos/simonw/datasette/issues/894 | 709510422 | MDEyOklzc3VlQ29tbWVudDcwOTUxMDQyMg== | 9599 | 2020-10-15T18:26:25Z | 2020-10-15T18:26:25Z | OWNER | There's something interesting about figuring out which sort options should be offered in the column actions menu. Two options: - Try to detect if a text column is all-integers or all-floats, either by scanning the entire table (if it's small enough) or by scanning the first X rows, where X might be the size of the first page or maybe the first 1,000 or similar. - Could also let users define this in `metadata.yml` for the table. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709509635 | https://api.github.com/repos/simonw/datasette/issues/894 | 709509635 | MDEyOklzc3VlQ29tbWVudDcwOTUwOTYzNQ== | 9599 | 2020-10-15T18:24:57Z | 2020-10-15T18:24:57Z | OWNER | This does feel like a weird plugin hook just because there aren't really THAT many different use-cases that plugins could solve. The ones I can think of are: - Sort numeric - Sort by parsed date - Sort by FTS rank Could this work if I just allow `_sort_clause=`? One possible solution for the no-arbitrary-SQL case: users can define sort orders in `metadata.json/yml`. So if you want to enable sort-by-distance without enabling arbitrary SQL you could add something like this: ```yaml databases: mydb: tables: museums: sort_clause: bm25(fts) ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709505953 | https://api.github.com/repos/simonw/datasette/issues/894 | 709505953 | MDEyOklzc3VlQ29tbWVudDcwOTUwNTk1Mw== | 9599 | 2020-10-15T18:20:13Z | 2020-10-15T18:20:13Z | OWNER | Sorting by date when the column has a junk date format in it is such a column need it should maybe ship in Datasette by default - though I've been trying to avoid adding heavy dependencies like `dateutil` if I can get away with it. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 | |
https://github.com/simonw/datasette/issues/894#issuecomment-709505608 | https://api.github.com/repos/simonw/datasette/issues/894 | 709505608 | MDEyOklzc3VlQ29tbWVudDcwOTUwNTYwOA== | 9599 | 2020-10-15T18:19:35Z | 2020-10-15T18:19:35Z | OWNER | I could even let plugins define new sort types. Imagine a plugin that enables this: `?_sort_date_desc=mycol` - where it knows how to handle specific date formats, or even uses `dateutil.parser.parse`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
657572753 |