id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,pull_request,body,repo,type,active_lock_reason,performed_via_github_app,reactions,draft,state_reason 1163369515,I_kwDOBm6k_c5FV5wr,1655,query result page is using 400mb of browser memory 40x size of html page and 400x size of csv data,536941,open,0,,,8,2022-03-09T00:56:40Z,2023-10-17T21:53:17Z,,CONTRIBUTOR,,"[this page](https://labordata.bunkum.us/opdr-8335ea3?sql=with+most_recent_lu+as+%28%0D%0A++select%0D%0A++++*%0D%0A++from%0D%0A++++%28%0D%0A++++++select%0D%0A++++++++*%0D%0A++++++from%0D%0A++++++++lm_data%0D%0A++++++order+by%0D%0A++++++++f_num%2C%0D%0A++++++++receive_date+desc%0D%0A++++%29+t%0D%0A++group+by%0D%0A++++f_num%0D%0A%29%0D%0Aselect%0D%0A++aff_abbr+%7C%7C+coalesce%28%27+local+%27+%7C%7C+desig_num%2C+%27+%27+%7C%7C+unit_name%29+as+abbr_local_name%2C%0D%0A++coalesce%28%0D%0A++++regexp_match%28%27%28.*%3F%29%28%2C%3F+AFL-CIO%24%29%27%2C+union_name%29%2C%0D%0A++++regexp_match%28%27%28.*%3F%29%28+IND%24%29%27%2C+union_name%29%2C%0D%0A++++union_name%0D%0A++%29+%7C%7C+coalesce%28%27+local+%27+%7C%7C+desig_num%2C+%27+%27+%7C%7C+unit_name%29+as+full_local_name%2C%0D%0A++*%0D%0Afrom%0D%0A++most_recent_lu%0D%0Awhere+%28desig_num+IS+NOT+NULL+OR+unit_name+IS+NOT+NULL%29+AND+desig_name+%21%3D+%27HQ%27%0D%0Alimit%0D%0A++5000+offset+0) is using about 400 mb in firefox 97 on mac os x. if you download the html for the page, it's about 11mb and if you get the csv for the data its about 1mb. it's using over a 1G on chrome 99. i found this because, i was trying to figure out why editing the SQL was getting very slow. ",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/1655/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1781530343,I_kwDOBm6k_c5qL_7n,2093,"Proposal: Combine settings, metadata, static, etc. into a single `datasette.yaml` File",15178711,open,0,,,8,2023-06-29T21:18:23Z,2023-09-11T20:19:32Z,,CONTRIBUTOR,,"Very often I get tripped up when trying to configure my Datasette instances. For example: if I want to change the port my app listen too, do I do that with a CLI flag, a `--setting` flag, inside `metadata.json`, or an env var? If I want to up the time limit of SQL statements, is that under `metadata.json` or a setting? Where does my plugin configuration go? Normally I need to look it up in Datasette docs, and I quickly find my answer, but the number of places where ""config"" goes it overwhelming. - Flat CLI flags like `--port`, `--host`, `--cors`, etc. - `--setting`, like `default_page_size`, `sql_time_limit_ms` etc - Inside `metadata.json`, including plugin configuration Typically my Datasette deploys are extremely long shell commands, with multiple `--setting` and other CLI flags. ## Proposal: Consolidate all ""config"" into `datasette.toml` I propose that we add a new `datasette.toml` that combines ""settings"", ""metadata"", and other common CLI flags like `--port` and `--cors` into a single file. It would be similar to ""Cargo.toml"" in Rust projects, ""package.json"" in Node projects, and ""pyproject.toml"" in Python, etc. A sample of what it could look like: ```toml # ""top level"" configuration that are currently CLI flags on `datasette serve` [config] port = 8020 host = ""0.0.0.0"" cors = true # replaces multiple `--setting` flags [settings] base_url = ""/app/datasette/"" default_allow_sql = true sql_time_limit_ms = 3500 # replaces `metadata.json`. # The contents of datasette-metadata.json could be defined in this file instead, but supporting separate files is nice (since those are easy to machine-generate) [metadata] include=""./datasette-metadata.json"" # plugin-specific [plugins] [plugins.datasette-auth-github] client_id = {env = ""DATASETTE_AUTH_GITHUB_CLIENT_ID""} client_secret = {env = ""GITHUB_CLIENT_SECRET""} [plugins.datasette-cluster-map] latitude_column = ""lat"" longitude_column = ""lon"" ``` ## Pros - Instead of multiple files and CLI flags, everything could be in one tidy file - Editing config in a separate file is easier than editing CLI flags, since you don't have to kill a process + edit a command every time - New users will know ""just edit my `datasette.toml` instead of needing to learn metadata + settings + CLI flags - Better dev experience for multiple environment. For example, could have `datasette -c datasette-dev.toml` for local dev environments (enables SQL, debug plugins, long timeouts, etc.), and a `datasette -c datasette-prod.toml` for ""production"" (lower timeouts, less plugins, monitoring plugins, etc.) ## Cons - Yet another config-management system. Now Datasette users will need to know about metadata, settings, CLI flags, _and_ `datasette.toml`. However with enough documentation + announcements + examples, I think we can get ahead of it. - If toml is chosen, would need to add a toml parser for Python version <3.11 - Multiple sources of config require priority. For example: Would `--setting default_allow_sql off` override the value inside `[settings]`? What about `--port`? ## Other Notes ### Toml I chose toml over json because toml supports comments. I chose toml over yaml because Python 3.11 has builtin support for it. I also find toml easier to work with since it doesn't have the odd ""gotchas"" that YAML has (""ex `3.10` resolving to `3.1`, Norway `NO` resolving to `false`, etc.). It also mimics `pyproject.toml` which is nice. Happy to change my mind about this however ### Plugin config will be difficult Plugin config is currently in `metadata.json` in two places: 1. Top level, under `""plugins.[plugin-name]""`. This fits well into `datasette.toml` as `[plugins.plugin-name]` 2. Table level, under `""databases.[db-name].tables.[table-name].plugins.[plugin-name]`. This doesn't fit that well into `datasette.toml`, unless it's nested under `[metadata]`? ### Extensions, static, one-off plugins? We could also include equivalents of `--plugins-dir`, `--static`, and `--load-extension` into `datasette.toml`, but I'd imagine there's a few security concerns there to think through. ### Explicitly list with plugins to use? I believe Datasette by default will load all install plugins on startup, but maybe `datasette.toml` can specify a list of plugins to use? For example, a dev version of `datasette.toml` can specify `datasette-pretty-traces`, but the prod version can leave it out",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/2093/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1560982210,PR_kwDOBm6k_c5IvYKw,2008,array facet: don't materialize unnecessary columns,193185,open,0,,,8,2023-01-28T19:33:40Z,2023-01-29T18:17:40Z,,CONTRIBUTOR,simonw/datasette/pulls/2008,"The presence of `inner.*` causes SQLite to materialize a row with all the columns. Those columns will be discarded later. Instead, we can select only the column we'll use. This lets SQLite's optimizer realize that the other columns in the CTE definition aren't needed. On a test table with 278K rows, 98K of which had an array, this speeds up the facet calculation from 4 sec to 1 sec. ---- :books: Documentation preview :books:: https://datasette--2008.org.readthedocs.build/en/2008/ ",107914493,pull,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/2008/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 1453813400,I_kwDOBm6k_c5Wp26Y,1901,"Some plugins show ""home"" breadcrumbs twice in the top left",95570,closed,0,,,8,2022-11-17T18:44:58Z,2022-11-18T07:22:37Z,2022-11-18T07:02:56Z,CONTRIBUTOR,," ",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/1901/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed 1160034488,I_kwDOCGYnMM5FJLi4,411,Support for generated columns,25778,open,0,,,8,2022-03-04T20:41:33Z,2022-03-11T22:32:43Z,,CONTRIBUTOR,,"This is a fairly new feature -- SQLite version 3.31.0 (2020-01-22) -- that I, admittedly, haven't gotten to work yet. But it looks _incredibly_ useful: https://dgl.cx/2020/06/sqlite-json-support I'm not sure if this is an option on `add-column` or a separate command like `add-generated-column`. Either way, it needs an argument to populate it. It could be something like this: ```sh sqlite-utils add-column data.db table-name generated --as 'json_extract(data, ""$.field"")' --virtual ``` More here: https://www.sqlite.org/gencol.html",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/411/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1077102934,I_kwDOCGYnMM5AM0lW,353,"Allow passing a file of code to ""sqlite-utils convert""",536941,closed,0,,,8,2021-12-10T18:06:14Z,2021-12-11T01:38:29Z,2021-12-11T01:09:39Z,CONTRIBUTOR,,"sqlite-utils is so nice, but the ergonomics of the multiline code in kind of tough. It's really hard (maybe impossible) to make the newlines play well with Makefiles. it would be great to write your code fragment in a separate file and direct it into the sqlite-utils either like ```sqlite-utils convert my.db my_table my_column < custom_code.py``` or ```sqlite-utils convert my.db my_table my_column --custom-code=custom_code.py``` Thanks, as ever, for these great tools!",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/353/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed 610517472,MDU6SXNzdWU2MTA1MTc0NzI=,103,sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns,32605365,closed,0,,,8,2020-05-01T02:26:14Z,2020-05-14T00:18:57Z,2020-05-14T00:18:57Z,CONTRIBUTOR,,"If using insert_all to put in 1000 rows of data with varying number of columns, it comes up with this message `sqlite3.OperationalError: too many SQL variables` if the number of columns is larger in later records (past the first row) I've reduced `SQLITE_MAX_VARS` by 100 to 899 at the top of `db.py` to add wiggle room, so that if the column count increases it wont go past SQLite's batch limit as calculated by this line of code based on the count of the first row's dict keys batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns))",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/103/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed 541331755,MDExOlB1bGxSZXF1ZXN0MzU2MDA0MjQy,653,allow leading comments in SQL input field,418191,closed,0,,,8,2019-12-21T14:19:52Z,2020-02-05T02:35:41Z,2020-02-05T02:13:25Z,CONTRIBUTOR,simonw/datasette/pulls/653,"this changes the SQL validation to allow for lines that are commented out my main use case for this is that I like to write a succession of queries when trying to solve a problem. In most native SQL clients there is a key binding that will run just the current highlighted query or the program is smart enough to run just the query that the cursor is in if it's properly delimited with a ';'. Typically my workflow will start with a single simple query and I'll copy/paste it to a new query below when I want to make big changes while debugging. This makes it easy to go back to a working version above when the query doesn't work. Since datasette sends the whole query to the DB I have to comment out the older queries by prefixing each line with `--`. This gets caught by the validators when I use my typical strategy of copy/pasting each successive query below the last one. so this is just a simple fix to allow for a query to be sent to the DB with leading comments. ",107914493,pull,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/653/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 434321685,MDExOlB1bGxSZXF1ZXN0MjcxMzM4NDA1,434,"""datasette publish cloudrun"" command to publish to Google Cloud Run",10352819,closed,0,,,8,2019-04-17T14:41:18Z,2019-05-03T21:50:44Z,2019-05-03T13:59:02Z,CONTRIBUTOR,simonw/datasette/pulls/434,"This is a very rough draft to start a discussion on a possible datasette cloud run publish plugin (see issue #400). The main change was to dynamically set the listening port in `make_dockerfile` to satisfy cloud run's [requirements](https://cloud.google.com/run/docs/reference/container-contract). This was done by running `datasette` through `sh` to get environment variable substitution. Not sure if that's the right approach? ",107914493,pull,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/434/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 438437973,MDExOlB1bGxSZXF1ZXN0Mjc0NDY4ODM2,441,Add register_output_renderer hook,45057,closed,0,,,8,2019-04-29T18:03:21Z,2019-05-01T23:01:57Z,2019-05-01T23:01:57Z,CONTRIBUTOR,simonw/datasette/pulls/441,"This changeset refactors out the JSON renderer and then adds a hook and dispatcher system to allow custom output renderers to be registered. The CSV output renderer is untouched because supporting streaming renderers through this system would be significantly more complex, and probably not worthwhile. We can't simply allow hooks to be called at request time because we need a list of supported file extensions when the request is being routed in order to resolve ambiguous database/table names. So, renderers need to be registered at startup. I've tried to make this API independent of Sanic's request/response objects so that this can remain stable during the switch to ASGI. I'm using dictionaries to keep it simple and to make adding additional options in the future easy. Fixes #440",107914493,pull,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/441/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0,