{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006311742", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006311742, "node_id": "IC_kwDOCGYnMM47-xk-", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T06:12:19Z", "updated_at": "2022-01-06T06:12:19Z", "author_association": "OWNER", "body": "Got that working:\r\n```\r\n% echo 'This is cool' | sqlite-utils insert words.db words - --text --convert '({\"word\": w} for w in text.split())'\r\n% sqlite-utils dump words.db                                                                                       \r\nBEGIN TRANSACTION;\r\nCREATE TABLE [words] (\r\n   [word] TEXT\r\n);\r\nINSERT INTO \"words\" VALUES('This');\r\nINSERT INTO \"words\" VALUES('is');\r\nINSERT INTO \"words\" VALUES('cool');\r\nCOMMIT;\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006309834", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006309834, "node_id": "IC_kwDOCGYnMM47-xHK", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T06:08:01Z", "updated_at": "2022-01-06T06:08:01Z", "author_association": "OWNER", "body": "For `--text` the conversion function should be allowed to return an iterable instead of a dictionary, in which case it will be treated as the full list of records to be inserted.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006301546", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006301546, "node_id": "IC_kwDOCGYnMM47-vFq", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T05:44:47Z", "updated_at": "2022-01-06T05:44:47Z", "author_association": "OWNER", "body": "Just need documentation for `--convert` now against the various different types of input.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006300280", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006300280, "node_id": "IC_kwDOCGYnMM47-ux4", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T05:40:45Z", "updated_at": "2022-01-06T05:40:45Z", "author_association": "OWNER", "body": "I'm going to rename `--all` to `--text`:\r\n\r\n> - Use `--text` to write the entire input to a column called \"text\"\r\n\r\nTo avoid that clash with Python's `all()` function.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006299778", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006299778, "node_id": "IC_kwDOCGYnMM47-uqC", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T05:39:10Z", "updated_at": "2022-01-06T05:39:10Z", "author_association": "OWNER", "body": "`all` is a bad variable name because it clashes with the Python `all()` built-in function.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006295276", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006295276, "node_id": "IC_kwDOCGYnMM47-tjs", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T05:26:11Z", "updated_at": "2022-01-06T05:26:11Z", "author_association": "OWNER", "body": "Here's the traceback if your `--convert` function doesn't return a dict right now:\r\n```\r\n% sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert 'all.upper()' --all         \r\n\r\nTraceback (most recent call last):\r\n  File \"/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/bin/sqlite-utils\", line 33, in <module>\r\n    sys.exit(load_entry_point('sqlite-utils', 'console_scripts', 'sqlite-utils')())\r\n  File \"/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py\", line 1137, in __call__\r\n    return self.main(*args, **kwargs)\r\n  File \"/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py\", line 1062, in main\r\n    rv = self.invoke(ctx)\r\n  File \"/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py\", line 1668, in invoke\r\n    return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n  File \"/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py\", line 1404, in invoke\r\n    return ctx.invoke(self.callback, **ctx.params)\r\n  File \"/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py\", line 763, in invoke\r\n    return __callback(*args, **kwargs)\r\n  File \"/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py\", line 949, in insert\r\n    insert_upsert_implementation(\r\n  File \"/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py\", line 834, in insert_upsert_implementation\r\n    db[table].insert_all(\r\n  File \"/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py\", line 2602, in insert_all\r\n    first_record = next(records)\r\n  File \"/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py\", line 3044, in fix_square_braces\r\n    for record in records:\r\n  File \"/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py\", line 831, in <genexpr>\r\n    docs = (decode_base64_values(doc) for doc in docs)\r\n  File \"/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/utils.py\", line 86, in decode_base64_values\r\n    to_fix = [\r\n  File \"/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/utils.py\", line 89, in <listcomp>\r\n    if isinstance(doc[k], dict)\r\nTypeError: string indices must be integers\r\n```\r\nI can live with that for the moment.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006294777", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006294777, "node_id": "IC_kwDOCGYnMM47-tb5", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T05:24:54Z", "updated_at": "2022-01-06T05:24:54Z", "author_association": "OWNER", "body": "> I added a custom error message for if the user's `--convert` code doesn't return a dict.\r\n\r\nThat turned out to be a bad idea because it meant exhausting the iterator early for the check - before we got to the `.insert_all()` code that breaks the iterator up into chunks. I tried fixing that with `itertools.tee()` to run the generator twice but that's grossly memory-inefficient for large imports.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006288444", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006288444, "node_id": "IC_kwDOCGYnMM47-r48", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T05:07:10Z", "updated_at": "2022-01-06T05:07:10Z", "author_association": "OWNER", "body": "And here's a demo of `--convert` used with `--all` - I added a custom error message for if the user's `--convert` code doesn't return a dict.\r\n\r\n```\r\n% sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert 'all.upper()' --all         \r\nError: Records returned by your --convert function must be dicts\r\n% sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert '{\"all\": all.upper()}' --all\r\n% sqlite-utils dump /tmp/all.db                                                           \r\nBEGIN TRANSACTION;\r\nCREATE TABLE [blah] (\r\n   [all] TEXT\r\n);\r\nINSERT INTO \"blah\" VALUES('INFO:     127.0.0.1:60581 - \"GET / HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60581 - \"GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60581 - \"GET /FAVICON.ICO HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60581 - \"GET /FOO/TIDDLYWIKI HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60581 - \"GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60584 - \"GET /FOO/-/STATIC/SQL-FORMATTER-2.3.3.MIN.JS HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60586 - \"GET /FOO/-/STATIC/CODEMIRROR-5.57.0.MIN.JS HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60585 - \"GET /FOO/-/STATIC/CODEMIRROR-5.57.0.MIN.CSS HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60588 - \"GET /FOO/-/STATIC/CODEMIRROR-5.57.0-SQL.MIN.JS HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60587 - \"GET /FOO/-/STATIC/CM-RESIZE-1.0.1.MIN.JS HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60586 - \"GET /FOO/TIDDLYWIKI/TIDDLERS HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60586 - \"GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1\" 200 OK\r\nINFO:     127.0.0.1:60584 - \"GET /FOO/-/STATIC/TABLE.JS HTTP/1.1\" 200 OK\r\n');\r\nCOMMIT;\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006284673", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006284673, "node_id": "IC_kwDOCGYnMM47-q-B", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T04:55:52Z", "updated_at": "2022-01-06T04:55:52Z", "author_association": "OWNER", "body": "Test code that just worked for me:\r\n```\r\nsqlite-utils insert /tmp/blah.db blah /tmp/log.log --convert '\r\nbits = line.split()\r\nreturn dict([(\"b_{}\".format(i), bit) for i, bit in enumerate(bits)])' --lines\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006232013", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006232013, "node_id": "IC_kwDOCGYnMM47-eHN", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T02:21:35Z", "updated_at": "2022-01-06T02:21:35Z", "author_association": "OWNER", "body": "I'm having second thoughts about this bit:\r\n\r\n> Your Python code will be passed a \"row\" variable representing the imported row, and can return a modified row.\r\n>\r\n> If you are using `--lines` your code will be passed a \"line\" variable, and for `--all` an \"all\" variable.\r\n\r\nThe code in question is this:\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/500a35ad4d91c8a6232134ce9406efec11bedff8/sqlite_utils/utils.py#L296-L303\r\n\r\nDo I really want to add the complexity of supporting different variable names there? I think always using `value` might be better.\r\n\r\nExcept... `value` made sense for the existing `sqlite-utils convert` command where you are running a conversion function against the value for the column in the current row - is it confusing if applied to lines or documents or `all`?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006230411", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006230411, "node_id": "IC_kwDOCGYnMM47-duL", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T02:17:35Z", "updated_at": "2022-01-06T02:17:35Z", "author_association": "OWNER", "body": "Documentation: https://github.com/simonw/sqlite-utils/blob/33223856ff7fe746b7b77750fbe5b218531d0545/docs/cli.rst#inserting-unstructured-data-with---lines-and---all - I went with a single section titled \"Inserting unstructured data with --lines and --all\"", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006220129", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006220129, "node_id": "IC_kwDOCGYnMM47-bNh", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T01:52:26Z", "updated_at": "2022-01-06T01:52:26Z", "author_association": "OWNER", "body": "I'm going to refactor all of the tests for `sqlite-utils insert` into a new `test_cli_insert.py` module.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006219848", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/361", "id": 1006219848, "node_id": "IC_kwDOCGYnMM47-bJI", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T01:51:36Z", "updated_at": "2022-01-06T01:51:36Z", "author_association": "OWNER", "body": "So far I've just implemented the new help:\r\n```\r\n% sqlite-utils insert --help\r\nUsage: sqlite-utils insert [OPTIONS] PATH TABLE FILE\r\n\r\n  Insert records from FILE into a table, creating the table if it does not\r\n  already exist.\r\n\r\n  By default the input is expected to be a JSON array of objects. Or:\r\n\r\n  - Use --nl for newline-delimited JSON objects\r\n  - Use --csv or --tsv for comma-separated or tab-separated input\r\n  - Use --lines to write each incoming line to a column called \"line\"\r\n  - Use --all to write the entire input to a column called \"all\"\r\n\r\n  You can also use --convert to pass a fragment of Python code that will be\r\n  used to convert each input.\r\n\r\n  Your Python code will be passed a \"row\" variable representing the imported\r\n  row, and can return a modified row.\r\n\r\n  If you are using --lines your code will be passed a \"line\" variable, and for\r\n  --all an \"all\" variable.\r\n\r\nOptions:\r\n  --pk TEXT                 Columns to use as the primary key, e.g. id\r\n  --flatten                 Flatten nested JSON objects, so {\"a\": {\"b\": 1}}\r\n                            becomes {\"a_b\": 1}\r\n  --nl                      Expect newline-delimited JSON\r\n  -c, --csv                 Expect CSV input\r\n  --tsv                     Expect TSV input\r\n  --lines                   Treat each line as a single value called 'line'\r\n  --all                     Treat input as a single value called 'all'\r\n  --convert TEXT            Python code to convert each item\r\n  --import TEXT             Python modules to import\r\n  --delimiter TEXT          Delimiter to use for CSV files\r\n  --quotechar TEXT          Quote character to use for CSV/TSV\r\n  --sniff                   Detect delimiter and quote character\r\n  --no-headers              CSV file has no header row\r\n  --batch-size INTEGER      Commit every X records\r\n  --alter                   Alter existing table to add any missing columns\r\n  --not-null TEXT           Columns that should be created as NOT NULL\r\n  --default <TEXT TEXT>...  Default value that should be set for a column\r\n  --encoding TEXT           Character encoding for input, defaults to utf-8\r\n  -d, --detect-types        Detect types for columns in CSV/TSV data\r\n  --load-extension TEXT     SQLite extensions to load\r\n  --silent                  Do not show progress bar\r\n  --ignore                  Ignore records if pk already exists\r\n  --replace                 Replace records if pk already exists\r\n  --truncate                Truncate table before inserting records, if table\r\n                            already exists\r\n  -h, --help                Show this message and exit.\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1094890366, "label": "--lines and --text and --convert and --import"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997496626", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/356", "id": 997496626, "node_id": "IC_kwDOCGYnMM47dJcy", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-20T00:38:15Z", "updated_at": "2022-01-06T01:29:03Z", "author_association": "OWNER", "body": "The implementation of this gets a tiny bit complicated.\r\n\r\nIgnoring `--convert`, the `--lines` option can internally produce `{\"line\": ...}` records and the `--all` option can produce `{\"all\": ...}` records.\r\n\r\nBut... when `--convert` is used, what should the code run against?\r\n\r\nIt could run against those already-converted records but that's a little bit strange, since you'd have to do this:\r\n\r\n    sqlite-utils insert blah.db blah myfile.txt --all --convert '{\"item\": s for s in value[\"all\"].split(\"-\")}'\r\n\r\nHaving to use `value[\"all\"]` there is unintuitive. It would be nicer to have a `all` variable to work against.\r\n\r\nBut then for `--lines` should the local variable be called `line`? And how best to summarize these different names for local variables in the inline help for the feature?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077431957, "label": "`sqlite-utils insert --convert` option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/360#issuecomment-1006211113", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/360", "id": 1006211113, "node_id": "IC_kwDOCGYnMM47-ZAp", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-06T01:27:53Z", "updated_at": "2022-01-06T01:27:53Z", "author_association": "OWNER", "body": "It looks like you were using `sqlite-utils memory` - that works by loading the entire file into an in-memory database, so 170GB is very likely to run out of RAM.\r\n\r\nThe line of code there exhibits another problem: it's reading the entire JSON file into a Python string, so it looks like it's going to run out of RAM even before it gets to the SQLite in-memory database section.\r\n\r\nTo handle a file of this size you'd need to write it to a SQLite database on-disk first. The `sqlite-utils insert` command can do this, and it should be able to \"stream\" records in from a file without loading the entire thing into memory - but only for JSON-NL and CSV/TSV formats, not for JSON arrays.\r\n\r\nThe code in question is here:\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/f3fd8613113d21d44238a6ec54b375f5aa72c4e0/sqlite_utils/cli.py#L738-L773\r\n\r\nThat's using Python generators for the CSV/TSV/JSON-NL variants... but it's doing this for regular JSON which requires reading the entire thing into memory:\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/f3fd8613113d21d44238a6ec54b375f5aa72c4e0/sqlite_utils/cli.py#L767\r\n\r\nIf you have the ability to control how your 170GB file is generated you may have more luck converting it to CSV or TSV or newline-delimited JSON, then using `sqlite-utils insert` to insert it into a database file.\r\n\r\nTo be honest though I've never tested this tooling with anything nearly that big, so it's possible you'll still run into problems. If you do I'd love to hear about them!\r\n\r\nI would be tempted to tackle this size of job by writing a custom Python script, either using the `sqlite_utils` Python library or even calling `sqlite3` directly.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1091819089, "label": "MemoryError"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1534#issuecomment-1005975080", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1534", "id": 1005975080, "node_id": "IC_kwDOBm6k_c479fYo", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-05T18:29:06Z", "updated_at": "2022-01-05T18:29:06Z", "author_association": "OWNER", "body": "A really big downside to this is that it turns out many CDNs - apparently including Cloudflare - don't support the Vary header at all!\r\n\r\nMore in this thread: https://twitter.com/simonw/status/1478470282931163137", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1065432388, "label": "Maybe return JSON from HTML pages if `Accept: application/json` is sent"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1585#issuecomment-1003575286", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1585", "id": 1003575286, "node_id": "IC_kwDOBm6k_c470Vf2", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-01T15:40:38Z", "updated_at": "2022-01-01T15:40:38Z", "author_association": "OWNER", "body": "API tutorial: https://firebase.google.com/docs/hosting/api-deploy", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1091838742, "label": "Fire base caching for `publish cloudrun`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1003437288", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8", "id": 1003437288, "node_id": "IC_kwDODFE5qs47zzzo", "user": {"value": 28565, "label": "maxhawkins"}, "created_at": "2021-12-31T19:06:20Z", "updated_at": "2021-12-31T19:06:20Z", "author_association": "NONE", "body": "> @maxhawkins how hard would it be to add an entry to the table that includes the HTML version of the email, if it exists? I just attempted your the PR branch on a very small mbox file, and it worked great. My use case is a research project and I need to access more than just the body plain text.\r\n\r\nShouldn't be hard. The easiest way is probably to remove the `if body.content_type == \"text/html\"` clause from [utils.py:254](https://github.com/dogsheep/google-takeout-to-sqlite/pull/8/commits/8e6d487b697ce2e8ad885acf613a157bfba84c59#diff-25ad9dd1ced1b8bfc37fda8444819c803232c08891e4af3d4064aa205d8174eaR254) and just return content directly without parsing.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 954546309, "label": "Add Gmail takeout mbox import (v2)"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1583#issuecomment-1002825217", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1583", "id": 1002825217, "node_id": "IC_kwDOBm6k_c47xeYB", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-12-30T00:34:16Z", "updated_at": "2021-12-30T00:34:16Z", "author_association": "CONTRIBUTOR", "body": "if that is not desirable, it might be good to document that users might want to set up a lifecycle rule to automatically delete these build artifacts. something like https://stackoverflow.com/questions/59937542/can-i-delete-container-images-from-google-cloud-storage-artifacts-bucket", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1090810196, "label": "consider adding deletion step of cloudbuild artifacts to gcloud publish"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1002735370", "issue_url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8", "id": 1002735370, "node_id": "IC_kwDODFE5qs47xIcK", "user": {"value": 203343, "label": "Btibert3"}, "created_at": "2021-12-29T18:58:23Z", "updated_at": "2021-12-29T18:58:23Z", "author_association": "NONE", "body": "@maxhawkins how hard would it be to add an entry to the table that includes the HTML version of the email, if it exists?  I just attempted your the PR branch on a very small mbox file, and it worked great.  My use case is a research project and I need to access more than just the body plain text.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 954546309, "label": "Add Gmail takeout mbox import (v2)"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1152#issuecomment-1001791592", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1152", "id": 1001791592, "node_id": "IC_kwDOBm6k_c47tiBo", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-27T23:04:31Z", "updated_at": "2021-12-27T23:04:31Z", "author_association": "OWNER", "body": "Another option: rethink permissions to always work in terms of where clauses users as part of a SQL query that returns the overall allowed set of databases or tables. This would require rethinking existing permissions but it might be worthwhile prior to 1.0.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770598024, "label": "Efficiently calculate list of databases/tables a user can view"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-1001699559", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 1001699559, "node_id": "IC_kwDOBm6k_c47tLjn", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-27T18:53:04Z", "updated_at": "2021-12-27T18:53:04Z", "author_association": "OWNER", "body": "I'm going to see if I can come up with the simplest possible version of this pattern for the `/-/metadata` and `/-/metadata.json` page, then try it for the database query page, before tackling the much more complex table page.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/dogsheep/twitter-to-sqlite/issues/62#issuecomment-1001222213", "issue_url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/62", "id": 1001222213, "node_id": "IC_kwDODEm0Qs47rXBF", "user": {"value": 6764957, "label": "swyxio"}, "created_at": "2021-12-26T17:59:25Z", "updated_at": "2021-12-26T17:59:25Z", "author_association": "NONE", "body": "just confirmed that this error does not occur when i use my public main account. gets more interesting!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1088816961, "label": "KeyError: 'created_at' for private accounts?"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/228#issuecomment-1001115286", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/228", "id": 1001115286, "node_id": "IC_kwDOCGYnMM47q86W", "user": {"value": 1206106, "label": "agguser"}, "created_at": "2021-12-26T07:01:31Z", "updated_at": "2021-12-26T07:01:31Z", "author_association": "NONE", "body": "`--no-headers` does not work?\r\n```\r\n$ echo 'a,1\\nb,2' | sqlite-utils memory --no-headers -t - 'select * from stdin'\r\na      1                                                                                                                             \r\n---  ---                                                                                                                             \r\nb      2 \r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807437089, "label": "--no-headers option for CSV and TSV"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-1000935523", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 1000935523, "node_id": "IC_kwDOBm6k_c47qRBj", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-24T21:33:05Z", "updated_at": "2021-12-24T21:33:05Z", "author_association": "OWNER", "body": "Another option would be to attempt to import `contextvars` and, if the import fails (for Python 3.6) continue using the current mechanism - then let Python 3.6 users know in the documentation that under Python 3.6 they will miss out on nested traces.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1577#issuecomment-1000673444", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1577", "id": 1000673444, "node_id": "IC_kwDOBm6k_c47pRCk", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-24T06:08:58Z", "updated_at": "2021-12-24T06:08:58Z", "author_association": "OWNER", "body": "https://pypistats.org/packages/datasette shows a breakdown of downloads by Python version:\r\n\r\n<img width=\"986\" alt=\"image\" src=\"https://user-images.githubusercontent.com/9599/147323253-1ee22d93-3be2-472b-8ead-495d925958e5.png\">\r\n\r\nIt looks like on a recent day I had 4,071 downloads from Python 3.7... and just 2 downloads from Python 3.6!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087913724, "label": "Drop support for Python 3.6"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1534#issuecomment-1000535904", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1534", "id": 1000535904, "node_id": "IC_kwDOBm6k_c47ovdg", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T21:44:31Z", "updated_at": "2021-12-23T21:44:31Z", "author_association": "OWNER", "body": "A big downside to this is that I would need to use `Vary: Accept` for when Datasette is running behind a cache such as Cloudflare - would that greatly reduce overall cache efficiency due to subtle variations in the accept headers sent by common browsers?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1065432388, "label": "Maybe return JSON from HTML pages if `Accept: application/json` is sent"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1579#issuecomment-1000485719", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1579", "id": 1000485719, "node_id": "IC_kwDOBm6k_c47ojNX", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T19:19:45Z", "updated_at": "2021-12-23T19:19:45Z", "author_association": "OWNER", "body": "All of those removed `block=True` lines in 8c401ee0f054de2f568c3a8302c9223555146407 really help confirm to me that this was a good decision.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087931918, "label": "`.execute_write(... block=True)` should be the default behaviour"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1579#issuecomment-1000485505", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1579", "id": 1000485505, "node_id": "IC_kwDOBm6k_c47ojKB", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T19:19:13Z", "updated_at": "2021-12-23T19:19:13Z", "author_association": "OWNER", "body": "Updated docs for `execute_write_fn()`: https://github.com/simonw/datasette/blob/75153ea9b94d09ec3d61f7c6ebdf378e0c0c7a0b/docs/internals.rst#await-dbexecute_write_fnfn-blocktrue", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087931918, "label": "`.execute_write(... block=True)` should be the default behaviour"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1579#issuecomment-1000481686", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1579", "id": 1000481686, "node_id": "IC_kwDOBm6k_c47oiOW", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T19:09:23Z", "updated_at": "2021-12-23T19:09:23Z", "author_association": "OWNER", "body": "Re-opening this because I missed updating some of the docs, and I also need to update Datasette's own code to not use `block=True` in a bunch of places.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087931918, "label": "`.execute_write(... block=True)` should be the default behaviour"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1579#issuecomment-1000479737", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1579", "id": 1000479737, "node_id": "IC_kwDOBm6k_c47ohv5", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T19:04:23Z", "updated_at": "2021-12-23T19:04:23Z", "author_association": "OWNER", "body": "Updated documentation: https://github.com/simonw/datasette/blob/00a2895cd2dc42c63846216b36b2dc9f41170129/docs/internals.rst#await-dbexecute_writesql-paramsnone-blocktrue", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087931918, "label": "`.execute_write(... block=True)` should be the default behaviour"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1579#issuecomment-1000477813", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1579", "id": 1000477813, "node_id": "IC_kwDOBm6k_c47ohR1", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T18:59:41Z", "updated_at": "2021-12-23T18:59:41Z", "author_association": "OWNER", "body": "I'm going to go with `execute_write(..., block=False)` as the mechanism for fire-and-forget write queries.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087931918, "label": "`.execute_write(... block=True)` should be the default behaviour"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1579#issuecomment-1000477621", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1579", "id": 1000477621, "node_id": "IC_kwDOBm6k_c47ohO1", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T18:59:12Z", "updated_at": "2021-12-23T18:59:12Z", "author_association": "OWNER", "body": "The easiest way to change this would be to default to `block=True` such that you need to pass `block=False` to the APIs to have them do fire-and-forget.\r\n\r\nAn alternative would be to add new, separately named methods which do the fire-and-forget thing.\r\n\r\nIf I hadn't recently added `execute_write_script` and `execute_write_many` in #1570 I'd be more into this idea, but I don't want to end up with eight methods - `execute_write`, `execute_write_queue`, `execute_write_many`, `execute_write_many_queue`, `execute_write_script`, `execute_write_scrript_queue`, `execute_write_fn`, `execute_write_fn_queue`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087931918, "label": "`.execute_write(... block=True)` should be the default behaviour"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1579#issuecomment-1000476413", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1579", "id": 1000476413, "node_id": "IC_kwDOBm6k_c47og79", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T18:56:06Z", "updated_at": "2021-12-23T18:56:06Z", "author_association": "OWNER", "body": "This is technically a breaking change, but a GitHub code search at https://cs.github.com/?scopeName=All+repos&scope=&q=execute_write%20datasette%20-owner%3Asimonw shows only one repo not-owned-by-me using this, and they're using `block=True`: https://github.com/mfa/datasette-webhook-write/blob/e82440f372a2f2e3ed27d1bd34c9fa3a53b49b94/datasette_webhook_write/__init__.py#L88-L89", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087931918, "label": "`.execute_write(... block=True)` should be the default behaviour"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1578#issuecomment-1000471782", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1578", "id": 1000471782, "node_id": "IC_kwDOBm6k_c47ofzm", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T18:44:01Z", "updated_at": "2021-12-23T18:44:01Z", "author_association": "OWNER", "body": "The example nginx config on https://docs.datasette.io/en/stable/deploying.html#nginx-proxy-configuration is currently:\r\n\r\n```\r\ndaemon off;\r\n\r\nevents {\r\n  worker_connections  1024;\r\n}\r\nhttp {\r\n  server {\r\n    listen 80;\r\n    location /my-datasette {\r\n      proxy_pass http://127.0.0.1:8009/my-datasette;\r\n      proxy_set_header Host $host;\r\n    }\r\n  }\r\n}\r\n```\r\nThis looks to me like it might exhibit the bug. Need to confirm that and figure out an alternative.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087919372, "label": "Confirm if documented nginx proxy config works for row pages with escaped characters in their primary key"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1578#issuecomment-1000471371", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1578", "id": 1000471371, "node_id": "IC_kwDOBm6k_c47oftL", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T18:42:50Z", "updated_at": "2021-12-23T18:42:50Z", "author_association": "OWNER", "body": "Confirmed, that fixed the bug for me on my server.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087919372, "label": "Confirm if documented nginx proxy config works for row pages with escaped characters in their primary key"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1578#issuecomment-1000470652", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1578", "id": 1000470652, "node_id": "IC_kwDOBm6k_c47ofh8", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T18:40:46Z", "updated_at": "2021-12-23T18:40:46Z", "author_association": "OWNER", "body": "[This StackOverflow answer](https://serverfault.com/a/463932) suggests that the fix is to change this:\r\n\r\n    proxy_pass http://127.0.0.1:8000/;\r\n\r\nTo this:\r\n\r\n    proxy_pass http://127.0.0.1:8000;\r\n\r\nQuoting the nginx documentation: http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_pass\r\n\r\n> A request URI is passed to the server as follows:\r\n> \r\n> -   If the `proxy_pass` directive is specified with a URI, then when a request is passed to the server, the part of a [normalized](http://nginx.org/en/docs/http/ngx_http_core_module.html#location) request URI matching the location is replaced by a URI specified in the directive:\r\n> \r\n>         location /name/ {\r\n>             proxy_pass http://127.0.0.1/remote/;\r\n>         }\r\n> \r\n> -   If `proxy_pass` is specified without a URI, the request URI is passed to the server in the same form as sent by a client when the original request is processed, or the full normalized request URI is passed when processing the changed URI:\r\n> \r\n>         location /some/path/ {\r\n>             proxy_pass http://127.0.0.1;\r\n>         }", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087919372, "label": "Confirm if documented nginx proxy config works for row pages with escaped characters in their primary key"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1578#issuecomment-1000469107", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1578", "id": 1000469107, "node_id": "IC_kwDOBm6k_c47ofJz", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T18:36:38Z", "updated_at": "2021-12-23T18:36:38Z", "author_association": "OWNER", "body": "This problem doesn't occur on my `localhost` running Uvicorn directly - but I'm seeing it in my production environment that runs Datasette behind an nginx proxy:\r\n\r\n```\r\n    location / {\r\n        proxy_pass http://127.0.0.1:8000/;\r\n\tproxy_set_header Host $host;\r\n    }\r\n```\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087919372, "label": "Confirm if documented nginx proxy config works for row pages with escaped characters in their primary key"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1577#issuecomment-1000462309", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1577", "id": 1000462309, "node_id": "IC_kwDOBm6k_c47odfl", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T18:20:46Z", "updated_at": "2021-12-23T18:20:46Z", "author_association": "OWNER", "body": "There are a lot of improvements to `asyncio` in 3.7: https://docs.python.org/3/whatsnew/3.7.html#whatsnew37-asyncio", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087913724, "label": "Drop support for Python 3.6"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1577#issuecomment-1000461900", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1577", "id": 1000461900, "node_id": "IC_kwDOBm6k_c47odZM", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T18:19:44Z", "updated_at": "2021-12-23T18:19:44Z", "author_association": "OWNER", "body": "The 3.7 feature I want to use today is [contextvars](https://docs.python.org/3/library/contextvars.html) - but I have a workaround for the moment, see https://github.com/simonw/datasette/issues/1576#issuecomment-999987418\r\n\r\nSo I'm going to hold off on dropping 3.6 for a little bit longer. I imagine I'll drop it before Datasette 1.0 though.\r\n\r\nLeaving this issue open to gather thoughts and feedback on this issue from Datasette users and potential users.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087913724, "label": "Drop support for Python 3.6"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1577#issuecomment-1000461275", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1577", "id": 1000461275, "node_id": "IC_kwDOBm6k_c47odPb", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T18:18:11Z", "updated_at": "2021-12-23T18:18:11Z", "author_association": "OWNER", "body": "From the Twitter thread, there are still a decent amount of LTS Linux releases out there that are stuck on pre-3.7 Python.\r\n\r\nThough many of those are 3.5 and Datasette dropped support for 3.5 in November 2019: cf7776d36fbacefa874cbd6e5fcdc9fff7661203", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087913724, "label": "Drop support for Python 3.6"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999990414", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999990414, "node_id": "IC_kwDOBm6k_c47mqSO", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T02:08:39Z", "updated_at": "2021-12-23T18:16:35Z", "author_association": "OWNER", "body": "It's tiny: I'm tempted to vendor it. https://github.com/Skyscanner/aiotask-context/blob/master/aiotask_context/__init__.py\r\n\r\nNo, I'll add it as a pinned dependency, which I can then drop when I drop 3.6 support.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999987418", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999987418, "node_id": "IC_kwDOBm6k_c47mpja", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T01:59:58Z", "updated_at": "2021-12-23T02:02:12Z", "author_association": "OWNER", "body": "Another option: https://github.com/Skyscanner/aiotask-context - looks like it might be better as it's been updated for Python 3.7 in this commit https://github.com/Skyscanner/aiotask-context/commit/67108c91d2abb445655cc2af446fdb52ca7890c4\r\n\r\nThe Skyscanner one doesn't attempt to wrap any existing factories, but that's OK for my purposes since I don't need to handle arbitrary `asyncio` code written by other people.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999876666", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999876666, "node_id": "IC_kwDOBm6k_c47mOg6", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:59:22Z", "updated_at": "2021-12-22T21:18:09Z", "author_association": "OWNER", "body": "This article is relevant: [Context information storage for asyncio](https://blog.sqreen.com/asyncio/) - in particular the section https://blog.sqreen.com/asyncio/#context-inheritance-between-tasks which describes exactly the problem I have and their solution, which involves this trickery:\r\n\r\n```python\r\ndef request_task_factory(loop, coro):\r\n    child_task = asyncio.tasks.Task(coro, loop=loop)\r\n    parent_task = asyncio.Task.current_task(loop=loop)\r\n    current_request = getattr(parent_task, 'current_request', None)\r\n    setattr(child_task, 'current_request', current_request)\r\n    return child_task\r\n\r\nloop = asyncio.get_event_loop()\r\nloop.set_task_factory(request_task_factory)\r\n```\r\n\r\nThey released their solution as a library: https://pypi.org/project/aiocontext/ and https://github.com/sqreen/AioContext - but that company was acquired by Datadog back in April and doesn't seem to be actively maintaining their open source stuff any more: https://twitter.com/SqreenIO/status/1384906075506364417", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999878907", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999878907, "node_id": "IC_kwDOBm6k_c47mPD7", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T21:03:49Z", "updated_at": "2021-12-22T21:10:46Z", "author_association": "OWNER", "body": "`context_vars` can solve this but they were introduced in Python 3.7: https://www.python.org/dev/peps/pep-0567/\r\n\r\nPython 3.6 support ends in a few days time, and it looks like Glitch has updated to 3.7 now - so maybe I can get away with Datasette needing 3.7 these days?\r\n\r\nTweeted about that here: https://twitter.com/simonw/status/1473761478155010048", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999874886", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999874886, "node_id": "IC_kwDOBm6k_c47mOFG", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:55:42Z", "updated_at": "2021-12-22T20:57:28Z", "author_association": "OWNER", "body": "One way to solve this would be to introduce a `set_task_id()` method, which sets an ID which will be returned by `get_task_id()` instead of using `id(current_task(loop=loop))`.\r\n\r\nIt would be really nice if I could solve this using `with` syntax somehow. Something like:\r\n```python\r\nwith trace_child_tasks():\r\n    (\r\n        suggested_facets,\r\n        (facet_results, facets_timed_out),\r\n    ) = await asyncio.gather(\r\n        execute_suggested_facets(),\r\n        execute_facets(),\r\n    )\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999874484", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999874484, "node_id": "IC_kwDOBm6k_c47mN-0", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:54:52Z", "updated_at": "2021-12-22T20:54:52Z", "author_association": "OWNER", "body": "Here's the full current relevant code from `tracer.py`: https://github.com/simonw/datasette/blob/ace86566b28280091b3844cf5fbecd20158e9004/datasette/tracer.py#L8-L64\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-999870993", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 999870993, "node_id": "IC_kwDOBm6k_c47mNIR", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:47:18Z", "updated_at": "2021-12-22T20:50:24Z", "author_association": "OWNER", "body": "The reason they aren't showing up in the traces is that traces are stored just for the currently executing `asyncio` task ID: https://github.com/simonw/datasette/blob/ace86566b28280091b3844cf5fbecd20158e9004/datasette/tracer.py#L13-L25\r\n\r\nThis is so traces for other incoming requests don't end up mixed together. But there's no current mechanism to track async tasks that are effectively \"child tasks\" of the current request, and hence should be tracked the same.\r\n\r\nhttps://stackoverflow.com/a/69349501/6083 suggests that you pass the task ID as an argument to the child tasks that are executed using `asyncio.gather()` to work around this kind of problem.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-999870282", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 999870282, "node_id": "IC_kwDOBm6k_c47mM9K", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:45:56Z", "updated_at": "2021-12-22T20:46:08Z", "author_association": "OWNER", "body": "> New short-term goal: get facets and suggested facets to execute in parallel with the main query. Generate a trace graph that proves that is happening using `datasette-pretty-traces`.\r\n\r\nI wrote code to execute those in parallel using `asyncio.gather()` - which seems to work but causes the SQL run inside the parallel `async def` functions not to show up in the trace graph at all.\r\n\r\n```diff\r\ndiff --git a/datasette/views/table.py b/datasette/views/table.py\r\nindex 9808fd2..ec9db64 100644\r\n--- a/datasette/views/table.py\r\n+++ b/datasette/views/table.py\r\n@@ -1,3 +1,4 @@\r\n+import asyncio\r\n import urllib\r\n import itertools\r\n import json\r\n@@ -615,44 +616,37 @@ class TableView(RowTableShared):\r\n         if request.args.get(\"_timelimit\"):\r\n             extra_args[\"custom_time_limit\"] = int(request.args.get(\"_timelimit\"))\r\n \r\n-        # Execute the main query!\r\n-        results = await db.execute(sql, params, truncate=True, **extra_args)\r\n-\r\n-        # Calculate the total count for this query\r\n-        filtered_table_rows_count = None\r\n-        if (\r\n-            not db.is_mutable\r\n-            and self.ds.inspect_data\r\n-            and count_sql == f\"select count(*) from {table} \"\r\n-        ):\r\n-            # We can use a previously cached table row count\r\n-            try:\r\n-                filtered_table_rows_count = self.ds.inspect_data[database][\"tables\"][\r\n-                    table\r\n-                ][\"count\"]\r\n-            except KeyError:\r\n-                pass\r\n-\r\n-        # Otherwise run a select count(*) ...\r\n-        if count_sql and filtered_table_rows_count is None and not nocount:\r\n-            try:\r\n-                count_rows = list(await db.execute(count_sql, from_sql_params))\r\n-                filtered_table_rows_count = count_rows[0][0]\r\n-            except QueryInterrupted:\r\n-                pass\r\n-\r\n-        # Faceting\r\n-        if not self.ds.setting(\"allow_facet\") and any(\r\n-            arg.startswith(\"_facet\") for arg in request.args\r\n-        ):\r\n-            raise BadRequest(\"_facet= is not allowed\")\r\n+        async def execute_count():\r\n+            # Calculate the total count for this query\r\n+            filtered_table_rows_count = None\r\n+            if (\r\n+                not db.is_mutable\r\n+                and self.ds.inspect_data\r\n+                and count_sql == f\"select count(*) from {table} \"\r\n+            ):\r\n+                # We can use a previously cached table row count\r\n+                try:\r\n+                    filtered_table_rows_count = self.ds.inspect_data[database][\r\n+                        \"tables\"\r\n+                    ][table][\"count\"]\r\n+                except KeyError:\r\n+                    pass\r\n+\r\n+            if count_sql and filtered_table_rows_count is None and not nocount:\r\n+                try:\r\n+                    count_rows = list(await db.execute(count_sql, from_sql_params))\r\n+                    filtered_table_rows_count = count_rows[0][0]\r\n+                except QueryInterrupted:\r\n+                    pass\r\n+\r\n+            return filtered_table_rows_count\r\n+\r\n+        filtered_table_rows_count = await execute_count()\r\n \r\n         # pylint: disable=no-member\r\n         facet_classes = list(\r\n             itertools.chain.from_iterable(pm.hook.register_facet_classes())\r\n         )\r\n-        facet_results = {}\r\n-        facets_timed_out = []\r\n         facet_instances = []\r\n         for klass in facet_classes:\r\n             facet_instances.append(\r\n@@ -668,33 +662,58 @@ class TableView(RowTableShared):\r\n                 )\r\n             )\r\n \r\n-        if not nofacet:\r\n-            for facet in facet_instances:\r\n-                (\r\n-                    instance_facet_results,\r\n-                    instance_facets_timed_out,\r\n-                ) = await facet.facet_results()\r\n-                for facet_info in instance_facet_results:\r\n-                    base_key = facet_info[\"name\"]\r\n-                    key = base_key\r\n-                    i = 1\r\n-                    while key in facet_results:\r\n-                        i += 1\r\n-                        key = f\"{base_key}_{i}\"\r\n-                    facet_results[key] = facet_info\r\n-                facets_timed_out.extend(instance_facets_timed_out)\r\n-\r\n-        # Calculate suggested facets\r\n-        suggested_facets = []\r\n-        if (\r\n-            self.ds.setting(\"suggest_facets\")\r\n-            and self.ds.setting(\"allow_facet\")\r\n-            and not _next\r\n-            and not nofacet\r\n-            and not nosuggest\r\n-        ):\r\n-            for facet in facet_instances:\r\n-                suggested_facets.extend(await facet.suggest())\r\n+        async def execute_suggested_facets():\r\n+            # Calculate suggested facets\r\n+            suggested_facets = []\r\n+            if (\r\n+                self.ds.setting(\"suggest_facets\")\r\n+                and self.ds.setting(\"allow_facet\")\r\n+                and not _next\r\n+                and not nofacet\r\n+                and not nosuggest\r\n+            ):\r\n+                for facet in facet_instances:\r\n+                    suggested_facets.extend(await facet.suggest())\r\n+            return suggested_facets\r\n+\r\n+        async def execute_facets():\r\n+            facet_results = {}\r\n+            facets_timed_out = []\r\n+            if not self.ds.setting(\"allow_facet\") and any(\r\n+                arg.startswith(\"_facet\") for arg in request.args\r\n+            ):\r\n+                raise BadRequest(\"_facet= is not allowed\")\r\n+\r\n+            if not nofacet:\r\n+                for facet in facet_instances:\r\n+                    (\r\n+                        instance_facet_results,\r\n+                        instance_facets_timed_out,\r\n+                    ) = await facet.facet_results()\r\n+                    for facet_info in instance_facet_results:\r\n+                        base_key = facet_info[\"name\"]\r\n+                        key = base_key\r\n+                        i = 1\r\n+                        while key in facet_results:\r\n+                            i += 1\r\n+                            key = f\"{base_key}_{i}\"\r\n+                        facet_results[key] = facet_info\r\n+                    facets_timed_out.extend(instance_facets_timed_out)\r\n+\r\n+            return facet_results, facets_timed_out\r\n+\r\n+        # Execute the main query, facets and facet suggestions in parallel:\r\n+        (\r\n+            results,\r\n+            suggested_facets,\r\n+            (facet_results, facets_timed_out),\r\n+        ) = await asyncio.gather(\r\n+            db.execute(sql, params, truncate=True, **extra_args),\r\n+            execute_suggested_facets(),\r\n+            execute_facets(),\r\n+        )\r\n+\r\n+        results = await db.execute(sql, params, truncate=True, **extra_args)\r\n \r\n         # Figure out columns and rows for the query\r\n         columns = [r[0] for r in results.description]\r\n```\r\nHere's the trace for `http://127.0.0.1:4422/fixtures/compound_three_primary_keys?_trace=1&_facet=pk1&_facet=pk2` with the missing facet and facet suggestion queries:\r\n\r\n<img width=\"1447\" alt=\"image\" src=\"https://user-images.githubusercontent.com/9599/147153051-62cdb9a5-de5e-4cc3-9215-b779f92a81c8.png\">\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-999863269", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 999863269, "node_id": "IC_kwDOBm6k_c47mLPl", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:35:41Z", "updated_at": "2021-12-22T20:37:13Z", "author_association": "OWNER", "body": "It looks like the count has to be executed before facets can be, because the facet_class constructor needs that total count figure: https://github.com/simonw/datasette/blob/6b1384b2f529134998fb507e63307609a5b7f5c0/datasette/views/table.py#L660-L671\r\n\r\nIt's used in facet suggestion logic here: https://github.com/simonw/datasette/blob/ace86566b28280091b3844cf5fbecd20158e9004/datasette/facets.py#L172-L178", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-999850191", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 999850191, "node_id": "IC_kwDOBm6k_c47mIDP", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:29:38Z", "updated_at": "2021-12-22T20:29:38Z", "author_association": "OWNER", "body": "New short-term goal: get facets and suggested facets to execute in parallel with the main query. Generate a trace graph that proves that is happening using `datasette-pretty-traces`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-999837569", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 999837569, "node_id": "IC_kwDOBm6k_c47mE-B", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:15:45Z", "updated_at": "2021-12-22T20:15:45Z", "author_association": "OWNER", "body": "Also the whole `special_args` v.s. `request.args` thing is pretty confusing, I think that might be an older code pattern back from when I was using Sanic.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-999837220", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 999837220, "node_id": "IC_kwDOBm6k_c47mE4k", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:15:04Z", "updated_at": "2021-12-22T20:15:04Z", "author_association": "OWNER", "body": "I think I can move this much higher up in the method, it's a bit confusing having it half way through: https://github.com/simonw/datasette/blob/6b1384b2f529134998fb507e63307609a5b7f5c0/datasette/views/table.py#L414-L436", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-999831967", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 999831967, "node_id": "IC_kwDOBm6k_c47mDmf", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:04:47Z", "updated_at": "2021-12-22T20:10:11Z", "author_association": "OWNER", "body": "I think I might be able to clean up a lot of the stuff in here using the `render_cell` plugin hook: https://github.com/simonw/datasette/blob/6b1384b2f529134998fb507e63307609a5b7f5c0/datasette/views/table.py#L87-L89\r\n\r\nThe catch with that hook - https://docs.datasette.io/en/stable/plugin_hooks.html#render-cell-value-column-table-database-datasette - is that it gets called for every single cell. I don't want the overhead of looking up the foreign key relationships etc once for every value in a specific column.\r\n\r\nBut maybe I could extend the hook to include a shared cache that gets used for all of the cells in a specific table? Something like this:\r\n```python\r\nrender_cell(value, column, table, database, datasette, cache)\r\n```\r\n`cache` is a dictionary - and the same dictionary is passed to every call to that hook while rendering a specific page.\r\n\r\nIt's a bit of a gross hack though, and would it ever be useful for plugins outside of the default plugin in Datasette which does the foreign key stuff?\r\n\r\nIf I can think of one other potential application for this `cache` then I might implement it.\r\n\r\nNo, this optimization doesn't make sense: the most complex cell enrichment logic is the stuff that does a `select * from categories where id in (2, 5, 6)` query, using just the distinct set of IDs that are rendered on the current page. That's not going to fit in the `render_cell` hook no matter how hard I try to warp it into the right shape, because it needs full visibility of all of the results that are being rendered in order to collect those unique ID values.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1181#issuecomment-998999230", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1181", "id": 998999230, "node_id": "IC_kwDOBm6k_c47i4S-", "user": {"value": 9308268, "label": "rayvoelker"}, "created_at": "2021-12-21T18:25:15Z", "updated_at": "2021-12-21T18:25:15Z", "author_association": "NONE", "body": "I wonder if I'm encountering the same bug (or something related). I had previously been using the .csv feature to run queries and then fetch results for the pandas `read_csv()` function, but it seems to have stopped working recently.\r\n\r\nhttps://ilsweb.cincinnatilibrary.org/collection-analysis/collection-analysis/current_collection-3d56dbf.csv?sql=select%0D%0A++*%0D%0Afrom%0D%0A++bib%0D%0Alimit%0D%0A++100&_size=max\r\n\r\nDatasette v0.59.4\r\n![image](https://user-images.githubusercontent.com/9308268/146979957-66911877-2cd9-4022-bc76-fd54e4a3a6f7.png)\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 781262510, "label": "Certain database names results in 404: \"Database not found: None\""}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1554#issuecomment-998354538", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1554", "id": 998354538, "node_id": "IC_kwDOBm6k_c47ga5q", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-20T23:52:04Z", "updated_at": "2021-12-20T23:52:04Z", "author_association": "OWNER", "body": "Abandoning this since it didn't work how I wanted.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079129258, "label": "TableView refactor"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1547#issuecomment-997519202", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1547", "id": 997519202, "node_id": "IC_kwDOBm6k_c47dO9i", "user": {"value": 127565, "label": "wragge"}, "created_at": "2021-12-20T01:36:58Z", "updated_at": "2021-12-20T01:36:58Z", "author_association": "CONTRIBUTOR", "body": "Yep, that works -- thanks!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1076388044, "label": "Writable canned queries fail to load custom templates"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1547#issuecomment-997514220", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1547", "id": 997514220, "node_id": "IC_kwDOBm6k_c47dNvs", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-20T01:26:25Z", "updated_at": "2021-12-20T01:26:25Z", "author_association": "OWNER", "body": "OK, this should hopefully fix that for you:\r\n\r\n    pip install https://github.com/simonw/datasette/archive/f36e010b3b69ada104b79d83c7685caf9359049e.zip", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1076388044, "label": "Writable canned queries fail to load custom templates"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1547#issuecomment-997513369", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1547", "id": 997513369, "node_id": "IC_kwDOBm6k_c47dNiZ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-20T01:24:43Z", "updated_at": "2021-12-20T01:24:43Z", "author_association": "OWNER", "body": "@wragge thanks, that's a bug! Working on that in #1575.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1076388044, "label": "Writable canned queries fail to load custom templates"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1575#issuecomment-997513177", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1575", "id": 997513177, "node_id": "IC_kwDOBm6k_c47dNfZ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-20T01:24:25Z", "updated_at": "2021-12-20T01:24:25Z", "author_association": "OWNER", "body": "Looks like `specname` is new in Pluggy 1.0: https://github.com/pytest-dev/pluggy/blob/main/CHANGELOG.rst#pluggy-100-2021-08-25", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1084257842, "label": "__call__() got an unexpected keyword argument 'specname'"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1547#issuecomment-997511968", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1547", "id": 997511968, "node_id": "IC_kwDOBm6k_c47dNMg", "user": {"value": 127565, "label": "wragge"}, "created_at": "2021-12-20T01:21:59Z", "updated_at": "2021-12-20T01:21:59Z", "author_association": "CONTRIBUTOR", "body": "I've installed the alpha version but get an error when starting up Datasette:\r\n\r\n```\r\nTraceback (most recent call last):\r\n  File \"/Users/tim/.pyenv/versions/stock-exchange/bin/datasette\", line 5, in <module>\r\n    from datasette.cli import cli\r\n  File \"/Users/tim/.pyenv/versions/3.8.5/envs/stock-exchange/lib/python3.8/site-packages/datasette/cli.py\", line 15, in <module>\r\n    from .app import Datasette, DEFAULT_SETTINGS, SETTINGS, SQLITE_LIMIT_ATTACHED, pm\r\n  File \"/Users/tim/.pyenv/versions/3.8.5/envs/stock-exchange/lib/python3.8/site-packages/datasette/app.py\", line 31, in <module>\r\n    from .views.database import DatabaseDownload, DatabaseView\r\n  File \"/Users/tim/.pyenv/versions/3.8.5/envs/stock-exchange/lib/python3.8/site-packages/datasette/views/database.py\", line 25, in <module>\r\n    from datasette.plugins import pm\r\n  File \"/Users/tim/.pyenv/versions/3.8.5/envs/stock-exchange/lib/python3.8/site-packages/datasette/plugins.py\", line 29, in <module>\r\n    mod = importlib.import_module(plugin)\r\n  File \"/Users/tim/.pyenv/versions/3.8.5/lib/python3.8/importlib/__init__.py\", line 127, in import_module\r\n    return _bootstrap._gcd_import(name[level:], package, level)\r\n  File \"/Users/tim/.pyenv/versions/3.8.5/envs/stock-exchange/lib/python3.8/site-packages/datasette/filters.py\", line 9, in <module>\r\n    @hookimpl(specname=\"filters_from_request\")\r\nTypeError: __call__() got an unexpected keyword argument 'specname'\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1076388044, "label": "Writable canned queries fail to load custom templates"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997507074", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/356", "id": 997507074, "node_id": "IC_kwDOCGYnMM47dMAC", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-20T01:10:06Z", "updated_at": "2021-12-20T01:16:11Z", "author_association": "OWNER", "body": "Work-in-progress improved help:\r\n```\r\nUsage: sqlite-utils insert [OPTIONS] PATH TABLE FILE\r\n\r\n  Insert records from FILE into a table, creating the table if it does not\r\n  already exist.\r\n\r\n  By default the input is expected to be a JSON array of objects. Or:\r\n\r\n  - Use --nl for newline-delimited JSON objects\r\n  - Use --csv or --tsv for comma-separated or tab-separated input\r\n  - Use --lines to write each incoming line to a column called \"line\"\r\n  - Use --all to write the entire input to a column called \"all\"\r\n\r\n  You can also use --convert to pass a fragment of Python code that will be\r\n  used to convert each input.\r\n\r\n  Your Python code will be passed a \"row\" variable representing the imported\r\n  row, and can return a modified row.\r\n\r\n  If you are using --lines your code will be passed a \"line\" variable, and for\r\n  --all an \"all\" variable.\r\n\r\nOptions:\r\n  --pk TEXT                 Columns to use as the primary key, e.g. id\r\n  --flatten                 Flatten nested JSON objects, so {\"a\": {\"b\": 1}}\r\n                            becomes {\"a_b\": 1}\r\n  --nl                      Expect newline-delimited JSON\r\n  -c, --csv                 Expect CSV input\r\n  --tsv                     Expect TSV input\r\n  --lines                   Treat each line as a single value called 'line'\r\n  --all                     Treat input as a single value called 'all'\r\n  --convert TEXT            Python code to convert each item\r\n  --import TEXT             Python modules to import\r\n  --delimiter TEXT          Delimiter to use for CSV files\r\n  --quotechar TEXT          Quote character to use for CSV/TSV\r\n  --sniff                   Detect delimiter and quote character\r\n  --no-headers              CSV file has no header row\r\n  --batch-size INTEGER      Commit every X records\r\n  --alter                   Alter existing table to add any missing columns\r\n  --not-null TEXT           Columns that should be created as NOT NULL\r\n  --default <TEXT TEXT>...  Default value that should be set for a column\r\n  --encoding TEXT           Character encoding for input, defaults to utf-8\r\n  -d, --detect-types        Detect types for columns in CSV/TSV data\r\n  --load-extension TEXT     SQLite extensions to load\r\n  --silent                  Do not show progress bar\r\n  --ignore                  Ignore records if pk already exists\r\n  --replace                 Replace records if pk already exists\r\n  --truncate                Truncate table before inserting records, if table\r\n                            already exists\r\n  -h, --help                Show this message and exit.\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077431957, "label": "`sqlite-utils insert --convert` option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997508728", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/356", "id": 997508728, "node_id": "IC_kwDOCGYnMM47dMZ4", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-20T01:14:43Z", "updated_at": "2021-12-20T01:14:43Z", "author_association": "OWNER", "body": "(This makes me want `--extract` from #352 even more.)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077431957, "label": "`sqlite-utils insert --convert` option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/163#issuecomment-997502242", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/163", "id": 997502242, "node_id": "IC_kwDOCGYnMM47dK0i", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-20T00:56:45Z", "updated_at": "2021-12-20T00:56:52Z", "author_association": "OWNER", "body": "> Maybe `sqlite-utils` should absorb all of the functionality from `sqlite-transform` - having two separate tools doesn't necessarily make sense.\r\n\r\nI implemented that in:\r\n- #251", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 706001517, "label": "Idea: conversions= could take Python functions"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997497262", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/356", "id": 997497262, "node_id": "IC_kwDOCGYnMM47dJmu", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-20T00:40:15Z", "updated_at": "2021-12-20T00:40:15Z", "author_association": "OWNER", "body": "`--flatten` could do with a better description too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077431957, "label": "`sqlite-utils insert --convert` option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997496931", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/356", "id": 997496931, "node_id": "IC_kwDOCGYnMM47dJhj", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-20T00:39:14Z", "updated_at": "2021-12-20T00:39:52Z", "author_association": "OWNER", "body": "```\r\n% sqlite-utils insert --help\r\nUsage: sqlite-utils insert [OPTIONS] PATH TABLE JSON_FILE\r\n\r\n  Insert records from JSON file into a table, creating the table if it does\r\n  not already exist.\r\n\r\n  Input should be a JSON array of objects, unless --nl or --csv is used.\r\n\r\nOptions:\r\n  --pk TEXT                 Columns to use as the primary key, e.g. id\r\n  --nl                      Expect newline-delimited JSON\r\n  --flatten                 Flatten nested JSON objects\r\n  -c, --csv                 Expect CSV\r\n  --tsv                     Expect TSV\r\n  --convert TEXT            Python code to convert each item\r\n  --import TEXT             Python modules to import\r\n  --delimiter TEXT          Delimiter to use for CSV files\r\n  --quotechar TEXT          Quote character to use for CSV/TSV\r\n  --sniff                   Detect delimiter and quote character\r\n  --no-headers              CSV file has no header row\r\n  --batch-size INTEGER      Commit every X records\r\n  --alter                   Alter existing table to add any missing columns\r\n  --not-null TEXT           Columns that should be created as NOT NULL\r\n  --default <TEXT TEXT>...  Default value that should be set for a column\r\n  --encoding TEXT           Character encoding for input, defaults to utf-8\r\n  -d, --detect-types        Detect types for columns in CSV/TSV data\r\n  --load-extension TEXT     SQLite extensions to load\r\n  --silent                  Do not show progress bar\r\n  --ignore                  Ignore records if pk already exists\r\n  --replace                 Replace records if pk already exists\r\n  --truncate                Truncate table before inserting records, if table\r\n                            already exists\r\n  -h, --help                Show this message and exit.\r\n```\r\nI can add a bunch of extra help at the top there to explain all of this stuff. That \"Input should be a JSON array of objects\" bit could be expanded to several paragraphs.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077431957, "label": "`sqlite-utils insert --convert` option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997492872", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/356", "id": 997492872, "node_id": "IC_kwDOCGYnMM47dIiI", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-20T00:23:31Z", "updated_at": "2021-12-20T00:23:31Z", "author_association": "OWNER", "body": "I think this should work on JSON, or CSV, or individual lines, or the entire content at once.\r\n\r\nSo I'll require `--lines --convert ...` to import individual lines, or `--all --convert` to run the conversion against the entire input at once.\r\n\r\nWhat would `--lines` or `--all` do without `--convert`? Maybe insert records as `{\"line\": \"line of text\"}` or `{\"all\": \"whole input}`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077431957, "label": "`sqlite-utils insert --convert` option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997486156", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/356", "id": 997486156, "node_id": "IC_kwDOCGYnMM47dG5M", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T23:51:02Z", "updated_at": "2021-12-19T23:51:02Z", "author_association": "OWNER", "body": "This is going to need a `--import` multi option too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077431957, "label": "`sqlite-utils insert --convert` option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997485361", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/356", "id": 997485361, "node_id": "IC_kwDOCGYnMM47dGsx", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T23:45:30Z", "updated_at": "2021-12-19T23:45:30Z", "author_association": "OWNER", "body": "Really interesting example input for this: https://blog.timac.org/2021/1219-state-of-swift-and-swiftui-ios15/iOS13.txt - see https://blog.timac.org/2021/1219-state-of-swift-and-swiftui-ios15/", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077431957, "label": "`sqlite-utils insert --convert` option"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1565#issuecomment-997474022", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1565", "id": 997474022, "node_id": "IC_kwDOBm6k_c47dD7m", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T22:36:49Z", "updated_at": "2021-12-19T22:37:29Z", "author_association": "OWNER", "body": "No way with a tagged template literal to pass an extra database name argument, so instead I need a method that returns a callable that can be used for the tagged template literal for a specific database - or the default database.\r\n\r\nThis could work (bit weird looking though):\r\n```javascript\r\nvar rows = await datasette.query(\"fixtures\")`select * from foo`;\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1083657868, "label": "Documented JavaScript variables on different templates made available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1565#issuecomment-997473856", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1565", "id": 997473856, "node_id": "IC_kwDOBm6k_c47dD5A", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T22:35:20Z", "updated_at": "2021-12-19T22:35:20Z", "author_association": "OWNER", "body": "Quick prototype of that tagged template `query` function:\r\n\r\n```javascript\r\nfunction query(pieces, ...parameters) {\r\n  var qs = new URLSearchParams();\r\n  var sql = pieces[0];\r\n  parameters.forEach((param, i) => {\r\n    sql += `:p${i}${pieces[i + 1]}`;\r\n    qs.append(`p${i}`, param);\r\n  });\r\n  qs.append(\"sql\", sql);\r\n  return qs.toString();\r\n}\r\n\r\nvar id = 4;\r\nconsole.log(query`select * from ids where id > ${id}`);\r\n```\r\nOutputs:\r\n```\r\np0=4&sql=select+*+from+ids+where+id+%3E+%3Ap0\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1083657868, "label": "Documented JavaScript variables on different templates made available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1565#issuecomment-997472639", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1565", "id": 997472639, "node_id": "IC_kwDOBm6k_c47dDl_", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T22:25:50Z", "updated_at": "2021-12-19T22:25:50Z", "author_association": "OWNER", "body": "Or...\r\n```javascript\r\nrows = await datasette.query`select * from searchable where id > ${id}`;\r\n```\r\nAnd it knows how to turn that into a parameterized call using tagged template literals.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1083657868, "label": "Documented JavaScript variables on different templates made available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1565#issuecomment-997472509", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1565", "id": 997472509, "node_id": "IC_kwDOBm6k_c47dDj9", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T22:24:50Z", "updated_at": "2021-12-19T22:24:50Z", "author_association": "OWNER", "body": "... huh, it could even expose a JavaScript function that can be called to execute a SQL query.\r\n\r\n```javascript\r\ndatasette.query(\"select * from blah\").then(...)\r\n```\r\nMaybe it takes an optional second argument that specifies the database - defaulting to the one for the current page.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1083657868, "label": "Documented JavaScript variables on different templates made available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1565#issuecomment-997472370", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1565", "id": 997472370, "node_id": "IC_kwDOBm6k_c47dDhy", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T22:23:36Z", "updated_at": "2021-12-19T22:23:36Z", "author_association": "OWNER", "body": "This should also expose the JSON API endpoints used to execute SQL against this database.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1083657868, "label": "Documented JavaScript variables on different templates made available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1518#issuecomment-997472214", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1518", "id": 997472214, "node_id": "IC_kwDOBm6k_c47dDfW", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T22:22:08Z", "updated_at": "2021-12-19T22:22:08Z", "author_association": "OWNER", "body": "I sketched out a chained SQL builder pattern that might be useful for further tidying up this code - though with the new plugin hook I'm less excited about it than I was:\r\n\r\n```python\r\nclass TableQuery:\r\n    def __init__(self, table, columns, pks, is_view=False, prev=None):\r\n        self.table = table\r\n        self.columns = columns\r\n        self.pks = pks\r\n        self.is_view = is_view\r\n        self.prev = prev\r\n        \r\n        # These can be changed for different instances in the chain:\r\n        self._where_clauses = None\r\n        self._order_by = None\r\n        self._page_size = None\r\n        self._offset = None\r\n        self._select_columns = None\r\n\r\n        self.select_all_columns = '*'\r\n        self.select_specified_columns = '*'\r\n\r\n    @property\r\n    def where_clauses(self):\r\n        wheres = []\r\n        current = self\r\n        while current:\r\n            if current._where_clauses is not None:\r\n                wheres.extend(current._where_clauses)\r\n            current = current.prev\r\n        return list(reversed(wheres))\r\n\r\n    def where(self, where):\r\n        new_cls = TableQuery(self.table, self.columns, self.pks, self.is_view, self)\r\n        new_cls._where_clauses = [where]\r\n        return new_cls\r\n        \r\n    @classmethod\r\n    async def introspect(cls, db, table):\r\n        return cls(\r\n            table,\r\n            columns = await db.table_columns(table),\r\n            pks = await db.primary_keys(table),\r\n            is_view = bool(await db.get_view_definition(table))\r\n        )\r\n        \r\n    @property\r\n    def sql_from(self):\r\n        return f\"from {self.table}{self.sql_where}\"\r\n\r\n    @property\r\n    def sql_where(self):\r\n        if not self.where_clauses:\r\n            return \"\"\r\n        else:\r\n            return f\" where {' and '.join(self.where_clauses)}\"\r\n\r\n    @property\r\n    def sql_no_order_no_limit(self):\r\n        return f\"select {self.select_all_columns} from {self.table}{self.sql_where}\"\r\n\r\n    @property\r\n    def sql(self):\r\n        return f\"select {self.select_specified_columns} from {self.table} {self.sql_where}{self._order_by} limit {self._page_size}{self._offset}\"\r\n\r\n    @property\r\n    def sql_count(self):\r\n        return f\"select count(*) {self.sql_from}\"\r\n\r\n\r\n    def __repr__(self):\r\n        return f\"<TableQuery sql={self.sql}>\"\r\n```\r\nUsage:\r\n```python\r\nfrom datasette.app import Datasette\r\nds = Datasette(memory=True, files=[\"/Users/simon/Dropbox/Development/datasette/fixtures.db\"])\r\ndb = ds.get_database(\"fixtures\")\r\nquery = await TableQuery.introspect(db, \"facetable\")\r\nprint(query.where(\"foo = bar\").where(\"baz = 1\").sql_count)\r\n# 'select count(*) from facetable where foo = bar and baz = 1'\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1058072543, "label": "Complete refactor of TableView and table.html template"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1547#issuecomment-997471672", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1547", "id": 997471672, "node_id": "IC_kwDOBm6k_c47dDW4", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T22:18:26Z", "updated_at": "2021-12-19T22:18:26Z", "author_association": "OWNER", "body": "I released this [in an alpha](https://github.com/simonw/datasette/releases/tag/0.60a1), so you can try out this fix using:\r\n\r\n    pip install datasette==0.60a1", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1076388044, "label": "Writable canned queries fail to load custom templates"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1566#issuecomment-997470633", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1566", "id": 997470633, "node_id": "IC_kwDOBm6k_c47dDGp", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T22:12:00Z", "updated_at": "2021-12-19T22:12:00Z", "author_association": "OWNER", "body": "Released another alpha, 0.60a1: https://github.com/simonw/datasette/releases/tag/0.60a1", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1083669410, "label": "Release Datasette 0.60"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1545#issuecomment-997462604", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1545", "id": 997462604, "node_id": "IC_kwDOBm6k_c47dBJM", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T21:17:08Z", "updated_at": "2021-12-19T21:17:08Z", "author_association": "OWNER", "body": "Here's the relevant code: https://github.com/simonw/datasette/blob/4094741c2881c2ada3f3f878b532fdaec7914953/datasette/app.py#L1204-L1219\r\n\r\nIt's using `route_path.split(\"/\")` which should be OK because that's the incoming `request.path` path - which I would expect to use `/` even on Windows. Then it uses `os.path.join` which should do the right thing.\r\n\r\nI need to get myself a proper Windows development environment setup to investigate this one.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1075893249, "label": "Custom pages don't work on windows"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1573#issuecomment-997462117", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1573", "id": 997462117, "node_id": "IC_kwDOBm6k_c47dBBl", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T21:13:13Z", "updated_at": "2021-12-19T21:13:13Z", "author_association": "OWNER", "body": "This might also be the impetus I need to bring the https://datasette.io/plugins/datasette-pretty-traces plugin into Datasette core itself.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1084185188, "label": "Make trace() a documented internal API"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1547#issuecomment-997460731", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1547", "id": 997460731, "node_id": "IC_kwDOBm6k_c47dAr7", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T21:02:15Z", "updated_at": "2021-12-19T21:02:15Z", "author_association": "OWNER", "body": "Yes, this is a bug. It looks like the problem is with the `if write:` branch in this code here: https://github.com/simonw/datasette/blob/5fac26aa221a111d7633f2dd92014641f7c0ade9/datasette/views/database.py#L252-L327\r\n\r\nIs missing this bit of code:\r\n\r\nhttps://github.com/simonw/datasette/blob/5fac26aa221a111d7633f2dd92014641f7c0ade9/datasette/views/database.py#L343-L347", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1076388044, "label": "Writable canned queries fail to load custom templates"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1570#issuecomment-997460061", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1570", "id": 997460061, "node_id": "IC_kwDOBm6k_c47dAhd", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T20:56:54Z", "updated_at": "2021-12-19T20:56:54Z", "author_association": "OWNER", "body": "Documentation: https://docs.datasette.io/en/latest/internals.html#await-db-execute-write-sql-params-none-block-false", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1083921371, "label": "Separate db.execute_write() into three methods"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997459958", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997459958, "node_id": "IC_kwDOBm6k_c47dAf2", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T20:55:59Z", "updated_at": "2021-12-19T20:55:59Z", "author_association": "OWNER", "body": "Closing this issue because I've optimized this a whole bunch, and it's definitely good enough for the moment.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997325189", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997325189, "node_id": "IC_kwDOBm6k_c47cfmF", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T03:55:01Z", "updated_at": "2021-12-19T20:54:51Z", "author_association": "OWNER", "body": "It's a bit annoying that the queries no longer show up in the trace at all now, thanks to running in `.execute_fn()`. I wonder if there's something smart I can do about that - maybe have `trace()` record that function with a traceback even though it doesn't have the executed SQL string?\r\n\r\n5fac26aa221a111d7633f2dd92014641f7c0ade9 has the same problem.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997459637", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997459637, "node_id": "IC_kwDOBm6k_c47dAa1", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T20:53:46Z", "updated_at": "2021-12-19T20:53:46Z", "author_association": "OWNER", "body": "Using #1571 showed me that the `DELETE FROM columns/foreign_keys/indexes WHERE database_name = ? and table_name = ?` queries were running way more times than I expected. I came up with a new optimization that just does `DELETE FROM columns/foreign_keys/indexes WHERE database_name = ?` instead.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1566#issuecomment-997457790", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1566", "id": 997457790, "node_id": "IC_kwDOBm6k_c47c_9-", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T20:40:50Z", "updated_at": "2021-12-19T20:40:57Z", "author_association": "OWNER", "body": "Also release new version of `datasette-pretty-traces` with this feature:\r\n- https://github.com/simonw/datasette-pretty-traces/issues/7", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1083669410, "label": "Release Datasette 0.60"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997342494", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997342494, "node_id": "IC_kwDOBm6k_c47cj0e", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T07:22:04Z", "updated_at": "2021-12-19T07:22:04Z", "author_association": "OWNER", "body": "Another option would be to provide an abstraction that makes it easier to run a group of SQL queries in the same thread at the same time, and have them traced correctly.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997324666", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997324666, "node_id": "IC_kwDOBm6k_c47cfd6", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T03:47:51Z", "updated_at": "2021-12-19T03:48:09Z", "author_association": "OWNER", "body": "Here's a hacked together prototype of running all of that stuff inside a single function passed to `.execute_fn()`:\r\n\r\n```diff\r\ndiff --git a/datasette/utils/internal_db.py b/datasette/utils/internal_db.py\r\nindex 95055d8..58f9982 100644\r\n--- a/datasette/utils/internal_db.py\r\n+++ b/datasette/utils/internal_db.py\r\n@@ -1,4 +1,5 @@\r\n import textwrap\r\n+from datasette.utils import table_column_details\r\n \r\n \r\n async def init_internal_db(db):\r\n@@ -70,49 +71,70 @@ async def populate_schema_tables(internal_db, db):\r\n         \"DELETE FROM tables WHERE database_name = ?\", [database_name], block=True\r\n     )\r\n     tables = (await db.execute(\"select * from sqlite_master WHERE type = 'table'\")).rows\r\n-    tables_to_insert = []\r\n-    columns_to_delete = []\r\n-    columns_to_insert = []\r\n-    foreign_keys_to_delete = []\r\n-    foreign_keys_to_insert = []\r\n-    indexes_to_delete = []\r\n-    indexes_to_insert = []\r\n \r\n-    for table in tables:\r\n-        table_name = table[\"name\"]\r\n-        tables_to_insert.append(\r\n-            (database_name, table_name, table[\"rootpage\"], table[\"sql\"])\r\n-        )\r\n-        columns_to_delete.append((database_name, table_name))\r\n-        columns = await db.table_column_details(table_name)\r\n-        columns_to_insert.extend(\r\n-            {\r\n-                **{\"database_name\": database_name, \"table_name\": table_name},\r\n-                **column._asdict(),\r\n-            }\r\n-            for column in columns\r\n-        )\r\n-        foreign_keys_to_delete.append((database_name, table_name))\r\n-        foreign_keys = (\r\n-            await db.execute(f\"PRAGMA foreign_key_list([{table_name}])\")\r\n-        ).rows\r\n-        foreign_keys_to_insert.extend(\r\n-            {\r\n-                **{\"database_name\": database_name, \"table_name\": table_name},\r\n-                **dict(foreign_key),\r\n-            }\r\n-            for foreign_key in foreign_keys\r\n-        )\r\n-        indexes_to_delete.append((database_name, table_name))\r\n-        indexes = (await db.execute(f\"PRAGMA index_list([{table_name}])\")).rows\r\n-        indexes_to_insert.extend(\r\n-            {\r\n-                **{\"database_name\": database_name, \"table_name\": table_name},\r\n-                **dict(index),\r\n-            }\r\n-            for index in indexes\r\n+    def collect_info(conn):\r\n+        tables_to_insert = []\r\n+        columns_to_delete = []\r\n+        columns_to_insert = []\r\n+        foreign_keys_to_delete = []\r\n+        foreign_keys_to_insert = []\r\n+        indexes_to_delete = []\r\n+        indexes_to_insert = []\r\n+\r\n+        for table in tables:\r\n+            table_name = table[\"name\"]\r\n+            tables_to_insert.append(\r\n+                (database_name, table_name, table[\"rootpage\"], table[\"sql\"])\r\n+            )\r\n+            columns_to_delete.append((database_name, table_name))\r\n+            columns = table_column_details(conn, table_name)\r\n+            columns_to_insert.extend(\r\n+                {\r\n+                    **{\"database_name\": database_name, \"table_name\": table_name},\r\n+                    **column._asdict(),\r\n+                }\r\n+                for column in columns\r\n+            )\r\n+            foreign_keys_to_delete.append((database_name, table_name))\r\n+            foreign_keys = conn.execute(\r\n+                f\"PRAGMA foreign_key_list([{table_name}])\"\r\n+            ).fetchall()\r\n+            foreign_keys_to_insert.extend(\r\n+                {\r\n+                    **{\"database_name\": database_name, \"table_name\": table_name},\r\n+                    **dict(foreign_key),\r\n+                }\r\n+                for foreign_key in foreign_keys\r\n+            )\r\n+            indexes_to_delete.append((database_name, table_name))\r\n+            indexes = conn.execute(f\"PRAGMA index_list([{table_name}])\").fetchall()\r\n+            indexes_to_insert.extend(\r\n+                {\r\n+                    **{\"database_name\": database_name, \"table_name\": table_name},\r\n+                    **dict(index),\r\n+                }\r\n+                for index in indexes\r\n+            )\r\n+        return (\r\n+            tables_to_insert,\r\n+            columns_to_delete,\r\n+            columns_to_insert,\r\n+            foreign_keys_to_delete,\r\n+            foreign_keys_to_insert,\r\n+            indexes_to_delete,\r\n+            indexes_to_insert,\r\n         )\r\n \r\n+    (\r\n+        tables_to_insert,\r\n+        columns_to_delete,\r\n+        columns_to_insert,\r\n+        foreign_keys_to_delete,\r\n+        foreign_keys_to_insert,\r\n+        indexes_to_delete,\r\n+        indexes_to_insert,\r\n+    ) = await db.execute_fn(collect_info)\r\n+\r\n     await internal_db.execute_write_many(\r\n         \"\"\"\r\n         INSERT INTO tables (database_name, table_name, rootpage, sql)\r\n```\r\nFirst impressions: it looks like this helps **a lot** - as far as I can tell this is now taking around 21ms to get to the point at which all of those internal databases have been populated, where previously it took more than 180ms.\r\n\r\n![CleanShot 2021-12-18 at 19 47 22@2x](https://user-images.githubusercontent.com/9599/146663192-bba098d5-e7bd-4e2e-b525-2270867888a0.png)\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997324156", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997324156, "node_id": "IC_kwDOBm6k_c47cfV8", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T03:40:05Z", "updated_at": "2021-12-19T03:40:05Z", "author_association": "OWNER", "body": "Using the prototype of this:\r\n- https://github.com/simonw/datasette-pretty-traces/issues/5\r\n\r\nI'm seeing about 180ms spent running all of these queries on startup!\r\n\r\n![CleanShot 2021-12-18 at 19 38 37@2x](https://user-images.githubusercontent.com/9599/146663045-46bda669-90de-474f-8870-345182725dc1.png)\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997321767", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997321767, "node_id": "IC_kwDOBm6k_c47cewn", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T03:10:58Z", "updated_at": "2021-12-19T03:10:58Z", "author_association": "OWNER", "body": "I wonder how much overhead there is switching between the `async` event loop main code and the thread that runs the SQL queries.\r\n\r\nWould there be a performance boost if I gathered all of the column/index information in a single function run on the thread using `db.execute_fn()` I wonder? It would eliminate a bunch of switching between threads.\r\n\r\nWould be great to understand how much of an impact that would have.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997321653", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997321653, "node_id": "IC_kwDOBm6k_c47ceu1", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T03:09:43Z", "updated_at": "2021-12-19T03:09:43Z", "author_association": "OWNER", "body": "On that same documentation page I just spotted this:\r\n\r\n>  This feature is experimental and is subject to change. Further documentation will become available if and when the table-valued functions for PRAGMAs feature becomes officially supported. \r\n\r\nThis makes me nervous to rely on pragma function optimizations in Datasette itself.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997321477", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997321477, "node_id": "IC_kwDOBm6k_c47cesF", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T03:07:33Z", "updated_at": "2021-12-19T03:07:33Z", "author_association": "OWNER", "body": "If I want to continue supporting SQLite prior to 3.16.0 (2017-01-02) I'll need this optimization to only kick in with versions that support table-valued PRAGMA functions, while keeping the old `PRAGMA foreign_key_list(table)` stuff working for those older versions.\r\n\r\nThat's feasible, but it's a bit more work - and I need to make sure I have robust testing in place for SQLite 3.15.0.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997321327", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997321327, "node_id": "IC_kwDOBm6k_c47cepv", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T03:05:39Z", "updated_at": "2021-12-19T03:05:44Z", "author_association": "OWNER", "body": "This caught me out once before in:\r\n- https://github.com/simonw/datasette/issues/1276\r\n\r\nTurns out Glitch was running SQLite 3.11.0 from 2016-02-15.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997321217", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997321217, "node_id": "IC_kwDOBm6k_c47ceoB", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T03:04:16Z", "updated_at": "2021-12-19T03:04:16Z", "author_association": "OWNER", "body": "One thing to watch out for though, from https://sqlite.org/pragma.html#pragfunc\r\n\r\n>  The table-valued functions for PRAGMA feature was added in SQLite version 3.16.0 (2017-01-02). Prior versions of SQLite cannot use this feature. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997321115", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997321115, "node_id": "IC_kwDOBm6k_c47cemb", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T03:03:12Z", "updated_at": "2021-12-19T03:03:12Z", "author_association": "OWNER", "body": "Table columns is a bit harder, because `table_xinfo` is only in SQLite 3.26.0 or higher: https://github.com/simonw/datasette/blob/d637ed46762fdbbd8e32b86f258cd9a53c1cfdc7/datasette/utils/__init__.py#L565-L581\r\n\r\nSo if that function is available: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++table_xinfo.*%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_table_xinfo%28sqlite_master.name%29+AS+table_xinfo%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27\r\n\r\n```sql\r\nSELECT\r\n  sqlite_master.name,\r\n  table_xinfo.*\r\nFROM\r\n  sqlite_master,\r\n  pragma_table_xinfo(sqlite_master.name) AS table_xinfo\r\nWHERE\r\n  sqlite_master.type = 'table'\r\n```\r\nAnd otherwise, using `table_info`: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++table_info.*%2C%0D%0A++0+as+hidden%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_table_info%28sqlite_master.name%29+AS+table_info%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27\r\n\r\n```sql\r\nSELECT\r\n  sqlite_master.name,\r\n  table_info.*,\r\n  0 as hidden\r\nFROM\r\n  sqlite_master,\r\n  pragma_table_info(sqlite_master.name) AS table_info\r\nWHERE\r\n  sqlite_master.type = 'table'\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997320824", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997320824, "node_id": "IC_kwDOBm6k_c47ceh4", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-19T02:59:57Z", "updated_at": "2021-12-19T03:00:44Z", "author_association": "OWNER", "body": "To list all indexes: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++index_list.*%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_index_list%28sqlite_master.name%29+AS+index_list%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27\r\n\r\n```sql\r\nSELECT\r\n  sqlite_master.name,\r\n  index_list.*\r\nFROM\r\n  sqlite_master,\r\n  pragma_index_list(sqlite_master.name) AS index_list\r\nWHERE\r\n  sqlite_master.type = 'table'\r\n```\r\n\r\nForeign keys: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++foreign_key_list.*%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_foreign_key_list%28sqlite_master.name%29+AS+foreign_key_list%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27\r\n\r\n```sql\r\nSELECT\r\n  sqlite_master.name,\r\n  foreign_key_list.*\r\nFROM\r\n  sqlite_master,\r\n  pragma_foreign_key_list(sqlite_master.name) AS foreign_key_list\r\nWHERE\r\n  sqlite_master.type = 'table'\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1566#issuecomment-997272328", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1566", "id": 997272328, "node_id": "IC_kwDOBm6k_c47cSsI", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-18T19:18:01Z", "updated_at": "2021-12-18T19:18:01Z", "author_association": "OWNER", "body": "Added some useful new documented internal methods in:\r\n- #1570", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1083669410, "label": "Release Datasette 0.60"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997272223", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997272223, "node_id": "IC_kwDOBm6k_c47cSqf", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-18T19:17:13Z", "updated_at": "2021-12-18T19:17:13Z", "author_association": "OWNER", "body": "That's a good optimization. Still need to deal with the huge flurry of `PRAGMA` queries though before I can consider this done.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1570#issuecomment-997267583", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1570", "id": 997267583, "node_id": "IC_kwDOBm6k_c47cRh_", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-18T18:46:05Z", "updated_at": "2021-12-18T18:46:12Z", "author_association": "OWNER", "body": "This will replace the work done in #1569.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1083921371, "label": "Separate db.execute_write() into three methods"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1555#issuecomment-997267416", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1555", "id": 997267416, "node_id": "IC_kwDOBm6k_c47cRfY", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-18T18:44:53Z", "updated_at": "2021-12-18T18:45:28Z", "author_association": "OWNER", "body": "Rather than adding a `executemany=True` parameter, I'm now thinking a better design might be to have three methods:\r\n\r\n- `db.execute_write(sql, params=None, block=False)`\r\n- `db.execute_writescript(sql, block=False)`\r\n- `db.execute_writemany(sql, params_seq, block=False)`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079149656, "label": "Optimize all those calls to index_list and foreign_key_list"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1569#issuecomment-997266687", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1569", "id": 997266687, "node_id": "IC_kwDOBm6k_c47cRT_", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-18T18:41:40Z", "updated_at": "2021-12-18T18:41:40Z", "author_association": "OWNER", "body": "Updated documentation: https://docs.datasette.io/en/latest/internals.html#await-db-execute-write-sql-params-none-executescript-false-block-false", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1083895395, "label": "db.execute_write(..., executescript=True) parameter"}, "performed_via_github_app": null}