github

This data as json, CSV

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	issue
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006311742	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006311742	IC_kwDOCGYnMM47-xk-	9599	2022-01-06T06:12:19Z	2022-01-06T06:12:19Z	OWNER	Got that working: ``` % echo 'This is cool' \| sqlite-utils insert words.db words - --text --convert '({"word": w} for w in text.split())' % sqlite-utils dump words.db BEGIN TRANSACTION; CREATE TABLE [words] ( [word] TEXT ); INSERT INTO "words" VALUES('This'); INSERT INTO "words" VALUES('is'); INSERT INTO "words" VALUES('cool'); COMMIT; ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006309834	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006309834	IC_kwDOCGYnMM47-xHK	9599	2022-01-06T06:08:01Z	2022-01-06T06:08:01Z	OWNER	For `--text` the conversion function should be allowed to return an iterable instead of a dictionary, in which case it will be treated as the full list of records to be inserted.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006301546	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006301546	IC_kwDOCGYnMM47-vFq	9599	2022-01-06T05:44:47Z	2022-01-06T05:44:47Z	OWNER	Just need documentation for `--convert` now against the various different types of input.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006300280	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006300280	IC_kwDOCGYnMM47-ux4	9599	2022-01-06T05:40:45Z	2022-01-06T05:40:45Z	OWNER	I'm going to rename `--all` to `--text`: > - Use `--text` to write the entire input to a column called "text" To avoid that clash with Python's `all()` function.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006299778	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006299778	IC_kwDOCGYnMM47-uqC	9599	2022-01-06T05:39:10Z	2022-01-06T05:39:10Z	OWNER	`all` is a bad variable name because it clashes with the Python `all()` built-in function.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006295276	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006295276	IC_kwDOCGYnMM47-tjs	9599	2022-01-06T05:26:11Z	2022-01-06T05:26:11Z	OWNER	Here's the traceback if your `--convert` function doesn't return a dict right now: ``` % sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert 'all.upper()' --all Traceback (most recent call last): File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/bin/sqlite-utils", line 33, in <module> sys.exit(load_entry_point('sqlite-utils', 'console_scripts', 'sqlite-utils')()) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1137, in __call__ return self.main(args, kwargs) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1062, in main rv = self.invoke(ctx) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1668, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 763, in invoke return __callback(args, **kwargs) File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py", line 949, in insert insert_upsert_implementation( File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py", line 834, in insert_upsert_implementation db[table].insert_all( File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py", line 2602, in insert_all first_record = next(records) File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py", line 3044, in fix_square_braces for record in records: File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py", line 831, in <genexpr> docs = (decode_base64_values(doc) for doc in docs) File "/Users/simon/Dropbox/Development/s…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006294777	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006294777	IC_kwDOCGYnMM47-tb5	9599	2022-01-06T05:24:54Z	2022-01-06T05:24:54Z	OWNER	> I added a custom error message for if the user's `--convert` code doesn't return a dict. That turned out to be a bad idea because it meant exhausting the iterator early for the check - before we got to the `.insert_all()` code that breaks the iterator up into chunks. I tried fixing that with `itertools.tee()` to run the generator twice but that's grossly memory-inefficient for large imports.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006288444	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006288444	IC_kwDOCGYnMM47-r48	9599	2022-01-06T05:07:10Z	2022-01-06T05:07:10Z	OWNER	And here's a demo of `--convert` used with `--all` - I added a custom error message for if the user's `--convert` code doesn't return a dict. ``` % sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert 'all.upper()' --all Error: Records returned by your --convert function must be dicts % sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert '{"all": all.upper()}' --all % sqlite-utils dump /tmp/all.db BEGIN TRANSACTION; CREATE TABLE [blah] ( [all] TEXT ); INSERT INTO "blah" VALUES('INFO: 127.0.0.1:60581 - "GET / HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FAVICON.ICO HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FOO/TIDDLYWIKI HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1" 200 OK INFO: 127.0.0.1:60584 - "GET /FOO/-/STATIC/SQL-FORMATTER-2.3.3.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60586 - "GET /FOO/-/STATIC/CODEMIRROR-5.57.0.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60585 - "GET /FOO/-/STATIC/CODEMIRROR-5.57.0.MIN.CSS HTTP/1.1" 200 OK INFO: 127.0.0.1:60588 - "GET /FOO/-/STATIC/CODEMIRROR-5.57.0-SQL.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60587 - "GET /FOO/-/STATIC/CM-RESIZE-1.0.1.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60586 - "GET /FOO/TIDDLYWIKI/TIDDLERS HTTP/1.1" 200 OK INFO: 127.0.0.1:60586 - "GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1" 200 OK INFO: 127.0.0.1:60584 - "GET /FOO/-/STATIC/TABLE.JS HTTP/1.1" 200 OK '); COMMIT; ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006284673	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006284673	IC_kwDOCGYnMM47-q-B	9599	2022-01-06T04:55:52Z	2022-01-06T04:55:52Z	OWNER	Test code that just worked for me: ``` sqlite-utils insert /tmp/blah.db blah /tmp/log.log --convert ' bits = line.split() return dict([("b_{}".format(i), bit) for i, bit in enumerate(bits)])' --lines ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006232013	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006232013	IC_kwDOCGYnMM47-eHN	9599	2022-01-06T02:21:35Z	2022-01-06T02:21:35Z	OWNER	I'm having second thoughts about this bit: > Your Python code will be passed a "row" variable representing the imported row, and can return a modified row. > > If you are using `--lines` your code will be passed a "line" variable, and for `--all` an "all" variable. The code in question is this: https://github.com/simonw/sqlite-utils/blob/500a35ad4d91c8a6232134ce9406efec11bedff8/sqlite_utils/utils.py#L296-L303 Do I really want to add the complexity of supporting different variable names there? I think always using `value` might be better. Except... `value` made sense for the existing `sqlite-utils convert` command where you are running a conversion function against the value for the column in the current row - is it confusing if applied to lines or documents or `all`?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006230411	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006230411	IC_kwDOCGYnMM47-duL	9599	2022-01-06T02:17:35Z	2022-01-06T02:17:35Z	OWNER	Documentation: https://github.com/simonw/sqlite-utils/blob/33223856ff7fe746b7b77750fbe5b218531d0545/docs/cli.rst#inserting-unstructured-data-with---lines-and---all - I went with a single section titled "Inserting unstructured data with --lines and --all"	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006220129	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006220129	IC_kwDOCGYnMM47-bNh	9599	2022-01-06T01:52:26Z	2022-01-06T01:52:26Z	OWNER	I'm going to refactor all of the tests for `sqlite-utils insert` into a new `test_cli_insert.py` module.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006219848	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006219848	IC_kwDOCGYnMM47-bJI	9599	2022-01-06T01:51:36Z	2022-01-06T01:51:36Z	OWNER	So far I've just implemented the new help: ``` % sqlite-utils insert --help Usage: sqlite-utils insert [OPTIONS] PATH TABLE FILE Insert records from FILE into a table, creating the table if it does not already exist. By default the input is expected to be a JSON array of objects. Or: - Use --nl for newline-delimited JSON objects - Use --csv or --tsv for comma-separated or tab-separated input - Use --lines to write each incoming line to a column called "line" - Use --all to write the entire input to a column called "all" You can also use --convert to pass a fragment of Python code that will be used to convert each input. Your Python code will be passed a "row" variable representing the imported row, and can return a modified row. If you are using --lines your code will be passed a "line" variable, and for --all an "all" variable. Options: --pk TEXT Columns to use as the primary key, e.g. id --flatten Flatten nested JSON objects, so {"a": {"b": 1}} becomes {"a_b": 1} --nl Expect newline-delimited JSON -c, --csv Expect CSV input --tsv Expect TSV input --lines Treat each line as a single value called 'line' --all Treat input as a single value called 'all' --convert TEXT Python code to convert each item --import TEXT Python modules to import --delimiter TEXT Delimiter to use for CSV files --quotechar TEXT Quote character to use for CSV/TSV --sniff Detect delimiter and quote character --no-headers CSV file has no header row --batch-size INTEGER Commit every X records --alter Alter existing table to add any missing columns --not-null TEXT Columns that should be created as NOT NULL --default <TEXT TEXT>... Default value that should be set for a column --e…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997496626	https://api.github.com/repos/simonw/sqlite-utils/issues/356	997496626	IC_kwDOCGYnMM47dJcy	9599	2021-12-20T00:38:15Z	2022-01-06T01:29:03Z	OWNER	The implementation of this gets a tiny bit complicated. Ignoring `--convert`, the `--lines` option can internally produce `{"line": ...}` records and the `--all` option can produce `{"all": ...}` records. But... when `--convert` is used, what should the code run against? It could run against those already-converted records but that's a little bit strange, since you'd have to do this: sqlite-utils insert blah.db blah myfile.txt --all --convert '{"item": s for s in value["all"].split("-")}' Having to use `value["all"]` there is unintuitive. It would be nicer to have a `all` variable to work against. But then for `--lines` should the local variable be called `line`? And how best to summarize these different names for local variables in the inline help for the feature?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1077431957
https://github.com/simonw/sqlite-utils/issues/360#issuecomment-1006211113	https://api.github.com/repos/simonw/sqlite-utils/issues/360	1006211113	IC_kwDOCGYnMM47-ZAp	9599	2022-01-06T01:27:53Z	2022-01-06T01:27:53Z	OWNER	It looks like you were using `sqlite-utils memory` - that works by loading the entire file into an in-memory database, so 170GB is very likely to run out of RAM. The line of code there exhibits another problem: it's reading the entire JSON file into a Python string, so it looks like it's going to run out of RAM even before it gets to the SQLite in-memory database section. To handle a file of this size you'd need to write it to a SQLite database on-disk first. The `sqlite-utils insert` command can do this, and it should be able to "stream" records in from a file without loading the entire thing into memory - but only for JSON-NL and CSV/TSV formats, not for JSON arrays. The code in question is here: https://github.com/simonw/sqlite-utils/blob/f3fd8613113d21d44238a6ec54b375f5aa72c4e0/sqlite_utils/cli.py#L738-L773 That's using Python generators for the CSV/TSV/JSON-NL variants... but it's doing this for regular JSON which requires reading the entire thing into memory: https://github.com/simonw/sqlite-utils/blob/f3fd8613113d21d44238a6ec54b375f5aa72c4e0/sqlite_utils/cli.py#L767 If you have the ability to control how your 170GB file is generated you may have more luck converting it to CSV or TSV or newline-delimited JSON, then using `sqlite-utils insert` to insert it into a database file. To be honest though I've never tested this tooling with anything nearly that big, so it's possible you'll still run into problems. If you do I'd love to hear about them! I would be tempted to tackle this size of job by writing a custom Python script, either using the `sqlite_utils` Python library or even calling `sqlite3` directly.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1091819089
https://github.com/simonw/datasette/issues/1534#issuecomment-1005975080	https://api.github.com/repos/simonw/datasette/issues/1534	1005975080	IC_kwDOBm6k_c479fYo	9599	2022-01-05T18:29:06Z	2022-01-05T18:29:06Z	OWNER	A really big downside to this is that it turns out many CDNs - apparently including Cloudflare - don't support the Vary header at all! More in this thread: https://twitter.com/simonw/status/1478470282931163137	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1065432388
https://github.com/simonw/datasette/issues/1585#issuecomment-1003575286	https://api.github.com/repos/simonw/datasette/issues/1585	1003575286	IC_kwDOBm6k_c470Vf2	9599	2022-01-01T15:40:38Z	2022-01-01T15:40:38Z	OWNER	API tutorial: https://firebase.google.com/docs/hosting/api-deploy	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1091838742
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1003437288	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8	1003437288	IC_kwDODFE5qs47zzzo	28565	2021-12-31T19:06:20Z	2021-12-31T19:06:20Z	NONE	> @maxhawkins how hard would it be to add an entry to the table that includes the HTML version of the email, if it exists? I just attempted your the PR branch on a very small mbox file, and it worked great. My use case is a research project and I need to access more than just the body plain text. Shouldn't be hard. The easiest way is probably to remove the `if body.content_type == "text/html"` clause from [utils.py:254](https://github.com/dogsheep/google-takeout-to-sqlite/pull/8/commits/8e6d487b697ce2e8ad885acf613a157bfba84c59#diff-25ad9dd1ced1b8bfc37fda8444819c803232c08891e4af3d4064aa205d8174eaR254) and just return content directly without parsing.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	954546309
https://github.com/simonw/datasette/issues/1583#issuecomment-1002825217	https://api.github.com/repos/simonw/datasette/issues/1583	1002825217	IC_kwDOBm6k_c47xeYB	536941	2021-12-30T00:34:16Z	2021-12-30T00:34:16Z	CONTRIBUTOR	if that is not desirable, it might be good to document that users might want to set up a lifecycle rule to automatically delete these build artifacts. something like https://stackoverflow.com/questions/59937542/can-i-delete-container-images-from-google-cloud-storage-artifacts-bucket	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1090810196
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1002735370	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8	1002735370	IC_kwDODFE5qs47xIcK	203343	2021-12-29T18:58:23Z	2021-12-29T18:58:23Z	NONE	@maxhawkins how hard would it be to add an entry to the table that includes the HTML version of the email, if it exists? I just attempted your the PR branch on a very small mbox file, and it worked great. My use case is a research project and I need to access more than just the body plain text.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	954546309
https://github.com/simonw/datasette/issues/1152#issuecomment-1001791592	https://api.github.com/repos/simonw/datasette/issues/1152	1001791592	IC_kwDOBm6k_c47tiBo	9599	2021-12-27T23:04:31Z	2021-12-27T23:04:31Z	OWNER	Another option: rethink permissions to always work in terms of where clauses users as part of a SQL query that returns the overall allowed set of databases or tables. This would require rethinking existing permissions but it might be worthwhile prior to 1.0.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	770598024
https://github.com/simonw/datasette/issues/878#issuecomment-1001699559	https://api.github.com/repos/simonw/datasette/issues/878	1001699559	IC_kwDOBm6k_c47tLjn	9599	2021-12-27T18:53:04Z	2021-12-27T18:53:04Z	OWNER	I'm going to see if I can come up with the simplest possible version of this pattern for the `/-/metadata` and `/-/metadata.json` page, then try it for the database query page, before tackling the much more complex table page.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	648435885
https://github.com/dogsheep/twitter-to-sqlite/issues/62#issuecomment-1001222213	https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/62	1001222213	IC_kwDODEm0Qs47rXBF	6764957	2021-12-26T17:59:25Z	2021-12-26T17:59:25Z	NONE	just confirmed that this error does not occur when i use my public main account. gets more interesting!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1088816961
https://github.com/simonw/sqlite-utils/issues/228#issuecomment-1001115286	https://api.github.com/repos/simonw/sqlite-utils/issues/228	1001115286	IC_kwDOCGYnMM47q86W	1206106	2021-12-26T07:01:31Z	2021-12-26T07:01:31Z	NONE	`--no-headers` does not work? ``` $ echo 'a,1\nb,2' \| sqlite-utils memory --no-headers -t - 'select * from stdin' a 1 --- --- b 2 ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	807437089
https://github.com/simonw/datasette/issues/1576#issuecomment-1000935523	https://api.github.com/repos/simonw/datasette/issues/1576	1000935523	IC_kwDOBm6k_c47qRBj	9599	2021-12-24T21:33:05Z	2021-12-24T21:33:05Z	OWNER	Another option would be to attempt to import `contextvars` and, if the import fails (for Python 3.6) continue using the current mechanism - then let Python 3.6 users know in the documentation that under Python 3.6 they will miss out on nested traces.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087181951
https://github.com/simonw/datasette/issues/1577#issuecomment-1000673444	https://api.github.com/repos/simonw/datasette/issues/1577	1000673444	IC_kwDOBm6k_c47pRCk	9599	2021-12-24T06:08:58Z	2021-12-24T06:08:58Z	OWNER	https://pypistats.org/packages/datasette shows a breakdown of downloads by Python version: <img width="986" alt="image" src="https://user-images.githubusercontent.com/9599/147323253-1ee22d93-3be2-472b-8ead-495d925958e5.png"> It looks like on a recent day I had 4,071 downloads from Python 3.7... and just 2 downloads from Python 3.6!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087913724
https://github.com/simonw/datasette/issues/1534#issuecomment-1000535904	https://api.github.com/repos/simonw/datasette/issues/1534	1000535904	IC_kwDOBm6k_c47ovdg	9599	2021-12-23T21:44:31Z	2021-12-23T21:44:31Z	OWNER	A big downside to this is that I would need to use `Vary: Accept` for when Datasette is running behind a cache such as Cloudflare - would that greatly reduce overall cache efficiency due to subtle variations in the accept headers sent by common browsers?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1065432388
https://github.com/simonw/datasette/issues/1579#issuecomment-1000485719	https://api.github.com/repos/simonw/datasette/issues/1579	1000485719	IC_kwDOBm6k_c47ojNX	9599	2021-12-23T19:19:45Z	2021-12-23T19:19:45Z	OWNER	All of those removed `block=True` lines in 8c401ee0f054de2f568c3a8302c9223555146407 really help confirm to me that this was a good decision.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087931918
https://github.com/simonw/datasette/issues/1579#issuecomment-1000485505	https://api.github.com/repos/simonw/datasette/issues/1579	1000485505	IC_kwDOBm6k_c47ojKB	9599	2021-12-23T19:19:13Z	2021-12-23T19:19:13Z	OWNER	Updated docs for `execute_write_fn()`: https://github.com/simonw/datasette/blob/75153ea9b94d09ec3d61f7c6ebdf378e0c0c7a0b/docs/internals.rst#await-dbexecute_write_fnfn-blocktrue	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087931918
https://github.com/simonw/datasette/issues/1579#issuecomment-1000481686	https://api.github.com/repos/simonw/datasette/issues/1579	1000481686	IC_kwDOBm6k_c47oiOW	9599	2021-12-23T19:09:23Z	2021-12-23T19:09:23Z	OWNER	Re-opening this because I missed updating some of the docs, and I also need to update Datasette's own code to not use `block=True` in a bunch of places.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087931918
https://github.com/simonw/datasette/issues/1579#issuecomment-1000479737	https://api.github.com/repos/simonw/datasette/issues/1579	1000479737	IC_kwDOBm6k_c47ohv5	9599	2021-12-23T19:04:23Z	2021-12-23T19:04:23Z	OWNER	Updated documentation: https://github.com/simonw/datasette/blob/00a2895cd2dc42c63846216b36b2dc9f41170129/docs/internals.rst#await-dbexecute_writesql-paramsnone-blocktrue	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087931918
https://github.com/simonw/datasette/issues/1579#issuecomment-1000477813	https://api.github.com/repos/simonw/datasette/issues/1579	1000477813	IC_kwDOBm6k_c47ohR1	9599	2021-12-23T18:59:41Z	2021-12-23T18:59:41Z	OWNER	I'm going to go with `execute_write(..., block=False)` as the mechanism for fire-and-forget write queries.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087931918
https://github.com/simonw/datasette/issues/1579#issuecomment-1000477621	https://api.github.com/repos/simonw/datasette/issues/1579	1000477621	IC_kwDOBm6k_c47ohO1	9599	2021-12-23T18:59:12Z	2021-12-23T18:59:12Z	OWNER	The easiest way to change this would be to default to `block=True` such that you need to pass `block=False` to the APIs to have them do fire-and-forget. An alternative would be to add new, separately named methods which do the fire-and-forget thing. If I hadn't recently added `execute_write_script` and `execute_write_many` in #1570 I'd be more into this idea, but I don't want to end up with eight methods - `execute_write`, `execute_write_queue`, `execute_write_many`, `execute_write_many_queue`, `execute_write_script`, `execute_write_scrript_queue`, `execute_write_fn`, `execute_write_fn_queue`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087931918
https://github.com/simonw/datasette/issues/1579#issuecomment-1000476413	https://api.github.com/repos/simonw/datasette/issues/1579	1000476413	IC_kwDOBm6k_c47og79	9599	2021-12-23T18:56:06Z	2021-12-23T18:56:06Z	OWNER	This is technically a breaking change, but a GitHub code search at https://cs.github.com/?scopeName=All+repos&scope=&q=execute_write%20datasette%20-owner%3Asimonw shows only one repo not-owned-by-me using this, and they're using `block=True`: https://github.com/mfa/datasette-webhook-write/blob/e82440f372a2f2e3ed27d1bd34c9fa3a53b49b94/datasette_webhook_write/__init__.py#L88-L89	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087931918
https://github.com/simonw/datasette/issues/1578#issuecomment-1000471782	https://api.github.com/repos/simonw/datasette/issues/1578	1000471782	IC_kwDOBm6k_c47ofzm	9599	2021-12-23T18:44:01Z	2021-12-23T18:44:01Z	OWNER	The example nginx config on https://docs.datasette.io/en/stable/deploying.html#nginx-proxy-configuration is currently: ``` daemon off; events { worker_connections 1024; } http { server { listen 80; location /my-datasette { proxy_pass http://127.0.0.1:8009/my-datasette; proxy_set_header Host $host; } } } ``` This looks to me like it might exhibit the bug. Need to confirm that and figure out an alternative.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087919372
https://github.com/simonw/datasette/issues/1578#issuecomment-1000471371	https://api.github.com/repos/simonw/datasette/issues/1578	1000471371	IC_kwDOBm6k_c47oftL	9599	2021-12-23T18:42:50Z	2021-12-23T18:42:50Z	OWNER	Confirmed, that fixed the bug for me on my server.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087919372
https://github.com/simonw/datasette/issues/1578#issuecomment-1000470652	https://api.github.com/repos/simonw/datasette/issues/1578	1000470652	IC_kwDOBm6k_c47ofh8	9599	2021-12-23T18:40:46Z	2021-12-23T18:40:46Z	OWNER	[This StackOverflow answer](https://serverfault.com/a/463932) suggests that the fix is to change this: proxy_pass http://127.0.0.1:8000/; To this: proxy_pass http://127.0.0.1:8000; Quoting the nginx documentation: http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_pass > A request URI is passed to the server as follows: > > - If the `proxy_pass` directive is specified with a URI, then when a request is passed to the server, the part of a [normalized](http://nginx.org/en/docs/http/ngx_http_core_module.html#location) request URI matching the location is replaced by a URI specified in the directive: > > location /name/ { > proxy_pass http://127.0.0.1/remote/; > } > > - If `proxy_pass` is specified without a URI, the request URI is passed to the server in the same form as sent by a client when the original request is processed, or the full normalized request URI is passed when processing the changed URI: > > location /some/path/ { > proxy_pass http://127.0.0.1; > }	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087919372
https://github.com/simonw/datasette/issues/1578#issuecomment-1000469107	https://api.github.com/repos/simonw/datasette/issues/1578	1000469107	IC_kwDOBm6k_c47ofJz	9599	2021-12-23T18:36:38Z	2021-12-23T18:36:38Z	OWNER	This problem doesn't occur on my `localhost` running Uvicorn directly - but I'm seeing it in my production environment that runs Datasette behind an nginx proxy: ``` location / { proxy_pass http://127.0.0.1:8000/; proxy_set_header Host $host; } ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087919372
https://github.com/simonw/datasette/issues/1577#issuecomment-1000462309	https://api.github.com/repos/simonw/datasette/issues/1577	1000462309	IC_kwDOBm6k_c47odfl	9599	2021-12-23T18:20:46Z	2021-12-23T18:20:46Z	OWNER	There are a lot of improvements to `asyncio` in 3.7: https://docs.python.org/3/whatsnew/3.7.html#whatsnew37-asyncio	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087913724
https://github.com/simonw/datasette/issues/1577#issuecomment-1000461900	https://api.github.com/repos/simonw/datasette/issues/1577	1000461900	IC_kwDOBm6k_c47odZM	9599	2021-12-23T18:19:44Z	2021-12-23T18:19:44Z	OWNER	The 3.7 feature I want to use today is [contextvars](https://docs.python.org/3/library/contextvars.html) - but I have a workaround for the moment, see https://github.com/simonw/datasette/issues/1576#issuecomment-999987418 So I'm going to hold off on dropping 3.6 for a little bit longer. I imagine I'll drop it before Datasette 1.0 though. Leaving this issue open to gather thoughts and feedback on this issue from Datasette users and potential users.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087913724
https://github.com/simonw/datasette/issues/1577#issuecomment-1000461275	https://api.github.com/repos/simonw/datasette/issues/1577	1000461275	IC_kwDOBm6k_c47odPb	9599	2021-12-23T18:18:11Z	2021-12-23T18:18:11Z	OWNER	From the Twitter thread, there are still a decent amount of LTS Linux releases out there that are stuck on pre-3.7 Python. Though many of those are 3.5 and Datasette dropped support for 3.5 in November 2019: cf7776d36fbacefa874cbd6e5fcdc9fff7661203	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087913724
https://github.com/simonw/datasette/issues/1576#issuecomment-999990414	https://api.github.com/repos/simonw/datasette/issues/1576	999990414	IC_kwDOBm6k_c47mqSO	9599	2021-12-23T02:08:39Z	2021-12-23T18:16:35Z	OWNER	It's tiny: I'm tempted to vendor it. https://github.com/Skyscanner/aiotask-context/blob/master/aiotask_context/__init__.py No, I'll add it as a pinned dependency, which I can then drop when I drop 3.6 support.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087181951
https://github.com/simonw/datasette/issues/1576#issuecomment-999987418	https://api.github.com/repos/simonw/datasette/issues/1576	999987418	IC_kwDOBm6k_c47mpja	9599	2021-12-23T01:59:58Z	2021-12-23T02:02:12Z	OWNER	Another option: https://github.com/Skyscanner/aiotask-context - looks like it might be better as it's been updated for Python 3.7 in this commit https://github.com/Skyscanner/aiotask-context/commit/67108c91d2abb445655cc2af446fdb52ca7890c4 The Skyscanner one doesn't attempt to wrap any existing factories, but that's OK for my purposes since I don't need to handle arbitrary `asyncio` code written by other people.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087181951
https://github.com/simonw/datasette/issues/1576#issuecomment-999876666	https://api.github.com/repos/simonw/datasette/issues/1576	999876666	IC_kwDOBm6k_c47mOg6	9599	2021-12-22T20:59:22Z	2021-12-22T21:18:09Z	OWNER	This article is relevant: [Context information storage for asyncio](https://blog.sqreen.com/asyncio/) - in particular the section https://blog.sqreen.com/asyncio/#context-inheritance-between-tasks which describes exactly the problem I have and their solution, which involves this trickery: ```python def request_task_factory(loop, coro): child_task = asyncio.tasks.Task(coro, loop=loop) parent_task = asyncio.Task.current_task(loop=loop) current_request = getattr(parent_task, 'current_request', None) setattr(child_task, 'current_request', current_request) return child_task loop = asyncio.get_event_loop() loop.set_task_factory(request_task_factory) ``` They released their solution as a library: https://pypi.org/project/aiocontext/ and https://github.com/sqreen/AioContext - but that company was acquired by Datadog back in April and doesn't seem to be actively maintaining their open source stuff any more: https://twitter.com/SqreenIO/status/1384906075506364417	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087181951
https://github.com/simonw/datasette/issues/1576#issuecomment-999878907	https://api.github.com/repos/simonw/datasette/issues/1576	999878907	IC_kwDOBm6k_c47mPD7	9599	2021-12-22T21:03:49Z	2021-12-22T21:10:46Z	OWNER	`context_vars` can solve this but they were introduced in Python 3.7: https://www.python.org/dev/peps/pep-0567/ Python 3.6 support ends in a few days time, and it looks like Glitch has updated to 3.7 now - so maybe I can get away with Datasette needing 3.7 these days? Tweeted about that here: https://twitter.com/simonw/status/1473761478155010048	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087181951
https://github.com/simonw/datasette/issues/1576#issuecomment-999874886	https://api.github.com/repos/simonw/datasette/issues/1576	999874886	IC_kwDOBm6k_c47mOFG	9599	2021-12-22T20:55:42Z	2021-12-22T20:57:28Z	OWNER	One way to solve this would be to introduce a `set_task_id()` method, which sets an ID which will be returned by `get_task_id()` instead of using `id(current_task(loop=loop))`. It would be really nice if I could solve this using `with` syntax somehow. Something like: ```python with trace_child_tasks(): ( suggested_facets, (facet_results, facets_timed_out), ) = await asyncio.gather( execute_suggested_facets(), execute_facets(), ) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087181951
https://github.com/simonw/datasette/issues/1576#issuecomment-999874484	https://api.github.com/repos/simonw/datasette/issues/1576	999874484	IC_kwDOBm6k_c47mN-0	9599	2021-12-22T20:54:52Z	2021-12-22T20:54:52Z	OWNER	Here's the full current relevant code from `tracer.py`: https://github.com/simonw/datasette/blob/ace86566b28280091b3844cf5fbecd20158e9004/datasette/tracer.py#L8-L64	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1087181951
https://github.com/simonw/datasette/issues/1518#issuecomment-999870993	https://api.github.com/repos/simonw/datasette/issues/1518	999870993	IC_kwDOBm6k_c47mNIR	9599	2021-12-22T20:47:18Z	2021-12-22T20:50:24Z	OWNER	The reason they aren't showing up in the traces is that traces are stored just for the currently executing `asyncio` task ID: https://github.com/simonw/datasette/blob/ace86566b28280091b3844cf5fbecd20158e9004/datasette/tracer.py#L13-L25 This is so traces for other incoming requests don't end up mixed together. But there's no current mechanism to track async tasks that are effectively "child tasks" of the current request, and hence should be tracked the same. https://stackoverflow.com/a/69349501/6083 suggests that you pass the task ID as an argument to the child tasks that are executed using `asyncio.gather()` to work around this kind of problem.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1058072543
https://github.com/simonw/datasette/issues/1518#issuecomment-999870282	https://api.github.com/repos/simonw/datasette/issues/1518	999870282	IC_kwDOBm6k_c47mM9K	9599	2021-12-22T20:45:56Z	2021-12-22T20:46:08Z	OWNER	> New short-term goal: get facets and suggested facets to execute in parallel with the main query. Generate a trace graph that proves that is happening using `datasette-pretty-traces`. I wrote code to execute those in parallel using `asyncio.gather()` - which seems to work but causes the SQL run inside the parallel `async def` functions not to show up in the trace graph at all. ```diff diff --git a/datasette/views/table.py b/datasette/views/table.py index 9808fd2..ec9db64 100644 --- a/datasette/views/table.py +++ b/datasette/views/table.py @@ -1,3 +1,4 @@ +import asyncio import urllib import itertools import json @@ -615,44 +616,37 @@ class TableView(RowTableShared): if request.args.get("_timelimit"): extra_args["custom_time_limit"] = int(request.args.get("_timelimit")) - # Execute the main query! - results = await db.execute(sql, params, truncate=True, *extra_args) - - # Calculate the total count for this query - filtered_table_rows_count = None - if ( - not db.is_mutable - and self.ds.inspect_data - and count_sql == f"select count() from {table} " - ): - # We can use a previously cached table row count - try: - filtered_table_rows_count = self.ds.inspect_data[database]["tables"][ - table - ]["count"] - except KeyError: - pass - - # Otherwise run a select count(*) ... - if count_sql and filtered_table_rows_count is None and not nocount: - try: - count_rows = list(await db.execute(count_sql, from_sql_params)) - filtered_table_rows_count = count_rows[0][0] - except QueryInterrupted: - pass - - # Faceting - if not self.ds.setting("allow_facet") and any( - arg.startswith("_facet") for arg in request.args - ): - raise BadRequest("_facet= is not allo…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1058072543
https://github.com/simonw/datasette/issues/1518#issuecomment-999863269	https://api.github.com/repos/simonw/datasette/issues/1518	999863269	IC_kwDOBm6k_c47mLPl	9599	2021-12-22T20:35:41Z	2021-12-22T20:37:13Z	OWNER	It looks like the count has to be executed before facets can be, because the facet_class constructor needs that total count figure: https://github.com/simonw/datasette/blob/6b1384b2f529134998fb507e63307609a5b7f5c0/datasette/views/table.py#L660-L671 It's used in facet suggestion logic here: https://github.com/simonw/datasette/blob/ace86566b28280091b3844cf5fbecd20158e9004/datasette/facets.py#L172-L178	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1058072543
https://github.com/simonw/datasette/issues/1518#issuecomment-999850191	https://api.github.com/repos/simonw/datasette/issues/1518	999850191	IC_kwDOBm6k_c47mIDP	9599	2021-12-22T20:29:38Z	2021-12-22T20:29:38Z	OWNER	New short-term goal: get facets and suggested facets to execute in parallel with the main query. Generate a trace graph that proves that is happening using `datasette-pretty-traces`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1058072543
https://github.com/simonw/datasette/issues/1518#issuecomment-999837569	https://api.github.com/repos/simonw/datasette/issues/1518	999837569	IC_kwDOBm6k_c47mE-B	9599	2021-12-22T20:15:45Z	2021-12-22T20:15:45Z	OWNER	Also the whole `special_args` v.s. `request.args` thing is pretty confusing, I think that might be an older code pattern back from when I was using Sanic.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1058072543
https://github.com/simonw/datasette/issues/1518#issuecomment-999837220	https://api.github.com/repos/simonw/datasette/issues/1518	999837220	IC_kwDOBm6k_c47mE4k	9599	2021-12-22T20:15:04Z	2021-12-22T20:15:04Z	OWNER	I think I can move this much higher up in the method, it's a bit confusing having it half way through: https://github.com/simonw/datasette/blob/6b1384b2f529134998fb507e63307609a5b7f5c0/datasette/views/table.py#L414-L436	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1058072543
https://github.com/simonw/datasette/issues/1518#issuecomment-999831967	https://api.github.com/repos/simonw/datasette/issues/1518	999831967	IC_kwDOBm6k_c47mDmf	9599	2021-12-22T20:04:47Z	2021-12-22T20:10:11Z	OWNER	I think I might be able to clean up a lot of the stuff in here using the `render_cell` plugin hook: https://github.com/simonw/datasette/blob/6b1384b2f529134998fb507e63307609a5b7f5c0/datasette/views/table.py#L87-L89 The catch with that hook - https://docs.datasette.io/en/stable/plugin_hooks.html#render-cell-value-column-table-database-datasette - is that it gets called for every single cell. I don't want the overhead of looking up the foreign key relationships etc once for every value in a specific column. But maybe I could extend the hook to include a shared cache that gets used for all of the cells in a specific table? Something like this: ```python render_cell(value, column, table, database, datasette, cache) ``` `cache` is a dictionary - and the same dictionary is passed to every call to that hook while rendering a specific page. It's a bit of a gross hack though, and would it ever be useful for plugins outside of the default plugin in Datasette which does the foreign key stuff? If I can think of one other potential application for this `cache` then I might implement it. No, this optimization doesn't make sense: the most complex cell enrichment logic is the stuff that does a `select * from categories where id in (2, 5, 6)` query, using just the distinct set of IDs that are rendered on the current page. That's not going to fit in the `render_cell` hook no matter how hard I try to warp it into the right shape, because it needs full visibility of all of the results that are being rendered in order to collect those unique ID values.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1058072543
https://github.com/simonw/datasette/issues/1181#issuecomment-998999230	https://api.github.com/repos/simonw/datasette/issues/1181	998999230	IC_kwDOBm6k_c47i4S-	9308268	2021-12-21T18:25:15Z	2021-12-21T18:25:15Z	NONE	I wonder if I'm encountering the same bug (or something related). I had previously been using the .csv feature to run queries and then fetch results for the pandas `read_csv()` function, but it seems to have stopped working recently. https://ilsweb.cincinnatilibrary.org/collection-analysis/collection-analysis/current_collection-3d56dbf.csv?sql=select%0D%0A++*%0D%0Afrom%0D%0A++bib%0D%0Alimit%0D%0A++100&_size=max Datasette v0.59.4 ![image](https://user-images.githubusercontent.com/9308268/146979957-66911877-2cd9-4022-bc76-fd54e4a3a6f7.png)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	781262510
https://github.com/simonw/datasette/pull/1554#issuecomment-998354538	https://api.github.com/repos/simonw/datasette/issues/1554	998354538	IC_kwDOBm6k_c47ga5q	9599	2021-12-20T23:52:04Z	2021-12-20T23:52:04Z	OWNER	Abandoning this since it didn't work how I wanted.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079129258
https://github.com/simonw/datasette/issues/1547#issuecomment-997519202	https://api.github.com/repos/simonw/datasette/issues/1547	997519202	IC_kwDOBm6k_c47dO9i	127565	2021-12-20T01:36:58Z	2021-12-20T01:36:58Z	CONTRIBUTOR	Yep, that works -- thanks!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1076388044
https://github.com/simonw/datasette/issues/1547#issuecomment-997514220	https://api.github.com/repos/simonw/datasette/issues/1547	997514220	IC_kwDOBm6k_c47dNvs	9599	2021-12-20T01:26:25Z	2021-12-20T01:26:25Z	OWNER	OK, this should hopefully fix that for you: pip install https://github.com/simonw/datasette/archive/f36e010b3b69ada104b79d83c7685caf9359049e.zip	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1076388044
https://github.com/simonw/datasette/issues/1547#issuecomment-997513369	https://api.github.com/repos/simonw/datasette/issues/1547	997513369	IC_kwDOBm6k_c47dNiZ	9599	2021-12-20T01:24:43Z	2021-12-20T01:24:43Z	OWNER	@wragge thanks, that's a bug! Working on that in #1575.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1076388044
https://github.com/simonw/datasette/issues/1575#issuecomment-997513177	https://api.github.com/repos/simonw/datasette/issues/1575	997513177	IC_kwDOBm6k_c47dNfZ	9599	2021-12-20T01:24:25Z	2021-12-20T01:24:25Z	OWNER	Looks like `specname` is new in Pluggy 1.0: https://github.com/pytest-dev/pluggy/blob/main/CHANGELOG.rst#pluggy-100-2021-08-25	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1084257842
https://github.com/simonw/datasette/issues/1547#issuecomment-997511968	https://api.github.com/repos/simonw/datasette/issues/1547	997511968	IC_kwDOBm6k_c47dNMg	127565	2021-12-20T01:21:59Z	2021-12-20T01:21:59Z	CONTRIBUTOR	I've installed the alpha version but get an error when starting up Datasette: ``` Traceback (most recent call last): File "/Users/tim/.pyenv/versions/stock-exchange/bin/datasette", line 5, in <module> from datasette.cli import cli File "/Users/tim/.pyenv/versions/3.8.5/envs/stock-exchange/lib/python3.8/site-packages/datasette/cli.py", line 15, in <module> from .app import Datasette, DEFAULT_SETTINGS, SETTINGS, SQLITE_LIMIT_ATTACHED, pm File "/Users/tim/.pyenv/versions/3.8.5/envs/stock-exchange/lib/python3.8/site-packages/datasette/app.py", line 31, in <module> from .views.database import DatabaseDownload, DatabaseView File "/Users/tim/.pyenv/versions/3.8.5/envs/stock-exchange/lib/python3.8/site-packages/datasette/views/database.py", line 25, in <module> from datasette.plugins import pm File "/Users/tim/.pyenv/versions/3.8.5/envs/stock-exchange/lib/python3.8/site-packages/datasette/plugins.py", line 29, in <module> mod = importlib.import_module(plugin) File "/Users/tim/.pyenv/versions/3.8.5/lib/python3.8/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "/Users/tim/.pyenv/versions/3.8.5/envs/stock-exchange/lib/python3.8/site-packages/datasette/filters.py", line 9, in <module> @hookimpl(specname="filters_from_request") TypeError: __call__() got an unexpected keyword argument 'specname' ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1076388044
https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997507074	https://api.github.com/repos/simonw/sqlite-utils/issues/356	997507074	IC_kwDOCGYnMM47dMAC	9599	2021-12-20T01:10:06Z	2021-12-20T01:16:11Z	OWNER	Work-in-progress improved help: ``` Usage: sqlite-utils insert [OPTIONS] PATH TABLE FILE Insert records from FILE into a table, creating the table if it does not already exist. By default the input is expected to be a JSON array of objects. Or: - Use --nl for newline-delimited JSON objects - Use --csv or --tsv for comma-separated or tab-separated input - Use --lines to write each incoming line to a column called "line" - Use --all to write the entire input to a column called "all" You can also use --convert to pass a fragment of Python code that will be used to convert each input. Your Python code will be passed a "row" variable representing the imported row, and can return a modified row. If you are using --lines your code will be passed a "line" variable, and for --all an "all" variable. Options: --pk TEXT Columns to use as the primary key, e.g. id --flatten Flatten nested JSON objects, so {"a": {"b": 1}} becomes {"a_b": 1} --nl Expect newline-delimited JSON -c, --csv Expect CSV input --tsv Expect TSV input --lines Treat each line as a single value called 'line' --all Treat input as a single value called 'all' --convert TEXT Python code to convert each item --import TEXT Python modules to import --delimiter TEXT Delimiter to use for CSV files --quotechar TEXT Quote character to use for CSV/TSV --sniff Detect delimiter and quote character --no-headers CSV file has no header row --batch-size INTEGER Commit every X records --alter Alter existing table to add any missing columns --not-null TEXT Columns that should be created as NOT NULL --default <TEXT TEXT>... Default value that should be set for a column --encoding TEXT Character encoding…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1077431957
https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997508728	https://api.github.com/repos/simonw/sqlite-utils/issues/356	997508728	IC_kwDOCGYnMM47dMZ4	9599	2021-12-20T01:14:43Z	2021-12-20T01:14:43Z	OWNER	(This makes me want `--extract` from #352 even more.)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1077431957
https://github.com/simonw/sqlite-utils/issues/163#issuecomment-997502242	https://api.github.com/repos/simonw/sqlite-utils/issues/163	997502242	IC_kwDOCGYnMM47dK0i	9599	2021-12-20T00:56:45Z	2021-12-20T00:56:52Z	OWNER	> Maybe `sqlite-utils` should absorb all of the functionality from `sqlite-transform` - having two separate tools doesn't necessarily make sense. I implemented that in: - #251	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	706001517
https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997497262	https://api.github.com/repos/simonw/sqlite-utils/issues/356	997497262	IC_kwDOCGYnMM47dJmu	9599	2021-12-20T00:40:15Z	2021-12-20T00:40:15Z	OWNER	`--flatten` could do with a better description too.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1077431957
https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997496931	https://api.github.com/repos/simonw/sqlite-utils/issues/356	997496931	IC_kwDOCGYnMM47dJhj	9599	2021-12-20T00:39:14Z	2021-12-20T00:39:52Z	OWNER	``` % sqlite-utils insert --help Usage: sqlite-utils insert [OPTIONS] PATH TABLE JSON_FILE Insert records from JSON file into a table, creating the table if it does not already exist. Input should be a JSON array of objects, unless --nl or --csv is used. Options: --pk TEXT Columns to use as the primary key, e.g. id --nl Expect newline-delimited JSON --flatten Flatten nested JSON objects -c, --csv Expect CSV --tsv Expect TSV --convert TEXT Python code to convert each item --import TEXT Python modules to import --delimiter TEXT Delimiter to use for CSV files --quotechar TEXT Quote character to use for CSV/TSV --sniff Detect delimiter and quote character --no-headers CSV file has no header row --batch-size INTEGER Commit every X records --alter Alter existing table to add any missing columns --not-null TEXT Columns that should be created as NOT NULL --default <TEXT TEXT>... Default value that should be set for a column --encoding TEXT Character encoding for input, defaults to utf-8 -d, --detect-types Detect types for columns in CSV/TSV data --load-extension TEXT SQLite extensions to load --silent Do not show progress bar --ignore Ignore records if pk already exists --replace Replace records if pk already exists --truncate Truncate table before inserting records, if table already exists -h, --help Show this message and exit. ``` I can add a bunch of extra help at the top there to explain all of this stuff. That "Input should be a JSON array of objects" bit could be expanded to several paragraphs.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1077431957
https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997492872	https://api.github.com/repos/simonw/sqlite-utils/issues/356	997492872	IC_kwDOCGYnMM47dIiI	9599	2021-12-20T00:23:31Z	2021-12-20T00:23:31Z	OWNER	I think this should work on JSON, or CSV, or individual lines, or the entire content at once. So I'll require `--lines --convert ...` to import individual lines, or `--all --convert` to run the conversion against the entire input at once. What would `--lines` or `--all` do without `--convert`? Maybe insert records as `{"line": "line of text"}` or `{"all": "whole input}`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1077431957
https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997486156	https://api.github.com/repos/simonw/sqlite-utils/issues/356	997486156	IC_kwDOCGYnMM47dG5M	9599	2021-12-19T23:51:02Z	2021-12-19T23:51:02Z	OWNER	This is going to need a `--import` multi option too.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1077431957
https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997485361	https://api.github.com/repos/simonw/sqlite-utils/issues/356	997485361	IC_kwDOCGYnMM47dGsx	9599	2021-12-19T23:45:30Z	2021-12-19T23:45:30Z	OWNER	Really interesting example input for this: https://blog.timac.org/2021/1219-state-of-swift-and-swiftui-ios15/iOS13.txt - see https://blog.timac.org/2021/1219-state-of-swift-and-swiftui-ios15/	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1077431957
https://github.com/simonw/datasette/issues/1565#issuecomment-997474022	https://api.github.com/repos/simonw/datasette/issues/1565	997474022	IC_kwDOBm6k_c47dD7m	9599	2021-12-19T22:36:49Z	2021-12-19T22:37:29Z	OWNER	No way with a tagged template literal to pass an extra database name argument, so instead I need a method that returns a callable that can be used for the tagged template literal for a specific database - or the default database. This could work (bit weird looking though): ```javascript var rows = await datasette.query("fixtures")`select * from foo`; ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1083657868
https://github.com/simonw/datasette/issues/1565#issuecomment-997473856	https://api.github.com/repos/simonw/datasette/issues/1565	997473856	IC_kwDOBm6k_c47dD5A	9599	2021-12-19T22:35:20Z	2021-12-19T22:35:20Z	OWNER	Quick prototype of that tagged template `query` function: ```javascript function query(pieces, ...parameters) { var qs = new URLSearchParams(); var sql = pieces[0]; parameters.forEach((param, i) => { sql += `:p${i}${pieces[i + 1]}`; qs.append(`p${i}`, param); }); qs.append("sql", sql); return qs.toString(); } var id = 4; console.log(query`select * from ids where id > ${id}`); ``` Outputs: ``` p0=4&sql=select+*+from+ids+where+id+%3E+%3Ap0 ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1083657868
https://github.com/simonw/datasette/issues/1565#issuecomment-997472639	https://api.github.com/repos/simonw/datasette/issues/1565	997472639	IC_kwDOBm6k_c47dDl_	9599	2021-12-19T22:25:50Z	2021-12-19T22:25:50Z	OWNER	Or... ```javascript rows = await datasette.query`select * from searchable where id > ${id}`; ``` And it knows how to turn that into a parameterized call using tagged template literals.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1083657868
https://github.com/simonw/datasette/issues/1565#issuecomment-997472509	https://api.github.com/repos/simonw/datasette/issues/1565	997472509	IC_kwDOBm6k_c47dDj9	9599	2021-12-19T22:24:50Z	2021-12-19T22:24:50Z	OWNER	... huh, it could even expose a JavaScript function that can be called to execute a SQL query. ```javascript datasette.query("select * from blah").then(...) ``` Maybe it takes an optional second argument that specifies the database - defaulting to the one for the current page.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1083657868
https://github.com/simonw/datasette/issues/1565#issuecomment-997472370	https://api.github.com/repos/simonw/datasette/issues/1565	997472370	IC_kwDOBm6k_c47dDhy	9599	2021-12-19T22:23:36Z	2021-12-19T22:23:36Z	OWNER	This should also expose the JSON API endpoints used to execute SQL against this database.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1083657868
https://github.com/simonw/datasette/issues/1518#issuecomment-997472214	https://api.github.com/repos/simonw/datasette/issues/1518	997472214	IC_kwDOBm6k_c47dDfW	9599	2021-12-19T22:22:08Z	2021-12-19T22:22:08Z	OWNER	I sketched out a chained SQL builder pattern that might be useful for further tidying up this code - though with the new plugin hook I'm less excited about it than I was: ```python class TableQuery: def __init__(self, table, columns, pks, is_view=False, prev=None): self.table = table self.columns = columns self.pks = pks self.is_view = is_view self.prev = prev # These can be changed for different instances in the chain: self._where_clauses = None self._order_by = None self._page_size = None self._offset = None self._select_columns = None self.select_all_columns = '' self.select_specified_columns = '' @property def where_clauses(self): wheres = [] current = self while current: if current._where_clauses is not None: wheres.extend(current._where_clauses) current = current.prev return list(reversed(wheres)) def where(self, where): new_cls = TableQuery(self.table, self.columns, self.pks, self.is_view, self) new_cls._where_clauses = [where] return new_cls @classmethod async def introspect(cls, db, table): return cls( table, columns = await db.table_columns(table), pks = await db.primary_keys(table), is_view = bool(await db.get_view_definition(table)) ) @property def sql_from(self): return f"from {self.table}{self.sql_where}" @property def sql_where(self): if not self.where_clauses: return "" else: return f" where {' and '.join(self.where_clauses)}" @property def sql_no_order_no_limit(self): return f"select {self.select_all_columns} from {self.table}{self.sql_where}" @property def sql(self): return f"select {self.select_specified_columns} from {se…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1058072543
https://github.com/simonw/datasette/issues/1547#issuecomment-997471672	https://api.github.com/repos/simonw/datasette/issues/1547	997471672	IC_kwDOBm6k_c47dDW4	9599	2021-12-19T22:18:26Z	2021-12-19T22:18:26Z	OWNER	I released this [in an alpha](https://github.com/simonw/datasette/releases/tag/0.60a1), so you can try out this fix using: pip install datasette==0.60a1	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1076388044
https://github.com/simonw/datasette/issues/1566#issuecomment-997470633	https://api.github.com/repos/simonw/datasette/issues/1566	997470633	IC_kwDOBm6k_c47dDGp	9599	2021-12-19T22:12:00Z	2021-12-19T22:12:00Z	OWNER	Released another alpha, 0.60a1: https://github.com/simonw/datasette/releases/tag/0.60a1	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1083669410
https://github.com/simonw/datasette/issues/1545#issuecomment-997462604	https://api.github.com/repos/simonw/datasette/issues/1545	997462604	IC_kwDOBm6k_c47dBJM	9599	2021-12-19T21:17:08Z	2021-12-19T21:17:08Z	OWNER	Here's the relevant code: https://github.com/simonw/datasette/blob/4094741c2881c2ada3f3f878b532fdaec7914953/datasette/app.py#L1204-L1219 It's using `route_path.split("/")` which should be OK because that's the incoming `request.path` path - which I would expect to use `/` even on Windows. Then it uses `os.path.join` which should do the right thing. I need to get myself a proper Windows development environment setup to investigate this one.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1075893249
https://github.com/simonw/datasette/issues/1573#issuecomment-997462117	https://api.github.com/repos/simonw/datasette/issues/1573	997462117	IC_kwDOBm6k_c47dBBl	9599	2021-12-19T21:13:13Z	2021-12-19T21:13:13Z	OWNER	This might also be the impetus I need to bring the https://datasette.io/plugins/datasette-pretty-traces plugin into Datasette core itself.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1084185188
https://github.com/simonw/datasette/issues/1547#issuecomment-997460731	https://api.github.com/repos/simonw/datasette/issues/1547	997460731	IC_kwDOBm6k_c47dAr7	9599	2021-12-19T21:02:15Z	2021-12-19T21:02:15Z	OWNER	Yes, this is a bug. It looks like the problem is with the `if write:` branch in this code here: https://github.com/simonw/datasette/blob/5fac26aa221a111d7633f2dd92014641f7c0ade9/datasette/views/database.py#L252-L327 Is missing this bit of code: https://github.com/simonw/datasette/blob/5fac26aa221a111d7633f2dd92014641f7c0ade9/datasette/views/database.py#L343-L347	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1076388044
https://github.com/simonw/datasette/issues/1570#issuecomment-997460061	https://api.github.com/repos/simonw/datasette/issues/1570	997460061	IC_kwDOBm6k_c47dAhd	9599	2021-12-19T20:56:54Z	2021-12-19T20:56:54Z	OWNER	Documentation: https://docs.datasette.io/en/latest/internals.html#await-db-execute-write-sql-params-none-block-false	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1083921371
https://github.com/simonw/datasette/issues/1555#issuecomment-997459958	https://api.github.com/repos/simonw/datasette/issues/1555	997459958	IC_kwDOBm6k_c47dAf2	9599	2021-12-19T20:55:59Z	2021-12-19T20:55:59Z	OWNER	Closing this issue because I've optimized this a whole bunch, and it's definitely good enough for the moment.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1555#issuecomment-997325189	https://api.github.com/repos/simonw/datasette/issues/1555	997325189	IC_kwDOBm6k_c47cfmF	9599	2021-12-19T03:55:01Z	2021-12-19T20:54:51Z	OWNER	It's a bit annoying that the queries no longer show up in the trace at all now, thanks to running in `.execute_fn()`. I wonder if there's something smart I can do about that - maybe have `trace()` record that function with a traceback even though it doesn't have the executed SQL string? 5fac26aa221a111d7633f2dd92014641f7c0ade9 has the same problem.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1555#issuecomment-997459637	https://api.github.com/repos/simonw/datasette/issues/1555	997459637	IC_kwDOBm6k_c47dAa1	9599	2021-12-19T20:53:46Z	2021-12-19T20:53:46Z	OWNER	Using #1571 showed me that the `DELETE FROM columns/foreign_keys/indexes WHERE database_name = ? and table_name = ?` queries were running way more times than I expected. I came up with a new optimization that just does `DELETE FROM columns/foreign_keys/indexes WHERE database_name = ?` instead.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1566#issuecomment-997457790	https://api.github.com/repos/simonw/datasette/issues/1566	997457790	IC_kwDOBm6k_c47c_9-	9599	2021-12-19T20:40:50Z	2021-12-19T20:40:57Z	OWNER	Also release new version of `datasette-pretty-traces` with this feature: - https://github.com/simonw/datasette-pretty-traces/issues/7	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1083669410
https://github.com/simonw/datasette/issues/1555#issuecomment-997342494	https://api.github.com/repos/simonw/datasette/issues/1555	997342494	IC_kwDOBm6k_c47cj0e	9599	2021-12-19T07:22:04Z	2021-12-19T07:22:04Z	OWNER	Another option would be to provide an abstraction that makes it easier to run a group of SQL queries in the same thread at the same time, and have them traced correctly.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1555#issuecomment-997324666	https://api.github.com/repos/simonw/datasette/issues/1555	997324666	IC_kwDOBm6k_c47cfd6	9599	2021-12-19T03:47:51Z	2021-12-19T03:48:09Z	OWNER	Here's a hacked together prototype of running all of that stuff inside a single function passed to `.execute_fn()`: ```diff diff --git a/datasette/utils/internal_db.py b/datasette/utils/internal_db.py index 95055d8..58f9982 100644 --- a/datasette/utils/internal_db.py +++ b/datasette/utils/internal_db.py @@ -1,4 +1,5 @@ import textwrap +from datasette.utils import table_column_details async def init_internal_db(db): @@ -70,49 +71,70 @@ async def populate_schema_tables(internal_db, db): "DELETE FROM tables WHERE database_name = ?", [database_name], block=True ) tables = (await db.execute("select * from sqlite_master WHERE type = 'table'")).rows - tables_to_insert = [] - columns_to_delete = [] - columns_to_insert = [] - foreign_keys_to_delete = [] - foreign_keys_to_insert = [] - indexes_to_delete = [] - indexes_to_insert = [] - for table in tables: - table_name = table["name"] - tables_to_insert.append( - (database_name, table_name, table["rootpage"], table["sql"]) - ) - columns_to_delete.append((database_name, table_name)) - columns = await db.table_column_details(table_name) - columns_to_insert.extend( - { - {"database_name": database_name, "table_name": table_name}, - column._asdict(), - } - for column in columns - ) - foreign_keys_to_delete.append((database_name, table_name)) - foreign_keys = ( - await db.execute(f"PRAGMA foreign_key_list([{table_name}])") - ).rows - foreign_keys_to_insert.extend( - { - {"database_name": database_name, "table_name": table_name}, - dict(foreign_key), - } - for foreign_key in foreign_keys - ) - indexes_to_delete.append((database_name, table_name)) - indexes = (await db.execute(f"PRAGMA index_list([{table_name}])")).rows - …	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1555#issuecomment-997324156	https://api.github.com/repos/simonw/datasette/issues/1555	997324156	IC_kwDOBm6k_c47cfV8	9599	2021-12-19T03:40:05Z	2021-12-19T03:40:05Z	OWNER	Using the prototype of this: - https://github.com/simonw/datasette-pretty-traces/issues/5 I'm seeing about 180ms spent running all of these queries on startup! ![CleanShot 2021-12-18 at 19 38 37@2x](https://user-images.githubusercontent.com/9599/146663045-46bda669-90de-474f-8870-345182725dc1.png)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1555#issuecomment-997321767	https://api.github.com/repos/simonw/datasette/issues/1555	997321767	IC_kwDOBm6k_c47cewn	9599	2021-12-19T03:10:58Z	2021-12-19T03:10:58Z	OWNER	I wonder how much overhead there is switching between the `async` event loop main code and the thread that runs the SQL queries. Would there be a performance boost if I gathered all of the column/index information in a single function run on the thread using `db.execute_fn()` I wonder? It would eliminate a bunch of switching between threads. Would be great to understand how much of an impact that would have.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1555#issuecomment-997321653	https://api.github.com/repos/simonw/datasette/issues/1555	997321653	IC_kwDOBm6k_c47ceu1	9599	2021-12-19T03:09:43Z	2021-12-19T03:09:43Z	OWNER	On that same documentation page I just spotted this: > This feature is experimental and is subject to change. Further documentation will become available if and when the table-valued functions for PRAGMAs feature becomes officially supported. This makes me nervous to rely on pragma function optimizations in Datasette itself.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1555#issuecomment-997321477	https://api.github.com/repos/simonw/datasette/issues/1555	997321477	IC_kwDOBm6k_c47cesF	9599	2021-12-19T03:07:33Z	2021-12-19T03:07:33Z	OWNER	If I want to continue supporting SQLite prior to 3.16.0 (2017-01-02) I'll need this optimization to only kick in with versions that support table-valued PRAGMA functions, while keeping the old `PRAGMA foreign_key_list(table)` stuff working for those older versions. That's feasible, but it's a bit more work - and I need to make sure I have robust testing in place for SQLite 3.15.0.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1555#issuecomment-997321327	https://api.github.com/repos/simonw/datasette/issues/1555	997321327	IC_kwDOBm6k_c47cepv	9599	2021-12-19T03:05:39Z	2021-12-19T03:05:44Z	OWNER	This caught me out once before in: - https://github.com/simonw/datasette/issues/1276 Turns out Glitch was running SQLite 3.11.0 from 2016-02-15.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1555#issuecomment-997321217	https://api.github.com/repos/simonw/datasette/issues/1555	997321217	IC_kwDOBm6k_c47ceoB	9599	2021-12-19T03:04:16Z	2021-12-19T03:04:16Z	OWNER	One thing to watch out for though, from https://sqlite.org/pragma.html#pragfunc > The table-valued functions for PRAGMA feature was added in SQLite version 3.16.0 (2017-01-02). Prior versions of SQLite cannot use this feature.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1555#issuecomment-997321115	https://api.github.com/repos/simonw/datasette/issues/1555	997321115	IC_kwDOBm6k_c47cemb	9599	2021-12-19T03:03:12Z	2021-12-19T03:03:12Z	OWNER	Table columns is a bit harder, because `table_xinfo` is only in SQLite 3.26.0 or higher: https://github.com/simonw/datasette/blob/d637ed46762fdbbd8e32b86f258cd9a53c1cfdc7/datasette/utils/__init__.py#L565-L581 So if that function is available: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++table_xinfo.%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_table_xinfo%28sqlite_master.name%29+AS+table_xinfo%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27 ```sql SELECT sqlite_master.name, table_xinfo. FROM sqlite_master, pragma_table_xinfo(sqlite_master.name) AS table_xinfo WHERE sqlite_master.type = 'table' ``` And otherwise, using `table_info`: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++table_info.%2C%0D%0A++0+as+hidden%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_table_info%28sqlite_master.name%29+AS+table_info%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27 ```sql SELECT sqlite_master.name, table_info., 0 as hidden FROM sqlite_master, pragma_table_info(sqlite_master.name) AS table_info WHERE sqlite_master.type = 'table' ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1555#issuecomment-997320824	https://api.github.com/repos/simonw/datasette/issues/1555	997320824	IC_kwDOBm6k_c47ceh4	9599	2021-12-19T02:59:57Z	2021-12-19T03:00:44Z	OWNER	To list all indexes: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++index_list.%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_index_list%28sqlite_master.name%29+AS+index_list%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27 ```sql SELECT sqlite_master.name, index_list. FROM sqlite_master, pragma_index_list(sqlite_master.name) AS index_list WHERE sqlite_master.type = 'table' ``` Foreign keys: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++foreign_key_list.%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_foreign_key_list%28sqlite_master.name%29+AS+foreign_key_list%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27 ```sql SELECT sqlite_master.name, foreign_key_list. FROM sqlite_master, pragma_foreign_key_list(sqlite_master.name) AS foreign_key_list WHERE sqlite_master.type = 'table' ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1566#issuecomment-997272328	https://api.github.com/repos/simonw/datasette/issues/1566	997272328	IC_kwDOBm6k_c47cSsI	9599	2021-12-18T19:18:01Z	2021-12-18T19:18:01Z	OWNER	Added some useful new documented internal methods in: - #1570	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1083669410
https://github.com/simonw/datasette/issues/1555#issuecomment-997272223	https://api.github.com/repos/simonw/datasette/issues/1555	997272223	IC_kwDOBm6k_c47cSqf	9599	2021-12-18T19:17:13Z	2021-12-18T19:17:13Z	OWNER	That's a good optimization. Still need to deal with the huge flurry of `PRAGMA` queries though before I can consider this done.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1570#issuecomment-997267583	https://api.github.com/repos/simonw/datasette/issues/1570	997267583	IC_kwDOBm6k_c47cRh_	9599	2021-12-18T18:46:05Z	2021-12-18T18:46:12Z	OWNER	This will replace the work done in #1569.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1083921371
https://github.com/simonw/datasette/issues/1555#issuecomment-997267416	https://api.github.com/repos/simonw/datasette/issues/1555	997267416	IC_kwDOBm6k_c47cRfY	9599	2021-12-18T18:44:53Z	2021-12-18T18:45:28Z	OWNER	Rather than adding a `executemany=True` parameter, I'm now thinking a better design might be to have three methods: - `db.execute_write(sql, params=None, block=False)` - `db.execute_writescript(sql, block=False)` - `db.execute_writemany(sql, params_seq, block=False)`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656
https://github.com/simonw/datasette/issues/1569#issuecomment-997266687	https://api.github.com/repos/simonw/datasette/issues/1569	997266687	IC_kwDOBm6k_c47cRT_	9599	2021-12-18T18:41:40Z	2021-12-18T18:41:40Z	OWNER	Updated documentation: https://docs.datasette.io/en/latest/internals.html#await-db-execute-write-sql-params-none-executescript-false-block-false	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1083895395
https://github.com/simonw/datasette/issues/1555#issuecomment-997266100	https://api.github.com/repos/simonw/datasette/issues/1555	997266100	IC_kwDOBm6k_c47cRK0	9599	2021-12-18T18:40:02Z	2021-12-18T18:40:02Z	OWNER	The implementation of `cursor.executemany()` looks very efficient - it turns into a call to this C function with `multiple` set to `1`: https://github.com/python/cpython/blob/e002bbc6cce637171fb2b1391ffeca8643a13843/Modules/_sqlite/cursor.c#L468-L469	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1079149656

github

Custom SQL query returning 101 rows (hide)

Query parameters