github

This data as json, CSV

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	issue
https://github.com/simonw/sqlite-utils/issues/381#issuecomment-1010462035	https://api.github.com/repos/simonw/sqlite-utils/issues/381	1010462035	IC_kwDOCGYnMM48Om1T	9599	2022-01-11T23:33:37Z	2022-01-11T23:33:37Z	OWNER	Documentation: https://sqlite-utils.datasette.io/en/latest/cli.html#returning-all-rows-in-a-table	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1099584685
https://github.com/simonw/sqlite-utils/issues/382#issuecomment-1010461844	https://api.github.com/repos/simonw/sqlite-utils/issues/382	1010461844	IC_kwDOCGYnMM48OmyU	9599	2022-01-11T23:33:14Z	2022-01-11T23:33:14Z	OWNER	Documentation: https://sqlite-utils.datasette.io/en/latest/cli.html#returning-all-rows-in-a-table	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1099585611
https://github.com/simonw/sqlite-utils/issues/381#issuecomment-1010441118	https://api.github.com/repos/simonw/sqlite-utils/issues/381	1010441118	IC_kwDOCGYnMM48Ohue	9599	2022-01-11T22:56:53Z	2022-01-11T22:57:09Z	OWNER	`sqlite-utils search` has `--limit` already: https://sqlite-utils.datasette.io/en/latest/cli-reference.html#search ``` --limit INTEGER Number of rows to return - defaults to everything ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1099584685
https://github.com/simonw/sqlite-utils/issues/383#issuecomment-1010440166	https://api.github.com/repos/simonw/sqlite-utils/issues/383	1010440166	IC_kwDOCGYnMM48Ohfm	9599	2022-01-11T22:55:05Z	2022-01-11T22:55:05Z	OWNER	Twitter thread about this: https://twitter.com/simonw/status/1481020195074293761	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1099586786
https://github.com/simonw/sqlite-utils/issues/383#issuecomment-1010387223	https://api.github.com/repos/simonw/sqlite-utils/issues/383	1010387223	IC_kwDOCGYnMM48OUkX	9599	2022-01-11T21:45:32Z	2022-01-11T21:45:32Z	OWNER	The new page of documentation: https://sqlite-utils.datasette.io/en/latest/cli-reference.html	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1099586786
https://github.com/simonw/sqlite-utils/issues/383#issuecomment-1010386802	https://api.github.com/repos/simonw/sqlite-utils/issues/383	1010386802	IC_kwDOCGYnMM48OUdy	9599	2022-01-11T21:44:53Z	2022-01-11T21:44:53Z	OWNER	Here's the `cog` code I used: https://github.com/simonw/sqlite-utils/blob/1d44b0cc2784c94aed1bcf350225cd86ee1aa7e5/docs/cli-reference.rst#L11-L76	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1099586786
https://github.com/simonw/sqlite-utils/issues/383#issuecomment-1010333511	https://api.github.com/repos/simonw/sqlite-utils/issues/383	1010333511	IC_kwDOCGYnMM48OHdH	9599	2022-01-11T20:27:08Z	2022-01-11T20:27:08Z	OWNER	I'll call the new page "CLI reference", for consistency with the API reference page here: https://sqlite-utils.datasette.io/en/stable/reference.html	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1099586786
https://github.com/simonw/sqlite-utils/issues/380#issuecomment-1009544785	https://api.github.com/repos/simonw/sqlite-utils/issues/380	1009544785	IC_kwDOCGYnMM48LG5R	9599	2022-01-11T02:32:56Z	2022-01-11T02:32:56Z	OWNER	CLI and Python library improvements to help run [ANALYZE](https://www.sqlite.org/lang_analyze.html) after creating indexes or inserting rows, to gain better performance from the SQLite query planner when it runs against indexes. Three new CLI commands: `create-database`, `analyze` and `bulk`. - New `sqlite-utils create-database` command for creating new empty database files. ([#348](https://github.com/simonw/sqlite-utils/issues/348)) - New Python methods for running `ANALYZE` against a database, table or index: `db.analyze()` and `table.analyze()`, see [Optimizing index usage with ANALYZE](https://sqlite-utils.datasette.io/en/stable/python-api.html#python-api-analyze). ([#366](https://github.com/simonw/sqlite-utils/issues/366)) - New [sqlite-utils analyze command](https://sqlite-utils.datasette.io/en/stable/cli.html#cli-analyze) for running `ANALYZE` using the CLI. ([#379](https://github.com/simonw/sqlite-utils/issues/379)) - The `create-index`, `insert` and `update` commands now have a new `--analyze` option for running `ANALYZE` after the command has completed. ([#379](https://github.com/simonw/sqlite-utils/issues/379)) - New [sqlite-utils bulk command](https://sqlite-utils.datasette.io/en/stable/cli.html#cli-bulk) which can import records in the same way as `sqlite-utils insert` (from JSON, CSV or TSV) and use them to bulk execute a parametrized SQL query. ([#375](https://github.com/simonw/sqlite-utils/issues/375)) - The CLI tool can now also be run using `python -m sqlite_utils`. ([#368](https://github.com/simonw/sqlite-utils/issues/368)) - Using `--fmt` now implies `--table`, so you don't need to pass both options. ([#374](https://github.com/simonw/sqlite-utils/issues/374)) - The `--convert` function applied to rows can now modify the row in place. ([#371](https://github.com/simonw/sqlite-utils/issues/371)) - The [insert-files command](https://sqlite-utils.datasette.io/en/stable/cli.html#cli-insert-files) supports two new columns: `stem` and `suffix`. ([#372](https://github.co…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1098574572
https://github.com/simonw/sqlite-utils/issues/375#issuecomment-1009536276	https://api.github.com/repos/simonw/sqlite-utils/issues/375	1009536276	IC_kwDOCGYnMM48LE0U	9599	2022-01-11T02:12:58Z	2022-01-11T02:12:58Z	OWNER	Documentation: https://sqlite-utils.datasette.io/en/latest/cli.html#executing-sql-in-bulk	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097251014
https://github.com/simonw/sqlite-utils/pull/377#issuecomment-1009534817	https://api.github.com/repos/simonw/sqlite-utils/issues/377	1009534817	IC_kwDOCGYnMM48LEdh	9599	2022-01-11T02:09:38Z	2022-01-11T02:09:38Z	OWNER	I tested this like so: ``` % wget 'https://raw.githubusercontent.com/wri/global-power-plant-database/master/output_database/global_power_plant_database.csv' % sqlite-utils create-database test.db % sqlite-utils create-table test.db power_plants url text owner text % sqlite-utils schema test.db CREATE TABLE [power_plants] ( [url] TEXT, [owner] TEXT ); % sqlite-utils bulk test.db 'insert into power_plants (url, owner) values (:url, :owner)' global_power_plant_database.csv --csv [------------------------------------] 0% [###################################-] 99% % sqlite-utils tables --counts test.db -t table count ------------ ------- power_plants 33643 ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097477582
https://github.com/simonw/sqlite-utils/pull/377#issuecomment-1009532125	https://api.github.com/repos/simonw/sqlite-utils/issues/377	1009532125	IC_kwDOCGYnMM48LDzd	9599	2022-01-11T02:03:35Z	2022-01-11T02:03:35Z	OWNER	Documentation: https://github.com/simonw/sqlite-utils/blob/f4ea0d32c0543373eefaa9b9f3911eb07549eecb/docs/cli.rst#executing-sql-in-bulk	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097477582
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1009521921	https://api.github.com/repos/simonw/sqlite-utils/issues/365	1009521921	IC_kwDOCGYnMM48LBUB	9599	2022-01-11T01:37:53Z	2022-01-11T01:37:53Z	OWNER	I decided to go with making this opt-in, mainly for consistency with the other places where I added this feature - see: - #379 - #366 You can now run the following: sqlite-utils create-index mydb.db mytable mycolumn --analyze And ``ANALYZE`` will be run on the index once it has been created.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096558279
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009508865	https://api.github.com/repos/simonw/sqlite-utils/issues/366	1009508865	IC_kwDOCGYnMM48K-IB	9599	2022-01-11T01:08:51Z	2022-01-11T01:08:51Z	OWNER	The Python methods are all done now, next step is the CLI options. I'll do those in a separate issue.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096563265
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009288898	https://api.github.com/repos/simonw/sqlite-utils/issues/366	1009288898	IC_kwDOCGYnMM48KIbC	9599	2022-01-10T19:54:04Z	2022-01-10T19:54:04Z	OWNER	Having browsed the API reference I think the methods that would benefit from an `analyze=True` parameter are: - `db.create_index` - `table.insert_all` - `table.upsert_all` - `table.delete_where`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096563265
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009285627	https://api.github.com/repos/simonw/sqlite-utils/issues/366	1009285627	IC_kwDOCGYnMM48KHn7	9599	2022-01-10T19:49:19Z	2022-01-10T19:51:25Z	OWNER	Documentation for those two new methods: https://sqlite-utils.datasette.io/en/latest/python-api.html#optimizing-index-usage-with-analyze	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096563265
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009286373	https://api.github.com/repos/simonw/sqlite-utils/issues/366	1009286373	IC_kwDOCGYnMM48KHzl	9599	2022-01-10T19:50:22Z	2022-01-10T19:50:22Z	OWNER	With respect to #365, I'm now thinking that having the ability to say "... and then run ANALYZE" could be useful for a bunch of Python methods. For example: ```python db["dogs"].insert_all(list_of_dogs, analyze=True) db["dogs"].create_index(["name"], analyze=True) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096563265
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009273525	https://api.github.com/repos/simonw/sqlite-utils/issues/366	1009273525	IC_kwDOCGYnMM48KEq1	9599	2022-01-10T19:32:39Z	2022-01-10T19:32:39Z	OWNER	I'm going to implement the Python library methods based on the prototype: ```diff commit 650f97a08f29a688c530e5f6c9eedc9269ed7bdc Author: Simon Willison <swillison@gmail.com> Date: Sat Jan 8 13:34:01 2022 -0800 Initial prototype of .analyze(), refs #366 diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py index dfc4723..1348b4a 100644 --- a/sqlite_utils/db.py +++ b/sqlite_utils/db.py @@ -923,6 +923,13 @@ class Database: "Run a SQLite ``VACUUM`` against the database." self.execute("VACUUM;") + def analyze(self, name=None): + "Run ``ANALYZE`` against the entire database or a named table or index." + sql = "ANALYZE" + if name is not None: + sql += " [{}]".format(name) + self.execute(sql) + class Queryable: def exists(self) -> bool: @@ -2902,6 +2909,10 @@ class Table(Queryable): ) return self + def analyze(self): + "Run ANALYZE against this table" + self.db.analyze(self.name) + def analyze_column( self, column: str, common_limit: int = 10, value_truncate=None, total_rows=None ) -> "ColumnDetails": ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096563265
https://github.com/simonw/sqlite-utils/pull/367#issuecomment-1009272446	https://api.github.com/repos/simonw/sqlite-utils/issues/367	1009272446	IC_kwDOCGYnMM48KEZ-	9599	2022-01-10T19:31:08Z	2022-01-10T19:31:08Z	OWNER	I'm going to implement this in a separate commit from this PR.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097041471
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008557414	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008557414	IC_kwDOCGYnMM48HV1m	9599	2022-01-10T05:36:19Z	2022-01-10T05:36:19Z	OWNER	That did the trick.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/375#issuecomment-1008556706	https://api.github.com/repos/simonw/sqlite-utils/issues/375	1008556706	IC_kwDOCGYnMM48HVqi	9599	2022-01-10T05:33:41Z	2022-01-10T05:33:41Z	OWNER	I tested the prototype like this: sqlite-utils blah.db 'create table blah (id integer primary key, name text)' echo 'id,name 1,Cleo 2,Chicken' > blah.csv sqlite-utils bulk blah.db 'insert into blah (id, name) values (:id, :name)' blah.csv --csv	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097251014
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008546573	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008546573	IC_kwDOCGYnMM48HTMN	9599	2022-01-10T05:05:15Z	2022-01-10T05:05:15Z	OWNER	Bit nasty but it might work: ```python def try_until(expected): tries = 0 while True: rows = list(Database(db_path)["rows"].rows) if rows == expected: return tries += 1 if tries > 10: assert False, "Expected {}, got {}".format(expected, rows) time.sleep(tries * 0.1) try_until([{"name": "Azi"}]) proc.stdin.write(b'{"name": "Suna"}\n') proc.stdin.flush() try_until([{"name": "Azi"}, {"name": "Suna"}]) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008545140	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008545140	IC_kwDOCGYnMM48HS10	9599	2022-01-10T05:01:34Z	2022-01-10T05:01:34Z	OWNER	Urgh, tests are still failing intermittently - for example: ``` time.sleep(0.4) > assert list(Database(db_path)["rows"].rows) == [{"name": "Azi"}] E AssertionError: assert [] == [{'name': 'Azi'}] E Right contains one more item: {'name': 'Azi'} E Full diff: E - [{'name': 'Azi'}] E + [] ``` I'm going to change this code to keep on trying up to 10 seconds - that should get the tests to pass faster on most machines.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008537194	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008537194	IC_kwDOCGYnMM48HQ5q	9599	2022-01-10T04:29:53Z	2022-01-10T04:31:29Z	OWNER	After a bunch of debugging with `print()` statements it's clear that the problem isn't with when things are committed or the size of the batches - it's that the data sent to standard input is all being processed in one go, not a line at a time. I think that's because it is being buffered by this: https://github.com/simonw/sqlite-utils/blob/d2a79d200f9071a86027365fa2a576865b71064f/sqlite_utils/cli.py#L759-L770 The buffering is there so that we can sniff the first few bytes to detect if it's a CSV file - added in 99ff0a288c08ec2071139c6031eb880fa9c95310 for #230. So maybe for non-CSV inputs we should disable buffering?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008526736	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008526736	IC_kwDOCGYnMM48HOWQ	9599	2022-01-10T04:07:29Z	2022-01-10T04:07:29Z	OWNER	I think this test is right: ```python def test_insert_streaming_batch_size_1(db_path): # https://github.com/simonw/sqlite-utils/issues/364 # Streaming with --batch-size 1 should commit on each record # Can't use CliRunner().invoke() here bacuse we need to # run assertions in between writing to process stdin proc = subprocess.Popen( [ sys.executable, "-m", "sqlite_utils", "insert", db_path, "rows", "-", "--nl", "--batch-size", "1", ], stdin=subprocess.PIPE, ) proc.stdin.write(b'{"name": "Azi"}') proc.stdin.flush() assert list(Database(db_path)["rows"].rows) == [{"name": "Azi"}] proc.stdin.write(b'{"name": "Suna"}') proc.stdin.flush() assert list(Database(db_path)["rows"].rows) == [{"name": "Azi"}, {"name": "Suna"}] proc.stdin.close() proc.wait() ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/348#issuecomment-1008383293	https://api.github.com/repos/simonw/sqlite-utils/issues/348	1008383293	IC_kwDOCGYnMM48GrU9	9599	2022-01-09T20:38:17Z	2022-01-09T20:38:17Z	OWNER	Documentation: https://sqlite-utils.datasette.io/en/latest/cli.html#creating-an-empty-database	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1067771698
https://github.com/simonw/sqlite-utils/issues/348#issuecomment-1008367607	https://api.github.com/repos/simonw/sqlite-utils/issues/348	1008367607	IC_kwDOCGYnMM48Gnf3	9599	2022-01-09T20:22:43Z	2022-01-09T20:22:43Z	OWNER	I'm not going to implement `--page-size` unless someone specifically requests it - I don't like having features that I've never needed to use myself.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1067771698
https://github.com/simonw/sqlite-utils/issues/371#issuecomment-1008364701	https://api.github.com/repos/simonw/sqlite-utils/issues/371	1008364701	IC_kwDOCGYnMM48Gmyd	9599	2022-01-09T20:04:35Z	2022-01-09T20:04:35Z	OWNER	The previous code for highlighting errors in syntax (which was already a bit confused thanks to the added `return`, see https://github.com/simonw/sqlite-utils/issues/355#issuecomment-991393684 - isn't compatible with this approach at all. I'm going to ditch it and just show a generic `Error: Could not compile code` message.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097128334
https://github.com/simonw/sqlite-utils/issues/371#issuecomment-1008354207	https://api.github.com/repos/simonw/sqlite-utils/issues/371	1008354207	IC_kwDOCGYnMM48GkOf	9599	2022-01-09T18:54:54Z	2022-01-09T18:54:54Z	OWNER	This seems to work: ```python def _compile_code(code, imports, variable="value"): locals = {} globals = {"r": recipes, "recipes": recipes} # If user defined a convert() function, return that try: exec(code, globals, locals) return locals["convert"] except (AttributeError, SyntaxError, NameError, KeyError, TypeError): pass # Try compiling their code as a function instead body_variants = [code] # If single line and no 'return', try adding the return if "\n" not in code and not code.strip().startswith("return "): body_variants.insert(0, "return {}".format(code)) for variant in body_variants: new_code = ["def fn({}):".format(variable)] for line in variant.split("\n"): new_code.append(" {}".format(line)) try: code_o = compile("\n".join(new_code), "<string>", "exec") break except SyntaxError: # Try another variant, e.g. for 'return row["column"] = 1' continue for import_ in imports: globals[import_.split(".")[0]] = __import__(import_) exec(code_o, globals, locals) return locals["fn"] ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097128334
https://github.com/simonw/sqlite-utils/issues/371#issuecomment-1008348032	https://api.github.com/repos/simonw/sqlite-utils/issues/371	1008348032	IC_kwDOCGYnMM48GiuA	9599	2022-01-09T18:14:02Z	2022-01-09T18:14:02Z	OWNER	Here's the code in question: https://github.com/simonw/sqlite-utils/blob/b8c134059e89f0fa040b84fb7d0bda25b9a52759/sqlite_utils/utils.py#L288-L299	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097128334
https://github.com/simonw/sqlite-utils/issues/371#issuecomment-1008347768	https://api.github.com/repos/simonw/sqlite-utils/issues/371	1008347768	IC_kwDOCGYnMM48Gip4	9599	2022-01-09T18:12:30Z	2022-01-09T18:12:30Z	OWNER	Tried this test: ```python result = CliRunner().invoke( cli.cli, [ "insert", db_path, "rows", "-", "--convert", 'row["is_chicken"] = True', ], input='{"name": "Azi"}', ) ``` And got this error: > `E + where 1 = <Result SyntaxError('invalid syntax', ('<string>', 2, 30, ' return row["is_chicken"] = True\n'))>.exit_code` The code snippet compilation isn't currently compatible with this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097128334
https://github.com/simonw/sqlite-utils/issues/374#issuecomment-1008346841	https://api.github.com/repos/simonw/sqlite-utils/issues/374	1008346841	IC_kwDOCGYnMM48GibZ	9599	2022-01-09T18:06:50Z	2022-01-09T18:06:50Z	OWNER	In addition to a unit test I manually tested all of the above, e.g. ``` % sqlite-utils indexes global-power-plants.db sqlite_master --fmt rst ======= ============ ======= ===== ====== ====== ====== ===== table index_name seqno cid name desc coll key ======= ============ ======= ===== ====== ====== ====== ===== ======= ============ ======= ===== ====== ====== ====== ===== ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097135860
https://github.com/simonw/sqlite-utils/issues/374#issuecomment-1008346338	https://api.github.com/repos/simonw/sqlite-utils/issues/374	1008346338	IC_kwDOCGYnMM48GiTi	9599	2022-01-09T18:03:22Z	2022-01-09T18:03:22Z	OWNER	Commands that support `--fmt` (via the `@output_options` decorator) are: - `tables` - `views` - `query` - `memory` - `search` - `rows` - `triggers` - `indexes`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097135860
https://github.com/simonw/sqlite-utils/issues/374#issuecomment-1008345267	https://api.github.com/repos/simonw/sqlite-utils/issues/374	1008345267	IC_kwDOCGYnMM48GiCz	9599	2022-01-09T17:56:37Z	2022-01-09T17:56:37Z	OWNER	Better: ```python if fmt: table = True ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097135860
https://github.com/simonw/sqlite-utils/issues/373#issuecomment-1008344980	https://api.github.com/repos/simonw/sqlite-utils/issues/373	1008344980	IC_kwDOCGYnMM48Gh-U	9599	2022-01-09T17:54:53Z	2022-01-09T17:54:53Z	OWNER	Updated TIL: https://til.simonwillison.net/python/cog-to-update-help-in-readme#user-content-cog-for-restructuredtext	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097135732
https://github.com/simonw/sqlite-utils/issues/373#issuecomment-1008344525	https://api.github.com/repos/simonw/sqlite-utils/issues/373	1008344525	IC_kwDOCGYnMM48Gh3N	9599	2022-01-09T17:52:22Z	2022-01-09T17:52:22Z	OWNER	Updated docs: https://sqlite-utils.datasette.io/en/latest/cli.html#table-formatted-output	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097135732
https://github.com/simonw/sqlite-utils/issues/373#issuecomment-1008341078	https://api.github.com/repos/simonw/sqlite-utils/issues/373	1008341078	IC_kwDOCGYnMM48GhBW	9599	2022-01-09T17:31:12Z	2022-01-09T17:31:12Z	OWNER	Found an example of using `cog` in a rST file here: https://github.com/nedbat/coveragepy/blob/f3238eea7e403d13a217b30579b1a1c2cbff62e3/doc/dbschema.rst#L21 ``` .. [[[cog from coverage.sqldata import SCHEMA_VERSION print(".. code::") print() print(f" SCHEMA_VERSION = {SCHEMA_VERSION}") print() .. ]]] .. code:: SCHEMA_VERSION = 7 .. [[[end]]] ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097135732
https://github.com/simonw/sqlite-utils/issues/375#issuecomment-1008338186	https://api.github.com/repos/simonw/sqlite-utils/issues/375	1008338186	IC_kwDOCGYnMM48GgUK	9599	2022-01-09T17:13:33Z	2022-01-09T17:13:54Z	OWNER	cat blah.csv \| sqlite-utils bulk blah.db - \ "insert into blah (:foo, :bar)" --csv	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097251014
https://github.com/simonw/sqlite-utils/issues/374#issuecomment-1008252732	https://api.github.com/repos/simonw/sqlite-utils/issues/374	1008252732	IC_kwDOCGYnMM48GLc8	9599	2022-01-09T08:25:30Z	2022-01-09T08:25:30Z	OWNER	Need to change `if table:` to `if table or fmt:` in a few places.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097135860
https://github.com/simonw/sqlite-utils/issues/372#issuecomment-1008247370	https://api.github.com/repos/simonw/sqlite-utils/issues/372	1008247370	IC_kwDOCGYnMM48GKJK	9599	2022-01-09T07:51:18Z	2022-01-09T07:51:18Z	OWNER	Pathlib says the stem of that would be `dogs.and.cats.jpg` - best stick with that for consistency. https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.suffix It calls the last bit `suffix` - maybe I should use that instead of `ext`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097129710
https://github.com/simonw/sqlite-utils/issues/371#issuecomment-1008246366	https://api.github.com/repos/simonw/sqlite-utils/issues/371	1008246366	IC_kwDOCGYnMM48GJ5e	9599	2022-01-09T07:42:14Z	2022-01-09T07:42:14Z	OWNER	Also need to update relevant docs for that example.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097128334
https://github.com/simonw/sqlite-utils/issues/371#issuecomment-1008246239	https://api.github.com/repos/simonw/sqlite-utils/issues/371	1008246239	IC_kwDOCGYnMM48GJ3f	9599	2022-01-09T07:41:24Z	2022-01-09T07:41:24Z	OWNER	Might be a case of modifying this line: https://github.com/simonw/sqlite-utils/blob/e0c476bc380744680c8b7675c24fb0e9f5ec6dcd/sqlite_utils/cli.py#L828 To: ```python docs = (fn(doc) or doc for doc in docs) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097128334
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008234293	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008234293	IC_kwDOCGYnMM48GG81	9599	2022-01-09T05:37:02Z	2022-01-09T05:37:02Z	OWNER	Calling `p.stdin.close()` and then `p.wait()` terminates the subprocess.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008233910	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008233910	IC_kwDOCGYnMM48GG22	9599	2022-01-09T05:32:53Z	2022-01-09T05:35:45Z	OWNER	This is strange. The following: ```pycon >>> import subprocess >>> p = subprocess.Popen(["sqlite-utils", "insert", "/tmp/stream.db", "stream", "-", "--nl"], stdin=subprocess.PIPE) >>> p.stdin.write(b'\n'.join(b'{"id": %s}' % str(i).encode("utf-8") for i in range(1000))) 11889 >>> # At this point /tmp/stream.db is still 0 bytes - but if I then run this: >>> p.stdin.close() >>> # /tmp/stream.db is now 20K and contains the written data ``` No wait, mystery solved - I can add `p.stdin.flush()` instead of `p.stdin.close()` and the file suddenly jumps up in size.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/369#issuecomment-1008232075	https://api.github.com/repos/simonw/sqlite-utils/issues/369	1008232075	IC_kwDOCGYnMM48GGaL	9599	2022-01-09T05:13:15Z	2022-01-09T05:13:56Z	OWNER	I think the query that will help solve this is: `explain query plan select * from ny_times_us_counties where state = 1 and county = 2` In this case, the query planner needs to decide if it should use the index for the `state` column or the index for the `county` column. That's where the statistics come into play. In particular: \| tbl \| idx \| stat \| \|----------------------\|---------------------------------\|---------------\| \| ny_times_us_counties \| idx_ny_times_us_counties_date \| 2092871 2915 \| \| ny_times_us_counties \| idx_ny_times_us_counties_fips \| 2092871 651 \| \| ny_times_us_counties \| idx_ny_times_us_counties_county \| 2092871 1085 \| \| ny_times_us_counties \| idx_ny_times_us_counties_state \| 2092871 37373 \| Those numbers are explained by this comment in the SQLite C code: https://github.com/sqlite/sqlite/blob/5622c7f97106314719740098cf0854e7eaa81802/src/analyze.c#L41-L55 ``` There is normally one row per index, with the index identified by the name in the idx column. The tbl column is the name of the table to which the index belongs. In each such row, the stat column will be a string consisting of a list of integers. The first integer in this list is the number of rows in the index. (This is the same as the number of rows in the table, except for partial indices.) The second integer is the average number of rows in the index that have the same value in the first column of the index. ``` So that table is telling us that using a value in the `county` column will filter down to an average of 1,085 rows, whereas filtering on the `state` column will filter down to an average of 37,373 - so clearly the `county` index is the better index to use here! Just one catch: against both my` covid.db` and my `covid-analyzed.db` databases the `county` index is picked for both of them - so SQLite is somehow guessing that `county` is a better index even though it doesn't have statistics for that.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097091527
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008229839	https://api.github.com/repos/simonw/sqlite-utils/issues/365	1008229839	IC_kwDOCGYnMM48GF3P	9599	2022-01-09T04:51:44Z	2022-01-09T04:51:44Z	OWNER	Found one report on Stack Overflow from 9 years ago of someone seeing broken performance after running `ANALYZE`, hard to say that's a trend and not a single weird edge-case though! https://stackoverflow.com/q/12947214/6083	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096558279
https://github.com/simonw/sqlite-utils/issues/369#issuecomment-1008229341	https://api.github.com/repos/simonw/sqlite-utils/issues/369	1008229341	IC_kwDOCGYnMM48GFvd	9599	2022-01-09T04:45:38Z	2022-01-09T04:47:11Z	OWNER	This is probably too fancy. I think maybe the way to do this is with `select * from [global-power-plants] where "country_long" = 'United Kingdom'` - then mess around with stats to see if I can get it to use the index or not based on them. Here's the explain for that: https://global-power-plants.datasettes.com/global-power-plants?sql=EXPLAIN+QUERY+PLAN+select+*+from+[global-power-plants]+where+%22country_long%22+%3D+%27United+Kingdom%27	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097091527
https://github.com/simonw/sqlite-utils/issues/369#issuecomment-1008227625	https://api.github.com/repos/simonw/sqlite-utils/issues/369	1008227625	IC_kwDOCGYnMM48GFUp	9599	2022-01-09T04:25:38Z	2022-01-09T04:25:38Z	OWNER	```sql EXPLAIN QUERY PLAN select country_long, count() from [global-power-plants] group by country_long ``` https://global-power-plants.datasettes.com/global-power-plants?sql=EXPLAIN+QUERY+PLAN+select+country_long%2C+count%28%29+from+%5Bglobal-power-plants%5D+group+by+country_long > SCAN TABLE global-power-plants USING COVERING INDEX "global-power-plants_country_long"	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097091527
https://github.com/simonw/datasette/issues/1588#issuecomment-1008227436	https://api.github.com/repos/simonw/datasette/issues/1588	1008227436	IC_kwDOBm6k_c48GFRs	9599	2022-01-09T04:23:37Z	2022-01-09T04:25:04Z	OWNER	Relevant code: https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/utils/__init__.py#L163-L170 https://github.com/simonw/datasette/blob/85849935292e500ab7a99f8fe0f9546e903baad3/datasette/utils/__init__.py#L195-L204	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097101917
https://github.com/simonw/datasette/issues/1588#issuecomment-1008227491	https://api.github.com/repos/simonw/datasette/issues/1588	1008227491	IC_kwDOBm6k_c48GFSj	9599	2022-01-09T04:24:09Z	2022-01-09T04:24:09Z	OWNER	I think this is the fix: ```python re.compile(r"^explain\s+query\s+plan\s+select\b"), ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097101917
https://github.com/simonw/sqlite-utils/issues/369#issuecomment-1008226862	https://api.github.com/repos/simonw/sqlite-utils/issues/369	1008226862	IC_kwDOCGYnMM48GFIu	9599	2022-01-09T04:17:55Z	2022-01-09T04:17:55Z	OWNER	There are some clues as to what effect ANALYZE has in https://www.sqlite.org/optoverview.html Some quotes: > SQLite might use a skip-scan on an index if it knows that the first one or more columns contain many duplication values. If there are too few duplicates in the left-most columns of the index, then it would be faster to simply step ahead to the next value, and thus do a full table scan, than to do a binary search on an index to locate the next left-column value. > > The only way that SQLite can know that there are many duplicates in the left-most columns of an index is if the ANALYZE command has been run on the database. Without the results of ANALYZE, SQLite has to guess at the "shape" of the data in the table, and the default guess is that there are an average of 10 duplicates for every value in the left-most column of the index. Skip-scan only becomes profitable (it only gets to be faster than a full table scan) when the number of duplicates is about 18 or more. Hence, a skip-scan is never used on a database that has not been analyzed. And > Join reordering is automatic and usually works well enough that programmers do not have to think about it, especially if ANALYZE has been used to gather statistics about the available indexes, though occasionally some hints from the programmer are needed. And > The various sqlite_statN tables contain information on how selective the various indexes are. For example, the sqlite_stat1 table might indicate that an equality constraint on column x reduces the search space to 10 rows on average, whereas an equality constraint on column y reduces the search space to 3 rows on average. In that case, SQLite would prefer to use index ex2i2 since that index is more selective.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097091527
https://github.com/simonw/sqlite-utils/issues/369#issuecomment-1008226487	https://api.github.com/repos/simonw/sqlite-utils/issues/369	1008226487	IC_kwDOCGYnMM48GFC3	9599	2022-01-09T04:14:05Z	2022-01-09T04:14:05Z	OWNER	Didn't manage to spot a meaningful difference with that database either: ``` analyze % python3 -m timeit '__import__("sqlite3").connect("covid.db").execute("select fips, count() from [ny_times_us_counties] group by fips").fetchall()' 2 loops, best of 5: 101 msec per loop analyze % python3 -m timeit '__import__("sqlite3").connect("covid-analyzed.db").execute("select fips, count() from [ny_times_us_counties] group by fips").fetchall()' 2 loops, best of 5: 103 msec per loop ``` Maybe `select fips, count(*) from [ny_times_us_counties] group by fips` isn't a good query for testing this?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097091527
https://github.com/simonw/sqlite-utils/issues/369#issuecomment-1008220270	https://api.github.com/repos/simonw/sqlite-utils/issues/369	1008220270	IC_kwDOCGYnMM48GDhu	9599	2022-01-09T03:12:38Z	2022-01-09T03:13:15Z	OWNER	Basically no difference using this very basic benchmark: ``` analyze % python3 -m timeit '__import__("sqlite3").connect("global-power-plants.db").execute("select country_long, count() from [global-power-plants] group by country_long").fetchall()' 100 loops, best of 5: 2.39 msec per loop analyze % python3 -m timeit '__import__("sqlite3").connect("global-power-plants-analyzed.db").execute("select country_long, count() from [global-power-plants] group by country_long").fetchall()' 100 loops, best of 5: 2.38 msec per loop ``` I should try this against a much larger database. https://covid-19.datasettes.com/covid.db is 879MB.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097091527
https://github.com/simonw/sqlite-utils/issues/369#issuecomment-1008219844	https://api.github.com/repos/simonw/sqlite-utils/issues/369	1008219844	IC_kwDOCGYnMM48GDbE	9599	2022-01-09T03:08:09Z	2022-01-09T03:08:09Z	OWNER	``` analyze % sqlite-utils global-power-plants-analyzed.db 'analyze' [{"rows_affected": -1}] analyze % sqlite-utils tables global-power-plants-analyzed.db [{"table": "global-power-plants"}, {"table": "global-power-plants_fts"}, {"table": "global-power-plants_fts_data"}, {"table": "global-power-plants_fts_idx"}, {"table": "global-power-plants_fts_docsize"}, {"table": "global-power-plants_fts_config"}, {"table": "sqlite_stat1"}] analyze % sqlite-utils rows global-power-plants-analyzed.db sqlite_stat1 -t tbl idx stat ------------------------------- ---------------------------------- --------- global-power-plants_fts_config global-power-plants_fts_config 1 1 global-power-plants_fts_docsize 33643 global-power-plants_fts_idx global-power-plants_fts_idx 199 40 1 global-power-plants_fts_data 136 global-power-plants "global-power-plants_owner" 33643 4 global-power-plants "global-power-plants_country_long" 33643 202 ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097091527
https://github.com/simonw/sqlite-utils/issues/369#issuecomment-1008219588	https://api.github.com/repos/simonw/sqlite-utils/issues/369	1008219588	IC_kwDOCGYnMM48GDXE	9599	2022-01-09T03:06:42Z	2022-01-09T03:06:42Z	OWNER	``` analyze % sqlite-utils indexes global-power-plants.db -t table index_name seqno cid name desc coll key ------------------------------ ------------------------------------------------- ------- ----- ------------ ------ ------ ----- global-power-plants "global-power-plants_owner" 0 12 owner 0 BINARY 1 global-power-plants "global-power-plants_country_long" 0 1 country_long 0 BINARY 1 global-power-plants_fts_idx sqlite_autoindex_global-power-plants_fts_idx_1 0 0 segid 0 BINARY 1 global-power-plants_fts_idx sqlite_autoindex_global-power-plants_fts_idx_1 1 1 term 0 BINARY 1 global-power-plants_fts_config sqlite_autoindex_global-power-plants_fts_config_1 0 0 k 0 BINARY 1 ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097091527
https://github.com/simonw/sqlite-utils/issues/369#issuecomment-1008219484	https://api.github.com/repos/simonw/sqlite-utils/issues/369	1008219484	IC_kwDOCGYnMM48GDVc	9599	2022-01-09T03:05:44Z	2022-01-09T03:05:44Z	OWNER	I'll start by running some experiments against the 11MB database file from https://global-power-plants.datasettes.com/global-power-plants.db	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097091527
https://github.com/simonw/sqlite-utils/issues/369#issuecomment-1008219191	https://api.github.com/repos/simonw/sqlite-utils/issues/369	1008219191	IC_kwDOCGYnMM48GDQ3	9599	2022-01-09T03:03:53Z	2022-01-09T03:03:53Z	OWNER	Refs: - #366 - #365	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097091527
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008163585	https://api.github.com/repos/simonw/sqlite-utils/issues/365	1008163585	IC_kwDOCGYnMM48F1sB	9599	2022-01-08T22:14:39Z	2022-01-09T03:03:07Z	OWNER	The reason I'm hesitating on this is that I've not actually used ANALYZE at all in nearly five years of messing around with SQLite! So I'm nervous that there are surprise downsides I haven't thought of. My hunch is that ANALYZE is only worth worrying about on much larger databases, in which case I'm OK supporting it as a thoroughly documented power-user feature rather than a default.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096558279
https://github.com/simonw/sqlite-utils/issues/368#issuecomment-1008216371	https://api.github.com/repos/simonw/sqlite-utils/issues/368	1008216371	IC_kwDOCGYnMM48GCkz	9599	2022-01-09T02:36:22Z	2022-01-09T02:36:22Z	OWNER	In Python 3.6: https://docs.python.org/3.6/library/subprocess.html > This does not capture stdout or stderr by default. To do so, pass [`PIPE`](https://docs.python.org/3.6/library/subprocess.html#subprocess.PIPE "subprocess.PIPE") for the stdout and/or stderr arguments.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097087280
https://github.com/simonw/sqlite-utils/issues/368#issuecomment-1008216271	https://api.github.com/repos/simonw/sqlite-utils/issues/368	1008216271	IC_kwDOCGYnMM48GCjP	9599	2022-01-09T02:35:09Z	2022-01-09T02:35:09Z	OWNER	Test failure on Python 3.6: > `E TypeError: __init__() got an unexpected keyword argument 'capture_output'`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097087280
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008216201	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008216201	IC_kwDOCGYnMM48GCiJ	9599	2022-01-09T02:34:12Z	2022-01-09T02:34:12Z	OWNER	I can now write tests that look like this: https://github.com/simonw/sqlite-utils/blob/539f5ccd90371fa87f946018f8b77d55929e06db/tests/test_cli.py#L2024-L2030 Which means I can write a test that exercises this bug.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/368#issuecomment-1008215912	https://api.github.com/repos/simonw/sqlite-utils/issues/368	1008215912	IC_kwDOCGYnMM48GCdo	9599	2022-01-09T02:30:59Z	2022-01-09T02:30:59Z	OWNER	Even better, inspired by `rich`, support `python -m sqlite_utils`. https://github.com/Textualize/rich/blob/master/rich/__main__.py	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097087280
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008214998	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008214998	IC_kwDOCGYnMM48GCPW	9599	2022-01-09T02:23:20Z	2022-01-09T02:23:20Z	OWNER	Possible way of running the test: add this to `sqlite_utils/cli.py`: ```python if __name__ == "__main__": cli() ``` Now the tool can be run using `python -m sqlite_utils.cli --help` Then in the test use `subprocess` to call `sys.executable` (the path to the current Python interpreter) and pass it `-m sqlite_utils.cli` to run the script!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008214406	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008214406	IC_kwDOCGYnMM48GCGG	9599	2022-01-09T02:18:21Z	2022-01-09T02:18:21Z	OWNER	I'm having trouble figuring out the best way to write a unit test for this. Filed a relevant feature request for Click here: - https://github.com/pallets/click/issues/2171	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008163050	https://api.github.com/repos/simonw/sqlite-utils/issues/365	1008163050	IC_kwDOCGYnMM48F1jq	9599	2022-01-08T22:10:51Z	2022-01-08T22:10:51Z	OWNER	Is there a downside to having a `sqlite_stat1` table if it has wildly incorrect statistics in it? Imagine the following sequence of events: - User imports a few records, creating the table, using `sqlite-utils insert` - User runs `sqlite-utils create-index ...` which also creates and populates the `sqlite_stat1` table - User runs `insert` again to populate several million new records The user now has a database file with several million records and a statistics table that is wildly out of date, having been populated when they only had a few. Will this result in surprisingly bad query performance compared to it that statistics table did not exist at all? If so, I lean much harder towards `ANALYZE` as a strictly opt-in optimization, maybe with the `--analyze` option added to `sqlite-utils insert` top to help users opt in to updating their statistics after running big inserts.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096558279
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1008158616	https://api.github.com/repos/simonw/sqlite-utils/issues/366	1008158616	IC_kwDOCGYnMM48F0eY	9599	2022-01-08T21:35:32Z	2022-01-08T21:35:32Z	OWNER	Built a prototype in a branch, see #367.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096563265
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008158357	https://api.github.com/repos/simonw/sqlite-utils/issues/365	1008158357	IC_kwDOCGYnMM48F0aV	9599	2022-01-08T21:33:07Z	2022-01-08T21:33:07Z	OWNER	The one thing that worries me a little bit about doing this by default is that it adds a surprising new table to the database - it may be confusing to users if they run `create-index` and their database suddenly has a new `sqlite_stat1` table, see https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1008157132 Options here are: - Do it anyway. People can tolerate a surprise table appearing when they create an index. - Only run `ANALYZE` if the user says `sqlite-utils create-index ... --analyze` - Use the `--analyze` option, but also automatically run `ANALYZE` if they create an index and the database they are working with already has a `sqlite_stat1` table I'm currently leading towards that third option - @fgregg any thoughts?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096558279
https://github.com/simonw/datasette/issues/1587#issuecomment-1008157998	https://api.github.com/repos/simonw/datasette/issues/1587	1008157998	IC_kwDOBm6k_c48F0Uu	9599	2022-01-08T21:29:54Z	2022-01-08T21:29:54Z	OWNER	Relevant code: https://github.com/simonw/datasette/blob/00a2895cd2dc42c63846216b36b2dc9f41170129/datasette/database.py#L339-L354	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097040427
https://github.com/simonw/datasette/issues/1587#issuecomment-1008157908	https://api.github.com/repos/simonw/datasette/issues/1587	1008157908	IC_kwDOBm6k_c48F0TU	9599	2022-01-08T21:29:06Z	2022-01-08T21:29:06Z	OWNER	Depending on the SQLite version (and compile options) that ran `ANALYZE` these can be called: - `sqlite_stat1` - `sqlite_stat2` - `sqlite_stat3` - `sqlite_stat4`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1097040427
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1008157132	https://api.github.com/repos/simonw/sqlite-utils/issues/366	1008157132	IC_kwDOCGYnMM48F0HM	9599	2022-01-08T21:23:08Z	2022-01-08T21:25:05Z	OWNER	Running `ANALYZE` creates a new visible table called `sqlite_stat1`: https://www.sqlite.org/fileformat.html#the_sqlite_stat1_table This should be added to the default list of hidden tables in Datasette. It looks something like this: \| tbl \| idx \| stat \| \|---------------------------------\|------------------------------------\|-----------\| \| _counts \| sqlite_autoindex__counts_1 \| 5 1 \| \| global-power-plants_fts_config \| global-power-plants_fts_config \| 1 1 \| \| global-power-plants_fts_docsize \| \| 33643 \| \| global-power-plants_fts_idx \| global-power-plants_fts_idx \| 199 40 1 \| \| global-power-plants_fts_data \| \| 136 \| \| global-power-plants \| "global-power-plants_owner" \| 33643 4 \| \| global-power-plants \| "global-power-plants_country_long" \| 33643 202 \| > In each such row, the sqlite_stat.stat column will be a string consisting of a list of integers followed by zero or more arguments. The first integer in this list is the approximate number of rows in the index. (The number of rows in the index is the same as the number of rows in the table, except for partial indexes.) The second integer is the approximate number of rows in the index that have the same value in the first column of the index. The third integer is the number number of rows in the index that have the same value for the first two columns. The N-th integer (for N>1) is the estimated average number of rows in the index which have the same value for the first N-1 columns. For a K-column index, there will be K+1 integers in the stat column. If the index is unique, then the last integer will be 1.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096563265
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008155916	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008155916	IC_kwDOCGYnMM48Fz0M	9599	2022-01-08T21:16:46Z	2022-01-08T21:16:46Z	OWNER	No, `chunks()` seems to work OK in the test I just added.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008154873	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008154873	IC_kwDOCGYnMM48Fzj5	9599	2022-01-08T21:11:55Z	2022-01-08T21:11:55Z	OWNER	I'm suspicious that the `chunks()` utility function may not be working correctly: ```pycon In [10]: [list(d) for d in list(chunks('abc', 5))] Out[10]: [['a'], ['b'], ['c']] In [11]: [list(d) for d in list(chunks('abcdefghi', 5))] Out[11]: [['a'], ['b'], ['c'], ['d'], ['e'], ['f'], ['g'], ['h'], ['i']] In [12]: [list(d) for d in list(chunks('abcdefghi', 3))] Out[12]: [['a'], ['b'], ['c'], ['d'], ['e'], ['f'], ['g'], ['h'], ['i']] ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008153586	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008153586	IC_kwDOCGYnMM48FzPy	9599	2022-01-08T21:06:15Z	2022-01-08T21:06:15Z	OWNER	I added a print statement after `for query, params in queries_and_params` and confirmed that something in the code is waiting until 16 records are available to be inserted and then executing the inserts, even with `--batch-size 1`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008151884	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008151884	IC_kwDOCGYnMM48Fy1M	9599	2022-01-08T20:59:21Z	2022-01-08T20:59:21Z	OWNER	(That Heroku example doesn't record the timestamp, which limits its usefulness)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008143248	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008143248	IC_kwDOCGYnMM48FwuQ	9599	2022-01-08T20:34:12Z	2022-01-08T20:34:12Z	OWNER	Built that tool: https://github.com/simonw/stream-delay and https://pypi.org/project/stream-delay/	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008129841	https://api.github.com/repos/simonw/sqlite-utils/issues/364	1008129841	IC_kwDOCGYnMM48Ftcx	9599	2022-01-08T20:04:42Z	2022-01-08T20:04:42Z	OWNER	It would be easier to test this if I had a utility for streaming out a file one line at a time. A few recipes for this in https://superuser.com/questions/526242/cat-file-to-terminal-at-particular-speed-of-lines-per-second - I'm going to build a quick `stream-delay` tool though.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1095570074
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007643254	https://api.github.com/repos/simonw/sqlite-utils/issues/365	1007643254	IC_kwDOCGYnMM48D2p2	9599	2022-01-07T18:37:56Z	2022-01-07T18:37:56Z	OWNER	Or I could leave off `--no-analyze` and tell people that if they want to add an index without running analyze they can execute the `CREATE INDEX` themselves.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096558279
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007642831	https://api.github.com/repos/simonw/sqlite-utils/issues/365	1007642831	IC_kwDOCGYnMM48D2jP	9599	2022-01-07T18:37:18Z	2022-01-07T18:37:18Z	OWNER	After implementing #366 I can make it so `sqlite-utils create-index` automatically runs `db.analyze(index_name)` afterwards, maybe with a `--no-analyze` option in case anyone wants to opt out of that for specific performance reasons.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096558279
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1007641634	https://api.github.com/repos/simonw/sqlite-utils/issues/366	1007641634	IC_kwDOCGYnMM48D2Qi	9599	2022-01-07T18:35:35Z	2022-01-07T18:35:35Z	OWNER	Since the existing CLI feature is this: $ sqlite-utils analyze-tables github.db tags I can add `sqlite-utils analyze` to reflect the Python library method.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096563265
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1007639860	https://api.github.com/repos/simonw/sqlite-utils/issues/366	1007639860	IC_kwDOCGYnMM48D100	9599	2022-01-07T18:32:59Z	2022-01-07T18:33:07Z	OWNER	From the SQLite docs: > If no arguments are given, all attached databases are analyzed. If a schema name is given as the argument, then all tables and indices in that one database are analyzed. If the argument is a table name, then only that table and the indices associated with that table are analyzed. If the argument is an index name, then only that one index is analyzed. So I think this becomes two methods: - `db.analyze()` calls analyze on the whole database - `db.analyze(name_of_table_or_index)` for a specific named table or index - `table.analyze()` is a shortcut for `db.analyze(table.name)`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096563265
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1007637963	https://api.github.com/repos/simonw/sqlite-utils/issues/366	1007637963	IC_kwDOCGYnMM48D1XL	9599	2022-01-07T18:30:13Z	2022-01-07T18:30:13Z	OWNER	Annoyingly I use the word "analyze" to mean something else in the CLI - for these features: - #207 - #320 there's only one method with a similar name in the Python library though and that's this one: https://github.com/simonw/sqlite-utils/blob/6e46b9913411682f3a3ec66f4d58886c1db8654b/sqlite_utils/db.py#L2904-L2906	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096563265
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007634999	https://api.github.com/repos/simonw/sqlite-utils/issues/365	1007634999	IC_kwDOCGYnMM48D0o3	9599	2022-01-07T18:26:22Z	2022-01-07T18:26:22Z	OWNER	I've not used the `ANALYZE` feature in SQLite at all before. Should probably add Python library methods for it. Annoyingly I use the word "analyze" to mean something else in the CLI - for these features: - #207 - #320	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096558279
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007633376	https://api.github.com/repos/simonw/sqlite-utils/issues/365	1007633376	IC_kwDOCGYnMM48D0Pg	9599	2022-01-07T18:24:07Z	2022-01-07T18:24:07Z	OWNER	Relevant documentation: https://www.sqlite.org/lang_analyze.html	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1096558279
https://github.com/simonw/sqlite-utils/issues/363#issuecomment-1006344080	https://api.github.com/repos/simonw/sqlite-utils/issues/363	1006344080	IC_kwDOCGYnMM47-5eQ	9599	2022-01-06T07:32:05Z	2022-01-06T07:32:05Z	OWNER	As part of this work I should add test coverage of this error message too: https://github.com/simonw/sqlite-utils/blob/413f8ed754e38d7b190de888c85fe8438336cb11/sqlite_utils/cli.py#L826	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094981339
https://github.com/simonw/sqlite-utils/issues/363#issuecomment-1006343303	https://api.github.com/repos/simonw/sqlite-utils/issues/363	1006343303	IC_kwDOCGYnMM47-5SH	9599	2022-01-06T07:30:20Z	2022-01-06T07:30:20Z	OWNER	This check should run inside the `.insert_all()` method. It should raise a custom exception which the CLI code can then catch and turn into a click error.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094981339
https://github.com/simonw/sqlite-utils/issues/356#issuecomment-1006318443	https://api.github.com/repos/simonw/sqlite-utils/issues/356	1006318443	IC_kwDOCGYnMM47-zNr	9599	2022-01-06T06:30:13Z	2022-01-06T06:30:13Z	OWNER	Documentation: - https://sqlite-utils.datasette.io/en/latest/cli.html#inserting-unstructured-data-with-lines-and-text - https://sqlite-utils.datasette.io/en/latest/cli.html#applying-conversions-while-inserting-data	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1077431957
https://github.com/simonw/sqlite-utils/issues/356#issuecomment-1006318007	https://api.github.com/repos/simonw/sqlite-utils/issues/356	1006318007	IC_kwDOCGYnMM47-zG3	9599	2022-01-06T06:28:53Z	2022-01-06T06:28:53Z	OWNER	Implemented in #361.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1077431957
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006315145	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006315145	IC_kwDOCGYnMM47-yaJ	9599	2022-01-06T06:20:51Z	2022-01-06T06:20:51Z	OWNER	This is all documented. I'm going to rebase-merge it to keep the individual commits.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006311742	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006311742	IC_kwDOCGYnMM47-xk-	9599	2022-01-06T06:12:19Z	2022-01-06T06:12:19Z	OWNER	Got that working: ``` % echo 'This is cool' \| sqlite-utils insert words.db words - --text --convert '({"word": w} for w in text.split())' % sqlite-utils dump words.db BEGIN TRANSACTION; CREATE TABLE [words] ( [word] TEXT ); INSERT INTO "words" VALUES('This'); INSERT INTO "words" VALUES('is'); INSERT INTO "words" VALUES('cool'); COMMIT; ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006309834	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006309834	IC_kwDOCGYnMM47-xHK	9599	2022-01-06T06:08:01Z	2022-01-06T06:08:01Z	OWNER	For `--text` the conversion function should be allowed to return an iterable instead of a dictionary, in which case it will be treated as the full list of records to be inserted.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006301546	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006301546	IC_kwDOCGYnMM47-vFq	9599	2022-01-06T05:44:47Z	2022-01-06T05:44:47Z	OWNER	Just need documentation for `--convert` now against the various different types of input.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006300280	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006300280	IC_kwDOCGYnMM47-ux4	9599	2022-01-06T05:40:45Z	2022-01-06T05:40:45Z	OWNER	I'm going to rename `--all` to `--text`: > - Use `--text` to write the entire input to a column called "text" To avoid that clash with Python's `all()` function.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006299778	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006299778	IC_kwDOCGYnMM47-uqC	9599	2022-01-06T05:39:10Z	2022-01-06T05:39:10Z	OWNER	`all` is a bad variable name because it clashes with the Python `all()` built-in function.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006295276	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006295276	IC_kwDOCGYnMM47-tjs	9599	2022-01-06T05:26:11Z	2022-01-06T05:26:11Z	OWNER	Here's the traceback if your `--convert` function doesn't return a dict right now: ``` % sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert 'all.upper()' --all Traceback (most recent call last): File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/bin/sqlite-utils", line 33, in <module> sys.exit(load_entry_point('sqlite-utils', 'console_scripts', 'sqlite-utils')()) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1137, in __call__ return self.main(args, kwargs) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1062, in main rv = self.invoke(ctx) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1668, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 763, in invoke return __callback(args, **kwargs) File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py", line 949, in insert insert_upsert_implementation( File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py", line 834, in insert_upsert_implementation db[table].insert_all( File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py", line 2602, in insert_all first_record = next(records) File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py", line 3044, in fix_square_braces for record in records: File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py", line 831, in <genexpr> docs = (decode_base64_values(doc) for doc in docs) File "/Users/simon/Dropbox/Development/s…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006294777	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006294777	IC_kwDOCGYnMM47-tb5	9599	2022-01-06T05:24:54Z	2022-01-06T05:24:54Z	OWNER	> I added a custom error message for if the user's `--convert` code doesn't return a dict. That turned out to be a bad idea because it meant exhausting the iterator early for the check - before we got to the `.insert_all()` code that breaks the iterator up into chunks. I tried fixing that with `itertools.tee()` to run the generator twice but that's grossly memory-inefficient for large imports.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006288444	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006288444	IC_kwDOCGYnMM47-r48	9599	2022-01-06T05:07:10Z	2022-01-06T05:07:10Z	OWNER	And here's a demo of `--convert` used with `--all` - I added a custom error message for if the user's `--convert` code doesn't return a dict. ``` % sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert 'all.upper()' --all Error: Records returned by your --convert function must be dicts % sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert '{"all": all.upper()}' --all % sqlite-utils dump /tmp/all.db BEGIN TRANSACTION; CREATE TABLE [blah] ( [all] TEXT ); INSERT INTO "blah" VALUES('INFO: 127.0.0.1:60581 - "GET / HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FAVICON.ICO HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FOO/TIDDLYWIKI HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1" 200 OK INFO: 127.0.0.1:60584 - "GET /FOO/-/STATIC/SQL-FORMATTER-2.3.3.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60586 - "GET /FOO/-/STATIC/CODEMIRROR-5.57.0.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60585 - "GET /FOO/-/STATIC/CODEMIRROR-5.57.0.MIN.CSS HTTP/1.1" 200 OK INFO: 127.0.0.1:60588 - "GET /FOO/-/STATIC/CODEMIRROR-5.57.0-SQL.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60587 - "GET /FOO/-/STATIC/CM-RESIZE-1.0.1.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60586 - "GET /FOO/TIDDLYWIKI/TIDDLERS HTTP/1.1" 200 OK INFO: 127.0.0.1:60586 - "GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1" 200 OK INFO: 127.0.0.1:60584 - "GET /FOO/-/STATIC/TABLE.JS HTTP/1.1" 200 OK '); COMMIT; ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006284673	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006284673	IC_kwDOCGYnMM47-q-B	9599	2022-01-06T04:55:52Z	2022-01-06T04:55:52Z	OWNER	Test code that just worked for me: ``` sqlite-utils insert /tmp/blah.db blah /tmp/log.log --convert ' bits = line.split() return dict([("b_{}".format(i), bit) for i, bit in enumerate(bits)])' --lines ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006232013	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006232013	IC_kwDOCGYnMM47-eHN	9599	2022-01-06T02:21:35Z	2022-01-06T02:21:35Z	OWNER	I'm having second thoughts about this bit: > Your Python code will be passed a "row" variable representing the imported row, and can return a modified row. > > If you are using `--lines` your code will be passed a "line" variable, and for `--all` an "all" variable. The code in question is this: https://github.com/simonw/sqlite-utils/blob/500a35ad4d91c8a6232134ce9406efec11bedff8/sqlite_utils/utils.py#L296-L303 Do I really want to add the complexity of supporting different variable names there? I think always using `value` might be better. Except... `value` made sense for the existing `sqlite-utils convert` command where you are running a conversion function against the value for the column in the current row - is it confusing if applied to lines or documents or `all`?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006230411	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006230411	IC_kwDOCGYnMM47-duL	9599	2022-01-06T02:17:35Z	2022-01-06T02:17:35Z	OWNER	Documentation: https://github.com/simonw/sqlite-utils/blob/33223856ff7fe746b7b77750fbe5b218531d0545/docs/cli.rst#inserting-unstructured-data-with---lines-and---all - I went with a single section titled "Inserting unstructured data with --lines and --all"	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006220129	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006220129	IC_kwDOCGYnMM47-bNh	9599	2022-01-06T01:52:26Z	2022-01-06T01:52:26Z	OWNER	I'm going to refactor all of the tests for `sqlite-utils insert` into a new `test_cli_insert.py` module.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006219848	https://api.github.com/repos/simonw/sqlite-utils/issues/361	1006219848	IC_kwDOCGYnMM47-bJI	9599	2022-01-06T01:51:36Z	2022-01-06T01:51:36Z	OWNER	So far I've just implemented the new help: ``` % sqlite-utils insert --help Usage: sqlite-utils insert [OPTIONS] PATH TABLE FILE Insert records from FILE into a table, creating the table if it does not already exist. By default the input is expected to be a JSON array of objects. Or: - Use --nl for newline-delimited JSON objects - Use --csv or --tsv for comma-separated or tab-separated input - Use --lines to write each incoming line to a column called "line" - Use --all to write the entire input to a column called "all" You can also use --convert to pass a fragment of Python code that will be used to convert each input. Your Python code will be passed a "row" variable representing the imported row, and can return a modified row. If you are using --lines your code will be passed a "line" variable, and for --all an "all" variable. Options: --pk TEXT Columns to use as the primary key, e.g. id --flatten Flatten nested JSON objects, so {"a": {"b": 1}} becomes {"a_b": 1} --nl Expect newline-delimited JSON -c, --csv Expect CSV input --tsv Expect TSV input --lines Treat each line as a single value called 'line' --all Treat input as a single value called 'all' --convert TEXT Python code to convert each item --import TEXT Python modules to import --delimiter TEXT Delimiter to use for CSV files --quotechar TEXT Quote character to use for CSV/TSV --sniff Detect delimiter and quote character --no-headers CSV file has no header row --batch-size INTEGER Commit every X records --alter Alter existing table to add any missing columns --not-null TEXT Columns that should be created as NOT NULL --default <TEXT TEXT>... Default value that should be set for a column --e…	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1094890366
https://github.com/simonw/sqlite-utils/issues/356#issuecomment-997496626	https://api.github.com/repos/simonw/sqlite-utils/issues/356	997496626	IC_kwDOCGYnMM47dJcy	9599	2021-12-20T00:38:15Z	2022-01-06T01:29:03Z	OWNER	The implementation of this gets a tiny bit complicated. Ignoring `--convert`, the `--lines` option can internally produce `{"line": ...}` records and the `--all` option can produce `{"all": ...}` records. But... when `--convert` is used, what should the code run against? It could run against those already-converted records but that's a little bit strange, since you'd have to do this: sqlite-utils insert blah.db blah myfile.txt --all --convert '{"item": s for s in value["all"].split("-")}' Having to use `value["all"]` there is unintuitive. It would be nicer to have a `all` variable to work against. But then for `--lines` should the local variable be called `line`? And how best to summarize these different names for local variables in the inline help for the feature?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1077431957

github

Custom SQL query returning 101 rows (hide)

Query parameters