html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app
https://github.com/simonw/sqlite-utils/issues/446#issuecomment-1162234441,https://api.github.com/repos/simonw/sqlite-utils/issues/446,1162234441,IC_kwDOCGYnMM5FRkpJ,9599,2022-06-21T19:28:35Z,2022-06-21T19:28:35Z,OWNER,"`just -l` now does this:
```
% just -l
Available recipes:
black # Apply Black
cog # Rebuild docs with cog
default # Run tests and linters
lint # Run linters: black, flake8, mypy, cog
test *options # Run pytest with supplied options
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1277328147,
https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162231111,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1162231111,IC_kwDOCGYnMM5FRj1H,9599,2022-06-21T19:25:44Z,2022-06-21T19:25:44Z,OWNER,Pushed that prototype to a branch.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,
https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162223668,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1162223668,IC_kwDOCGYnMM5FRiA0,9599,2022-06-21T19:19:22Z,2022-06-21T19:22:15Z,OWNER,"Built a prototype of `--fast` for the `sqlite-utils memory` command:
```
% time sqlite-utils memory taxi.csv 'SELECT passenger_count, COUNT(*), AVG(total_amount) FROM taxi GROUP BY passenger_count' --fast
passenger_count COUNT(*) AVG(total_amount)
--------------- -------- -----------------
128020 32.2371511482553
0 42228 17.0214016766151
1 1533197 17.6418833067999
2 286461 18.0975870711456
3 72852 17.9153958710923
4 25510 18.452774990196
5 50291 17.2709248175672
6 32623 17.6002964166367
7 2 87.17
8 2 95.705
9 1 113.6
sqlite-utils memory taxi.csv --fast 12.71s user 0.48s system 104% cpu 12.627 total
```
Takes 13s - about the same time as calling `sqlite3 :memory: ...` directly as seen in https://til.simonwillison.net/sqlite/one-line-csv-operations
Without the `--fast` option that takes several minutes (262s = 4m20s)!
Here's the prototype so far:
```diff
diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py
index 86eddfb..1c83ef6 100644
--- a/sqlite_utils/cli.py
+++ b/sqlite_utils/cli.py
@@ -14,6 +14,8 @@ import io
import itertools
import json
import os
+import shutil
+import subprocess
import sys
import csv as csv_std
import tabulate
@@ -1669,6 +1671,7 @@ def query(
is_flag=True,
help=""Analyze resulting tables and output results"",
)
+@click.option(""--fast"", is_flag=True, help=""Fast mode, only works with CSV and TSV"")
@load_extension_option
def memory(
paths,
@@ -1692,6 +1695,7 @@ def memory(
save,
analyze,
load_extension,
+ fast,
):
""""""Execute SQL query against an in-memory database, optionally populated by imported data
@@ -1719,6 +1723,22 @@ def memory(
\b
sqlite-utils memory animals.csv --schema
""""""
+ if fast:
+ if (
+ attach
+ or flatten
+ or param
+ or encoding
+ or no_detect_types
+ or analyze
+ or load_extension
+ ):
+ raise click.ClickException(
+ ""--fast mode does not support any of the following options: --attach, --flatten, --param, --encoding, --no-detect-types, --analyze, --load-extension""
+ )
+ # TODO: Figure out and pass other supported options
+ memory_fast(paths, sql)
+ return
db = sqlite_utils.Database(memory=True)
# If --dump or --save or --analyze used but no paths detected, assume SQL query is a path:
if (dump or save or schema or analyze) and not paths:
@@ -1791,6 +1811,33 @@ def memory(
)
+def memory_fast(paths, sql):
+ if not shutil.which(""sqlite3""):
+ raise click.ClickException(""sqlite3 not found in PATH"")
+ args = [""sqlite3"", "":memory:"", ""-cmd"", "".mode csv""]
+ table_names = []
+
+ def name(path):
+ base_name = pathlib.Path(path).stem or ""t""
+ table_name = base_name
+ prefix = 1
+ while table_name in table_names:
+ prefix += 1
+ table_name = ""{}_{}"".format(base_name, prefix)
+ return table_name
+
+ for path in paths:
+ table_name = name(path)
+ table_names.append(table_name)
+ args.extend(
+ [""-cmd"", "".import {} {}"".format(pathlib.Path(path).resolve(), table_name)]
+ )
+
+ args.extend([""-cmd"", "".mode column""])
+ args.append(sql)
+ subprocess.run(args)
+
+
def _execute_query(
db, sql, param, raw, table, csv, tsv, no_headers, fmt, nl, arrays, json_cols
):
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,
https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1162186856,https://api.github.com/repos/simonw/sqlite-utils/issues/447,1162186856,IC_kwDOCGYnMM5FRZBo,9599,2022-06-21T18:48:46Z,2022-06-21T18:48:46Z,OWNER,"That fixed it:
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1278571700,
https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162179354,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1162179354,IC_kwDOCGYnMM5FRXMa,9599,2022-06-21T18:44:03Z,2022-06-21T18:44:03Z,OWNER,The thing I like about that `--fast` option is that it could selectively use this alternative mechanism just for the files for which it can work (CSV and TSV files). I could also add a `--fast` option to `sqlite-utils memory` which could then kick in only for operations that involve just TSV and CSV files.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,
https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1161869859,https://api.github.com/repos/simonw/sqlite-utils/issues/447,1161869859,IC_kwDOCGYnMM5FQLoj,9599,2022-06-21T15:00:42Z,2022-06-21T15:00:42Z,OWNER,Deploying that to https://sqlite-utils.datasette.io/en/latest/cli-reference.html#insert,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1278571700,
https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1161857806,https://api.github.com/repos/simonw/sqlite-utils/issues/447,1161857806,IC_kwDOCGYnMM5FQIsO,9599,2022-06-21T14:55:51Z,2022-06-21T14:58:14Z,OWNER,"https://stackoverflow.com/a/44379513 suggests that the fix is:
.. code-block:: text
Or set this in `conf.py`:
highlight_language = ""none""
I like that better - I don't like that all `::` blocks default to being treated as Python code.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1278571700,
https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1161849874,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1161849874,IC_kwDOCGYnMM5FQGwS,9599,2022-06-21T14:49:12Z,2022-06-21T14:49:12Z,OWNER,"Since there are all sorts of existing options for `sqlite-utils insert` that won't work with this, maybe it would be better to have an entirely separate command - this for example:
sqlite-utils fast-insert data.db mytable data.csv ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,