html_url,issue_url,id,node_id,user,user_label,created_at,updated_at,author_association,body,reactions,issue,issue_label,performed_via_github_app https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1732018273,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1732018273,IC_kwDOCGYnMM5nPIBh,1108600,radusuciu,2023-09-22T20:49:51Z,2023-09-22T20:49:51Z,NONE,This would be awesome to have for multi-gig tsv and csv files! I'm currently looking at a 10 hour countdown for one such important. Not a problem because I'm lazy and happy to let it run and check on it tomorrow..,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262920929,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262920929,IC_kwDOCGYnMM5LRqTh,9599,simonw,2022-09-29T23:06:44Z,2022-09-29T23:06:44Z,OWNER,"Currently the only other use of `-t` is for this: ``` -t, --table Output as a formatted table ``` So I think it's OK to use it to mean something slightly different for this command, since `sqlite-utils insert` doesn't do any output of data in any format.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262918833,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262918833,IC_kwDOCGYnMM5LRpyx,9599,simonw,2022-09-29T23:02:52Z,2022-09-29T23:02:52Z,OWNER,"The other nice thing about having this as a separate command is that I can implement a tiny subset of the overall `sqlite-utils insert` features at first, and then add additional features in subsequent releases.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262917059,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262917059,IC_kwDOCGYnMM5LRpXD,9599,simonw,2022-09-29T22:59:28Z,2022-09-29T22:59:28Z,OWNER,"I quite like `sqlite-utils fast-csv` - I think it's clear enough what it does, and running `--help` can clarify if needed.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262915322,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262915322,IC_kwDOCGYnMM5LRo76,9599,simonw,2022-09-29T22:57:31Z,2022-09-29T22:57:42Z,OWNER,Maybe `sqlite-utils fast-csv` is right? Not entirely clear that's an insert though as opposed to a faster version of in-memory querying in the style of `sqlite-utils memory`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262914416,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262914416,IC_kwDOCGYnMM5LRotw,9599,simonw,2022-09-29T22:56:53Z,2022-09-29T22:56:53Z,OWNER,"Potential names/designs: - `sqlite-utils fast data.db rows rows.csv` - `sqlite-utils insert-fast data.db rows rows.csv` - `sqlite-utils fast-csv data.db rows rows.csv` Or more interestingly... what if it could accept multiple CSV files to create multiple tables? - `sqlite-utils fast data.db rows.csv other.csv` Would still need to support creating tables with different names though. Maybe like this: - `sqlite-utils fast data.db -t mytable rows.csv -t othertable other.csv` I seem to be leaning towards `fast` as the command name, but as a standalone command name it's a bit meaningless - how do we know that's about CSV import and not about fast querying or similar?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262913145,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1262913145,IC_kwDOCGYnMM5LRoZ5,9599,simonw,2022-09-29T22:54:13Z,2022-09-29T22:54:13Z,OWNER,"After reviewing `sqlite-utils insert --help` I'm confident that MOST of these options wouldn't make sense for a ""fast"" moder that just supports CSV and works by piping directly to the `sqlite3` binary: https://github.com/simonw/sqlite-utils/blob/d792dad1cf5f16525da81b1e162fb71d469995f3/docs/cli-reference.rst#L251-L279 I'm going to implement a separate command instead.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1247161510,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1247161510,IC_kwDOCGYnMM5KViym,9599,simonw,2022-09-14T18:39:50Z,2022-09-14T18:39:50Z,OWNER,Wrote that up as a TIL: https://til.simonwillison.net/python/pypy-macos,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1247149969,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1247149969,IC_kwDOCGYnMM5KVf-R,9599,simonw,2022-09-14T18:28:53Z,2022-09-14T18:29:34Z,OWNER,"As an aside, https://avi.im/blag/2021/fast-sqlite-inserts/ inspired my to try pypy since that article claimed to get a 2.5x speedup using pypy compared to regular Python for a CSV import script. Setup: ``` brew install pypy3 cd /tmp pypy3 -m venv venv source venv/bin/activate pip install sqlite-utils ``` I grabbed the first 760M of that `https://static.openfoodfacts.org/data/en.openfoodfacts.org.products.csv` file (didn't wait for the whole thing to download). Then: ``` time sqlite-utils insert pypy.db t en.openfoodfacts.org.products.csv --csv [------------------------------------] 0% [###################################-] 99% 11.76s user 2.26s system 93% cpu 14.981 total ``` Compared to regular Python `sqlite-utils` doing the same thing: ``` time sqlite-utils insert py.db t en.openfoodfacts.org.products.csv --csv [------------------------------------] 0% [###################################-] 99% 11.36s user 2.06s system 93% cpu 14.341 total ``` So no perceivable performance difference.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1246978641,https://api.github.com/repos/simonw/sqlite-utils/issues/297,1246978641,IC_kwDOCGYnMM5KU2JR,9599,simonw,2022-09-14T15:57:41Z,2022-09-14T15:57:41Z,OWNER,"One solution suggested on Discord: ``` wget https://static.openfoodfacts.org/data/en.openfoodfacts.org.products.csv CREATE=`curl -s -L https://gist.githubusercontent.com/CharlesNepote/80fb813a416ad445fdd6e4738b4c8156/raw/032af70de631ff1c4dd09d55360f242949dcc24f/create.sql` INDEX=`curl -s -L https://gist.githubusercontent.com/CharlesNepote/80fb813a416ad445fdd6e4738b4c8156/raw/032af70de631ff1c4dd09d55360f242949dcc24f/index.sql` time sqlite3 products_new.db < The source argument is the name of a file to be read or, if it begins with a ""|"" character, specifies a command which will be run to produce the input CSV data.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism, https://github.com/simonw/sqlite-utils/issues/297#issuecomment-880256058,https://api.github.com/repos/simonw/sqlite-utils/issues/297,880256058,MDEyOklzc3VlQ29tbWVudDg4MDI1NjA1OA==,9599,simonw,2021-07-14T22:40:01Z,2021-07-14T22:40:47Z,OWNER,"Full docs here: https://www.sqlite.org/draft/cli.html#csv One catch: how this works has changed in recent SQLite versions: https://www.sqlite.org/changes.html - 2020-12-01 (3.34.0) - ""Table name quoting works correctly for the .import dot-command"" - 2020-05-22 (3.32.0) - ""Add options to the .import command: --csv, --ascii, --skip"" - 2017-08-01 (3.20.0) - "" The "".import"" command ignores an initial UTF-8 BOM."" The ""skip"" feature is particularly important to understand. https://www.sqlite.org/draft/cli.html#csv says: > There are two cases to consider: (1) Table ""tab1"" does not previously exist and (2) table ""tab1"" does already exist. > > In the first case, when the table does not previously exist, the table is automatically created and the content of the first row of the input CSV file is used to determine the name of all the columns in the table. In other words, if the table does not previously exist, the first row of the CSV file is interpreted to be column names and the actual data starts on the second row of the CSV file. > > For the second case, when the table already exists, every row of the CSV file, including the first row, is assumed to be actual content. If the CSV file contains an initial row of column labels, you can cause the .import command to skip that initial row using the ""--skip 1"" option. But the `--skip 1` option is only available in 3.32.0 and higher.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944846776,Option for importing CSV data using the SQLite .import mechanism,