issue_comments
22 rows where author_association = "OWNER" and issue = 944846776 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- Option for importing CSV data using the SQLite .import mechanism · 22 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
1262920929 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262920929 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5LRqTh | simonw 9599 | 2022-09-29T23:06:44Z | 2022-09-29T23:06:44Z | OWNER | Currently the only other use of |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1262918833 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262918833 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5LRpyx | simonw 9599 | 2022-09-29T23:02:52Z | 2022-09-29T23:02:52Z | OWNER | The other nice thing about having this as a separate command is that I can implement a tiny subset of the overall |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1262917059 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262917059 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5LRpXD | simonw 9599 | 2022-09-29T22:59:28Z | 2022-09-29T22:59:28Z | OWNER | I quite like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1262915322 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262915322 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5LRo76 | simonw 9599 | 2022-09-29T22:57:31Z | 2022-09-29T22:57:42Z | OWNER | Maybe |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1262914416 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262914416 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5LRotw | simonw 9599 | 2022-09-29T22:56:53Z | 2022-09-29T22:56:53Z | OWNER | Potential names/designs:
Or more interestingly... what if it could accept multiple CSV files to create multiple tables?
Would still need to support creating tables with different names though. Maybe like this:
I seem to be leaning towards |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1262913145 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1262913145 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5LRoZ5 | simonw 9599 | 2022-09-29T22:54:13Z | 2022-09-29T22:54:13Z | OWNER | After reviewing I'm going to implement a separate command instead. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1247161510 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1247161510 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5KViym | simonw 9599 | 2022-09-14T18:39:50Z | 2022-09-14T18:39:50Z | OWNER | Wrote that up as a TIL: https://til.simonwillison.net/python/pypy-macos |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1247149969 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1247149969 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5KVf-R | simonw 9599 | 2022-09-14T18:28:53Z | 2022-09-14T18:29:34Z | OWNER | As an aside, https://avi.im/blag/2021/fast-sqlite-inserts/ inspired my to try pypy since that article claimed to get a 2.5x speedup using pypy compared to regular Python for a CSV import script. Setup:
Then:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1246978641 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1246978641 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5KU2JR | simonw 9599 | 2022-09-14T15:57:41Z | 2022-09-14T15:57:41Z | OWNER | One solution suggested on Discord:
Converting empty to NULL for columns which are either FLOAT or INTEGERtime sqlite3 products.db ".schema all" | sed -nr 's/.[(.)] (INTEGER|FLOAT).*/\1/gp' | xargs -I % sqlite3 products.db -cmd "PRAGMA journal_mode=OFF;" "UPDATE [all] SET [%] = NULL WHERE [%] = '';" ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1246977989 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1246977989 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5KU1_F | simonw 9599 | 2022-09-14T15:57:09Z | 2022-09-14T15:57:09Z | OWNER | Should consider how this could best handle creating columns that are integer and float as opposed to just text. https://discord.com/channels/823971286308356157/823971286941302908/1019630014544748584 is a relevant discussion on Discord. Even if you create the schema in advance with the correct column types, this import mechanism can put empty strings in blank float/integer columns when ideally you would want to have nulls. Related feature idea for Not sure how best to handle this for |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1240882245 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1240882245 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5J9lxF | simonw 9599 | 2022-09-08T15:33:11Z | 2022-09-08T15:33:11Z | OWNER | The more I think about this the more I like it - particularly for I used a variant of this trick with parquet files here: https://simonwillison.net/2022/Sep/5/laion-aesthetics-weeknotes/ |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1162231111 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162231111 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5FRj1H | simonw 9599 | 2022-06-21T19:25:44Z | 2022-06-21T19:25:44Z | OWNER | Pushed that prototype to a branch. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1162223668 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162223668 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5FRiA0 | simonw 9599 | 2022-06-21T19:19:22Z | 2022-06-21T19:22:15Z | OWNER | Built a prototype of ``` % time sqlite-utils memory taxi.csv 'SELECT passenger_count, COUNT(), AVG(total_amount) FROM taxi GROUP BY passenger_count' --fast passenger_count COUNT() AVG(total_amount)
0 42228 17.0214016766151
1 1533197 17.6418833067999
2 286461 18.0975870711456
3 72852 17.9153958710923
4 25510 18.452774990196 Without the Here's the prototype so far: ```diff diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py index 86eddfb..1c83ef6 100644 --- a/sqlite_utils/cli.py +++ b/sqlite_utils/cli.py @@ -14,6 +14,8 @@ import io import itertools import json import os +import shutil +import subprocess import sys import csv as csv_std import tabulate @@ -1669,6 +1671,7 @@ def query( is_flag=True, help="Analyze resulting tables and output results", ) +@click.option("--fast", is_flag=True, help="Fast mode, only works with CSV and TSV") @load_extension_option def memory( paths, @@ -1692,6 +1695,7 @@ def memory( save, analyze, load_extension, + fast, ): """Execute SQL query against an in-memory database, optionally populated by imported data @@ -1719,6 +1723,22 @@ def memory( \b sqlite-utils memory animals.csv --schema """ + if fast: + if ( + attach + or flatten + or param + or encoding + or no_detect_types + or analyze + or load_extension + ): + raise click.ClickException( + "--fast mode does not support any of the following options: --attach, --flatten, --param, --encoding, --no-detect-types, --analyze, --load-extension" + ) + # TODO: Figure out and pass other supported options + memory_fast(paths, sql) + return db = sqlite_utils.Database(memory=True) # If --dump or --save or --analyze used but no paths detected, assume SQL query is a path: if (dump or save or schema or analyze) and not paths: @@ -1791,6 +1811,33 @@ def memory( ) +def memory_fast(paths, sql): + if not shutil.which("sqlite3"): + raise click.ClickException("sqlite3 not found in PATH") + args = ["sqlite3", ":memory:", "-cmd", ".mode csv"] + table_names = [] + + def name(path): + base_name = pathlib.Path(path).stem or "t" + table_name = base_name + prefix = 1 + while table_name in table_names: + prefix += 1 + table_name = "{}_{}".format(base_name, prefix) + return table_name + + for path in paths: + table_name = name(path) + table_names.append(table_name) + args.extend( + ["-cmd", ".import {} {}".format(pathlib.Path(path).resolve(), table_name)] + ) + + args.extend(["-cmd", ".mode column"]) + args.append(sql) + subprocess.run(args) + + def _execute_query( db, sql, param, raw, table, csv, tsv, no_headers, fmt, nl, arrays, json_cols ): ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1162179354 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162179354 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5FRXMa | simonw 9599 | 2022-06-21T18:44:03Z | 2022-06-21T18:44:03Z | OWNER | The thing I like about that |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1161849874 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1161849874 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5FQGwS | simonw 9599 | 2022-06-21T14:49:12Z | 2022-06-21T14:49:12Z | OWNER | Since there are all sorts of existing options for
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
882052693 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-882052693 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM40kw5V | simonw 9599 | 2021-07-18T12:57:54Z | 2022-06-21T13:17:15Z | OWNER | Another implementation option would be to use the CSV virtual table mechanism. This could avoid shelling out to the (Would be neat to produce a Python wheel of this, see https://simonwillison.net/2022/May/23/bundling-binary-tools-in-python-wheels/) This would also help solve the challenge of making this optimization available to the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
1160991031 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1160991031 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM5FM1E3 | simonw 9599 | 2022-06-21T00:35:20Z | 2022-06-21T00:35:20Z | OWNER | { "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | ||
882052852 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-882052852 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | IC_kwDOCGYnMM40kw70 | simonw 9599 | 2021-07-18T12:59:20Z | 2021-07-18T12:59:20Z | OWNER | I'm not too worried about |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
880259255 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-880259255 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | MDEyOklzc3VlQ29tbWVudDg4MDI1OTI1NQ== | simonw 9599 | 2021-07-14T22:48:41Z | 2021-07-14T22:48:41Z | OWNER | Should also take advantage of |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
880257587 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-880257587 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | MDEyOklzc3VlQ29tbWVudDg4MDI1NzU4Nw== | simonw 9599 | 2021-07-14T22:44:05Z | 2021-07-14T22:44:05Z | OWNER | https://unix.stackexchange.com/a/642364 suggests you can also use this to import from stdin, like so:
Here the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
880256865 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-880256865 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | MDEyOklzc3VlQ29tbWVudDg4MDI1Njg2NQ== | simonw 9599 | 2021-07-14T22:42:11Z | 2021-07-14T22:42:11Z | OWNER | Potential workaround for missing
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 | |
880256058 | https://github.com/simonw/sqlite-utils/issues/297#issuecomment-880256058 | https://api.github.com/repos/simonw/sqlite-utils/issues/297 | MDEyOklzc3VlQ29tbWVudDg4MDI1NjA1OA== | simonw 9599 | 2021-07-14T22:40:01Z | 2021-07-14T22:40:47Z | OWNER | Full docs here: https://www.sqlite.org/draft/cli.html#csv One catch: how this works has changed in recent SQLite versions: https://www.sqlite.org/changes.html
The "skip" feature is particularly important to understand. https://www.sqlite.org/draft/cli.html#csv says:
But the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Option for importing CSV data using the SQLite .import mechanism 944846776 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [issue] INTEGER REFERENCES [issues]([id]) , [performed_via_github_app] TEXT); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1