github: issue_comments: 7 rows where issue = 688670158 and user = 9599 sorted by updated

7 rows where issue = 688670158 and user = 9599 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
779416619	https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779416619	https://api.github.com/repos/simonw/sqlite-utils/issues/147	MDEyOklzc3VlQ29tbWVudDc3OTQxNjYxOQ==	simonw 9599	2021-02-15T19:40:57Z	2021-02-15T21:27:55Z	OWNER	Tried this experiment (not proper binary search, it only searches downwards): ```python import sqlite3 db = sqlite3.connect(":memory:") def tryit(n): sql = "select 1 where 1 in ({})".format(", ".join("?" for i in range(n))) db.execute(sql, [0 for i in range(n)]) def find_limit(min=0, max=5_000_000): value = max while True: print('Trying', value) try: tryit(value) return value except: value = value // 2 Running `find_limit()` with those default parameters takes about 1.47s on my laptop: In [9]: %timeit find_limit() Trying 5000000 Trying 2500000... 1.47 s ± 28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) `` Interestingly the value it suggested was 156250 - suggesting that the macOSsqlite3` binary with a 500,000 limit isn't the same as whatever my Python is using here.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	SQLITE_MAX_VARS maybe hard-coded too low 688670158
779448912	https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779448912	https://api.github.com/repos/simonw/sqlite-utils/issues/147	MDEyOklzc3VlQ29tbWVudDc3OTQ0ODkxMg==	simonw 9599	2021-02-15T21:09:50Z	2021-02-15T21:09:50Z	OWNER	I fiddled around and replaced that line with `batch_size = SQLITE_MAX_VARS // num_columns` - which evaluated to `10416` for this particular file. That got me this: `40.71s user 1.81s system 98% cpu 43.081 total` 43s is definitely better than 56s, but it's still not as big as the ~26.5s to ~3.5s improvement described by @simonwiles at the top of this issue. I wonder what I'm missing here.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	SQLITE_MAX_VARS maybe hard-coded too low 688670158
779446652	https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779446652	https://api.github.com/repos/simonw/sqlite-utils/issues/147	MDEyOklzc3VlQ29tbWVudDc3OTQ0NjY1Mg==	simonw 9599	2021-02-15T21:04:19Z	2021-02-15T21:04:19Z	OWNER	... but it looks like `batch_size` is hard-coded to 100, rather than `None` - which means it's not being calculated using that value: https://github.com/simonw/sqlite-utils/blob/1f49f32814a942fa076cfe5f504d1621188097ed/sqlite_utils/db.py#L704 And https://github.com/simonw/sqlite-utils/blob/1f49f32814a942fa076cfe5f504d1621188097ed/sqlite_utils/db.py#L1877	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	SQLITE_MAX_VARS maybe hard-coded too low 688670158
779445423	https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779445423	https://api.github.com/repos/simonw/sqlite-utils/issues/147	MDEyOklzc3VlQ29tbWVudDc3OTQ0NTQyMw==	simonw 9599	2021-02-15T21:00:44Z	2021-02-15T21:01:09Z	OWNER	I tried changing the hard-coded value from 999 to 156_250 and running `sqlite-utils insert` against a 500MB CSV file, with these results: ``` (sqlite-utils) sqlite-utils % time sqlite-utils insert slow-ethos.db ethos ../ethos-datasette/ethos.csv --no-headers [###################################-] 99% 00:00:00sqlite-utils insert slow-ethos.db ethos ../ethos-datasette/ethos.csv 44.74s user 7.61s system 92% cpu 56.601 total Increased the setting here (sqlite-utils) sqlite-utils % time sqlite-utils insert fast-ethos.db ethos ../ethos-datasette/ethos.csv --no-headers [###################################-] 99% 00:00:00sqlite-utils insert fast-ethos.db ethos ../ethos-datasette/ethos.csv 39.40s user 5.15s system 96% cpu 46.320 total ``` Not as big a difference as I was expecting.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	SQLITE_MAX_VARS maybe hard-coded too low 688670158
779417723	https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779417723	https://api.github.com/repos/simonw/sqlite-utils/issues/147	MDEyOklzc3VlQ29tbWVudDc3OTQxNzcyMw==	simonw 9599	2021-02-15T19:44:02Z	2021-02-15T19:47:00Z	OWNER	`%timeit find_limit(max=1_000_000)` took 378ms on my laptop `%timeit find_limit(max=500_000)` took 197ms `%timeit find_limit(max=200_000)` reported 53ms per loop `%timeit find_limit(max=100_000)` reported 26.8ms per loop. All of these are still slow enough that I'm not comfortable running this search for every time the library is imported. Allowing users to opt-in to this as a performance enhancement might be better.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	SQLITE_MAX_VARS maybe hard-coded too low 688670158
779409770	https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779409770	https://api.github.com/repos/simonw/sqlite-utils/issues/147	MDEyOklzc3VlQ29tbWVudDc3OTQwOTc3MA==	simonw 9599	2021-02-15T19:23:11Z	2021-02-15T19:23:11Z	OWNER	On my Mac right now I'm seeing a limit of 500,000: `% sqlite3 -cmd ".limits variable_number" variable_number 500000`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	SQLITE_MAX_VARS maybe hard-coded too low 688670158
683528149	https://github.com/simonw/sqlite-utils/issues/147#issuecomment-683528149	https://api.github.com/repos/simonw/sqlite-utils/issues/147	MDEyOklzc3VlQ29tbWVudDY4MzUyODE0OQ==	simonw 9599	2020-08-31T03:17:26Z	2020-08-31T03:17:26Z	OWNER	+1 to making this something that users can customize. An optional argument to the `Database` constructor would be a neat way to do this. I think there's a terrifying way that we could find this value... we could perform a binary search for it! Open up a memory connection and try running different bulk inserts against it and catch the exceptions - then adjust and try again. My hunch is that we could perform just 2 or 3 probes (maybe against carefully selected values) to find the highest value that works. If this process took less than a few ms to run I'd be happy to do it automatically when the class is instantiated (and let users disable that automatic proving by setting a value using the constructor argument).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	SQLITE_MAX_VARS maybe hard-coded too low 688670158

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);

issue_comments

7 rows where issue = 688670158 and user = 9599 sorted by updated_at descending

Increased the setting here

Advanced export