github: issue_comments: 3 rows where "created_at" is on date 2020-09-07 and issue = 688668680 sorted by updated

3 rows where "created_at" is on date 2020-09-07 and issue = 688668680 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
688508510	https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688508510	https://api.github.com/repos/simonw/sqlite-utils/issues/146	MDEyOklzc3VlQ29tbWVudDY4ODUwODUxMA==	simonw 9599	2020-09-07T20:56:03Z	2020-09-07T20:56:24Z	OWNER	The problem with this approach is that it requires us to consume the entire iterator before we can start inserting rows into the table - here on line 1052: https://github.com/simonw/sqlite-utils/blob/bb131793feac16bc7181ab997568f941b0220ef2/sqlite_utils/db.py#L1047-L1054 I designed the `.insert_all()` to avoid doing this, because I want to be able to pass it an iterator (or more likely a generator) that could produce potentially millions of records. Doing things one batch of 100 records at a time means that the Python process doesn't need to pull millions of records into memory at once. `db-to-sqlite` is one example of a tool that uses that characteristic, in https://github.com/simonw/db-to-sqlite/blob/63e4ee972f292de13bb11767c0fb64b35339d954/db_to_sqlite/cli.py#L94-L106 So we need to solve this issue without consuming the entire iterator with a `records = list(records)` call. I think one way to do this is to execute each chunk one at a time and watch out for an exception that indicates that we sent too many parameters - then adjust the chunk size down and try again.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Handle case where subsequent records (after first batch) include extra columns 688668680
688481317	https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688481317	https://api.github.com/repos/simonw/sqlite-utils/issues/146	MDEyOklzc3VlQ29tbWVudDY4ODQ4MTMxNw==	simonwiles 96218	2020-09-07T19:18:55Z	2020-09-07T19:18:55Z	CONTRIBUTOR	Just force-pushed to update d042f9c with more formatting changes to satisfy `black==20.8b1` and pass the GitHub Actions "Test" workflow.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Handle case where subsequent records (after first batch) include extra columns 688668680
688479163	https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688479163	https://api.github.com/repos/simonw/sqlite-utils/issues/146	MDEyOklzc3VlQ29tbWVudDY4ODQ3OTE2Mw==	simonwiles 96218	2020-09-07T19:10:33Z	2020-09-07T19:11:57Z	CONTRIBUTOR	@simonw -- I've gone ahead updated the documentation to reflect the changes introduced in this PR. IMO it's ready to merge now. In writing the documentation changes, I begin to wonder about the value and role of `batch_size` at all, tbh. May I assume it was originally intended to prevent using the entire row set to determine columns and column types, and that this was a performance consideration? If so, this PR entirely undermines its purpose. I've been passing in excess of 500,000 rows at a time to `insert_all()` with these changes and although I'm sure the performance difference is measurable it's not really noticeable; given #145, I don't know that any performance advantages outweigh the problems doing it this way removes. What do you think about just dropping the argument and defaulting to the maximum `batch_size` permissible given `SQLITE_MAX_VARS`? Are there other reasons one might want to restrict `batch_size` that I've overlooked? I could open a new issue to discuss/implement this. Of course the documentation will need to change again too if/when something is done about #147.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Handle case where subsequent records (after first batch) include extra columns 688668680

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);