github

This data as json, CSV

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	issue	performed_via_github_app
https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688481317	https://api.github.com/repos/simonw/sqlite-utils/issues/146	688481317	MDEyOklzc3VlQ29tbWVudDY4ODQ4MTMxNw==	96218	2020-09-07T19:18:55Z	2020-09-07T19:18:55Z	CONTRIBUTOR	Just force-pushed to update d042f9c with more formatting changes to satisfy `black==20.8b1` and pass the GitHub Actions "Test" workflow.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	688668680
https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688479163	https://api.github.com/repos/simonw/sqlite-utils/issues/146	688479163	MDEyOklzc3VlQ29tbWVudDY4ODQ3OTE2Mw==	96218	2020-09-07T19:10:33Z	2020-09-07T19:11:57Z	CONTRIBUTOR	@simonw -- I've gone ahead updated the documentation to reflect the changes introduced in this PR. IMO it's ready to merge now. In writing the documentation changes, I begin to wonder about the value and role of `batch_size` at all, tbh. May I assume it was originally intended to prevent using the entire row set to determine columns and column types, and that this was a performance consideration? If so, this PR entirely undermines its purpose. I've been passing in excess of 500,000 rows at a time to `insert_all()` with these changes and although I'm sure the performance difference is measurable it's not really noticeable; given #145, I don't know that any performance advantages outweigh the problems doing it this way removes. What do you think about just dropping the argument and defaulting to the maximum `batch_size` permissible given `SQLITE_MAX_VARS`? Are there other reasons one might want to restrict `batch_size` that I've overlooked? I could open a new issue to discuss/implement this. Of course the documentation will need to change again too if/when something is done about #147.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	688668680
https://github.com/simonw/sqlite-utils/issues/145#issuecomment-683382252	https://api.github.com/repos/simonw/sqlite-utils/issues/145	683382252	MDEyOklzc3VlQ29tbWVudDY4MzM4MjI1Mg==	96218	2020-08-30T06:27:25Z	2020-08-30T06:27:52Z	CONTRIBUTOR	Note: had to adjust the test above because trying to exhaust a `SQLITE_MAX_VARIABLE_NUMBER` of 250000 in 99 records requires 2526 columns, and trips the ` "Rows can have a maximum of {} columns".format(SQLITE_MAX_VARS)` check even before it trips the default `SQLITE_MAX_COLUMN` value (2000).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	688659182
https://github.com/simonw/sqlite-utils/issues/139#issuecomment-682815377	https://api.github.com/repos/simonw/sqlite-utils/issues/139	682815377	MDEyOklzc3VlQ29tbWVudDY4MjgxNTM3Nw==	96218	2020-08-28T16:14:58Z	2020-08-28T16:14:58Z	CONTRIBUTOR	Thanks! And yeah, I had updating the docs on my list too :) Will try to get to it this afternoon (budgeting time is fraught with uncertainty at the moment!).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	686978131
https://github.com/simonw/sqlite-utils/issues/139#issuecomment-682182178	https://api.github.com/repos/simonw/sqlite-utils/issues/139	682182178	MDEyOklzc3VlQ29tbWVudDY4MjE4MjE3OA==	96218	2020-08-27T20:46:18Z	2020-08-27T20:46:18Z	CONTRIBUTOR	> I tried changing the batch_size argument to the total number of records, but it seems only to effect the number of rows that are committed at a time, and has no influence on this problem. So the reason for this is that the `batch_size` for import is limited (of necessity) here: https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/db.py#L1048 With regard to the issue of ignoring columns, however, I made a fork and hacked a temporary fix that looks like this: https://github.com/simonwiles/sqlite-utils/commit/3901f43c6a712a1a3efc340b5b8d8fd0cbe8ee63 It doesn't seem to affect performance enormously (but I've not tested it thoroughly), and it now does what I need (and would expect, tbh), but it now fails the test here: https://github.com/simonw/sqlite-utils/blob/main/tests/test_create.py#L710-L716 The existence of this test suggests that `insert_all()` is behaving as intended, of course. It seems odd to me that this would be a desirable default behaviour (let alone the only behaviour), and its not very prominently flagged-up, either. @simonw is this something you'd be willing to look at a PR for? I assume you wouldn't want to change the default behaviour at this point, but perhaps an option could be provided, or at least a bit more of a warning in the docs. Are there oversights in the implementation that I've made? Would be grateful for your thoughts! Thanks!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	686978131