html_url,issue_url,id,node_id,user,user_label,created_at,updated_at,author_association,body,reactions,issue,issue_label,performed_via_github_app
https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688481317,https://api.github.com/repos/simonw/sqlite-utils/issues/146,688481317,MDEyOklzc3VlQ29tbWVudDY4ODQ4MTMxNw==,96218,simonwiles,2020-09-07T19:18:55Z,2020-09-07T19:18:55Z,CONTRIBUTOR,"Just force-pushed to update d042f9c with more formatting changes to satisfy `black==20.8b1` and pass the GitHub Actions ""Test"" workflow.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688668680,Handle case where subsequent records (after first batch) include extra columns,
https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688479163,https://api.github.com/repos/simonw/sqlite-utils/issues/146,688479163,MDEyOklzc3VlQ29tbWVudDY4ODQ3OTE2Mw==,96218,simonwiles,2020-09-07T19:10:33Z,2020-09-07T19:11:57Z,CONTRIBUTOR,"@simonw -- I've gone ahead updated the documentation to reflect the changes introduced in this PR.  IMO it's ready to merge now.

In writing the documentation changes, I begin to wonder about the value and role of `batch_size` at all, tbh.  May I assume it was originally intended to prevent using the entire row set to determine columns and column types, and that this was a performance consideration?  If so, this PR entirely undermines its purpose.  I've been passing in excess of 500,000 rows at a time to `insert_all()` with these changes and although I'm sure the performance difference is measurable it's not really noticeable; given #145, I don't know that any performance advantages outweigh the problems doing it this way removes.  What do you think about just dropping the argument and defaulting to the maximum `batch_size` permissible given `SQLITE_MAX_VARS`?  Are there other reasons one might want to restrict `batch_size` that I've overlooked?  I could open a new issue to discuss/implement this.

Of course the documentation will need to change again too if/when something is done about #147.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688668680,Handle case where subsequent records (after first batch) include extra columns,
https://github.com/simonw/sqlite-utils/issues/145#issuecomment-683382252,https://api.github.com/repos/simonw/sqlite-utils/issues/145,683382252,MDEyOklzc3VlQ29tbWVudDY4MzM4MjI1Mg==,96218,simonwiles,2020-08-30T06:27:25Z,2020-08-30T06:27:52Z,CONTRIBUTOR,"Note: had to adjust the test above because trying to exhaust a `SQLITE_MAX_VARIABLE_NUMBER` of 250000 in 99 records requires 2526 columns, and trips the ` ""Rows can have a maximum of {} columns"".format(SQLITE_MAX_VARS)` check even before it trips the default `SQLITE_MAX_COLUMN` value (2000).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688659182,Bug when first record contains fewer columns than subsequent records,
https://github.com/simonw/sqlite-utils/issues/139#issuecomment-682815377,https://api.github.com/repos/simonw/sqlite-utils/issues/139,682815377,MDEyOklzc3VlQ29tbWVudDY4MjgxNTM3Nw==,96218,simonwiles,2020-08-28T16:14:58Z,2020-08-28T16:14:58Z,CONTRIBUTOR,"Thanks!  And yeah, I had updating the docs on my list too :)  Will try to get to it this afternoon (budgeting time is fraught with uncertainty at the moment!).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",686978131,"insert_all(..., alter=True) should work for new columns introduced after the first 100 records",
https://github.com/simonw/sqlite-utils/issues/139#issuecomment-682182178,https://api.github.com/repos/simonw/sqlite-utils/issues/139,682182178,MDEyOklzc3VlQ29tbWVudDY4MjE4MjE3OA==,96218,simonwiles,2020-08-27T20:46:18Z,2020-08-27T20:46:18Z,CONTRIBUTOR,"> I tried changing the batch_size argument to the total number of records, but it seems only to effect the number of rows that are committed at a time, and has no influence on this problem.

So the reason for this is that the `batch_size` for import is limited (of necessity) here: https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/db.py#L1048

With regard to the issue of ignoring columns, however, I made a fork and hacked a temporary fix that looks like this:
https://github.com/simonwiles/sqlite-utils/commit/3901f43c6a712a1a3efc340b5b8d8fd0cbe8ee63

It doesn't seem to affect performance enormously (but I've not tested it thoroughly), and it now does what I need (and would expect, tbh), but it now fails the test here:
https://github.com/simonw/sqlite-utils/blob/main/tests/test_create.py#L710-L716

The existence of this test suggests that `insert_all()` is behaving as intended, of course.  It seems odd to me that this would be a desirable default behaviour (let alone the only behaviour), and its not very prominently flagged-up, either.

@simonw is this something you'd be willing to look at a PR for?  I assume you wouldn't want to change the default behaviour at this point, but perhaps an option could be provided, or at least a bit more of a warning in the docs.  Are there oversights in the implementation that I've made?

Would be grateful for your thoughts!  Thanks!
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",686978131,"insert_all(..., alter=True) should work for new columns introduced after the first 100 records",