github

This data as json, CSV

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	pull_request	body	repo	type	active_lock_reason	performed_via_github_app	reactions	draft	state_reason
688670158	MDU6SXNzdWU2ODg2NzAxNTg=	147	SQLITE_MAX_VARS maybe hard-coded too low	96218	open	0			7	2020-08-30T07:26:45Z	2021-02-15T21:27:55Z		CONTRIBUTOR		I came across this while about to open an issue and PR against the documentation for `batch_size`, which is a bit incomplete. As mentioned in #145, while: > [`SQLITE_MAX_VARIABLE_NUMBER`](https://www.sqlite.org/limits.html#max_variable_number) ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0. it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with SQLITE_MAX_VARIABLE_NUMBER set to 250,000, and I believe this is the case for homebrew installations too. In working to understand what `batch_size` was actually doing and why, I realized that by setting `SQLITE_MAX_VARS` in `db.py` to match the value my sqlite was compiled with (I'm on Debian), I was able to decrease the time to `insert_all()` my test data set (~128k records across 7 tables) from ~26.5s to ~3.5s. Given that this about .05% of my total dataset, this is time I am keen to save... Unfortunately, it seems that `sqlite3` in the python standard library doesn't expose the `get_limit()` C API (even though `pysqlite` used to), so it's hard to know what value sqlite has been compiled with (note that this could mean, I suppose, that it's less than 999, and even hardcoding `SQLITE_MAX_VARS` to the conservative default might not be adequate. It can also be lowered -- but not raised -- at runtime). The best I could come up with is `echo "" \| sqlite3 -cmd ".limits variable_number"` (only available in `sqlite >= 2015-05-07 (3.8.10)`). Obviously this couldn't be relied upon in `sqlite_utils`, but I wonder what your opinion would be about exposing `SQLITE_MAX_VARS` as a user-configurable parameter (with suitable "here be dragons" warnings)? I'm going to go ahead and monkey-patch it for my purposes in any event, but it seems like it might be worth considering.	140912432	issue			{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
686978131	MDU6SXNzdWU2ODY5NzgxMzE=	139	insert_all(..., alter=True) should work for new columns introduced after the first 100 records	96218	closed	0			7	2020-08-27T06:25:25Z	2020-08-28T22:48:51Z	2020-08-28T22:30:14Z	CONTRIBUTOR		Is there a way to make `.insert_all()` work properly when new columns are introduced outside the first 100 records (with or without the `alter=True` argument)? I'm using `.insert_all()` to bulk insert ~3-4k records at a time and it is common for records to need to introduce new columns. However, if new columns are introduced after the first 100 records, `sqlite_utils` doesn't even raise the `OperationalError: table ... has no column named ...` exception; it just silently drops the extra data and moves on. It took me a while to find this little snippet in the [documentation for `.insert_all()`](https://sqlite-utils.readthedocs.io/en/stable/python-api.html#bulk-inserts) (it's not mentioned under [Adding columns automatically on insert/update](https://sqlite-utils.readthedocs.io/en/stable/python-api.html#bulk-inserts)): > The column types used in the CREATE TABLE statement are automatically derived from the types of data in that first batch of rows. _Any additional or missing columns in subsequent batches will be ignored._ I tried changing the `batch_size` argument to the total number of records, but it seems only to effect the number of rows that are committed at a time, and has no influence on this problem. Is there a way around this that you would suggest? It seems like it should raise an exception at least.	140912432	issue			{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/139/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed