html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app https://github.com/simonw/sqlite-utils/issues/114#issuecomment-655786374,https://api.github.com/repos/simonw/sqlite-utils/issues/114,655786374,MDEyOklzc3VlQ29tbWVudDY1NTc4NjM3NA==,9599,2020-07-08T22:16:54Z,2020-07-08T22:16:54Z,OWNER,"According to https://www.sqlite.org/lang_altertable.html#making_other_kinds_of_table_schema_changes the hardest bits to consider are how to deal with existing foreign key relationships, triggers and views. I'm OK leaving views as an exercise for the caller - many of these transformations may not need any view changes at all. Foreign key relationships are important: it should handle these automatically as effectively as possible. Likewise trigger changes: need to think about what this means.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",621989740, https://github.com/simonw/sqlite-utils/issues/114#issuecomment-655785396,https://api.github.com/repos/simonw/sqlite-utils/issues/114,655785396,MDEyOklzc3VlQ29tbWVudDY1NTc4NTM5Ng==,9599,2020-07-08T22:14:10Z,2020-07-08T22:14:10Z,OWNER,"Work in progress: not quite right yet, I need smarter logic for how renamed columns are reflected in the generated `INSERT INTO ... SELECT ...` query: ```python def transform_table( self, columns=None, rename=None, change_type=None, pk=None, foreign_keys=None, column_order=None, not_null=None, defaults=None, hash_id=None, extracts=None, ): assert self.exists(), ""Cannot transform a table that doesn't exist yet"" columns = columns or self.columns_dict if rename is not None or change_type is not None: columns = {rename.get(key, key): change_type.get(key, value) for key, value in columns.items()} new_table_name = ""{}_new_{}"".format(self.name, os.urandom(6).hex()) previous_columns = set(self.columns_dict.keys()) with self.db.conn: columns = {name: value for (name, value) in columns.items()} new_table = self.db.create_table( new_table_name, columns, pk=pk, foreign_keys=foreign_keys, column_order=column_order, not_null=not_null, defaults=defaults, hash_id=hash_id, extracts=extracts, ) # Copy across data - but only for columns that exist in both new_columns = set(columns.keys()) columns_to_copy = new_columns.intersection(previous_columns) copy_sql = ""INSERT INTO [{new_table}] ({new_cols}) SELECT {old_cols} FROM [{old_table}]"".format( new_table=new_table_name, old_table=self.name, old_cols="", "".join(""[{}]"".format(col) for col in columns_to_copy), new_cols="", "".join(""[{}]"".format(rename.get(col, col)) for col in columns_to_copy), ) self.db.conn.execute(copy_sql) # Drop the old table self.db.conn.execute(""DROP TABLE [{}]"".format(self.name)) # Rename the new one self.db.conn.execute( ""ALTER TABLE [{}] RENAME TO [{}]"".format(new_table_name, self.name) ) return self ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",621989740, https://github.com/simonw/sqlite-utils/issues/114#issuecomment-655783875,https://api.github.com/repos/simonw/sqlite-utils/issues/114,655783875,MDEyOklzc3VlQ29tbWVudDY1NTc4Mzg3NQ==,9599,2020-07-08T22:09:51Z,2020-07-08T22:10:16Z,OWNER,I can have a convenient `change_type={...}` parameter for changing column types too.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",621989740, https://github.com/simonw/sqlite-utils/issues/114#issuecomment-655782477,https://api.github.com/repos/simonw/sqlite-utils/issues/114,655782477,MDEyOklzc3VlQ29tbWVudDY1NTc4MjQ3Nw==,9599,2020-07-08T22:06:23Z,2020-07-08T22:06:23Z,OWNER,"Thinking about the method signature: ```python def transform_table( self, columns, pk=None, foreign_keys=None, column_order=None, not_null=None, defaults=None, hash_id=None, extracts=None, ): ``` This requires the caller to provide the exact set of columns for the new table. It would be useful if this was optional - if you could omit the columns and have it automatically use the previous columns. This would let you change things like the primary key or the column order using the other arguments. Even better: allow column renaming using an optional `rename={...}` argument: ```python db[""dogs""].transform_table(rename={""name"": ""dog_name""}) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",621989740, https://github.com/simonw/sqlite-utils/issues/114#issuecomment-655778058,https://api.github.com/repos/simonw/sqlite-utils/issues/114,655778058,MDEyOklzc3VlQ29tbWVudDY1NTc3ODA1OA==,9599,2020-07-08T21:54:30Z,2020-07-08T21:54:30Z,OWNER,"Don't forget this step: > If foreign key constraints are enabled, disable them using PRAGMA foreign_keys=OFF. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",621989740, https://github.com/simonw/sqlite-utils/issues/114#issuecomment-655677909,https://api.github.com/repos/simonw/sqlite-utils/issues/114,655677909,MDEyOklzc3VlQ29tbWVudDY1NTY3NzkwOQ==,9599,2020-07-08T18:16:39Z,2020-07-08T18:16:39Z,OWNER,"Since neither the term ""transform"" or ""migrate"" are used in the codebase at the moment, I think I'll go with `.transform_table()` - that leaves the term ""migrate"" available for any future database migrations system (similar to Django's).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",621989740, https://github.com/simonw/sqlite-utils/issues/114#issuecomment-655677396,https://api.github.com/repos/simonw/sqlite-utils/issues/114,655677396,MDEyOklzc3VlQ29tbWVudDY1NTY3NzM5Ng==,9599,2020-07-08T18:15:39Z,2020-07-08T18:15:39Z,OWNER,"Alternative possible names: - `.transform_table()` - `.migrate()` - `.transform()` I'm torn between `.migrate_table()` and `.transform_table()`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",621989740, https://github.com/simonw/sqlite-utils/issues/114#issuecomment-655677099,https://api.github.com/repos/simonw/sqlite-utils/issues/114,655677099,MDEyOklzc3VlQ29tbWVudDY1NTY3NzA5OQ==,9599,2020-07-08T18:15:02Z,2020-07-08T18:15:02Z,OWNER,"I'm not so keen on that chained API - it's pretty complicated. Here's an idea for a much simpler interface. Essentially it lets you say ""take table X and migrate its contents to a new table with this structure - then atomically rename the tables to switch them"": ```python db[""mytable""].migrate_table({""id"": int, ""name"": str""}, pk=""id"") ``` The `migrate_table()` method would take the same exact signature as the `table.create()` method: https://github.com/simonw/sqlite-utils/blob/a236a6bc771a5a6a9d7e814f1986d461afc422d2/sqlite_utils/db.py#L615-L625","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",621989740, https://github.com/simonw/sqlite-utils/issues/119#issuecomment-655674910,https://api.github.com/repos/simonw/sqlite-utils/issues/119,655674910,MDEyOklzc3VlQ29tbWVudDY1NTY3NDkxMA==,9599,2020-07-08T18:10:18Z,2020-07-08T18:10:18Z,OWNER,"This will work similar to how `.add_foreign_keys()` works: turn on `writable_schema` and rewrite the `sql` for that table in the `sqlite_master` table. Here's that code today - it could be adapted to include removal of foreign keys that we no longer want: https://github.com/simonw/sqlite-utils/blob/a236a6bc771a5a6a9d7e814f1986d461afc422d2/sqlite_utils/db.py#L391-L401","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",652700770, https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655653292,https://api.github.com/repos/simonw/sqlite-utils/issues/118,655653292,MDEyOklzc3VlQ29tbWVudDY1NTY1MzI5Mg==,9599,2020-07-08T17:26:02Z,2020-07-08T17:26:02Z,OWNER,"Awesome, thank you very much.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",651844316, https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655643078,https://api.github.com/repos/simonw/sqlite-utils/issues/118,655643078,MDEyOklzc3VlQ29tbWVudDY1NTY0MzA3OA==,79913,2020-07-08T17:05:59Z,2020-07-08T17:05:59Z,CONTRIBUTOR,"> The only thing missing from this PR is updates to the documentation. Ah, yes, thanks for this reminder! I've repushed with doc bits added.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",651844316, https://github.com/simonw/sqlite-utils/issues/114#issuecomment-655290625,https://api.github.com/repos/simonw/sqlite-utils/issues/114,655290625,MDEyOklzc3VlQ29tbWVudDY1NTI5MDYyNQ==,9599,2020-07-08T05:15:45Z,2020-07-08T05:15:45Z,OWNER,"Ideally this would all happen in a single transaction, such that other processes talking to the database would not see any inconsistent state while the table copy was taking place. Need to confirm that this is possible. Also refs transactions thoughts in #121.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",621989740, https://github.com/simonw/sqlite-utils/pull/120#issuecomment-655289686,https://api.github.com/repos/simonw/sqlite-utils/issues/120,655289686,MDEyOklzc3VlQ29tbWVudDY1NTI4OTY4Ng==,9599,2020-07-08T05:13:11Z,2020-07-08T05:13:11Z,OWNER,"This is an excellent fix, thanks!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",652816158, https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655286864,https://api.github.com/repos/simonw/sqlite-utils/issues/118,655286864,MDEyOklzc3VlQ29tbWVudDY1NTI4Njg2NA==,9599,2020-07-08T05:05:27Z,2020-07-08T05:05:36Z,OWNER,"The only thing missing from this PR is updates to the documentation. Those need to go in two places: - In the Python API docs. I suggest adding a note to this section about bulk inserts: https://github.com/simonw/sqlite-utils/blob/d0cdaaaf00249230e847be3a3b393ee2689fbfe4/docs/python-api.rst#bulk-inserts - In the CLI docs, in this section: https://github.com/simonw/sqlite-utils/blob/d0cdaaaf00249230e847be3a3b393ee2689fbfe4/docs/cli.rst#inserting-json-data Here's an example of a previous commit that includes updates to both CLI and API documentation: https://github.com/simonw/sqlite-utils/commit/f9473ace14878212c1fa968b7bd2f51e4f064dba#diff-e3e2a9bfd88566b05001b02a3f51d286","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",651844316, https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655284168,https://api.github.com/repos/simonw/sqlite-utils/issues/118,655284168,MDEyOklzc3VlQ29tbWVudDY1NTI4NDE2OA==,9599,2020-07-08T04:58:00Z,2020-07-08T04:58:00Z,OWNER,"Oops didn't mean to click ""close"" there.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",651844316, https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655284054,https://api.github.com/repos/simonw/sqlite-utils/issues/118,655284054,MDEyOklzc3VlQ29tbWVudDY1NTI4NDA1NA==,9599,2020-07-08T04:57:38Z,2020-07-08T04:57:38Z,OWNER,Thoughts on transactions would be much appreciated in #121 ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",651844316, https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655283393,https://api.github.com/repos/simonw/sqlite-utils/issues/118,655283393,MDEyOklzc3VlQ29tbWVudDY1NTI4MzM5Mw==,9599,2020-07-08T04:55:18Z,2020-07-08T04:55:18Z,OWNER,"This is a really good idea - and thank you for the detailed discussion in the pull request. I'm keen to discuss how transactions can work better. I tend to use this pattern in my own code: with db.conn: db[""table""].insert(...) But it's not documented and I've not though very hard about it! I like having inserts that handle 10,000+ rows commit on every chunk so I can watch their progress from another process, but the library should absolutely support people who want to commit all of the rows in a single transaction - or combine changes with DML. Lots to discuss here. I'll start a new issue.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",651844316, https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655239728,https://api.github.com/repos/simonw/sqlite-utils/issues/118,655239728,MDEyOklzc3VlQ29tbWVudDY1NTIzOTcyOA==,79913,2020-07-08T02:16:42Z,2020-07-08T02:16:42Z,CONTRIBUTOR,"I fixed my original oops by moving the `DELETE FROM $table` out of the chunking loop and repushed. I think this change can be considered in isolation from issues around transactions, which I discuss next. I wanted to make the DELETE + INSERT happen all in the same transaction so it was robust, but that was more complicated than I expected. The transaction handling in the Database/Table classes isn't systematic, and this poses big hurdles to making `Table.insert_all` (or other operations) consistent and robust in the face of errors. For example, I wanted to do this (whitespace ignored in diff, so indentation change not highlighted): ```diff diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py index d6b9ecf..4107ceb 100644 --- a/sqlite_utils/db.py +++ b/sqlite_utils/db.py @@ -1028,6 +1028,11 @@ class Table(Queryable): batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns)) self.last_rowid = None self.last_pk = None + with self.db.conn: + # Explicit BEGIN is necessary because Python's sqlite3 doesn't + # issue implicit BEGINs for DDL, only DML. We mix DDL and DML + # below and might execute DDL first, e.g. for table creation. + self.db.conn.execute(""BEGIN"") if truncate and self.exists(): self.db.conn.execute(""DELETE FROM [{}];"".format(self.name)) for chunk in chunks(itertools.chain([first_record], records), batch_size): @@ -1038,7 +1043,11 @@ class Table(Queryable): # Use the first batch to derive the table names column_types = suggest_column_types(chunk) column_types.update(columns or {}) - self.create( + # Not self.create() because that is wrapped in its own + # transaction and Python's sqlite3 doesn't support + # nested transactions. + self.db.create_table( + self.name, column_types, pk, foreign_keys, @@ -1139,7 +1148,6 @@ class Table(Queryable): flat_values = list(itertools.chain(*values)) queries_and_params = [(sql, flat_values)] - with self.db.conn: for query, params in queries_and_params: try: result = self.db.conn.execute(query, params) ``` but that fails in tests because other methods call `insert/upsert/insert_all/upsert_all` in the middle of their transactions, so the BEGIN statement throws an error (no nested transactions allowed). Stepping back, it would be nice to make the transaction handling systematic and predictable. One way to do this is to make the `sqlite_utils/db.py` code generally not begin or commit any transactions, and require the caller to do that instead. This lets the caller mix and match the Python API calls into transactions as appropriate (which is impossible for the API methods themselves to fully determine). Then, make `sqlite_utils/cli.py` begin and commit a transaction in each `@cli.command` function, making each command robust and consistent in the face of errors. The big change here, and why I didn't just submit a patch, is that it dramatically changes the Python API to _require_ callers to begin a transaction rather than just immediately calling methods. There is also the caveat that for each transaction, an explicit `BEGIN` is also necessary so that DDL as well as DML (as well as `SELECT`s) are consistent and rolled back on error. There are several bugs.python.org discussions around this particular problem of DDL and some plans to make it better and consistent with DBAPI2, eventually. In the meantime, the sqlite-utils Database class could be a context manager which supports the incantations necessary to do proper transactions. This would still be a Python API change for callers but wouldn't expose them to the weirdness of the sqlite3's default transaction handling.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",651844316,