github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655653292 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655653292 | MDEyOklzc3VlQ29tbWVudDY1NTY1MzI5Mg== | 9599 | 2020-07-08T17:26:02Z | 2020-07-08T17:26:02Z | OWNER | Awesome, thank you very much. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
651844316 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655643078 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655643078 | MDEyOklzc3VlQ29tbWVudDY1NTY0MzA3OA== | 79913 | 2020-07-08T17:05:59Z | 2020-07-08T17:05:59Z | CONTRIBUTOR | > The only thing missing from this PR is updates to the documentation. Ah, yes, thanks for this reminder! I've repushed with doc bits added. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
651844316 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655286864 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655286864 | MDEyOklzc3VlQ29tbWVudDY1NTI4Njg2NA== | 9599 | 2020-07-08T05:05:27Z | 2020-07-08T05:05:36Z | OWNER | The only thing missing from this PR is updates to the documentation. Those need to go in two places: - In the Python API docs. I suggest adding a note to this section about bulk inserts: https://github.com/simonw/sqlite-utils/blob/d0cdaaaf00249230e847be3a3b393ee2689fbfe4/docs/python-api.rst#bulk-inserts - In the CLI docs, in this section: https://github.com/simonw/sqlite-utils/blob/d0cdaaaf00249230e847be3a3b393ee2689fbfe4/docs/cli.rst#inserting-json-data Here's an example of a previous commit that includes updates to both CLI and API documentation: https://github.com/simonw/sqlite-utils/commit/f9473ace14878212c1fa968b7bd2f51e4f064dba#diff-e3e2a9bfd88566b05001b02a3f51d286 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
651844316 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655284168 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655284168 | MDEyOklzc3VlQ29tbWVudDY1NTI4NDE2OA== | 9599 | 2020-07-08T04:58:00Z | 2020-07-08T04:58:00Z | OWNER | Oops didn't mean to click "close" there. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
651844316 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655284054 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655284054 | MDEyOklzc3VlQ29tbWVudDY1NTI4NDA1NA== | 9599 | 2020-07-08T04:57:38Z | 2020-07-08T04:57:38Z | OWNER | Thoughts on transactions would be much appreciated in #121 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
651844316 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655283393 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655283393 | MDEyOklzc3VlQ29tbWVudDY1NTI4MzM5Mw== | 9599 | 2020-07-08T04:55:18Z | 2020-07-08T04:55:18Z | OWNER | This is a really good idea - and thank you for the detailed discussion in the pull request. I'm keen to discuss how transactions can work better. I tend to use this pattern in my own code: with db.conn: db["table"].insert(...) But it's not documented and I've not though very hard about it! I like having inserts that handle 10,000+ rows commit on every chunk so I can watch their progress from another process, but the library should absolutely support people who want to commit all of the rows in a single transaction - or combine changes with DML. Lots to discuss here. I'll start a new issue. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
651844316 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655239728 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655239728 | MDEyOklzc3VlQ29tbWVudDY1NTIzOTcyOA== | 79913 | 2020-07-08T02:16:42Z | 2020-07-08T02:16:42Z | CONTRIBUTOR | I fixed my original oops by moving the `DELETE FROM $table` out of the chunking loop and repushed. I think this change can be considered in isolation from issues around transactions, which I discuss next. I wanted to make the DELETE + INSERT happen all in the same transaction so it was robust, but that was more complicated than I expected. The transaction handling in the Database/Table classes isn't systematic, and this poses big hurdles to making `Table.insert_all` (or other operations) consistent and robust in the face of errors. For example, I wanted to do this (whitespace ignored in diff, so indentation change not highlighted): ```diff diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py index d6b9ecf..4107ceb 100644 --- a/sqlite_utils/db.py +++ b/sqlite_utils/db.py @@ -1028,6 +1028,11 @@ class Table(Queryable): batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns)) self.last_rowid = None self.last_pk = None + with self.db.conn: + # Explicit BEGIN is necessary because Python's sqlite3 doesn't + # issue implicit BEGINs for DDL, only DML. We mix DDL and DML + # below and might execute DDL first, e.g. for table creation. + self.db.conn.execute("BEGIN") if truncate and self.exists(): self.db.conn.execute("DELETE FROM [{}];".format(self.name)) for chunk in chunks(itertools.chain([first_record], records), batch_size): @@ -1038,7 +1043,11 @@ class Table(Queryable): # Use the first batch to derive the table names column_types = suggest_column_types(chunk) column_types.update(columns or {}) - self.create( + # Not self.create() because that is wrapped in its own + # transaction and Python's sqlite3 doesn't support + # nested transactions. + self.db.create_table( + … | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
651844316 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655052451 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655052451 | MDEyOklzc3VlQ29tbWVudDY1NTA1MjQ1MQ== | 79913 | 2020-07-07T18:45:23Z | 2020-07-07T18:45:23Z | CONTRIBUTOR | Ah, I see the problem. The truncate is inside a loop I didn't realize was there. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
651844316 | |
https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655018966 | https://api.github.com/repos/simonw/sqlite-utils/issues/118 | 655018966 | MDEyOklzc3VlQ29tbWVudDY1NTAxODk2Ng== | 79913 | 2020-07-07T17:41:06Z | 2020-07-07T17:41:06Z | CONTRIBUTOR | Hmm, while tests pass, this may not work as intended on larger datasets. Looking into it. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
651844316 |