issue_comments

9 rows where issue = 651844316 sorted by updated_at descending

View and edit SQL

Suggested facets: created_at (date), updated_at (date)

user

author_association

issue

  • Add insert --truncate option · 9
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
655653292 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655653292 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTY1MzI5Mg== simonw 9599 2020-07-08T17:26:02Z 2020-07-08T17:26:02Z OWNER

Awesome, thank you very much.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
655643078 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655643078 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTY0MzA3OA== tsibley 79913 2020-07-08T17:05:59Z 2020-07-08T17:05:59Z CONTRIBUTOR

The only thing missing from this PR is updates to the documentation.

Ah, yes, thanks for this reminder! I've repushed with doc bits added.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
655286864 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655286864 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTI4Njg2NA== simonw 9599 2020-07-08T05:05:27Z 2020-07-08T05:05:36Z OWNER

The only thing missing from this PR is updates to the documentation. Those need to go in two places:

Here's an example of a previous commit that includes updates to both CLI and API documentation: https://github.com/simonw/sqlite-utils/commit/f9473ace14878212c1fa968b7bd2f51e4f064dba#diff-e3e2a9bfd88566b05001b02a3f51d286

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
655284168 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655284168 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTI4NDE2OA== simonw 9599 2020-07-08T04:58:00Z 2020-07-08T04:58:00Z OWNER

Oops didn't mean to click "close" there.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
655284054 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655284054 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTI4NDA1NA== simonw 9599 2020-07-08T04:57:38Z 2020-07-08T04:57:38Z OWNER

Thoughts on transactions would be much appreciated in #121

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
655283393 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655283393 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTI4MzM5Mw== simonw 9599 2020-07-08T04:55:18Z 2020-07-08T04:55:18Z OWNER

This is a really good idea - and thank you for the detailed discussion in the pull request.

I'm keen to discuss how transactions can work better. I tend to use this pattern in my own code:

with db.conn:
    db["table"].insert(...)

But it's not documented and I've not though very hard about it!

I like having inserts that handle 10,000+ rows commit on every chunk so I can watch their progress from another process, but the library should absolutely support people who want to commit all of the rows in a single transaction - or combine changes with DML.

Lots to discuss here. I'll start a new issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
655239728 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655239728 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTIzOTcyOA== tsibley 79913 2020-07-08T02:16:42Z 2020-07-08T02:16:42Z CONTRIBUTOR

I fixed my original oops by moving the DELETE FROM $table out of the chunking loop and repushed. I think this change can be considered in isolation from issues around transactions, which I discuss next.

I wanted to make the DELETE + INSERT happen all in the same transaction so it was robust, but that was more complicated than I expected. The transaction handling in the Database/Table classes isn't systematic, and this poses big hurdles to making Table.insert_all (or other operations) consistent and robust in the face of errors.

For example, I wanted to do this (whitespace ignored in diff, so indentation change not highlighted):

diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py
index d6b9ecf..4107ceb 100644
--- a/sqlite_utils/db.py
+++ b/sqlite_utils/db.py
@@ -1028,6 +1028,11 @@ class Table(Queryable):
         batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns))
         self.last_rowid = None
         self.last_pk = None
+        with self.db.conn:
+            # Explicit BEGIN is necessary because Python's sqlite3 doesn't
+            # issue implicit BEGINs for DDL, only DML.  We mix DDL and DML
+            # below and might execute DDL first, e.g. for table creation.
+            self.db.conn.execute("BEGIN")
             if truncate and self.exists():
                 self.db.conn.execute("DELETE FROM [{}];".format(self.name))
             for chunk in chunks(itertools.chain([first_record], records), batch_size):
@@ -1038,7 +1043,11 @@ class Table(Queryable):
                         # Use the first batch to derive the table names
                         column_types = suggest_column_types(chunk)
                         column_types.update(columns or {})
-                    self.create(
+                        # Not self.create() because that is wrapped in its own
+                        # transaction and Python's sqlite3 doesn't support
+                        # nested transactions.
+                        self.db.create_table(
+                            self.name,
                             column_types,
                             pk,
                             foreign_keys,
@@ -1139,7 +1148,6 @@ class Table(Queryable):
                     flat_values = list(itertools.chain(*values))
                     queries_and_params = [(sql, flat_values)]

-            with self.db.conn:
                 for query, params in queries_and_params:
                     try:
                         result = self.db.conn.execute(query, params)

but that fails in tests because other methods call insert/upsert/insert_all/upsert_all in the middle of their transactions, so the BEGIN statement throws an error (no nested transactions allowed).

Stepping back, it would be nice to make the transaction handling systematic and predictable. One way to do this is to make the sqlite_utils/db.py code generally not begin or commit any transactions, and require the caller to do that instead. This lets the caller mix and match the Python API calls into transactions as appropriate (which is impossible for the API methods themselves to fully determine). Then, make sqlite_utils/cli.py begin and commit a transaction in each @cli.command function, making each command robust and consistent in the face of errors. The big change here, and why I didn't just submit a patch, is that it dramatically changes the Python API to require callers to begin a transaction rather than just immediately calling methods.

There is also the caveat that for each transaction, an explicit BEGIN is also necessary so that DDL as well as DML (as well as SELECTs) are consistent and rolled back on error. There are several bugs.python.org discussions around this particular problem of DDL and some plans to make it better and consistent with DBAPI2, eventually. In the meantime, the sqlite-utils Database class could be a context manager which supports the incantations necessary to do proper transactions. This would still be a Python API change for callers but wouldn't expose them to the weirdness of the sqlite3's default transaction handling.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
655052451 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655052451 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTA1MjQ1MQ== tsibley 79913 2020-07-07T18:45:23Z 2020-07-07T18:45:23Z CONTRIBUTOR

Ah, I see the problem. The truncate is inside a loop I didn't realize was there.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
655018966 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655018966 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTAxODk2Ng== tsibley 79913 2020-07-07T17:41:06Z 2020-07-07T17:41:06Z CONTRIBUTOR

Hmm, while tests pass, this may not work as intended on larger datasets. Looking into it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Query took 35.179ms · About: github-to-sqlite