home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where issue = 610517472 and user = 9599 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw · 7 ✖

issue 1

  • sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns · 7 ✖

author_association 1

  • OWNER 7
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
622587177 https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622587177 https://api.github.com/repos/simonw/sqlite-utils/issues/103 MDEyOklzc3VlQ29tbWVudDYyMjU4NzE3Nw== simonw 9599 2020-05-01T22:07:51Z 2020-05-01T22:07:51Z OWNER

This is my failed attempt to recreate the bug (plus some extra debugging output): ```diff % git diff diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py index dd49d5c..ea42aea 100644 --- a/sqlite_utils/db.py +++ b/sqlite_utils/db.py @@ -1013,7 +1013,11 @@ class Table(Queryable): assert ( num_columns <= SQLITE_MAX_VARS ), "Rows can have a maximum of {} columns".format(SQLITE_MAX_VARS) + print("default batch_size = ", batch_size) batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns)) + print("new batch_size = {},num_columns = {}, MAX_VARS // num_columns = {}".format( + batch_size, num_columns, SQLITE_MAX_VARS // num_columns + )) self.last_rowid = None self.last_pk = None for chunk in chunks(itertools.chain([first_record], records), batch_size): @@ -1124,6 +1128,9 @@ class Table(Queryable): ) flat_values = list(itertools.chain(*values)) queries_and_params = [(sql, flat_values)] + print(sql.count("?"), len(flat_values)) + + # print(json.dumps(queries_and_params, indent=4))

         with self.db.conn:
             for query, params in queries_and_params:

diff --git a/tests/test_create.py b/tests/test_create.py index 5290cd8..52940df 100644 --- a/tests/test_create.py +++ b/tests/test_create.py @@ -853,3 +853,33 @@ def test_create_with_nested_bytes(fresh_db): record = {"id": 1, "data": {"foo": b"bytes"}} fresh_db["t"].insert(record) assert [{"id": 1, "data": '{"foo": "b\'bytes\'"}'}] == list(fresh_db["t"].rows) + + +def test_create_throws_useful_error_with_increasing_number_of_columns(fresh_db): + # https://github.com/simonw/sqlite-utils/issues/103 + def rows(): + yield {"name": 0} + for i in range(1, 1001): + yield { + "name": i, + "age": i, + "size": i, + "name2": i, + "age2": i, + "size2": i, + "name3": i, + "age3": i, + "size3": i, + "name4": i, + "age4": i, + "size4": i, + "name5": i, + "age5": i, + "size5": i, + "name6": i, + "age6": i, + "size6": i, + } + + fresh_db["t"].insert_all(rows()) + assert 1001 == fresh_db["t"].count ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472  
622584433 https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622584433 https://api.github.com/repos/simonw/sqlite-utils/issues/103 MDEyOklzc3VlQ29tbWVudDYyMjU4NDQzMw== simonw 9599 2020-05-01T21:57:52Z 2020-05-01T21:57:52Z OWNER

@b0b5h4rp13 I'm having trouble creating a test that triggers this bug. Could you share a chunk of code that replicates what you're seeing here?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472  
622565276 https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622565276 https://api.github.com/repos/simonw/sqlite-utils/issues/103 MDEyOklzc3VlQ29tbWVudDYyMjU2NTI3Ng== simonw 9599 2020-05-01T20:57:16Z 2020-05-01T20:57:16Z OWNER

I'm reconsidering this: I think this is going to happen ANY time someone has at least one row that is wider than the first row. So at the very least I should show a more understandable error message.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472  
622563188 https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622563188 https://api.github.com/repos/simonw/sqlite-utils/issues/103 MDEyOklzc3VlQ29tbWVudDYyMjU2MzE4OA== simonw 9599 2020-05-01T20:51:24Z 2020-05-01T20:51:29Z OWNER

Hopefully anyone who runs into this problem in the future will search for and find this issue thread!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472  
622563059 https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622563059 https://api.github.com/repos/simonw/sqlite-utils/issues/103 MDEyOklzc3VlQ29tbWVudDYyMjU2MzA1OQ== simonw 9599 2020-05-01T20:51:01Z 2020-05-01T20:51:01Z OWNER

I'm not sure what to do about this.

I was thinking the solution would be to look at ALL of the rows in a batch before deciding on the maximum number of columns, but that doesn't work because we calculate batch size based on the number of columns!

I think my recommendation here is to manually pass a batch_size= argument to .insert_all() if you run into this error.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472  
622561944 https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622561944 https://api.github.com/repos/simonw/sqlite-utils/issues/103 MDEyOklzc3VlQ29tbWVudDYyMjU2MTk0NA== simonw 9599 2020-05-01T20:47:51Z 2020-05-01T20:47:51Z OWNER

Yup we only take the number of columns in the first record into account at the moment: https://github.com/simonw/sqlite-utils/blob/d56029549acae0b0ea94c5a0f783e3b3895d9218/sqlite_utils/db.py#L1007-L1016

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472  
622561585 https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622561585 https://api.github.com/repos/simonw/sqlite-utils/issues/103 MDEyOklzc3VlQ29tbWVudDYyMjU2MTU4NQ== simonw 9599 2020-05-01T20:46:50Z 2020-05-01T20:46:50Z OWNER

The varying number of columns thing is interesting - I don't think the tests cover that case much if at all.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 801.579ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows