home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

5 rows where comments = 7, repo = 140912432 and state = "open" sorted by updated_at descending

✖
✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: author_association, created_at (date), updated_at (date)

type 2

  • issue 4
  • pull 1

state 1

  • open · 5 ✖

repo 1

  • sqlite-utils · 5 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app reactions draft state_reason
1383646615 I_kwDOCGYnMM5SeMWX 491 Ability to merge databases and tables sgraaf 8904453 open 0     7 2022-09-23T11:10:55Z 2023-06-14T22:14:24Z   NONE  

Hi! Let me firstly say that I am a big fan of your work -- I follow your tweets and blog posts with great interest 😄.

Now onto the matter at hand: I think it would be great if sqlite-utils included a merge or combine command, with the purpose of combining different SQLite databases into a single SQLite database. This way, the newly "merged" database would contain all differently named tables contained in the databases to be merged as-is, as well a concatenation of all tables of the same name.

This could look something like this:

bash sqlite-utils merge cats.db dogs.db > animals.db

I imagine this is rather straightforward if all databases involved in the merge contain differently named tables (i.e. no chance of conflicts), but things get slightly more complicated if two or more of the databases to be merged contain tables with the same name. Not only do you have to "do something" with the primary key(s), but these tables could also simply have different schemas (and therefore be incompatible for concatenation to begin with).

Anyhow, I would love your thoughts on this, and, if you are open to it, work together on the design and implementation!

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/491/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
743384829 MDExOlB1bGxSZXF1ZXN0NTIxMjg3OTk0 203 changes to allow for compound foreign keys drkane 1049910 open 0     7 2020-11-16T00:30:10Z 2023-01-25T18:47:18Z   FIRST_TIME_CONTRIBUTOR simonw/sqlite-utils/pulls/203

Add support for compound foreign keys, as per issue #117

Not sure if this is the right approach. In particular I'm unsure about:

  • the new ForeignKey class, which replaces the namedtuple in order to ensure that column and other_column are forced into tuples. The class does the job, but doesn't feel very elegant.
  • I haven't rewritten guess_foreign_table to take account of multiple columns, so it just checks for the first column in the foreign key definition. This isn't ideal.
  • I haven't added any ability to the CLI to add compound foreign keys, it's only in the python API at the moment.

The PR also contains a minor related change that columns and tables are always quoted in foreign key definitions.

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/203/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1082651698 I_kwDOCGYnMM5Ah_Qy 358 Support for CHECK constraints luxint 11597658 open 0     7 2021-12-16T21:19:45Z 2022-09-25T07:15:59Z   NONE  

Hi,

I noticed the transform.table() method doesn't have an option to add/change or drop a check constraint (see https://sqlite.org/lang_createtable.html -> 3.7 Check Constraints. would be great to have this as an option!

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/358/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
722816436 MDU6SXNzdWU3MjI4MTY0MzY= 186 .extract() shouldn't extract null values simonw 9599 open 0     7 2020-10-16T02:41:08Z 2021-08-12T12:32:14Z   OWNER  

This almost works, but it creates a rogue type record with a value of None. In [1]: import sqlite_utils In [2]: db = sqlite_utils.Database(memory=True) In [5]: db["creatures"].insert_all([ {"id": 1, "name": "Simon", "type": None}, {"id": 2, "name": "Natalie", "type": None}, {"id": 3, "name": "Cleo", "type": "dog"}], pk="id") Out[5]: <Table creatures (id, name, type)> In [7]: db["creatures"].extract("type") Out[7]: <Table creatures (id, name, type_id)> In [8]: list(db["creatures"].rows) Out[8]: [{'id': 1, 'name': 'Simon', 'type_id': None}, {'id': 2, 'name': 'Natalie', 'type_id': None}, {'id': 3, 'name': 'Cleo', 'type_id': 2}] In [9]: db["type"] Out[9]: <Table type (id, type)> In [10]: list(db["type"].rows) Out[10]: [{'id': 1, 'type': None}, {'id': 2, 'type': 'dog'}]

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/186/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
688670158 MDU6SXNzdWU2ODg2NzAxNTg= 147 SQLITE_MAX_VARS maybe hard-coded too low simonwiles 96218 open 0     7 2020-08-30T07:26:45Z 2021-02-15T21:27:55Z   CONTRIBUTOR  

I came across this while about to open an issue and PR against the documentation for batch_size, which is a bit incomplete.

As mentioned in #145, while:

SQLITE_MAX_VARIABLE_NUMBER ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0.

it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with SQLITE_MAX_VARIABLE_NUMBER set to 250,000, and I believe this is the case for homebrew installations too.

In working to understand what batch_size was actually doing and why, I realized that by setting SQLITE_MAX_VARS in db.py to match the value my sqlite was compiled with (I'm on Debian), I was able to decrease the time to insert_all() my test data set (~128k records across 7 tables) from ~26.5s to ~3.5s. Given that this about .05% of my total dataset, this is time I am keen to save...

Unfortunately, it seems that sqlite3 in the python standard library doesn't expose the get_limit() C API (even though pysqlite used to), so it's hard to know what value sqlite has been compiled with (note that this could mean, I suppose, that it's less than 999, and even hardcoding SQLITE_MAX_VARS to the conservative default might not be adequate. It can also be lowered -- but not raised -- at runtime). The best I could come up with is echo "" | sqlite3 -cmd ".limits variable_number" (only available in sqlite >= 2015-05-07 (3.8.10)).

Obviously this couldn't be relied upon in sqlite_utils, but I wonder what your opinion would be about exposing SQLITE_MAX_VARS as a user-configurable parameter (with suitable "here be dragons" warnings)? I'm going to go ahead and monkey-patch it for my purposes in any event, but it seems like it might be worth considering.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);
Powered by Datasette · Queries took 71.711ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows