home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

17 rows where author_association = "OWNER" and "updated_at" is on date 2019-06-13 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 3

  • Mechanism for turning nested JSON into foreign keys / many-to-many 8
  • Additional Column Constraints? 5
  • Allow .insert(..., foreign_keys=()) to auto-detect table and primary key 4

user 1

  • simonw 17

author_association 1

  • OWNER · 17 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
501541902 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501541902 https://api.github.com/repos/simonw/sqlite-utils/issues/26 MDEyOklzc3VlQ29tbWVudDUwMTU0MTkwMg== simonw 9599 2019-06-13T04:15:22Z 2019-06-13T16:55:42Z OWNER

So maybe something like this: curl https://api.github.com/repos/simonw/datasette/pulls?state=all | \ sqlite-utils insert git.db pulls - \ --flatten=base \ --flatten=head \ --extract=user:users:id \ --extract=head_repo.license:licenses:key \ --extract=head_repo.owner:users \ --extract=head_repo --extract=base_repo.license:licenses:key \ --extract=base_repo.owner:users \ --extract=base_repo Is the order of those nested --extract lines significant I wonder? It would be nice if the order didn't matter and the code figured out the right execution plan on its own.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
501572149 https://github.com/simonw/sqlite-utils/issues/24#issuecomment-501572149 https://api.github.com/repos/simonw/sqlite-utils/issues/24 MDEyOklzc3VlQ29tbWVudDUwMTU3MjE0OQ== simonw 9599 2019-06-13T06:47:17Z 2019-06-13T06:47:17Z OWNER

@IgnoredAmbience this is now shipped in sqlite-utils 1.2 - documentation here:

  • https://sqlite-utils.readthedocs.io/en/latest/python-api.html#python-api-defaults-not-null
  • https://sqlite-utils.readthedocs.io/en/latest/cli.html#cli-defaults-not-null
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Additional Column Constraints? 449818897  
501548676 https://github.com/simonw/sqlite-utils/issues/25#issuecomment-501548676 https://api.github.com/repos/simonw/sqlite-utils/issues/25 MDEyOklzc3VlQ29tbWVudDUwMTU0ODY3Ng== simonw 9599 2019-06-13T04:58:12Z 2019-06-13T04:58:12Z OWNER

I'm going to reuse the ForeignKey named tuple here:

https://github.com/simonw/sqlite-utils/blob/d645032cfa4edbccd0542eecdddca29edf9f7b07/sqlite_utils/db.py#L17-L19

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow .insert(..., foreign_keys=()) to auto-detect table and primary key 449848803  
501548097 https://github.com/simonw/sqlite-utils/issues/25#issuecomment-501548097 https://api.github.com/repos/simonw/sqlite-utils/issues/25 MDEyOklzc3VlQ29tbWVudDUwMTU0ODA5Nw== simonw 9599 2019-06-13T04:54:33Z 2019-06-13T04:54:33Z OWNER

Still need to add this mechanism to .create_table() - this code here is all that needs to be modified - it needs to learn to deal with the alternative syntax for foreign keys and guess the missing data if necessary:

https://github.com/simonw/sqlite-utils/blob/d645032cfa4edbccd0542eecdddca29edf9f7b07/sqlite_utils/db.py#L115-L119

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow .insert(..., foreign_keys=()) to auto-detect table and primary key 449848803  
501543688 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501543688 https://api.github.com/repos/simonw/sqlite-utils/issues/26 MDEyOklzc3VlQ29tbWVudDUwMTU0MzY4OA== simonw 9599 2019-06-13T04:26:15Z 2019-06-13T04:26:15Z OWNER

I may ignore --flatten for the moment - users can do their own flattening using jq if they need that.

curl https://api.github.com/repos/simonw/datasette/pulls?state=all | jq " [.[] | . + { base_label: .base.label, base_ref: .base.ref, base_sha: .base.sha, base_user: .base.user, base_repo: .base.repo, head_label: .head.label, head_ref: .head.ref, head_sha: .head.sha, head_user: .head.user, head_repo: .head.repo } | del(.base, .head, ._links)] " Output: https://gist.github.com/simonw/2703ed43fcfe96eb8cfeee7b558b61e1

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
501542025 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501542025 https://api.github.com/repos/simonw/sqlite-utils/issues/26 MDEyOklzc3VlQ29tbWVudDUwMTU0MjAyNQ== simonw 9599 2019-06-13T04:16:10Z 2019-06-13T04:16:42Z OWNER

So for --extract the format is path-to-property:table-to-extract-to:primary-key

If we find an array (as opposed to a direct nested object) at the end of the dotted path we do a m2m table.

And if primary-key is omitted maybe we do the rowid thing with a foreign key back to ourselves.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
501539452 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501539452 https://api.github.com/repos/simonw/sqlite-utils/issues/26 MDEyOklzc3VlQ29tbWVudDUwMTUzOTQ1Mg== simonw 9599 2019-06-13T03:59:32Z 2019-06-13T03:59:32Z OWNER

Another complexity from the https://api.github.com/repos/simonw/datasette/pulls example:

We don't actually want head and base to be pulled out into a separate table. Our ideal table design would probably look something like this:

  • url: ...
  • id: 285698310
  • ...
  • user_id: 9599 -> refs users
  • head_label: simonw:travis-38dev
  • head_ref: travis-38dev
  • head_sha: f274f9004302c5ca75ce89d0abfd648457957e31
  • head_user_id: 9599 -> refs users
  • head_repo_id: 107914493 -> refs repos
  • base_label: simonw:master
  • base_ref: master
  • base_sha: 5e8fbf7f6fbc0b63d0479da3806dd9ccd6aaa945
  • base_user_id: 9599 -> refs users
  • base_repo_id: 107914493 -> refs repos

So the nested head and base sections here, instead of being extracted into another table, were flattened into their own columns.

So perhaps we need a flatten-nested-into-columns mechanism which can be used in conjunction with a extract-to-tables mechanism.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
501538100 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501538100 https://api.github.com/repos/simonw/sqlite-utils/issues/26 MDEyOklzc3VlQ29tbWVudDUwMTUzODEwMA== simonw 9599 2019-06-13T03:51:27Z 2019-06-13T03:51:27Z OWNER

I like the term "extract" for what we are doing here, partly because that's the terminology I used in csvs-to-sqlite.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
501537812 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501537812 https://api.github.com/repos/simonw/sqlite-utils/issues/26 MDEyOklzc3VlQ29tbWVudDUwMTUzNzgxMg== simonw 9599 2019-06-13T03:49:37Z 2019-06-13T03:50:39Z OWNER

There's an interesting difference here between nested objects with a primary-key style ID and nested objects without.

If a nested object does not have a primary key, we could still shift it out to another table but it would need to be in a context where it has an automatic foreign key back to our current record.

A good example of something where that would be useful is the outageDevices key in https://github.com/simonw/pge-outages/blob/d890d09ff6e2997948028528e06c82e1efe30365/pge-outages.json#L13-L25

json { "outageNumber": "407367", "outageStartTime": "1560355216", "crewCurrentStatus": "PG&E repair crew is on-site working to restore power.", "currentEtor": "1560376800", "cause": "Our preliminary determination is that your outage was caused by scheduled maintenance work.", "estCustAffected": "3", "lastUpdateTime": "1560355709", "hazardFlag": "0", "latitude": "37.35629", "longitude": "-119.70469", "outageDevices": [ { "latitude": "37.35409", "longitude": "-119.70575" }, { "latitude": "37.35463", "longitude": "-119.70525" }, { "latitude": "37.35562", "longitude": "-119.70467" } ], "regionName": "Ahwahnee" }

These could either be inserted into an outageDevices table that uses rowid... or we could have a mechanism where we automatically derive a primary key for them based on a hash of their data, hence avoiding creating duplicates even though we don't have a provided primary key.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
501536495 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501536495 https://api.github.com/repos/simonw/sqlite-utils/issues/26 MDEyOklzc3VlQ29tbWVudDUwMTUzNjQ5NQ== simonw 9599 2019-06-13T03:40:21Z 2019-06-13T03:40:21Z OWNER

I think I can do something here with a very simple head.repo.owner path syntax. Normally this kind of syntax would have to take the difference between dictionaries and lists into account but I don't think that matters here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
501517535 https://github.com/simonw/sqlite-utils/issues/25#issuecomment-501517535 https://api.github.com/repos/simonw/sqlite-utils/issues/25 MDEyOklzc3VlQ29tbWVudDUwMTUxNzUzNQ== simonw 9599 2019-06-13T01:50:34Z 2019-06-13T01:50:34Z OWNER

If I'm going to do this then I should make the other_table and other_column arguments optional here too:

https://github.com/simonw/sqlite-utils/blob/2fed87da6ea990d295672e4db2c8ae97b787913e/sqlite_utils/cli.py#L201-L215

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow .insert(..., foreign_keys=()) to auto-detect table and primary key 449848803  
501516797 https://github.com/simonw/sqlite-utils/issues/25#issuecomment-501516797 https://api.github.com/repos/simonw/sqlite-utils/issues/25 MDEyOklzc3VlQ29tbWVudDUwMTUxNjc5Nw== simonw 9599 2019-06-13T01:46:36Z 2019-06-13T01:47:35Z OWNER

Maybe foreign_keys could even optionally just be a list of columns - it could then attempt to detect the related tables based on some rules-of-thumb and raise an error if there's no obvious candidate.

Rules: * If the column name ends in _id, remove that suffix and look for a matching table. * Try for a table which is the column name without the _id suffix with an s appended to it * Try for a table that's the exact match for the column name

If none of these rules match, raise an error.

So the above example could be further simplified to: python db["usages"].insert_all( usages_to_insert, foreign_keys=["line_id", "definition_id"] )

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow .insert(..., foreign_keys=()) to auto-detect table and primary key 449848803  
501516028 https://github.com/simonw/sqlite-utils/issues/24#issuecomment-501516028 https://api.github.com/repos/simonw/sqlite-utils/issues/24 MDEyOklzc3VlQ29tbWVudDUwMTUxNjAyOA== simonw 9599 2019-06-13T01:42:36Z 2019-06-13T01:42:36Z OWNER

Maybe it's time to create a sqlite-utils create-table command here too, rather than forcing people to create tables only by inserting example data.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Additional Column Constraints? 449818897  
501515609 https://github.com/simonw/sqlite-utils/issues/24#issuecomment-501515609 https://api.github.com/repos/simonw/sqlite-utils/issues/24 MDEyOklzc3VlQ29tbWVudDUwMTUxNTYwOQ== simonw 9599 2019-06-13T01:40:12Z 2019-06-13T01:40:47Z OWNER

But what to do for creating a table?

For the Python function I could do this: python db["cats"].create({ "id": int, "name": str, "score": int, "weight": float, }, pk="id", not_null={"weight"}, defaults={"score": 1})

The CLI tool only every creates tables as a side-effect of a sqlite-utils insert or sqlite-utils upsert. I can have them accept optional arguments, --not-null colname and --default colname value: echo '{"name": "Cleo", "age": 4, "score": 2}' | \ sqlite-utils insert dogs.db dogs - \ --not-null age \ --not-null name \ --default score 1

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Additional Column Constraints? 449818897  
501514575 https://github.com/simonw/sqlite-utils/issues/24#issuecomment-501514575 https://api.github.com/repos/simonw/sqlite-utils/issues/24 MDEyOklzc3VlQ29tbWVudDUwMTUxNDU3NQ== simonw 9599 2019-06-13T01:34:55Z 2019-06-13T01:34:55Z OWNER

Since you can't have one without the other, I'm going with --not-null-default= and not_null_default= for the add column versions of this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Additional Column Constraints? 449818897  
501509642 https://github.com/simonw/sqlite-utils/issues/24#issuecomment-501509642 https://api.github.com/repos/simonw/sqlite-utils/issues/24 MDEyOklzc3VlQ29tbWVudDUwMTUwOTY0Mg== simonw 9599 2019-06-13T01:06:09Z 2019-06-13T01:06:09Z OWNER

Hmm... we need the ability to pass --not-null when we are creating a table as well.

If you attempt to add NOT NULL to a column after a table has first been created you get this error:

sqlite3.OperationalError: Cannot add a NOT NULL column with default value NULL

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Additional Column Constraints? 449818897  
501508302 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501508302 https://api.github.com/repos/simonw/sqlite-utils/issues/26 MDEyOklzc3VlQ29tbWVudDUwMTUwODMwMg== simonw 9599 2019-06-13T00:57:52Z 2019-06-13T00:57:52Z OWNER

Two challenges here:

  1. We need a way to specify which tables should be used - e.g. "put records from the "user" key in a users table, put multiple records from the "labels" key in a table called labels" (we can pick an automatic name for the m2m table, though it might be nice to have an option to customize it)

  2. Should we deal with nested objects? Consider https://api.github.com/repos/simonw/datasette/pulls for example:

Here we have head.user as a user, head.repo as a repo, and head.repo.owner as another user.

Ideally our mechanism for specifying which table things should be pulled out into would handle this, but it's getting a bit complicated.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 360.254ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows