home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where author_association = "OWNER" and issue = 1039037439 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 2

issue 1

  • Add functionality to read Parquet files. · 2 ✖

author_association 1

  • OWNER · 2 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
979442854 https://github.com/simonw/sqlite-utils/pull/333#issuecomment-979442854 https://api.github.com/repos/simonw/sqlite-utils/issues/333 IC_kwDOCGYnMM46YRym simonw 9599 2021-11-25T19:47:26Z 2021-11-25T19:47:26Z OWNER

I just remembered that there's one other place that this could fit: as a Datasette "insert" plugin.

This is vaporware at the moment, but the idea is that Datasette itself could grow a mechanism for importing data, that's driven by plugins.

Out of the box Datasette would be able to import CSV and CSV files, similar to sqlite-utils insert ... --csv - but plugins would then be able to add support for additional format such as GeoJSON or - in this case - Parquet.

The neat thing about having it as a Datasette plugin is that one plugin would enable three different ways of importing data:

  1. Via a new datasette insert ... CLI option (similar to sqlite-utils)
  2. Via a web form upload interface, where authenticated Datasette users would be able to upload files
  3. Via an API interface, where files could be programatically submitted to a running Datasette server

I started fleshing out this idea quite a while ago but didn't make much concrete progress, maybe I should revisit it:

  • https://github.com/simonw/datasette/issues/1160
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add functionality to read Parquet files. 1039037439  
974754412 https://github.com/simonw/sqlite-utils/pull/333#issuecomment-974754412 https://api.github.com/repos/simonw/sqlite-utils/issues/333 IC_kwDOCGYnMM46GZJs simonw 9599 2021-11-21T04:35:32Z 2021-11-21T04:35:32Z OWNER

Some other recent projects (like trying to get this library to work in JupyterLite) have made me much more cautious about adding new dependencies, especially dependencies like pyarrow which require custom C/Rust extensions.

There are a few ways this could work though:

  • Have this as an optional dependency feature - so it only works if the user installs pyarrow as well
  • Implement this as a separate tool, parquet-to-sqlite - which could itself depend on sqlite-utils
  • Add a concept of "plugins" to sqlite-utils, similar to how those work in Datasette: https://docs.datasette.io/en/stable/plugins.html

My favourite option is parquet-to-sqlite because that can be built without any additional changes to sqlite-utils at all!

I find the concept of plugins for sqlite-utils interesting. I've so far not had quite enough potential use-cases to convince me this is worthwhile (especially since it should be very easy to build out separate tools entirely), but I'm ready to be convinced that a plugin mechanism would be worthwhile.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add functionality to read Parquet files. 1039037439  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 20.565ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows