home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where author_association = "NONE" and "updated_at" is on date 2021-07-22 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

user 3

  • maxhawkins 3
  • UtahDave 1
  • aaronyih1 1

issue 2

  • WIP: Add Gmail takeout mbox import 4
  • KeyError: 'Contents' on running upload 1

author_association 1

  • NONE · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
885098025 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-885098025 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 IC_kwDODFE5qs40wYYp UtahDave 306240 2021-07-22T17:47:50Z 2021-07-22T17:47:50Z NONE

Hi @maxhawkins , I'm sorry, I haven't had any time to work on this. I'll have some time tomorrow to test your commits. I think they look great. I'm great with your commits superseding my initial attempt here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
885094284 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-885094284 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 IC_kwDODFE5qs40wXeM maxhawkins 28565 2021-07-22T17:41:32Z 2021-07-22T17:41:32Z NONE

I added a follow-up commit that deals with emails that don't have a Date header: https://github.com/maxhawkins/google-takeout-to-sqlite/commit/4bc70103582c10802c85a523ef1e99a8a2154aa9

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
885022230 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-885022230 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 IC_kwDODFE5qs40wF4W maxhawkins 28565 2021-07-22T15:51:46Z 2021-07-22T15:51:46Z NONE

One thing I noticed is this importer doesn't save attachments along with the body of the emails. It would be nice if those got stored as blobs in a separate attachments table so attachments can be included while fetching search results.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
884672647 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-884672647 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 IC_kwDODFE5qs40uwiH maxhawkins 28565 2021-07-22T05:56:31Z 2021-07-22T14:03:08Z NONE

How does this commit look? https://github.com/maxhawkins/google-takeout-to-sqlite/commit/72802a83fee282eb5d02d388567731ba4301050d

It seems that Takeout's mbox format is pretty simple, so we can get away with just splitting the file on lines begining with From. My commit just splits the file every time a line starts with From and uses email.message_from_bytes to parse each chunk.

I was able to load a 12GB takeout mbox without the program using more than a couple hundred MB of memory during the import process. It does make us lose the progress bar, but maybe I can add that back in a later commit.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
884688833 https://github.com/dogsheep/dogsheep-photos/issues/32#issuecomment-884688833 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/32 IC_kwDOD079W840u0fB aaronyih1 10793464 2021-07-22T06:40:25Z 2021-07-22T06:40:25Z NONE

The solution here is to upload an image to the bucket first. It is caused because it does not properly handle the case when there are no images in the bucket.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
KeyError: 'Contents' on running upload 803333769  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 450.961ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows