home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

518 rows where author_association = "MEMBER" sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

issue >30

  • Upload all my photos to a secure S3 bucket 14
  • WIP: Add Gmail takeout mbox import 13
  • Import machine-learning detected labels (dog, llama etc) from Apple Photos 12
  • --since feature can be confused by retweets 11
  • Set up a live demo Datasette instance 9
  • Ability to serve thumbnailed Apple Photo from its place on disk 9
  • Command to fetch stargazers for one or more repos 8
  • Commits in GitHub API can have null author 8
  • Import photo metadata from Apple Photos into SQLite 8
  • Rename project to dogsheep-photos 8
  • Mechanism for defining custom display of results 8
  • the JSON object must be str, bytes or bytearray, not 'Undefined' 8
  • Demo is failing to deploy 7
  • Commands for making authenticated API calls 7
  • Pagination 7
  • First working version 7
  • Command for running a search and saving tweets for that search 6
  • Command for retrieving dependents for a repo 6
  • bpylist.archiver.CircularReference: archive has a cycle with uid(13) 6
  • Mechanism for differentiating between "by me" and "liked by me" 6
  • Figure out how to display images from <en-media> tags inline in Datasette 6
  • export.xml file name varies with different language settings 6
  • Folder support 6
  • Rethink progress bars for various commands 5
  • stargazers command, refs #4 5
  • Add this view for seeing new releases 5
  • twitter-to-sqlite user-timeline [screen_names] --sql / --attach 5
  • Feature: record history of follower counts 5
  • Repos have a big blob of JSON in the organization column 5
  • Annotate photos using the Google Cloud Vision API 5
  • …

user 1

  • simonw 518

author_association 1

  • MEMBER · 518 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1462968053 https://github.com/dogsheep/apple-notes-to-sqlite/issues/11#issuecomment-1462968053 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/11 IC_kwDOJHON9s5XMx71 simonw 9599 2023-03-09T23:24:01Z 2023-03-09T23:24:01Z MEMBER

I improved the readability by removing some unnecessary table aliases: sql with recursive nested_folders(folder_id, descendant_folder_id) as ( -- base case: select all immediate children of the root folder select id, id from folders where parent is null union all -- recursive case: select all children of the previous level of nested folders select nested_folders.folder_id, folders.id from nested_folders join folders on nested_folders.descendant_folder_id = folders.parent ) -- Find notes within all descendants of folder 1 select * from notes where folder in ( select descendant_folder_id from nested_folders where folder_id = 1 );

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement a SQL view to make it easier to query files in a nested folder 1618130434  
1462962682 https://github.com/dogsheep/apple-notes-to-sqlite/issues/11#issuecomment-1462962682 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/11 IC_kwDOJHON9s5XMwn6 simonw 9599 2023-03-09T23:20:35Z 2023-03-09T23:22:41Z MEMBER

Here's a query that returns all notes in folder 1, including notes in descendant folders: sql with recursive nested_folders(folder_id, descendant_folder_id) as ( -- base case: select all immediate children of the root folder select id, id from folders where parent is null union all -- recursive case: select all children of the previous level of nested folders select nf.folder_id, f.id from nested_folders nf join folders f on nf.descendant_folder_id = f.parent ) -- Find notes within all descendants of folder 1 select * from notes where folder in ( select descendant_folder_id from nested_folders where folder_id = 1 ); With assistance from ChatGPT. Prompts were:

``` SQLite schema:

CREATE TABLE [folders] ( [id] INTEGER PRIMARY KEY, [long_id] TEXT, [name] TEXT, [parent] INTEGER, FOREIGN KEY([parent]) REFERENCES folders );

Write a recursive CTE that returns the following:

folder_id | descendant_folder_id

With a row for every nested child of every folder - so the top level folder has lots of rows Then I tweaked it a bit, then ran this: WITH RECURSIVE nested_folders(folder_id, descendant_folder_id) AS ( -- base case: select all immediate children of the root folder SELECT id, id FROM folders WHERE parent IS NULL UNION ALL -- recursive case: select all children of the previous level of nested folders SELECT nf.folder_id, f.id FROM nested_folders nf JOIN folders f ON nf.descendant_folder_id = f.parent ) -- select all rows from the recursive CTE SELECT * from notes where folder in (select descendant_folder_id FROM nested_folders where folder_id = 1)

Convert all SQL keywords to lower case, and re-indent ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement a SQL view to make it easier to query files in a nested folder 1618130434  
1462965256 https://github.com/dogsheep/apple-notes-to-sqlite/issues/11#issuecomment-1462965256 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/11 IC_kwDOJHON9s5XMxQI simonw 9599 2023-03-09T23:22:12Z 2023-03-09T23:22:12Z MEMBER

Here's what the CTE from that looks like:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement a SQL view to make it easier to query files in a nested folder 1618130434  
1462693867 https://github.com/dogsheep/apple-notes-to-sqlite/issues/7#issuecomment-1462693867 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/7 IC_kwDOJHON9s5XLu_r simonw 9599 2023-03-09T20:01:39Z 2023-03-09T20:02:11Z MEMBER

My folders table will have:

  • id - rowid
  • long_id - that long unique string ID
  • name - the name
  • parent - foreign key to id
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Folder support 1617769847  
1462691466 https://github.com/dogsheep/apple-notes-to-sqlite/issues/7#issuecomment-1462691466 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/7 IC_kwDOJHON9s5XLuaK simonw 9599 2023-03-09T19:59:52Z 2023-03-09T19:59:52Z MEMBER

Improved script: zsh osascript -e 'tell application "Notes" set allFolders to folders repeat with aFolder in allFolders set folderId to id of aFolder set folderName to name of aFolder set folderContainer to container of aFolder if class of folderContainer is folder then set folderContainerId to id of folderContainer else set folderContainerId to "" end if log "ID: " & folderId log "Name: " & folderName log "Container: " & folderContainerId log " " end repeat end tell ' ``` ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p6113 Name: Blog posts Container:

ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p698 Name: JSK Container:

ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p7995 Name: Nested inside blog posts Container: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p6113

ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p3526 Name: New Folder Container:

ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p3839 Name: New Folder 1 Container:

ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p2 Name: Notes Container:

ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p6059 Name: Quick Notes Container:

ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p7283 Name: UK Christmas 2022 Container: `` I filtered out things where the parent was an account and not a folder usingif class of folderContainer is folder then`.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Folder support 1617769847  
1462682795 https://github.com/dogsheep/apple-notes-to-sqlite/issues/7#issuecomment-1462682795 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/7 IC_kwDOJHON9s5XLsSr simonw 9599 2023-03-09T19:52:20Z 2023-03-09T19:52:44Z MEMBER

Created through several rounds with ChatGPT (including hints like "rewrite that using setdefault()"): ```python def topological_sort(nodes): children = {} for node in nodes: parent_id = node["parent"] if parent_id is not None: children.setdefault(parent_id, []).append(node)

def traverse(node, result):
    result.append(node)
    if node["id"] in children:
        for child in children[node["id"]]:
            traverse(child, result)

sorted_data = []

for node in nodes:
    if node["parent"] is None:
        traverse(node, sorted_data)

return sorted_data

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Folder support 1617769847  
1462570187 https://github.com/dogsheep/apple-notes-to-sqlite/issues/7#issuecomment-1462570187 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/7 IC_kwDOJHON9s5XLQzL simonw 9599 2023-03-09T18:30:24Z 2023-03-09T18:30:24Z MEMBER

I used ChatGPT to write this: osascript -e 'tell application "Notes" set allFolders to folders repeat with aFolder in allFolders set folderId to id of aFolder set folderName to name of aFolder set folderContainer to container of aFolder set folderContainerName to name of folderContainer log "Folder ID: " & folderId log "Folder Name: " & folderName log "Folder Container: " & folderContainerName log " " --check for nested folders if count of folders of aFolder > 0 then set nestedFolders to folders of aFolder repeat with aNestedFolder in nestedFolders set nestedFolderId to id of aNestedFolder set nestedFolderName to name of aNestedFolder set nestedFolderContainer to container of aNestedFolder set nestedFolderContainerName to name of nestedFolderContainer log " Nested Folder ID: " & nestedFolderId log " Nested Folder Name: " & nestedFolderName log " Nested Folder Container: " & nestedFolderContainerName log " " end repeat end if end repeat end tell ' Which for my account output this: ``` Folder ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p6113 Folder Name: Blog posts Folder Container: iCloud

Nested Folder ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p7995
Nested Folder Name: Nested inside blog posts
Nested Folder Container: Blog posts

Folder ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p698 Folder Name: JSK Folder Container: iCloud

Folder ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p7995 Folder Name: Nested inside blog posts Folder Container: Blog posts

Folder ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p3526 Folder Name: New Folder Folder Container: iCloud

Folder ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p3839 Folder Name: New Folder 1 Folder Container: iCloud

Folder ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p2 Folder Name: Notes Folder Container: iCloud

Folder ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p6059 Folder Name: Quick Notes Folder Container: iCloud

Folder ID: x-coredata://D2D50498-BBD1-4097-B122-D15ABD32BDEC/ICFolder/p7283 Folder Name: UK Christmas 2022 Folder Container: iCloud ``` So I think the correct approach here is to run code at the start to list all of the folders (no need to do fancy recursion though, just a flat list with the parent containers is enough) and create a model of that hierarchy in SQLite.

Then when I import notes I can foreign key reference them back to their containing folder.

I'm tempted to use rowid for the foreign keys because the official IDs are pretty long.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Folder support 1617769847  
1462564717 https://github.com/dogsheep/apple-notes-to-sqlite/issues/7#issuecomment-1462564717 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/7 IC_kwDOJHON9s5XLPdt simonw 9599 2023-03-09T18:25:39Z 2023-03-09T18:25:39Z MEMBER

So it looks like folders can be hierarchical?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Folder support 1617769847  
1462562735 https://github.com/dogsheep/apple-notes-to-sqlite/issues/7#issuecomment-1462562735 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/7 IC_kwDOJHON9s5XLO-v simonw 9599 2023-03-09T18:23:56Z 2023-03-09T18:25:22Z MEMBER

From the Script Editor library docs:

A note has a:

  • container (folder), r/o) : the folder of the note

Here's what a folder looks like:

folder n : a folder containing notes elements:

  • contains folders, notes; contained by application, accounts, folders.

properties:

  • name (text) : the name of the folder
  • id (text, r/o) : the unique identifier of the folder
  • shared (boolean, r/o) : Is the folder shared?
  • container (account or folder, r/o) : the container of the folder
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Folder support 1617769847  
1462556829 https://github.com/dogsheep/apple-notes-to-sqlite/issues/4#issuecomment-1462556829 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/4 IC_kwDOJHON9s5XLNid simonw 9599 2023-03-09T18:20:56Z 2023-03-09T18:20:56Z MEMBER

In terms of the UI: I'm tempted to say that the default behaviour is for it to run until it sees a note that it already knows about AND that has matching update/created dates, and then stop.

You can do a full import again ignoring that logic with apple-notes-to-sqlite notes.db --full.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support incremental updates 1616429236  
1462554175 https://github.com/dogsheep/apple-notes-to-sqlite/issues/4#issuecomment-1462554175 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/4 IC_kwDOJHON9s5XLM4_ simonw 9599 2023-03-09T18:19:34Z 2023-03-09T18:19:34Z MEMBER

It looks like the iteration order is most-recently-modified-first - I tried editing a note a bit further back in my notes app and it was the first one output by apple-notes-to-sqlite --dump.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support incremental updates 1616429236  
1461285545 https://github.com/dogsheep/apple-notes-to-sqlite/issues/2#issuecomment-1461285545 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/2 IC_kwDOJHON9s5XGXKp simonw 9599 2023-03-09T05:06:24Z 2023-03-09T05:06:24Z MEMBER

OK, this works!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
First working version 1616354999  
1461262577 https://github.com/dogsheep/apple-notes-to-sqlite/issues/2#issuecomment-1461262577 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/2 IC_kwDOJHON9s5XGRjx simonw 9599 2023-03-09T04:30:00Z 2023-03-09T04:30:00Z MEMBER

It doesn't have tests yet. I guess I'll need to mock subprocess to test this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
First working version 1616354999  
1461260978 https://github.com/dogsheep/apple-notes-to-sqlite/issues/2#issuecomment-1461260978 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/2 IC_kwDOJHON9s5XGRKy simonw 9599 2023-03-09T04:27:18Z 2023-03-09T04:27:18Z MEMBER

Before that conversion:

Monday, March 6, 2023 at 11:55:15 AM

After:

2023-03-06T11:55:15
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
First working version 1616354999  
1461259490 https://github.com/dogsheep/apple-notes-to-sqlite/issues/2#issuecomment-1461259490 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/2 IC_kwDOJHON9s5XGQzi simonw 9599 2023-03-09T04:24:27Z 2023-03-09T04:24:27Z MEMBER

Converting AppleScript date strings to ISO format is hard!

https://forum.latenightsw.com/t/formatting-dates/841 has a recipe I'll try:

set todayISO to (todayDate as «class isot» as string)

Not clear to me how timezones work here. I'm going to ignore them for the moment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
First working version 1616354999  
1461234591 https://github.com/dogsheep/apple-notes-to-sqlite/issues/2#issuecomment-1461234591 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/2 IC_kwDOJHON9s5XGKuf simonw 9599 2023-03-09T03:56:45Z 2023-03-09T03:56:45Z MEMBER

My prototype showed that images embedded in notes come out in the HTML export as bas64 image URLs, which is neat.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
First working version 1616354999  
1461234311 https://github.com/dogsheep/apple-notes-to-sqlite/issues/2#issuecomment-1461234311 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/2 IC_kwDOJHON9s5XGKqH simonw 9599 2023-03-09T03:56:24Z 2023-03-09T03:56:24Z MEMBER

I opened the "Script Editor" app on my computer, used Window -> Library to open the Library panel, then clicked on the Notes app there. I got this:

So the notes object has these properties:

  • name (text) : the name of the note (normally the first line of the body)
  • id (text, r/o) : the unique identifier of the note
  • container (folder, r/o) : the folder of the note
  • body (text) : the HTML content of the note
  • plaintext (text, r/o) : the plaintext content of the note
  • creation date (date, r/o) : the creation date of the note
  • modification date (date, r/o) : the modification date of the note
  • password protected (boolean, r/o) : Is the note password protected?
  • shared (boolean, r/o) : Is the note shared?

I'm going to ignore the concept of attachments for the moment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
First working version 1616354999  
1461232709 https://github.com/dogsheep/apple-notes-to-sqlite/issues/2#issuecomment-1461232709 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/2 IC_kwDOJHON9s5XGKRF simonw 9599 2023-03-09T03:54:28Z 2023-03-09T03:54:28Z MEMBER

I think the AppleScript I want to pass to osascript looks like this: applescript tell application "Notes" repeat with eachNote in every note set noteId to the id of eachNote set noteTitle to the name of eachNote set noteBody to the body of eachNote log "------------------------" & "\n" log noteId & "\n" log noteTitle & "\n\n" log noteBody & "\n" end repeat end tell But there are a few more properties I'd like to get - created and updated date for example.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
First working version 1616354999  
1461230436 https://github.com/dogsheep/apple-notes-to-sqlite/issues/1#issuecomment-1461230436 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/1 IC_kwDOJHON9s5XGJtk simonw 9599 2023-03-09T03:51:52Z 2023-03-09T03:51:52Z MEMBER

This did the job! Next step is to turn that into a Python script.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Initial proof of concept with ChatGPT 1616347574  
1461230197 https://github.com/dogsheep/apple-notes-to-sqlite/issues/1#issuecomment-1461230197 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/1 IC_kwDOJHON9s5XGJp1 simonw 9599 2023-03-09T03:51:36Z 2023-03-09T03:51:36Z MEMBER

After a few more rounds I got to this script, which outputs them to a /tmp/notes.txt file:

```zsh

!/bin/zsh

osascript -e ' set notesFile to "/tmp/notes.txt" set fileRef to open for access notesFile with write permission tell application "Notes" repeat with eachNote in every note set noteId to the id of eachNote set noteTitle to the name of eachNote set noteBody to the body of eachNote write "------------------------" & "\n" to fileRef write noteId & "\n" to fileRef write noteTitle & "\n\n" to fileRef write noteBody & "\n" to fileRef end repeat end tell close access fileRef' Then I wrote this little Python script to load them into a database:python import sqlite_utils split = b"------------------------\n" s = open("/tmp/notes.txt", "rb").read() notes = [n.decode("mac_roman") for n in s.split(split) if n]

cleaned_notes = [{ "id": n.split("\n")[0], "title": n.split("\n")[1], "body": "\n".join(n.split("\n")[2:]).strip() } for n in notes]

db = sqlite_utils.Database("/tmp/notes.db") db["notes"].insert_all(cleaned_notes) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Initial proof of concept with ChatGPT 1616347574  
1461226402 https://github.com/dogsheep/apple-notes-to-sqlite/issues/1#issuecomment-1461226402 https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/1 IC_kwDOJHON9s5XGIui simonw 9599 2023-03-09T03:47:11Z 2023-03-09T03:47:11Z MEMBER

Me:

Write applescript to loop through all of my Apple Notes and output their contents

ChatGPT:

applescript tell application "Notes" repeat with eachNote in every note set noteTitle to the name of eachNote set noteBody to the body of eachNote display dialog noteTitle & return & return & noteBody end repeat end tell Me:

How can I turn that into a zsh script which will output the results to my terminal

ChatGPT: ```zsh

!/usr/bin/zsh

osascript -e 'tell application "Notes" repeat with eachNote in every note set noteTitle to the name of eachNote set noteBody to the body of eachNote set output to noteTitle & "\n" & noteBody & "\n" display dialog output log output end repeat end tell' `` I had to change that first line to#!/bin/zsh` - then when I ran it it spammed my screen with dialog boxes.

Me:

I want to log the notes contents to the terminal, not display it in a dialog

ChatGPT:

```zsh

!/bin/zsh

osascript -e 'tell application "Notes" repeat with eachNote in every note set noteTitle to the name of eachNote set noteBody to the body of eachNote set output to noteTitle & "\n" & noteBody & "\n" log output end repeat end tell' ``` This really helps!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Initial proof of concept with ChatGPT 1616347574  
1221623052 https://github.com/dogsheep/pocket-to-sqlite/issues/10#issuecomment-1221623052 https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/10 IC_kwDODLZ_YM5I0H0M simonw 9599 2022-08-21T21:20:33Z 2022-08-21T21:20:33Z MEMBER

That was clearly the intention from the description of this issue: - #4

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
When running `auth` command, don't overwrite an existing auth.json file 1246826792  
1221622873 https://github.com/dogsheep/pocket-to-sqlite/issues/10#issuecomment-1221622873 https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/10 IC_kwDODLZ_YM5I0HxZ simonw 9599 2022-08-21T21:19:25Z 2022-08-21T21:19:25Z MEMBER

Agreed, that would be a much better implementation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
When running `auth` command, don't overwrite an existing auth.json file 1246826792  
1221621529 https://github.com/dogsheep/pocket-to-sqlite/issues/11#issuecomment-1221621529 https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/11 IC_kwDODLZ_YM5I0HcZ simonw 9599 2022-08-21T21:10:15Z 2022-08-21T21:11:26Z MEMBER

Just saw that's what's implemented here already! - #7

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
-a option is used for "--auth" and for "--all" 1345452427  
1221621700 https://github.com/dogsheep/pocket-to-sqlite/pull/7#issuecomment-1221621700 https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/7 IC_kwDODLZ_YM5I0HfE simonw 9599 2022-08-21T21:11:12Z 2022-08-21T21:11:12Z MEMBER

I thought this might need a documentation update but --all is already covered: https://github.com/dogsheep/pocket-to-sqlite/blob/0.2.1/README.md

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fixed conflicting CLI flags 750141615  
1221621466 https://github.com/dogsheep/pocket-to-sqlite/issues/11#issuecomment-1221621466 https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/11 IC_kwDODLZ_YM5I0Hba simonw 9599 2022-08-21T21:09:47Z 2022-08-21T21:09:47Z MEMBER

Great catch, thanks.

I'm going to use it to mean --auth - since other tools in the Dogsheep family have the same convention.

--all will be the only way to specify all.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
-a option is used for "--auth" and for "--all" 1345452427  
1188317682 https://github.com/dogsheep/github-to-sqlite/issues/74#issuecomment-1188317682 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/74 IC_kwDODFdgUs5G1Eny simonw 9599 2022-07-18T21:14:22Z 2022-07-18T21:14:22Z MEMBER

That fixed it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
500 error in github-to-sqlite demo 1308461063  
1188233729 https://github.com/dogsheep/github-to-sqlite/issues/74#issuecomment-1188233729 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/74 IC_kwDODFdgUs5G0wIB simonw 9599 2022-07-18T19:51:02Z 2022-07-18T19:51:02Z MEMBER

Takes 30m to deploy the demo!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
500 error in github-to-sqlite demo 1308461063  
1188228964 https://github.com/dogsheep/github-to-sqlite/issues/74#issuecomment-1188228964 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/74 IC_kwDODFdgUs5G0u9k simonw 9599 2022-07-18T19:45:30Z 2022-07-18T19:47:35Z MEMBER

pycmarkgfm doesn't implement the Markdown plugin extension I was using.

I'm going to drop the GFM rendering from the demo, and just treat it as regular markdown.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
500 error in github-to-sqlite demo 1308461063  
1188223933 https://github.com/dogsheep/github-to-sqlite/issues/74#issuecomment-1188223933 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/74 IC_kwDODFdgUs5G0tu9 simonw 9599 2022-07-18T19:40:50Z 2022-07-18T19:42:41Z MEMBER

Here's how the demo is deployed: https://github.com/dogsheep/github-to-sqlite/blob/dbac2e5dd8a562b45d8255a265859cf8020ca22a/.github/workflows/deploy-demo.yml#L103-L119

I'm suspicious of py-gfm, which is used like this:

https://github.com/dogsheep/github-to-sqlite/blob/dbac2e5dd8a562b45d8255a265859cf8020ca22a/demo-metadata.json#L49-L51

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
500 error in github-to-sqlite demo 1308461063  
1188225625 https://github.com/dogsheep/github-to-sqlite/issues/74#issuecomment-1188225625 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/74 IC_kwDODFdgUs5G0uJZ simonw 9599 2022-07-18T19:41:52Z 2022-07-18T19:41:52Z MEMBER

https://github.com/Zopieux/py-gfm says that library is no longer maintained, and suggests https://github.com/Zopieux/pycmarkgfm as an alternative.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
500 error in github-to-sqlite demo 1308461063  
1188223299 https://github.com/dogsheep/github-to-sqlite/pull/73#issuecomment-1188223299 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/73 IC_kwDODFdgUs5G0tlD simonw 9599 2022-07-18T19:40:06Z 2022-07-18T19:40:06Z MEMBER

Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fixing 'NoneType' object has no attribute 'items' 1261884917  
1105474232 https://github.com/dogsheep/github-to-sqlite/issues/72#issuecomment-1105474232 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/72 IC_kwDODFdgUs5B5DK4 simonw 9599 2022-04-21T17:02:15Z 2022-04-21T17:02:15Z MEMBER

That's interesting - yeah it looks like the number of pages can be derived from the Link header, which is enough information to show a progress bar, probably using Click just to avoid adding another dependency.

https://docs.github.com/en/rest/guides/traversing-with-pagination

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
feature: display progress bar when downloading multi-page responses 1211283427  
985928838 https://github.com/dogsheep/github-to-sqlite/issues/69#issuecomment-985928838 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/69 IC_kwDODFdgUs46xBSG simonw 9599 2021-12-04T00:34:52Z 2021-12-04T00:34:52Z MEMBER

First attempt at this: sql select 'issues' as "table", id, node_id, title, user, created_at, body, repo from issues union all select 'issue_comments' as "table", issue_comments.id, issue_comments.node_id, '' as title, issue_comments.user, issue_comments.created_at, issue_comments.body, issues.repo from issue_comments join issues on issues.id = issue_comments.issue order by created_at desc https://github-to-sqlite.dogsheep.net/github?sql=select+%27issues%27+as+%22table%22%2C+id%2C+node_id%2C+title%2C+user%2C+created_at%2C+body%2C+repo%0D%0Afrom+issues%0D%0Aunion+all%0D%0Aselect+%27issue_comments%27+as+%22table%22%2C+issue_comments.id%2C+issue_comments.node_id%2C+%27%27+as+title%2C+issue_comments.user%2C+issue_comments.created_at%2C+issue_comments.body%2C+issues.repo%0D%0Afrom+issue_comments+join+issues+on+issues.id+%3D+issue_comments.issue%0D%0Aorder+by+created_at+desc

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
View that combines issues and issue comments 1071071397  
924209583 https://github.com/dogsheep/twitter-to-sqlite/pull/59#issuecomment-924209583 https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/59 IC_kwDODEm0Qs43FlGv simonw 9599 2021-09-21T17:37:34Z 2021-09-21T17:37:34Z MEMBER

Thanks for this!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix for since_id bug, closes #58 984942782  
906646452 https://github.com/dogsheep/evernote-to-sqlite/issues/13#issuecomment-906646452 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/13 IC_kwDOEhK-wc42ClO0 simonw 9599 2021-08-26T18:34:34Z 2021-08-26T18:35:20Z MEMBER

I tried this ampersand fix: https://regex101.com/r/ojU2H9/1 ```python

https://regex101.com/r/ojU2H9/1

_invalid_ampersand_re = re.compile(r'&(?![a-z0-9]+;)')

def fix_bad_xml(xml): # More fixes for things like '&' not as part of an entity return _invalid_ampersand_re.sub('&', xml) ```

Even with that I'm still getting total garbage in the <en-note> content - it's just HTML, not even trying to be XML.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
xml.etree.ElementTree.ParseError: not well-formed (invalid token) 978743426  
906635938 https://github.com/dogsheep/evernote-to-sqlite/issues/13#issuecomment-906635938 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/13 IC_kwDOEhK-wc42Ciqi simonw 9599 2021-08-26T18:18:27Z 2021-08-26T18:18:27Z MEMBER

It looks like I was using the round-trip to dump the <?xml version="1.0" encoding="UTF-8" standalone="no"?> and <!DOCTYPE prefixes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
xml.etree.ElementTree.ParseError: not well-formed (invalid token) 978743426  
905206234 https://github.com/dogsheep/evernote-to-sqlite/issues/13#issuecomment-905206234 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/13 IC_kwDOEhK-wc419Fna simonw 9599 2021-08-25T05:58:42Z 2021-08-25T05:58:42Z MEMBER

https://github.com/dogsheep/evernote-to-sqlite/blob/36a466f142e5bad52719851c2fbda0c05cd35b99/evernote_to_sqlite/utils.py#L34-L42

Not sure why I was round-tripping the content_xml like that - I will try not doing that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
xml.etree.ElementTree.ParseError: not well-formed (invalid token) 978743426  
905203570 https://github.com/dogsheep/evernote-to-sqlite/issues/13#issuecomment-905203570 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/13 IC_kwDOEhK-wc419E9y simonw 9599 2021-08-25T05:51:22Z 2021-08-25T05:53:27Z MEMBER

The debugger showed me that it broke on a string that looked like this: ```xml

<en-note>

Q3 2018 Reflection & Development

... ``` Yeah that is not valid XML!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
xml.etree.ElementTree.ParseError: not well-formed (invalid token) 978743426  
902356871 https://github.com/dogsheep/healthkit-to-sqlite/issues/20#issuecomment-902356871 https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/20 IC_kwDOC8tyDs41yN-H simonw 9599 2021-08-20T01:12:48Z 2021-08-20T01:12:48Z MEMBER

Also on workout_points.workout_id to speed up queries to show all points in a specific workout.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add index on workout_points.date 975166271  
902355471 https://github.com/dogsheep/healthkit-to-sqlite/issues/20#issuecomment-902355471 https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/20 IC_kwDOC8tyDs41yNoP simonw 9599 2021-08-20T01:09:07Z 2021-08-20T01:09:07Z MEMBER

Workaround:

sqlite-utils create-index healthkit.db workout_points -- -date

See https://sqlite-utils.datasette.io/en/stable/cli.html#creating-indexes

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add index on workout_points.date 975166271  
902330301 https://github.com/dogsheep/twitter-to-sqlite/pull/49#issuecomment-902330301 https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/49 IC_kwDODEm0Qs41yHe9 simonw 9599 2021-08-20T00:01:56Z 2021-08-20T00:01:56Z MEMBER

Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Document the use of --stop_after with favorites, refs #20 681575714  
902329884 https://github.com/dogsheep/twitter-to-sqlite/issues/57#issuecomment-902329884 https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/57 IC_kwDODEm0Qs41yHYc simonw 9599 2021-08-20T00:01:05Z 2021-08-20T00:01:05Z MEMBER

Maybe Click changed something which meant that this broke things when it didn't used to?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Error: Use either --since or --since_id, not both 907645813  
902329455 https://github.com/dogsheep/twitter-to-sqlite/issues/57#issuecomment-902329455 https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/57 IC_kwDODEm0Qs41yHRv simonw 9599 2021-08-19T23:59:56Z 2021-08-19T23:59:56Z MEMBER

This looks like the bug to me:

https://github.com/dogsheep/twitter-to-sqlite/blob/197e69cec40052c423a5ed071feb5f7cccea41b9/twitter_to_sqlite/cli.py#L239-L241

type=str, default=False

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Error: Use either --since or --since_id, not both 907645813  
902328760 https://github.com/dogsheep/twitter-to-sqlite/issues/57#issuecomment-902328760 https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/57 IC_kwDODEm0Qs41yHG4 simonw 9599 2021-08-19T23:57:41Z 2021-08-19T23:57:41Z MEMBER

Weird, added debug code and got this: {'screen_name': 'simonw', 'count': 200, 'since_id': 'False', 'tweet_mode': 'extended'} - so maybe it's a twitter-to-sqlite bug where somehow the string False is being passed somewhere.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Error: Use either --since or --since_id, not both 907645813  
902328369 https://github.com/dogsheep/twitter-to-sqlite/issues/57#issuecomment-902328369 https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/57 IC_kwDODEm0Qs41yHAx simonw 9599 2021-08-19T23:56:26Z 2021-08-19T23:56:26Z MEMBER

https://developer.twitter.com/en/docs/twitter-api/v1/tweets/timelines/api-reference/get-statuses-user_timeline says the API has been replaced by the new v2 one, but it should still work - and the since_id parameter is still documented on that page.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Error: Use either --since or --since_id, not both 907645813  
902327457 https://github.com/dogsheep/twitter-to-sqlite/issues/57#issuecomment-902327457 https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/57 IC_kwDODEm0Qs41yGyh simonw 9599 2021-08-19T23:53:25Z 2021-08-19T23:53:25Z MEMBER

I'm getting this too. Looking into it now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Error: Use either --since or --since_id, not both 907645813  
886241674 https://github.com/dogsheep/hacker-news-to-sqlite/issues/3#issuecomment-886241674 https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/3 IC_kwDODtX3eM400vmK simonw 9599 2021-07-25T18:41:17Z 2021-07-25T18:41:17Z MEMBER

Got a TIL out of this: https://til.simonwillison.net/jq/extracting-objects-recursively

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Use HN algolia endpoint to retrieve trees 952189173  
886237834 https://github.com/dogsheep/hacker-news-to-sqlite/issues/3#issuecomment-886237834 https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/3 IC_kwDODtX3eM400uqK simonw 9599 2021-07-25T18:05:32Z 2021-07-25T18:05:32Z MEMBER

If you hit the endpoint for a comment that's part of a thread you get that comment and its recursive children: https://hn.algolia.com/api/v1/items/27941552

You can tell that it's not the top-level because the parent_id isn't null. You can use story_id to figure out what the top-level item is.

json { "id": 27941552, "created_at": "2021-07-24T15:08:39.000Z", "created_at_i": 1627139319, "type": "comment", "author": "nine_k", "title": null, "url": null, "text": "<p>I wish ...", "points": null, "parent_id": 27941108, "story_id": 27941108 }

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Use HN algolia endpoint to retrieve trees 952189173  
886142671 https://github.com/dogsheep/hacker-news-to-sqlite/issues/3#issuecomment-886142671 https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/3 IC_kwDODtX3eM400XbP simonw 9599 2021-07-25T03:51:05Z 2021-07-25T03:51:05Z MEMBER

Prototype:

curl 'https://hn.algolia.com/api/v1/items/27941108' \
  | jq '[recurse(.children[]) | del(.children)]' \
  | sqlite-utils insert hn.db items - --pk id
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Use HN algolia endpoint to retrieve trees 952189173  
886140431 https://github.com/dogsheep/hacker-news-to-sqlite/issues/2#issuecomment-886140431 https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2 IC_kwDODtX3eM400W4P simonw 9599 2021-07-25T03:12:57Z 2021-07-25T03:12:57Z MEMBER

I'm going to build a general-purpose hacker-new-to-sqlite search ... command, where one of the options is to search within the URL.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Command for fetching Hacker News threads from the search API 952179830  
886136224 https://github.com/dogsheep/hacker-news-to-sqlite/issues/2#issuecomment-886136224 https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2 IC_kwDODtX3eM400V2g simonw 9599 2021-07-25T02:08:29Z 2021-07-25T02:08:29Z MEMBER

Prototype:

curl "https://hn.algolia.com/api/v1/search_by_date?query=simonwillison.net&restrictSearchableAttributes=url&hitsPerPage=1000" | \
  jq .hits | sqlite-utils insert hn.db items - --pk objectID --alter
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Command for fetching Hacker News threads from the search API 952179830  
886135922 https://github.com/dogsheep/hacker-news-to-sqlite/issues/2#issuecomment-886135922 https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2 IC_kwDODtX3eM400Vxy simonw 9599 2021-07-25T02:06:20Z 2021-07-25T02:06:20Z MEMBER

https://hn.algolia.com/api/v1/search_by_date?query=simonwillison.net&restrictSearchableAttributes=url looks like it does what I want.

https://hn.algolia.com/api/v1/search_by_date?query=simonwillison.net&restrictSearchableAttributes=url&hitsPerPage=1000 - returns 1000 at once.

Otherwise you have to paginate using &page=2 etc - up to nbPages pages.

https://www.algolia.com/doc/api-reference/api-parameters/hitsPerPage/ says 1000 is the maximum.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Command for fetching Hacker News threads from the search API 952179830  
886135562 https://github.com/dogsheep/hacker-news-to-sqlite/issues/2#issuecomment-886135562 https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2 IC_kwDODtX3eM400VsK simonw 9599 2021-07-25T02:01:11Z 2021-07-25T02:01:11Z MEMBER

That page doesn't have an API but does look easy to scrape.

The other option here is the HN Search API powered by Algolia, documented at https://hn.algolia.com/api

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Command for fetching Hacker News threads from the search API 952179830  
879477586 https://github.com/dogsheep/healthkit-to-sqlite/issues/12#issuecomment-879477586 https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/12 MDEyOklzc3VlQ29tbWVudDg3OTQ3NzU4Ng== simonw 9599 2021-07-13T23:50:06Z 2021-07-13T23:50:06Z MEMBER

Unfortunately I don't think updating the database is practical, because the export doesn't include unique identifiers which can be used to update existing records and create new ones. Recreating from scratch works around that limitation.

I've not explored workouts with SpatiaLite but that's a really good idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Some workout columns should be float, not text 727848625  
861042050 https://github.com/dogsheep/github-to-sqlite/issues/64#issuecomment-861042050 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64 MDEyOklzc3VlQ29tbWVudDg2MTA0MjA1MA== simonw 9599 2021-06-14T22:45:42Z 2021-06-14T22:45:42Z MEMBER

I'm definitely interested in supporting events in this tool - see #14.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
feature: support "events" 920636216  
861041597 https://github.com/dogsheep/github-to-sqlite/issues/64#issuecomment-861041597 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64 MDEyOklzc3VlQ29tbWVudDg2MTA0MTU5Nw== simonw 9599 2021-06-14T22:44:54Z 2021-06-14T22:44:54Z MEMBER

Have you found a way to access events in GraphQL? I can only see way to access a timeline of events for a single issue or a single pull request. See also https://github.community/t/get-event-equivalent-for-v4/13600/2

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
feature: support "events" 920636216  
844250232 https://github.com/dogsheep/github-to-sqlite/pull/59#issuecomment-844250232 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/59 MDEyOklzc3VlQ29tbWVudDg0NDI1MDIzMg== simonw 9599 2021-05-19T16:08:10Z 2021-05-19T16:08:10Z MEMBER

Thanks for catching this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Remove unneeded exists=True for -a/--auth flag. 771872303  
844249385 https://github.com/dogsheep/github-to-sqlite/pull/61#issuecomment-844249385 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/61 MDEyOklzc3VlQ29tbWVudDg0NDI0OTM4NQ== simonw 9599 2021-05-19T16:07:06Z 2021-05-19T16:07:06Z MEMBER

Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
fixing typo in get cli help text 797108702  
790695126 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790695126 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDY5NTEyNg== simonw 9599 2021-03-04T15:20:42Z 2021-03-04T15:20:42Z MEMBER

I'm not sure why but my most recent import, when displayed in Datasette, looks like this:

Sorting by id in the opposite order gives me the data I would expect - so it looks like a bunch of null/blank messages are being imported at some point and showing up first due to ID ordering.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
790693674 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790693674 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDY5MzY3NA== simonw 9599 2021-03-04T15:18:36Z 2021-03-04T15:18:36Z MEMBER

I imported my 10GB mbox with 750,000 emails in it, ran this tool (with a hacked fix for the blob column problem) - and now a search that returns 92 results takes 25.37ms! This is fantastic.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
790669767 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790669767 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDY2OTc2Nw== simonw 9599 2021-03-04T14:46:06Z 2021-03-04T14:46:06Z MEMBER

Solution could be to pre-process that string by splitting on ( and dropping everything afterwards, assuming that the (...) bit isn't necessary for correctly parsing the date.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
790668263 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790668263 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDY2ODI2Mw== simonw 9599 2021-03-04T14:43:58Z 2021-03-04T14:43:58Z MEMBER

I added this code to output a message ID on errors: diff print("Errors: {}".format(num_errors)) print(traceback.format_exc()) + print("Message-Id: {}".format(email.get("Message-Id", "None"))) continue Having found a message ID that had an error, I ran this command to see the context:

rg --text --context 20 '44F289B0.000001.02100@SCHWARZE-DWFXMI' ~/gmail.mbox

This was for the following error: File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 102, in get_mbox message["date"] = get_message_date(email.get("Date"), email.get_from()) File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 178, in get_message_date datetime_tuple = email.utils.parsedate_tz(mail_date) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 50, in parsedate_tz res = _parsedate_tz(data) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 69, in _parsedate_tz data = data.split() AttributeError: 'Header' object has no attribute 'split' Here's what I spotted in the ripgrep output: 177133570:Message-Id: <44F289B0.000001.02100@SCHWARZE-DWFXMI> 177133571-Date: Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop�ische Sommerzeit) 177133572-X-Mailer: IncrediMail (5002253) So it could it be that _parsedate_tz is having trouble with that Mon, 28 Aug 2006 08:14:08 +0200 (Westeurop�ische Sommerzeit) string.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
790384087 https://github.com/dogsheep/google-takeout-to-sqlite/issues/6#issuecomment-790384087 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/6 MDEyOklzc3VlQ29tbWVudDc5MDM4NDA4Nw== simonw 9599 2021-03-04T07:22:51Z 2021-03-04T07:22:51Z MEMBER

3 also mentions the conflicting version with other tools.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Upgrade to latest sqlite-utils 821841046  
790380839 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790380839 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDM4MDgzOQ== simonw 9599 2021-03-04T07:17:05Z 2021-03-04T07:17:05Z MEMBER

Looks like you're doing this: python elif message.get_content_type() == "text/plain": body = message.get_payload(decode=True) So presumably that decodes to a unicode string?

I imagine the reason the column is a BLOB for me is that sqlite-utils determines the column type based on the first batch of items - https://github.com/simonw/sqlite-utils/blob/09c3386f55f766b135b6a1c00295646c4ae29bec/sqlite_utils/db.py#L1927-L1928 - and I got unlucky and had something in my first batch that wasn't a unicode string.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
790379629 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790379629 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDM3OTYyOQ== simonw 9599 2021-03-04T07:14:41Z 2021-03-04T07:14:41Z MEMBER

Confirmed: removing the len() call does not speed things up, so it's reading through the entire file for some other purpose too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
790378658 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790378658 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDM3ODY1OA== simonw 9599 2021-03-04T07:12:48Z 2021-03-04T07:12:48Z MEMBER

It looks like the body is being loaded into a BLOB column - so in Datasette default it looks like this:

If I datasette install datasette-render-binary and then try again I get this:

It would be great if we could store the body as unicode text instead. May have to do something clever to decode it based on some kind of charset header?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
790373024 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790373024 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDM3MzAyNA== simonw 9599 2021-03-04T07:01:58Z 2021-03-04T07:04:06Z MEMBER

I got 9 warnings that look like this: Errors: 1 Traceback (most recent call last): File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 103, in get_mbox message["date"] = get_message_date(email.get("Date"), email.get_from()) File "/Users/simon/Dropbox/Development/google-takeout-to-sqlite/google_takeout_to_sqlite/utils.py", line 167, in get_message_date datetime_tuple = email.utils.parsedate_tz(mail_date) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 50, in parsedate_tz res = _parsedate_tz(data) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/email/_parseaddr.py", line 69, in _parsedate_tz data = data.split() AttributeError: 'Header' object has no attribute 'split' It would be useful if those warnings told me the message ID (or similar) of the affected message so I could grep for it in the mbox and see what was going on.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
790372621 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790372621 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDM3MjYyMQ== simonw 9599 2021-03-04T07:01:18Z 2021-03-04T07:01:18Z MEMBER

I'm not sure if it would work, but there is an alternative pattern for showing a progress bar against a really large file that I've used in healthkit-to-sqlite - you set the progress bar size to the size of the file in bytes, then update a counter as you read the file.

https://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/cli.py#L24-L57 and https://github.com/dogsheep/healthkit-to-sqlite/blob/3eb2b06bfe3b4faaf10e9cf9dfcb28e3d16c14ff/healthkit_to_sqlite/utils.py#L4-L19 (the progress_callback() bit) is where that happens.

It can be a bit of a convoluted pattern, and I'm not at all sure it would work for mbox files since it looks like that library has other reasons it needs to do a file scan rather than streaming it through one chunk of bytes at a time. So I imagine this would not work here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
790370485 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790370485 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDM3MDQ4NQ== simonw 9599 2021-03-04T06:57:25Z 2021-03-04T06:57:48Z MEMBER

The command takes quite a while to start running, presumably because this line causes it to have to scan the WHOLE file in order to generate a count:

https://github.com/dogsheep/google-takeout-to-sqlite/blob/a3de045eba0fae4b309da21aa3119102b0efc576/google_takeout_to_sqlite/utils.py#L66-L67

I'm fine with waiting though. It's not like this is a command people run every day - and without that count we can't show a progress bar, which seems pretty important for a process that takes this long.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
790369076 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790369076 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDM2OTA3Ng== simonw 9599 2021-03-04T06:54:46Z 2021-03-04T06:54:46Z MEMBER

The Rich-powered progress bar is pretty:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
790312268 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790312268 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc5MDMxMjI2OA== simonw 9599 2021-03-04T05:48:16Z 2021-03-04T05:48:16Z MEMBER

Wow, my mbox is a 10.35 GB download!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
786925280 https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-786925280 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDc4NjkyNTI4MA== simonw 9599 2021-02-26T22:23:10Z 2021-02-26T22:23:10Z MEMBER

Thanks!

I requested my Gmail export from takeout - once that arrives I'll test it against this and then merge the PR.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
WIP: Add Gmail takeout mbox import 813880401  
777839351 https://github.com/dogsheep/evernote-to-sqlite/pull/10#issuecomment-777839351 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/10 MDEyOklzc3VlQ29tbWVudDc3NzgzOTM1MQ== simonw 9599 2021-02-11T22:37:55Z 2021-02-11T22:37:55Z MEMBER

I've merged these changes by hand now, thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
BugFix for encoding and not update info. 770712149  
777827396 https://github.com/dogsheep/evernote-to-sqlite/issues/7#issuecomment-777827396 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/7 MDEyOklzc3VlQ29tbWVudDc3NzgyNzM5Ng== simonw 9599 2021-02-11T22:13:14Z 2021-02-11T22:13:14Z MEMBER

My best guess is that you have an older version of sqlite-utils installed here - the replace=True argument was added in version 2.0. I've bumped the dependency in setup.py.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
evernote-to-sqlite on windows 10 give this error: TypeError: insert() got an unexpected keyword argument 'replace' 743297582  
777821383 https://github.com/dogsheep/evernote-to-sqlite/issues/9#issuecomment-777821383 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/9 MDEyOklzc3VlQ29tbWVudDc3NzgyMTM4Mw== simonw 9599 2021-02-11T22:01:28Z 2021-02-11T22:01:28Z MEMBER

Aha! I think I've figured out what's going on here.

The CData blocks containing the notes look like this:

<![CDATA[<!DOCTYPE en-note SYSTEM "http://xml.evernote.com/pub/enml2.dtd"><en-note><div>This note includes two images.</div><div><br /></div>...

The DTD at http://xml.evernote.com/pub/enml2.dtd includes some entities:

```

%HTMLlat1;

%HTMLsymbol;

%HTMLspecial; `` So I need to be able to handle all of those different entities. I think I can do that usinghtml.entities.entitydefs` from the Python standard library, which looks a bit like this:

```python {'Aacute': 'Á', 'aacute': 'á', 'Aacute;': 'Á', 'aacute;': 'á', 'Abreve;': 'Ă', 'abreve;': 'ă', 'ac;': '∾', 'acd;': '∿',

...

} ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
ParseError: undefined entity &scaron; 748372469  
777798330 https://github.com/dogsheep/evernote-to-sqlite/issues/11#issuecomment-777798330 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/11 MDEyOklzc3VlQ29tbWVudDc3Nzc5ODMzMA== simonw 9599 2021-02-11T21:18:58Z 2021-02-11T21:18:58Z MEMBER

Thanks for the fix!

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
XML parse error 792851444  
770071568 https://github.com/dogsheep/github-to-sqlite/issues/60#issuecomment-770071568 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/60 MDEyOklzc3VlQ29tbWVudDc3MDA3MTU2OA== simonw 9599 2021-01-29T21:56:15Z 2021-01-29T21:56:15Z MEMBER

I really like the way you're using pipes here - really smart. It's similar to how I build the demo database in this GitHub Actions workflow:

https://github.com/dogsheep/github-to-sqlite/blob/62dfd3bc4014b108200001ef4bc746feb6f33b45/.github/workflows/deploy-demo.yml#L52-L82

twitter-to-sqlite actually has a mechanism for doing this kind of thing, documented at https://github.com/dogsheep/twitter-to-sqlite#providing-input-from-a-sql-query-with---sql-and---attach

It lets you do things like:

$ twitter-to-sqlite users-lookup my.db --sql="select follower_id from following" --ids Maybe I should add something similar to github-to-sqlite? Feels like it could be really useful.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Use Data from SQLite in other commands 797097140  
769957751 https://github.com/dogsheep/twitter-to-sqlite/issues/56#issuecomment-769957751 https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/56 MDEyOklzc3VlQ29tbWVudDc2OTk1Nzc1MQ== simonw 9599 2021-01-29T17:59:40Z 2021-01-29T17:59:40Z MEMBER

This is interesting - how did you create that initial table? Was this using the twitter-to-sqlite import archive.db ~/Downloads/twitter-2019-06-25-b31f2.zip command, or something else?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Not all quoted statuses get fetched? 796736607  
761967094 https://github.com/dogsheep/swarm-to-sqlite/issues/11#issuecomment-761967094 https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/11 MDEyOklzc3VlQ29tbWVudDc2MTk2NzA5NA== simonw 9599 2021-01-18T04:11:13Z 2021-01-18T04:11:13Z MEMBER

I just got a similar error:

``` File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/swarm_to_sqlite/utils.py", line 79, in save_checkin checkins_table.m2m("users", user, m2m_table="with", pk="id") File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 2048, in m2m id = other_table.insert(record, pk=pk, replace=True).last_pk File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 1781, in insert return self.insert_all( File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 1899, in insert_all self.insert_chunk( File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 1709, in insert_chunk result = self.db.execute(query, params) File "/home/dogsheep/datasette-venv/lib/python3.8/site-packages/sqlite_utils/db.py", line 226, in execute return self.conn.execute(sql, parameters) pysqlite3.dbapi2.OperationalError: table users has no column named countryCode

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Error thrown: sqlite3.OperationalError: table users has no column named lastName 743400216  
748426877 https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426877 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 MDEyOklzc3VlQ29tbWVudDc0ODQyNjg3Nw== simonw 9599 2020-12-19T06:16:11Z 2020-12-19T06:16:11Z MEMBER

Here's why:

if "fts5" in str(e):

But the error being raised here is:

sqlite3.OperationalError: no such column: to

I'm going to attempt the escaped on on every error.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Searching for "github-to-sqlite" throws an error 771316301  
748426663 https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426663 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 MDEyOklzc3VlQ29tbWVudDc0ODQyNjY2Mw== simonw 9599 2020-12-19T06:14:06Z 2020-12-19T06:14:06Z MEMBER

Looks like I already do that here: https://github.com/dogsheep/dogsheep-beta/blob/9ba4401017ac24ffa3bc1db38e0910ea49de7616/dogsheep_beta/init.py#L141-L146

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Searching for "github-to-sqlite" throws an error 771316301  
748426581 https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426581 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 MDEyOklzc3VlQ29tbWVudDc0ODQyNjU4MQ== simonw 9599 2020-12-19T06:13:17Z 2020-12-19T06:13:17Z MEMBER

One fix for this could be to try running the raw query, but if it throws an error run it again with the query escaped.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Searching for "github-to-sqlite" throws an error 771316301  
748426501 https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426501 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 MDEyOklzc3VlQ29tbWVudDc0ODQyNjUwMQ== simonw 9599 2020-12-19T06:12:22Z 2020-12-19T06:12:22Z MEMBER

I deliberately added support for advanced FTS in https://github.com/dogsheep/dogsheep-beta/commit/cbb2491b85d7ff416d6d429b60109e6c2d6d50b9 for #13 but that's the cause of this bug.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Searching for "github-to-sqlite" throws an error 771316301  
747126777 https://github.com/dogsheep/google-takeout-to-sqlite/issues/2#issuecomment-747126777 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/2 MDEyOklzc3VlQ29tbWVudDc0NzEyNjc3Nw== simonw 9599 2020-12-17T00:36:52Z 2020-12-17T00:36:52Z MEMBER

The memory profiler tricks I used in https://github.com/dogsheep/healthkit-to-sqlite/issues/7 could help figure out what's going on here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
killed by oomkiller on large location-history 769376447  
747034481 https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747034481 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 MDEyOklzc3VlQ29tbWVudDc0NzAzNDQ4MQ== simonw 9599 2020-12-16T21:17:05Z 2020-12-16T21:17:05Z MEMBER

I'm just going to add q for the moment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add search highlighting snippets 724759588  
747031608 https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747031608 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 MDEyOklzc3VlQ29tbWVudDc0NzAzMTYwOA== simonw 9599 2020-12-16T21:15:18Z 2020-12-16T21:15:18Z MEMBER

Should I pass any other details to the display_sql here as well?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add search highlighting snippets 724759588  
747030964 https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747030964 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 MDEyOklzc3VlQ29tbWVudDc0NzAzMDk2NA== simonw 9599 2020-12-16T21:14:54Z 2020-12-16T21:14:54Z MEMBER

To do this I'll need the search term to be passed to the display_sql SQL query: https://github.com/dogsheep/dogsheep-beta/blob/4890ec87b5e2ec48940f32c9ad1f5aae25c75a4d/dogsheep_beta/init.py#L164-L171

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add search highlighting snippets 724759588  
747029636 https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747029636 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 MDEyOklzc3VlQ29tbWVudDc0NzAyOTYzNg== simonw 9599 2020-12-16T21:14:03Z 2020-12-16T21:14:03Z MEMBER

I think I can do this as a cunning trick in display_sql. Consider this example query: https://til.simonwillison.net/tils?sql=select%0D%0A++path%2C%0D%0A++snippet%28til_fts%2C+-1%2C+%27b4de2a49c8%27%2C+%278c94a2ed4b%27%2C+%27...%27%2C+60%29+as+snippet%0D%0Afrom%0D%0A++til%0D%0A++join+til_fts+on+til.rowid+%3D+til_fts.rowid%0D%0Awhere%0D%0A++til_fts+match+escape_fts%28%3Aq%29%0D%0A++and+path+%3D+%27asgi_lifespan-test-httpx.md%27%0D%0A&q=pytest

sql select path, snippet(til_fts, -1, 'b4de2a49c8', '8c94a2ed4b', '...', 60) as snippet from til join til_fts on til.rowid = til_fts.rowid where til_fts match escape_fts(:q) and path = 'asgi_lifespan-test-httpx.md' The and path = 'asgi_lifespan-test-httpx.md' bit means we only get back a specific document - but the snippet highlighting is applied to it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add search highlighting snippets 724759588  
746735889 https://github.com/dogsheep/github-to-sqlite/issues/58#issuecomment-746735889 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/58 MDEyOklzc3VlQ29tbWVudDc0NjczNTg4OQ== simonw 9599 2020-12-16T17:59:50Z 2020-12-16T17:59:50Z MEMBER

I don't want to add a full HTML parser (like BeautifulSoup) as a dependency for this feature. Since the HTML comes from a single, trusted source (GitHub) I could probably handle this using regular expressions.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Readme HTML has broken internal links 769150394  
746734412 https://github.com/dogsheep/github-to-sqlite/issues/58#issuecomment-746734412 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/58 MDEyOklzc3VlQ29tbWVudDc0NjczNDQxMg== simonw 9599 2020-12-16T17:58:56Z 2020-12-16T17:58:56Z MEMBER

I'm going to rewrite those <a href="#filtering-tables"> links to <a href="#user-content-filtering-tables"> - but only if a corresponding id="user-content-filtering-tables" element exists.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Readme HTML has broken internal links 769150394  
739058820 https://github.com/dogsheep/dogsheep-photos/pull/29#issuecomment-739058820 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/29 MDEyOklzc3VlQ29tbWVudDczOTA1ODgyMA== simonw 9599 2020-12-04T22:32:35Z 2020-12-04T22:32:35Z MEMBER

Thanks for this!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fixed bug in SQL query for photo scores 638375985  
735485677 https://github.com/dogsheep/github-to-sqlite/issues/53#issuecomment-735485677 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/53 MDEyOklzc3VlQ29tbWVudDczNTQ4NTY3Nw== simonw 9599 2020-11-30T00:36:09Z 2020-11-30T00:36:09Z MEMBER

Given rate limits (see #51) this command might be better implemented by running a git clone into a temporary directory - doing so would retrieve all of the files in one go.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Command for fetching file contents 753000405  
735484186 https://github.com/dogsheep/github-to-sqlite/issues/51#issuecomment-735484186 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/51 MDEyOklzc3VlQ29tbWVudDczNTQ4NDE4Ng== simonw 9599 2020-11-30T00:29:31Z 2020-11-30T00:29:31Z MEMBER

This just caused a failure in deploying the demo: https://github.com/dogsheep/github-to-sqlite/runs/1471304407?check_suite_focus=true

File "/opt/hostedtoolcache/Python/3.8.6/x64/bin/github-to-sqlite", line 33, in <module> sys.exit(load_entry_point('github-to-sqlite', 'console_scripts', 'github-to-sqlite')()) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/cli.py", line 142, in issue_comments for comment in utils.fetch_issue_comments(repo, token, issue): File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/utils.py", line 380, in fetch_issue_comments for comments in paginate(url, headers): File "/home/runner/work/github-to-sqlite/github-to-sqlite/github_to_sqlite/utils.py", line 472, in paginate raise GitHubError.from_response(response) github_to_sqlite.utils.GitHubError: ('API rate limit exceeded for user ID 9599.', 403) Error: Process completed with exit code 1.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
github-to-sqlite should handle rate limits better 703246031  
735483820 https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735483820 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46 MDEyOklzc3VlQ29tbWVudDczNTQ4MzgyMA== simonw 9599 2020-11-30T00:27:47Z 2020-11-30T00:27:47Z MEMBER

So it looks like anything that pulls reviews needs to pull each review, then for each one pull the comments.

I'm going to consider this blocked on smarter rate limit handling in #51.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Feature: pull request reviews and comments 664485022  
735483604 https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735483604 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46 MDEyOklzc3VlQ29tbWVudDczNTQ4MzYwNA== simonw 9599 2020-11-30T00:26:50Z 2020-11-30T00:26:50Z MEMBER

It seems like there's a lot missing from that - those aren't particularly interesting given the data that is returned.

From the docs at https://docs.github.com/en/free-pro-team@latest/rest/reference/pulls#reviews it looks like each review consists of multiple comments, and the comments are where the useful material is - https://docs.github.com/en/free-pro-team@latest/rest/reference/pulls#list-comments-for-a-pull-request-review

github-to-sqlite get https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48/reviews/503368921/comments --accept 'application/vnd.github.v3+json'

json [ { "id": 500603838, "node_id": "MDI0OlB1bGxSZXF1ZXN0UmV2aWV3Q29tbWVudDUwMDYwMzgzOA==", "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/comments/500603838", "pull_request_review_id": 503368921, "diff_hunk": "@@ -0,0 +1,370 @@\n+[\n+ {\n+ \"url\": \"https://api.github.com/repos/simonw/datasette/pulls/571\",\n+ \"id\": 313384926,\n+ \"node_id\": \"MDExOlB1bGxSZXF1ZXN0MzEzMzg0OTI2\",\n+ \"html_url\": \"https://github.com/simonw/datasette/pull/571\",\n+ \"diff_url\": \"https://github.com/simonw/datasette/pull/571.diff\",\n+ \"patch_url\": \"https://github.com/simonw/datasette/pull/571.patch\",\n+ \"issue_url\": \"https://api.github.com/repos/simonw/datasette/issues/571\",\n+ \"number\": 571,\n+ \"state\": \"closed\",\n+ \"locked\": false,\n+ \"title\": \"detect_fts now works with alternative table escaping\",\n+ \"user\": {\n+ \"login\": \"simonw\",\n+ \"id\": 9599,\n+ \"node_id\": \"MDQ6VXNlcjk1OTk=\",\n+ \"avatar_url\": \"https://avatars0.githubusercontent.com/u/9599?v=4\",\n+ \"gravatar_id\": \"\",\n+ \"url\": \"https://api.github.com/users/simonw\",\n+ \"html_url\": \"https://github.com/simonw\",\n+ \"followers_url\": \"https://api.github.com/users/simonw/followers\",\n+ \"following_url\": \"https://api.github.com/users/simonw/following{/other_user}\",\n+ \"gists_url\": \"https://api.github.com/users/simonw/gists{/gist_id}\",\n+ \"starred_url\": \"https://api.github.com/users/simonw/starred{/owner}{/repo}\",\n+ \"subscriptions_url\": \"https://api.github.com/users/simonw/subscriptions\",\n+ \"organizations_url\": \"https://api.github.com/users/simonw/orgs\",\n+ \"repos_url\": \"https://api.github.com/users/simonw/repos\",\n+ \"events_url\": \"https://api.github.com/users/simonw/events{/privacy}\",\n+ \"received_events_url\": \"https://api.github.com/users/simonw/received_events\",\n+ \"type\": \"User\",\n+ \"site_admin\": false\n+ },\n+ \"body\": \"Fixes #570\",\n+ \"created_at\": \"2019-09-03T00:23:39Z\",\n+ \"updated_at\": \"2019-09-03T00:32:28Z\",\n+ \"closed_at\": \"2019-09-03T00:32:28Z\",\n+ \"merged_at\": \"2019-09-03T00:32:28Z\",\n+ \"merge_commit_sha\": \"2dc5c8dc259a0606162673d394ba8cc1c6f54428\",\n+ \"assignee\": null,\n+ \"assignees\": [\n+\n+ ],\n+ \"requested_reviewers\": [\n+\n+ ],\n+ \"requested_teams\": [\n+\n+ ],\n+ \"labels\": [\n+\n+ ],\n+ \"milestone\": null,\n+ \"draft\": false,\n+ \"commits_url\": \"https://api.github.com/repos/simonw/datasette/pulls/571/commits\",\n+ \"review_comments_url\": \"https://api.github.com/repos/simonw/datasette/pulls/571/comments\",\n+ \"review_comment_url\": \"https://api.github.com/repos/simonw/datasette/pulls/comments{/number}\",\n+ \"comments_url\": \"https://api.github.com/repos/simonw/datasette/issues/571/comments\",\n+ \"statuses_url\": \"https://api.github.com/repos/simonw/datasette/statuses/a85239f69261c10f1a9f90514c8b5d113cb94585\",\n+ \"head\": {\n+ \"label\": \"simonw:detect-fts\",\n+ \"ref\": \"detect-fts\",\n+ \"sha\": \"a85239f69261c10f1a9f90514c8b5d113cb94585\",\n+ \"user\": {\n+ \"login\": \"simonw\",\n+ \"id\": 9599,\n+ \"node_id\": \"MDQ6VXNlcjk1OTk=\",\n+ \"avatar_url\": \"https://avatars0.githubusercontent.com/u/9599?v=4\",\n+ \"gravatar_id\": \"\",\n+ \"url\": \"https://api.github.com/users/simonw\",\n+ \"html_url\": \"https://github.com/simonw\",\n+ \"followers_url\": \"https://api.github.com/users/simonw/followers\",\n+ \"following_url\": \"https://api.github.com/users/simonw/following{/other_user}\",\n+ \"gists_url\": \"https://api.github.com/users/simonw/gists{/gist_id}\",\n+ \"starred_url\": \"https://api.github.com/users/simonw/starred{/owner}{/repo}\",\n+ \"subscriptions_url\": \"https://api.github.com/users/simonw/subscriptions\",\n+ \"organizations_url\": \"https://api.github.com/users/simonw/orgs\",\n+ \"repos_url\": \"https://api.github.com/users/simonw/repos\",\n+ \"events_url\": \"https://api.github.com/users/simonw/events{/privacy}\",\n+ \"received_events_url\": \"https://api.github.com/users/simonw/received_events\",\n+ \"type\": \"User\",\n+ \"site_admin\": false\n+ },\n+ \"repo\": {\n+ \"id\": 107914493,\n+ \"node_id\": \"MDEwOlJlcG9zaXRvcnkxMDc5MTQ0OTM=\",\n+ \"name\": \"datasette\",\n+ \"full_name\": \"simonw/datasette\",\n+ \"private\": false,\n+ \"owner\": {\n+ \"login\": \"simonw\",\n+ \"id\": 9599,\n+ \"node_id\": \"MDQ6VXNlcjk1OTk=\",\n+ \"avatar_url\": \"https://avatars0.githubusercontent.com/u/9599?v=4\",\n+ \"gravatar_id\": \"\",\n+ \"url\": \"https://api.github.com/users/simonw\",\n+ \"html_url\": \"https://github.com/simonw\",\n+ \"followers_url\": \"https://api.github.com/users/simonw/followers\",\n+ \"following_url\": \"https://api.github.com/users/simonw/following{/other_user}\",\n+ \"gists_url\": \"https://api.github.com/users/simonw/gists{/gist_id}\",\n+ \"starred_url\": \"https://api.github.com/users/simonw/starred{/owner}{/repo}\",\n+ \"subscriptions_url\": \"https://api.github.com/users/simonw/subscriptions\",\n+ \"organizations_url\": \"https://api.github.com/users/simonw/orgs\",\n+ \"repos_url\": \"https://api.github.com/users/simonw/repos\",\n+ \"events_url\": \"https://api.github.com/users/simonw/events{/privacy}\",\n+ \"received_events_url\": \"https://api.github.com/users/simonw/received_events\",\n+ \"type\": \"User\",\n+ \"site_admin\": false\n+ },\n+ \"html_url\": \"https://github.com/simonw/datasette\",\n+ \"description\": \"An open source multi-tool for exploring and publishing data\",\n+ \"fork\": false,\n+ \"url\": \"https://api.github.com/repos/simonw/datasette\",\n+ \"forks_url\": \"https://api.github.com/repos/simonw/datasette/forks\",\n+ \"keys_url\": \"https://api.github.com/repos/simonw/datasette/keys{/key_id}\",\n+ \"collaborators_url\": \"https://api.github.com/repos/simonw/datasette/collaborators{/collaborator}\",\n+ \"teams_url\": \"https://api.github.com/repos/simonw/datasette/teams\",\n+ \"hooks_url\": \"https://api.github.com/repos/simonw/datasette/hooks\",\n+ \"issue_events_url\": \"https://api.github.com/repos/simonw/datasette/issues/events{/number}\",\n+ \"events_url\": \"https://api.github.com/repos/simonw/datasette/events\",\n+ \"assignees_url\": \"https://api.github.com/repos/simonw/datasette/assignees{/user}\",\n+ \"branches_url\": \"https://api.github.com/repos/simonw/datasette/branches{/branch}\",\n+ \"tags_url\": \"https://api.github.com/repos/simonw/datasette/tags\",\n+ \"blobs_url\": \"https://api.github.com/repos/simonw/datasette/git/blobs{/sha}\",\n+ \"git_tags_url\": \"https://api.github.com/repos/simonw/datasette/git/tags{/sha}\",\n+ \"git_refs_url\": \"https://api.github.com/repos/simonw/datasette/git/refs{/sha}\",\n+ \"trees_url\": \"https://api.github.com/repos/simonw/datasette/git/trees{/sha}\",\n+ \"statuses_url\": \"https://api.github.com/repos/simonw/datasette/statuses/{sha}\",\n+ \"languages_url\": \"https://api.github.com/repos/simonw/datasette/languages\",\n+ \"stargazers_url\": \"https://api.github.com/repos/simonw/datasette/stargazers\",\n+ \"contributors_url\": \"https://api.github.com/repos/simonw/datasette/contributors\",\n+ \"subscribers_url\": \"https://api.github.com/repos/simonw/datasette/subscribers\",\n+ \"subscription_url\": \"https://api.github.com/repos/simonw/datasette/subscription\",\n+ \"commits_url\": \"https://api.github.com/repos/simonw/datasette/commits{/sha}\",\n+ \"git_commits_url\": \"https://api.github.com/repos/simonw/datasette/git/commits{/sha}\",\n+ \"comments_url\": \"https://api.github.com/repos/simonw/datasette/comments{/number}\",\n+ \"issue_comment_url\": \"https://api.github.com/repos/simonw/datasette/issues/comments{/number}\",\n+ \"contents_url\": \"https://api.github.com/repos/simonw/datasette/contents/{+path}\",\n+ \"compare_url\": \"https://api.github.com/repos/simonw/datasette/compare/{base}...{head}\",\n+ \"merges_url\": \"https://api.github.com/repos/simonw/datasette/merges\",\n+ \"archive_url\": \"https://api.github.com/repos/simonw/datasette/{archive_format}{/ref}\",\n+ \"downloads_url\": \"https://api.github.com/repos/simonw/datasette/downloads\",\n+ \"issues_url\": \"https://api.github.com/repos/simonw/datasette/issues{/number}\",\n+ \"pulls_url\": \"https://api.github.com/repos/simonw/datasette/pulls{/number}\",\n+ \"milestones_url\": \"https://api.github.com/repos/simonw/datasette/milestones{/number}\",\n+ \"notifications_url\": \"https://api.github.com/repos/simonw/datasette/notifications{?since,all,participating}\",\n+ \"labels_url\": \"https://api.github.com/repos/simonw/datasette/labels{/name}\",\n+ \"releases_url\": \"https://api.github.com/repos/simonw/datasette/releases{/id}\",\n+ \"deployments_url\": \"https://api.github.com/repos/simonw/datasette/deployments\",\n+ \"created_at\": \"2017-10-23T00:39:03Z\",\n+ \"updated_at\": \"2020-07-27T20:42:15Z\",\n+ \"pushed_at\": \"2020-07-26T01:21:05Z\",\n+ \"git_url\": \"git://github.com/simonw/datasette.git\",\n+ \"ssh_url\": \"git@github.com:simonw/datasette.git\",\n+ \"clone_url\": \"https://github.com/simonw/datasette.git\",\n+ \"svn_url\": \"https://github.com/simonw/datasette\",\n+ \"homepage\": \"http://datasette.readthedocs.io/\",\n+ \"size\": 3487,\n+ \"stargazers_count\": 3642,\n+ \"watchers_count\": 3642,\n+ \"language\": \"Python\",\n+ \"has_issues\": true,\n+ \"has_projects\": false,\n+ \"has_downloads\": true,\n+ \"has_wiki\": true,\n+ \"has_pages\": false,\n+ \"forks_count\": 206,\n+ \"mirror_url\": null,\n+ \"archived\": false,\n+ \"disabled\": false,\n+ \"open_issues_count\": 190,\n+ \"license\": {\n+ \"key\": \"apache-2.0\",\n+ \"name\": \"Apache License 2.0\",\n+ \"spdx_id\": \"Apache-2.0\",\n+ \"url\": \"https://api.github.com/licenses/apache-2.0\",\n+ \"node_id\": \"MDc6TGljZW5zZTI=\"\n+ },\n+ \"forks\": 206,\n+ \"open_issues\": 190,\n+ \"watchers\": 3642,\n+ \"default_branch\": \"master\"\n+ }\n+ },\n+ \"base\": {\n+ \"label\": \"simonw:master\",\n+ \"ref\": \"master\",\n+ \"sha\": \"f04deebec4f3842f7bd610cd5859de529f77d50e\",\n+ \"user\": {\n+ \"login\": \"simonw\",\n+ \"id\": 9599,\n+ \"node_id\": \"MDQ6VXNlcjk1OTk=\",\n+ \"avatar_url\": \"https://avatars0.githubusercontent.com/u/9599?v=4\",\n+ \"gravatar_id\": \"\",\n+ \"url\": \"https://api.github.com/users/simonw\",\n+ \"html_url\": \"https://github.com/simonw\",\n+ \"followers_url\": \"https://api.github.com/users/simonw/followers\",\n+ \"following_url\": \"https://api.github.com/users/simonw/following{/other_user}\",\n+ \"gists_url\": \"https://api.github.com/users/simonw/gists{/gist_id}\",\n+ \"starred_url\": \"https://api.github.com/users/simonw/starred{/owner}{/repo}\",\n+ \"subscriptions_url\": \"https://api.github.com/users/simonw/subscriptions\",\n+ \"organizations_url\": \"https://api.github.com/users/simonw/orgs\",\n+ \"repos_url\": \"https://api.github.com/users/simonw/repos\",\n+ \"events_url\": \"https://api.github.com/users/simonw/events{/privacy}\",\n+ \"received_events_url\": \"https://api.github.com/users/simonw/received_events\",\n+ \"type\": \"User\",\n+ \"site_admin\": false\n+ },\n+ \"repo\": {\n+ \"id\": 107914493,\n+ \"node_id\": \"MDEwOlJlcG9zaXRvcnkxMDc5MTQ0OTM=\",\n+ \"name\": \"datasette\",\n+ \"full_name\": \"simonw/datasette\",\n+ \"private\": false,\n+ \"owner\": {\n+ \"login\": \"simonw\",\n+ \"id\": 9599,\n+ \"node_id\": \"MDQ6VXNlcjk1OTk=\",\n+ \"avatar_url\": \"https://avatars0.githubusercontent.com/u/9599?v=4\",\n+ \"gravatar_id\": \"\",\n+ \"url\": \"https://api.github.com/users/simonw\",\n+ \"html_url\": \"https://github.com/simonw\",\n+ \"followers_url\": \"https://api.github.com/users/simonw/followers\",\n+ \"following_url\": \"https://api.github.com/users/simonw/following{/other_user}\",\n+ \"gists_url\": \"https://api.github.com/users/simonw/gists{/gist_id}\",\n+ \"starred_url\": \"https://api.github.com/users/simonw/starred{/owner}{/repo}\",\n+ \"subscriptions_url\": \"https://api.github.com/users/simonw/subscriptions\",\n+ \"organizations_url\": \"https://api.github.com/users/simonw/orgs\",\n+ \"repos_url\": \"https://api.github.com/users/simonw/repos\",\n+ \"events_url\": \"https://api.github.com/users/simonw/events{/privacy}\",\n+ \"received_events_url\": \"https://api.github.com/users/simonw/received_events\",\n+ \"type\": \"User\",\n+ \"site_admin\": false\n+ },\n+ \"html_url\": \"https://github.com/simonw/datasette\",\n+ \"description\": \"An open source multi-tool for exploring and publishing data\",\n+ \"fork\": false,\n+ \"url\": \"https://api.github.com/repos/simonw/datasette\",\n+ \"forks_url\": \"https://api.github.com/repos/simonw/datasette/forks\",\n+ \"keys_url\": \"https://api.github.com/repos/simonw/datasette/keys{/key_id}\",\n+ \"collaborators_url\": \"https://api.github.com/repos/simonw/datasette/collaborators{/collaborator}\",\n+ \"teams_url\": \"https://api.github.com/repos/simonw/datasette/teams\",\n+ \"hooks_url\": \"https://api.github.com/repos/simonw/datasette/hooks\",\n+ \"issue_events_url\": \"https://api.github.com/repos/simonw/datasette/issues/events{/number}\",\n+ \"events_url\": \"https://api.github.com/repos/simonw/datasette/events\",\n+ \"assignees_url\": \"https://api.github.com/repos/simonw/datasette/assignees{/user}\",\n+ \"branches_url\": \"https://api.github.com/repos/simonw/datasette/branches{/branch}\",\n+ \"tags_url\": \"https://api.github.com/repos/simonw/datasette/tags\",\n+ \"blobs_url\": \"https://api.github.com/repos/simonw/datasette/git/blobs{/sha}\",\n+ \"git_tags_url\": \"https://api.github.com/repos/simonw/datasette/git/tags{/sha}\",\n+ \"git_refs_url\": \"https://api.github.com/repos/simonw/datasette/git/refs{/sha}\",\n+ \"trees_url\": \"https://api.github.com/repos/simonw/datasette/git/trees{/sha}\",\n+ \"statuses_url\": \"https://api.github.com/repos/simonw/datasette/statuses/{sha}\",\n+ \"languages_url\": \"https://api.github.com/repos/simonw/datasette/languages\",\n+ \"stargazers_url\": \"https://api.github.com/repos/simonw/datasette/stargazers\",\n+ \"contributors_url\": \"https://api.github.com/repos/simonw/datasette/contributors\",\n+ \"subscribers_url\": \"https://api.github.com/repos/simonw/datasette/subscribers\",\n+ \"subscription_url\": \"https://api.github.com/repos/simonw/datasette/subscription\",\n+ \"commits_url\": \"https://api.github.com/repos/simonw/datasette/commits{/sha}\",\n+ \"git_commits_url\": \"https://api.github.com/repos/simonw/datasette/git/commits{/sha}\",\n+ \"comments_url\": \"https://api.github.com/repos/simonw/datasette/comments{/number}\",\n+ \"issue_comment_url\": \"https://api.github.com/repos/simonw/datasette/issues/comments{/number}\",\n+ \"contents_url\": \"https://api.github.com/repos/simonw/datasette/contents/{+path}\",\n+ \"compare_url\": \"https://api.github.com/repos/simonw/datasette/compare/{base}...{head}\",\n+ \"merges_url\": \"https://api.github.com/repos/simonw/datasette/merges\",\n+ \"archive_url\": \"https://api.github.com/repos/simonw/datasette/{archive_format}{/ref}\",\n+ \"downloads_url\": \"https://api.github.com/repos/simonw/datasette/downloads\",\n+ \"issues_url\": \"https://api.github.com/repos/simonw/datasette/issues{/number}\",\n+ \"pulls_url\": \"https://api.github.com/repos/simonw/datasette/pulls{/number}\",\n+ \"milestones_url\": \"https://api.github.com/repos/simonw/datasette/milestones{/number}\",\n+ \"notifications_url\": \"https://api.github.com/repos/simonw/datasette/notifications{?since,all,participating}\",\n+ \"labels_url\": \"https://api.github.com/repos/simonw/datasette/labels{/name}\",\n+ \"releases_url\": \"https://api.github.com/repos/simonw/datasette/releases{/id}\",\n+ \"deployments_url\": \"https://api.github.com/repos/simonw/datasette/deployments\",\n+ \"created_at\": \"2017-10-23T00:39:03Z\",\n+ \"updated_at\": \"2020-07-27T20:42:15Z\",\n+ \"pushed_at\": \"2020-07-26T01:21:05Z\",\n+ \"git_url\": \"git://github.com/simonw/datasette.git\",\n+ \"ssh_url\": \"git@github.com:simonw/datasette.git\",\n+ \"clone_url\": \"https://github.com/simonw/datasette.git\",\n+ \"svn_url\": \"https://github.com/simonw/datasette\",\n+ \"homepage\": \"http://datasette.readthedocs.io/\",\n+ \"size\": 3487,\n+ \"stargazers_count\": 3642,\n+ \"watchers_count\": 3642,\n+ \"language\": \"Python\",\n+ \"has_issues\": true,\n+ \"has_projects\": false,\n+ \"has_downloads\": true,\n+ \"has_wiki\": true,\n+ \"has_pages\": false,\n+ \"forks_count\": 206,\n+ \"mirror_url\": null,\n+ \"archived\": false,\n+ \"disabled\": false,\n+ \"open_issues_count\": 190,\n+ \"license\": {\n+ \"key\": \"apache-2.0\",\n+ \"name\": \"Apache License 2.0\",\n+ \"spdx_id\": \"Apache-2.0\",\n+ \"url\": \"https://api.github.com/licenses/apache-2.0\",\n+ \"node_id\": \"MDc6TGljZW5zZTI=\"\n+ },\n+ \"forks\": 206,\n+ \"open_issues\": 190,\n+ \"watchers\": 3642,\n+ \"default_branch\": \"master\"\n+ }\n+ },\n+ \"_links\": {\n+ \"self\": {\n+ \"href\": \"https://api.github.com/repos/simonw/datasette/pulls/571\"\n+ },\n+ \"html\": {\n+ \"href\": \"https://github.com/simonw/datasette/pull/571\"\n+ },\n+ \"issue\": {\n+ \"href\": \"https://api.github.com/repos/simonw/datasette/issues/571\"\n+ },\n+ \"comments\": {\n+ \"href\": \"https://api.github.com/repos/simonw/datasette/issues/571/comments\"\n+ },\n+ \"review_comments\": {\n+ \"href\": \"https://api.github.com/repos/simonw/datasette/pulls/571/comments\"\n+ },\n+ \"review_comment\": {\n+ \"href\": \"https://api.github.com/repos/simonw/datasette/pulls/comments{/number}\"\n+ },\n+ \"commits\": {\n+ \"href\": \"https://api.github.com/repos/simonw/datasette/pulls/571/commits\"\n+ },\n+ \"statuses\": {\n+ \"href\": \"https://api.github.com/repos/simonw/datasette/statuses/a85239f69261c10f1a9f90514c8b5d113cb94585\"\n+ }\n+ },\n+ \"author_association\": \"OWNER\",\n+ \"active_lock_reason\": null,\n+ \"merged\": true,\n+ \"mergeable\": null,\n+ \"rebaseable\": null,\n+ \"mergeable_state\": \"unknown\",\n+ \"merged_by\": {", "path": "tests/pull_requests.json", "position": 342, "original_position": 342, "commit_id": "3a0d5c498f9faae4e40aab204cd01b965a4f61f3", "user": { "login": "simonw", "id": 9599, "node_id": "MDQ6VXNlcjk1OTk=", "avatar_url": "https://avatars0.githubusercontent.com/u/9599?u=5968723deb1a55b82620e106f5ca58e9b11a0942&v=4", "gravatar_id": "", "url": "https://api.github.com/users/simonw", "html_url": "https://github.com/simonw", "followers_url": "https://api.github.com/users/simonw/followers", "following_url": "https://api.github.com/users/simonw/following{/other_user}", "gists_url": "https://api.github.com/users/simonw/gists{/gist_id}", "starred_url": "https://api.github.com/users/simonw/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/simonw/subscriptions", "organizations_url": "https://api.github.com/users/simonw/orgs", "repos_url": "https://api.github.com/users/simonw/repos", "events_url": "https://api.github.com/users/simonw/events{/privacy}", "received_events_url": "https://api.github.com/users/simonw/received_events", "type": "User", "site_admin": false }, "body": "Running this should create a `merged_by` column on the `pull_requests` table which is a foreign key to the `users` table.", "created_at": "2020-10-06T21:22:47Z", "updated_at": "2020-10-20T20:56:33Z", "html_url": "https://github.com/dogsheep/github-to-sqlite/pull/48#discussion_r500603838", "pull_request_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48", "author_association": "MEMBER", "_links": { "self": { "href": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/comments/500603838" }, "html": { "href": "https://github.com/dogsheep/github-to-sqlite/pull/48#discussion_r500603838" }, "pull_request": { "href": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48" } }, "original_commit_id": "4f33b850bd37829262dd29e1c520afffebedc19c" }, { "id": 500606198, "node_id": "MDI0OlB1bGxSZXF1ZXN0UmV2aWV3Q29tbWVudDUwMDYwNjE5OA==", "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/comments/500606198", "pull_request_review_id": 503368921, "diff_hunk": "@@ -0,0 +1,124 @@\n+from github_to_sqlite import utils\n+import pytest\n+import pathlib\n+import sqlite_utils\n+from sqlite_utils.db import ForeignKey\n+import json\n+\n+\n+@pytest.fixture\n+def pull_requests():\n+ return json.load(open(pathlib.Path(__file__).parent / \"pull_requests.json\"))\n+\n+\n+@pytest.fixture\n+def db(pull_requests):\n+ db = sqlite_utils.Database(memory=True)\n+ db[\"repos\"].insert(\n+ {\"id\": 1},\n+ pk=\"id\",\n+ columns={\"organization\": int, \"topics\": str, \"name\": str, \"description\": str},\n+ )\n+ utils.save_pull_requests(db, pull_requests, {\"id\": 1})\n+ return db\n+\n+\n+def test_tables(db):\n+ assert {\"pull_requests\", \"users\", \"repos\", \"milestones\"} == set(\n+ db.table_names()\n+ )\n+ assert {\n+ ForeignKey(\n+ table=\"pull_requests\", column=\"repo\", other_table=\"repos\", other_column=\"id\"\n+ ),\n+ ForeignKey(\n+ table=\"pull_requests\",\n+ column=\"milestone\",\n+ other_table=\"milestones\",\n+ other_column=\"id\",\n+ ),\n+ ForeignKey(\n+ table=\"pull_requests\", column=\"assignee\", other_table=\"users\", other_column=\"id\"\n+ ),\n+ ForeignKey(\n+ table=\"pull_requests\", column=\"user\", other_table=\"users\", other_column=\"id\"\n+ ),\n+ } == set(db[\"pull_requests\"].foreign_keys)\n+\n+\n+def test_pull_requests(db):\n+ pull_request_rows = list(db[\"pull_requests\"].rows)\n+ assert [\n+ {\n+ 'id': 313384926,\n+ 'node_id': 'MDExOlB1bGxSZXF1ZXN0MzEzMzg0OTI2',\n+ 'number': 571,\n+ 'state': 'closed',\n+ 'locked': 0,\n+ 'title': 'detect_fts now works with alternative table escaping',\n+ 'user': 9599,\n+ 'body': 'Fixes #570',\n+ 'created_at': '2019-09-03T00:23:39Z',\n+ 'updated_at': '2019-09-03T00:32:28Z',\n+ 'closed_at': '2019-09-03T00:32:28Z',\n+ 'merged_at': '2019-09-03T00:32:28Z',\n+ 'merge_commit_sha': '2dc5c8dc259a0606162673d394ba8cc1c6f54428',\n+ 'assignee': None,\n+ 'milestone': None,\n+ 'draft': 0,\n+ 'head': 'a85239f69261c10f1a9f90514c8b5d113cb94585',\n+ 'base': 'f04deebec4f3842f7bd610cd5859de529f77d50e',\n+ 'author_association': 'OWNER',\n+ 'merged': 1,\n+ 'mergeable': None,\n+ 'rebaseable': None,\n+ 'mergeable_state': 'unknown',\n+ 'merged_by': '{\"login\": \"simonw\", \"id\": 9599, \"node_id\": \"MDQ6VXNlcjk1OTk=\", \"avatar_url\": \"https://avatars0.githubusercontent.com/u/9599?v=4\", \"gravatar_id\": \"\", \"url\": \"https://api.github.com/users/simonw\", \"html_url\": \"https://github.com/simonw\", \"followers_url\": \"https://api.github.com/users/simonw/followers\", \"following_url\": \"https://api.github.com/users/simonw/following{/other_user}\", \"gists_url\": \"https://api.github.com/users/simonw/gists{/gist_id}\", \"starred_url\": \"https://api.github.com/users/simonw/starred{/owner}{/repo}\", \"subscriptions_url\": \"https://api.github.com/users/simonw/subscriptions\", \"organizations_url\": \"https://api.github.com/users/simonw/orgs\", \"repos_url\": \"https://api.github.com/users/simonw/repos\", \"events_url\": \"https://api.github.com/users/simonw/events{/privacy}\", \"received_events_url\": \"https://api.github.com/users/simonw/received_events\", \"type\": \"User\", \"site_admin\": false}',", "path": "tests/test_pull_requests.py", "position": null, "original_position": 76, "commit_id": "3a0d5c498f9faae4e40aab204cd01b965a4f61f3", "user": { "login": "simonw", "id": 9599, "node_id": "MDQ6VXNlcjk1OTk=", "avatar_url": "https://avatars0.githubusercontent.com/u/9599?u=5968723deb1a55b82620e106f5ca58e9b11a0942&v=4", "gravatar_id": "", "url": "https://api.github.com/users/simonw", "html_url": "https://github.com/simonw", "followers_url": "https://api.github.com/users/simonw/followers", "following_url": "https://api.github.com/users/simonw/following{/other_user}", "gists_url": "https://api.github.com/users/simonw/gists{/gist_id}", "starred_url": "https://api.github.com/users/simonw/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/simonw/subscriptions", "organizations_url": "https://api.github.com/users/simonw/orgs", "repos_url": "https://api.github.com/users/simonw/repos", "events_url": "https://api.github.com/users/simonw/events{/privacy}", "received_events_url": "https://api.github.com/users/simonw/received_events", "type": "User", "site_admin": false }, "body": "See above - this should be 9599, an integer reference to the row in the users table.", "created_at": "2020-10-06T21:27:43Z", "updated_at": "2020-10-20T20:56:33Z", "html_url": "https://github.com/dogsheep/github-to-sqlite/pull/48#discussion_r500606198", "pull_request_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48", "author_association": "MEMBER", "_links": { "self": { "href": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/comments/500606198" }, "html": { "href": "https://github.com/dogsheep/github-to-sqlite/pull/48#discussion_r500606198" }, "pull_request": { "href": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48" } }, "original_commit_id": "4f33b850bd37829262dd29e1c520afffebedc19c" }, { "id": 500606665, "node_id": "MDI0OlB1bGxSZXF1ZXN0UmV2aWV3Q29tbWVudDUwMDYwNjY2NQ==", "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/comments/500606665", "pull_request_review_id": 503368921, "diff_hunk": "@@ -0,0 +1,124 @@\n+from github_to_sqlite import utils\n+import pytest\n+import pathlib\n+import sqlite_utils\n+from sqlite_utils.db import ForeignKey\n+import json\n+\n+\n+@pytest.fixture\n+def pull_requests():\n+ return json.load(open(pathlib.Path(__file__).parent / \"pull_requests.json\"))\n+\n+\n+@pytest.fixture\n+def db(pull_requests):\n+ db = sqlite_utils.Database(memory=True)\n+ db[\"repos\"].insert(\n+ {\"id\": 1},\n+ pk=\"id\",\n+ columns={\"organization\": int, \"topics\": str, \"name\": str, \"description\": str},\n+ )\n+ utils.save_pull_requests(db, pull_requests, {\"id\": 1})\n+ return db\n+\n+\n+def test_tables(db):\n+ assert {\"pull_requests\", \"users\", \"repos\", \"milestones\"} == set(\n+ db.table_names()\n+ )\n+ assert {\n+ ForeignKey(\n+ table=\"pull_requests\", column=\"repo\", other_table=\"repos\", other_column=\"id\"\n+ ),\n+ ForeignKey(\n+ table=\"pull_requests\",\n+ column=\"milestone\",\n+ other_table=\"milestones\",\n+ other_column=\"id\",\n+ ),\n+ ForeignKey(\n+ table=\"pull_requests\", column=\"assignee\", other_table=\"users\", other_column=\"id\"\n+ ),\n+ ForeignKey(\n+ table=\"pull_requests\", column=\"user\", other_table=\"users\", other_column=\"id\"\n+ ),\n+ } == set(db[\"pull_requests\"].foreign_keys)\n+\n+\n+def test_pull_requests(db):\n+ pull_request_rows = list(db[\"pull_requests\"].rows)\n+ assert [\n+ {\n+ 'id': 313384926,", "path": "tests/test_pull_requests.py", "position": null, "original_position": 53, "commit_id": "3a0d5c498f9faae4e40aab204cd01b965a4f61f3", "user": { "login": "simonw", "id": 9599, "node_id": "MDQ6VXNlcjk1OTk=", "avatar_url": "https://avatars0.githubusercontent.com/u/9599?u=5968723deb1a55b82620e106f5ca58e9b11a0942&v=4", "gravatar_id": "", "url": "https://api.github.com/users/simonw", "html_url": "https://github.com/simonw", "followers_url": "https://api.github.com/users/simonw/followers", "following_url": "https://api.github.com/users/simonw/following{/other_user}", "gists_url": "https://api.github.com/users/simonw/gists{/gist_id}", "starred_url": "https://api.github.com/users/simonw/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/simonw/subscriptions", "organizations_url": "https://api.github.com/users/simonw/orgs", "repos_url": "https://api.github.com/users/simonw/repos", "events_url": "https://api.github.com/users/simonw/events{/privacy}", "received_events_url": "https://api.github.com/users/simonw/received_events", "type": "User", "site_admin": false }, "body": "Minor detail: I use Black for this repo, which requires double quotes - running \"black .\" in the root directory (with the latest version of Black) should handle this for you.", "created_at": "2020-10-06T21:28:31Z", "updated_at": "2020-10-20T20:56:33Z", "html_url": "https://github.com/dogsheep/github-to-sqlite/pull/48#discussion_r500606665", "pull_request_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48", "author_association": "MEMBER", "_links": { "self": { "href": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/comments/500606665" }, "html": { "href": "https://github.com/dogsheep/github-to-sqlite/pull/48#discussion_r500606665" }, "pull_request": { "href": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48" } }, "original_commit_id": "4f33b850bd37829262dd29e1c520afffebedc19c" } ] That's a lot more interesting.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Feature: pull request reviews and comments 664485022  
735482546 https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735482546 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46 MDEyOklzc3VlQ29tbWVudDczNTQ4MjU0Ng== simonw 9599 2020-11-30T00:22:02Z 2020-11-30T00:22:02Z MEMBER

As for reviews... here's the output of github-to-sqlite get https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48/reviews --accept 'application/vnd.github.v3+json'

json [ { "id": 503368921, "node_id": "MDE3OlB1bGxSZXF1ZXN0UmV2aWV3NTAzMzY4OTIx", "user": { "login": "simonw", "id": 9599, "node_id": "MDQ6VXNlcjk1OTk=", "avatar_url": "https://avatars0.githubusercontent.com/u/9599?u=5968723deb1a55b82620e106f5ca58e9b11a0942&v=4", "gravatar_id": "", "url": "https://api.github.com/users/simonw", "html_url": "https://github.com/simonw", "followers_url": "https://api.github.com/users/simonw/followers", "following_url": "https://api.github.com/users/simonw/following{/other_user}", "gists_url": "https://api.github.com/users/simonw/gists{/gist_id}", "starred_url": "https://api.github.com/users/simonw/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/simonw/subscriptions", "organizations_url": "https://api.github.com/users/simonw/orgs", "repos_url": "https://api.github.com/users/simonw/repos", "events_url": "https://api.github.com/users/simonw/events{/privacy}", "received_events_url": "https://api.github.com/users/simonw/received_events", "type": "User", "site_admin": false }, "body": "", "state": "CHANGES_REQUESTED", "html_url": "https://github.com/dogsheep/github-to-sqlite/pull/48#pullrequestreview-503368921", "pull_request_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48", "author_association": "MEMBER", "_links": { "html": { "href": "https://github.com/dogsheep/github-to-sqlite/pull/48#pullrequestreview-503368921" }, "pull_request": { "href": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48" } }, "submitted_at": "2020-10-06T21:28:40Z", "commit_id": "4f33b850bd37829262dd29e1c520afffebedc19c" }, { "id": 513118561, "node_id": "MDE3OlB1bGxSZXF1ZXN0UmV2aWV3NTEzMTE4NTYx", "user": { "login": "adamjonas", "id": 755825, "node_id": "MDQ6VXNlcjc1NTgyNQ==", "avatar_url": "https://avatars1.githubusercontent.com/u/755825?v=4", "gravatar_id": "", "url": "https://api.github.com/users/adamjonas", "html_url": "https://github.com/adamjonas", "followers_url": "https://api.github.com/users/adamjonas/followers", "following_url": "https://api.github.com/users/adamjonas/following{/other_user}", "gists_url": "https://api.github.com/users/adamjonas/gists{/gist_id}", "starred_url": "https://api.github.com/users/adamjonas/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/adamjonas/subscriptions", "organizations_url": "https://api.github.com/users/adamjonas/orgs", "repos_url": "https://api.github.com/users/adamjonas/repos", "events_url": "https://api.github.com/users/adamjonas/events{/privacy}", "received_events_url": "https://api.github.com/users/adamjonas/received_events", "type": "User", "site_admin": false }, "body": "", "state": "COMMENTED", "html_url": "https://github.com/dogsheep/github-to-sqlite/pull/48#pullrequestreview-513118561", "pull_request_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48", "author_association": "CONTRIBUTOR", "_links": { "html": { "href": "https://github.com/dogsheep/github-to-sqlite/pull/48#pullrequestreview-513118561" }, "pull_request": { "href": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48" } }, "submitted_at": "2020-10-20T20:45:05Z", "commit_id": "4f33b850bd37829262dd29e1c520afffebedc19c" }, { "id": 513127529, "node_id": "MDE3OlB1bGxSZXF1ZXN0UmV2aWV3NTEzMTI3NTI5", "user": { "login": "adamjonas", "id": 755825, "node_id": "MDQ6VXNlcjc1NTgyNQ==", "avatar_url": "https://avatars1.githubusercontent.com/u/755825?v=4", "gravatar_id": "", "url": "https://api.github.com/users/adamjonas", "html_url": "https://github.com/adamjonas", "followers_url": "https://api.github.com/users/adamjonas/followers", "following_url": "https://api.github.com/users/adamjonas/following{/other_user}", "gists_url": "https://api.github.com/users/adamjonas/gists{/gist_id}", "starred_url": "https://api.github.com/users/adamjonas/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/adamjonas/subscriptions", "organizations_url": "https://api.github.com/users/adamjonas/orgs", "repos_url": "https://api.github.com/users/adamjonas/repos", "events_url": "https://api.github.com/users/adamjonas/events{/privacy}", "received_events_url": "https://api.github.com/users/adamjonas/received_events", "type": "User", "site_admin": false }, "body": "", "state": "COMMENTED", "html_url": "https://github.com/dogsheep/github-to-sqlite/pull/48#pullrequestreview-513127529", "pull_request_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48", "author_association": "CONTRIBUTOR", "_links": { "html": { "href": "https://github.com/dogsheep/github-to-sqlite/pull/48#pullrequestreview-513127529" }, "pull_request": { "href": "https://api.github.com/repos/dogsheep/github-to-sqlite/pulls/48" } }, "submitted_at": "2020-10-20T20:57:33Z", "commit_id": "3a0d5c498f9faae4e40aab204cd01b965a4f61f3" } ]

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Feature: pull request reviews and comments 664485022  
735482187 https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-735482187 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46 MDEyOklzc3VlQ29tbWVudDczNTQ4MjE4Nw== simonw 9599 2020-11-30T00:20:11Z 2020-11-30T00:20:11Z MEMBER

Pull request are now added, thanks to @adamjonas.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Feature: pull request reviews and comments 664485022  
735465708 https://github.com/dogsheep/github-to-sqlite/issues/54#issuecomment-735465708 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/54 MDEyOklzc3VlQ29tbWVudDczNTQ2NTcwOA== simonw 9599 2020-11-29T22:08:46Z 2020-11-29T22:08:46Z MEMBER

Demo: - https://github-to-sqlite.dogsheep.net/github/steps?_facet=repo - https://github-to-sqlite.dogsheep.net/github/workflows - https://github-to-sqlite.dogsheep.net/github/jobs

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
github-to-sqlite workflows command 753026003  
735464493 https://github.com/dogsheep/github-to-sqlite/issues/54#issuecomment-735464493 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/54 MDEyOklzc3VlQ29tbWVudDczNTQ2NDQ5Mw== simonw 9599 2020-11-29T21:57:32Z 2020-11-29T21:57:32Z MEMBER

$ github-to-sqlite workflows github.db simonw/datasette dogsheep/github-to-sqlite

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
github-to-sqlite workflows command 753026003  

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1.2ms · About: github-to-sqlite