home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

12 rows where issue = 612287234 and user = 9599 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, updated_at (date)

user 1

  • simonw · 12 ✖

issue 1

  • Import machine-learning detected labels (dog, llama etc) from Apple Photos · 12 ✖

created_at (date) 1 ✖

  • 2020-05-05 12

author_association 1

  • MEMBER 12
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
623865250 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623865250 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg2NTI1MA== simonw 9599 2020-05-05T05:38:16Z 2020-05-05T05:38:16Z MEMBER

It looks like groups.content_string often has a null byte in it. I should clean this up as part of the import.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623863902 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623863902 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg2MzkwMg== simonw 9599 2020-05-05T05:31:53Z 2020-05-05T05:31:53Z MEMBER

Yes! Turning those rowid values into id with this script did the job: ```python import sqlite3 import sqlite_utils

conn = sqlite3.connect( "/Users/simon/Pictures/Photos Library.photoslibrary/database/search/psi.sqlite" )

def all_rows(table): result = conn.execute("select rowid as id, * from {}".format(table)) cols = [c[0] for c in result.description] for row in result.fetchall(): yield dict(zip(cols, row))

if name == "main": db = sqlite_utils.Database("psi_copy.db") for table in ("assets", "collections", "ga", "gc", "groups"): db[table].upsert_all(all_rows(table), pk="id", alter=True) Then I ran this query:sql select json_object('img_src', 'https://photos.simonwillison.net/i/' || photos.sha256 || '.' || photos.ext || '?w=400') as photo, group_concat(strip_null_chars(groups.content_string), ' ') as words, assets.uuid_0, assets.uuid_1, to_uuid(assets.uuid_0, assets.uuid_1) as uuid from assets join ga on assets.id = ga.assetid join groups on ga.groupid = groups.id join photos on photos.uuid = to_uuid(assets.uuid_0, assets.uuid_1) where groups.category = 2024 group by assets.id order by random() limit 10 ``` And got these results!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623857417 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623857417 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg1NzQxNw== simonw 9599 2020-05-05T05:01:47Z 2020-05-05T05:01:47Z MEMBER

Even that didn't work - it didn't copy across the rowid values. I'm pretty sure that's what's wrong here: sqlite3 /Users/simon/Pictures/Photos\ Library.photoslibrary/database/search/psi.sqlite 'select rowid, uuid_0, uuid_1 from assets limit 10' 1619605|-9205353363298198838|4814875488794983828 1641378|-9205348195631362269|390804289838822030 1634974|-9205331524553603243|-3834026796261633148 1619083|-9205326176986145401|7563404215614709654 22131|-9205315724827218763|8370531509591906734 1645633|-9205247376092758131|-1311540150497601346

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623855885 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623855885 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg1NTg4NQ== simonw 9599 2020-05-05T04:54:39Z 2020-05-05T04:54:53Z MEMBER

Trying this import mechanism instead: sqlite3 /Users/simon/Pictures/Photos\ Library.photoslibrary/database/search/psi.sqlite .dump | grep -v 'CREATE INDEX' | grep -v 'CREATE TRIGGER' | grep -v 'CREATE VIRTUAL TABLE' | sqlite3 search.db

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623855841 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623855841 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg1NTg0MQ== simonw 9599 2020-05-05T04:54:28Z 2020-05-05T04:54:28Z MEMBER

Things were not matching up for me correctly:

I think that's because my import script didn't correctly import the existing rowid values.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623846880 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623846880 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg0Njg4MA== simonw 9599 2020-05-05T04:06:08Z 2020-05-05T04:06:08Z MEMBER

This function seems to convert them into UUIDs that match my photos: python def to_uuid(uuid_0, uuid_1): b = uuid_0.to_bytes(8, 'little', signed=True) + uuid_1.to_bytes(8, 'little', signed=True) return str(uuid.UUID(bytes=b)).upper()

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623811131 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623811131 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgxMTEzMQ== simonw 9599 2020-05-05T03:16:18Z 2020-05-05T03:16:18Z MEMBER

Here's how to convert two integers unto a UUID using Java. Not sure if it's the solution I need though (or how to do the same thing in Python):

https://repl.it/repls/EuphoricSomberClasslibrary

```java import java.util.UUID;

class Main { public static void main(String[] args) { java.util.UUID uuid = new java.util.UUID( 2544182952487526660L, -3640314103732024685L ); System.out.println( uuid ); } } ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623807568 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623807568 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgwNzU2OA== simonw 9599 2020-05-05T02:56:06Z 2020-05-05T02:56:06Z MEMBER

I'm pretty sure this is what I'm after. The groups table has what looks like identified labels in the rows with category = 2025:

Then there's a ga table that maps groups to assets:

And an assets table which looks like it has one row for every one of my photos:

One major challenge: these UUIDs are split into two integer numbers, uuid_0 and uuid_1 - but the main photos database uses regular UUIDs like this:

I need to figure out how to match up these two different UUID representations. I asked on Twitter if anyone has any ideas: https://twitter.com/simonw/status/1257500689019703296

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623806687 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806687 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgwNjY4Nw== simonw 9599 2020-05-05T02:51:16Z 2020-05-05T02:51:16Z MEMBER

Running datasette against it directly doesn't work: ``` simon@Simons-MacBook-Pro search % datasette psi.sqlite Serve! files=('psi.sqlite',) (immutables=()) on port 8001 Usage: datasette serve [OPTIONS] [FILES]...

Error: Connection to psi.sqlite failed check: no such tokenizer: PSITokenizer Instead, I created a new SQLite database with a copy of some of the key tables, like this: sqlite-utils rows psi.sqlite groups | sqlite-utils insert /tmp/search.db groups - sqlite-utils rows psi.sqlite assets | sqlite-utils insert /tmp/search.db assets - sqlite-utils rows psi.sqlite ga | sqlite-utils insert /tmp/search.db ga - sqlite-utils rows psi.sqlite collections | sqlite-utils insert /tmp/search.db collections - sqlite-utils rows psi.sqlite gc | sqlite-utils insert /tmp/search.db gc - sqlite-utils rows psi.sqlite lookup | sqlite-utils insert /tmp/search.db lookup - ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623806533 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806533 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgwNjUzMw== simonw 9599 2020-05-05T02:50:16Z 2020-05-05T02:50:16Z MEMBER

I figured there must be a separate database that Photos uses to store the text of the identified labels.

I used "Open Files and Ports" in Activity Monitor against the Photos app to try and spot candidates... and found /Users/simon/Pictures/Photos Library.photoslibrary/database/search/psi.sqlite - a 53MB SQLite database file.

Here's the schema of that file: $ sqlite3 psi.sqlite .schema CREATE TABLE word_embedding(word TEXT, extended_word TEXT, score DOUBLE); CREATE INDEX word_embedding_index ON word_embedding(word); CREATE VIRTUAL TABLE word_embedding_prefix USING fts5(extended_word) /* word_embedding_prefix(extended_word) */; CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_data'(id INTEGER PRIMARY KEY, block BLOB); CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_idx'(segid, term, pgno, PRIMARY KEY(segid, term)) WITHOUT ROWID; CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_content'(id INTEGER PRIMARY KEY, c0); CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_docsize'(id INTEGER PRIMARY KEY, sz BLOB); CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_config'(k PRIMARY KEY, v) WITHOUT ROWID; CREATE TABLE groups(category INT2, owning_groupid INT, content_string TEXT, normalized_string TEXT, lookup_identifier TEXT, token_ranges_0 INT8, token_ranges_1 INT8, UNIQUE(category, owning_groupid, content_string, lookup_identifier, token_ranges_0, token_ranges_1)); CREATE TABLE assets(uuid_0 INT, uuid_1 INT, creationDate INT, UNIQUE(uuid_0, uuid_1)); CREATE TABLE ga(groupid INT, assetid INT, PRIMARY KEY(groupid, assetid)); CREATE TABLE collections(uuid_0 INT, uuid_1 INT, startDate INT, endDate INT, title TEXT, subtitle TEXT, keyAssetUUID_0 INT, keyAssetUUID_1 INT, typeAndNumberOfAssets INT32, sortDate DOUBLE, UNIQUE(uuid_0, uuid_1)); CREATE TABLE gc(groupid INT, collectionid INT, PRIMARY KEY(groupid, collectionid)); CREATE VIRTUAL TABLE prefix USING fts5(content='groups', normalized_string, category UNINDEXED, tokenize = 'PSITokenizer'); CREATE TABLE IF NOT EXISTS 'prefix_data'(id INTEGER PRIMARY KEY, block BLOB); CREATE TABLE IF NOT EXISTS 'prefix_idx'(segid, term, pgno, PRIMARY KEY(segid, term)) WITHOUT ROWID; CREATE TABLE IF NOT EXISTS 'prefix_docsize'(id INTEGER PRIMARY KEY, sz BLOB); CREATE TABLE IF NOT EXISTS 'prefix_config'(k PRIMARY KEY, v) WITHOUT ROWID; CREATE TABLE lookup(identifier TEXT PRIMARY KEY, category INT2); CREATE TRIGGER trigger_groups_insert AFTER INSERT ON groups BEGIN INSERT INTO prefix(rowid, normalized_string, category) VALUES (new.rowid, new.normalized_string, new.category); END; CREATE TRIGGER trigger_groups_delete AFTER DELETE ON groups BEGIN INSERT INTO prefix(prefix, rowid, normalized_string, category) VALUES('delete', old.rowid, old.normalized_string, old.category); END; CREATE INDEX group_pk ON groups(category, content_string, normalized_string, lookup_identifier); CREATE INDEX asset_pk ON assets(uuid_0, uuid_1); CREATE INDEX ga_assetid ON ga(assetid, groupid); CREATE INDEX collection_pk ON collections(uuid_0, uuid_1); CREATE INDEX gc_collectionid ON gc(collectionid);

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623806085 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806085 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgwNjA4NQ== simonw 9599 2020-05-05T02:47:18Z 2020-05-05T02:47:18Z MEMBER

In https://github.com/RhetTbull/osxphotos/issues/121#issuecomment-623249263 Rhet Turnbull spotted a table called ZSCENEIDENTIFIER which looked like it might have the right data, but the columns in it aren't particularly helpful: Z_PK,Z_ENT,Z_OPT,ZSCENEIDENTIFIER,ZASSETATTRIBUTES,ZCONFIDENCE 8,49,1,731,5,0.11834716796875 9,49,1,684,6,0.0233648251742125 10,49,1,1702,1,0.026153564453125 I love the look of those confidence scores, but what do the numbers mean?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623805823 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623805823 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgwNTgyMw== simonw 9599 2020-05-05T02:45:56Z 2020-05-05T02:45:56Z MEMBER

I filed an issue with osxphotos about this here: https://github.com/RhetTbull/osxphotos/issues/121

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 455.384ms · About: github-to-sqlite