5,578 rows sorted by html_url

View and edit SQL

Suggested facets: created_at (date), updated_at (date)

author_association

id html_url ▼ issue_url node_id user created_at updated_at author_association body reactions issue performed_via_github_app
686238498 https://github.com/dogsheep/dogsheep-beta/issues/10#issuecomment-686238498 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/10 MDEyOklzc3VlQ29tbWVudDY4NjIzODQ5OA== simonw 9599 2020-09-03T04:05:05Z 2020-09-03T04:05:05Z MEMBER

Since the first two categories are created and saved this one should be called received.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Category 3: received 691557547  
686618669 https://github.com/dogsheep/dogsheep-beta/issues/11#issuecomment-686618669 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/11 MDEyOklzc3VlQ29tbWVudDY4NjYxODY2OQ== simonw 9599 2020-09-03T16:47:34Z 2020-09-03T16:53:25Z MEMBER

I think a is_public integer column which defaults to 0 would be good here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Public / Private mechanism 692125110  
686774592 https://github.com/dogsheep/dogsheep-beta/issues/13#issuecomment-686774592 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/13 MDEyOklzc3VlQ29tbWVudDY4Njc3NDU5Mg== simonw 9599 2020-09-03T21:30:21Z 2020-09-03T21:30:21Z MEMBER

This is partially supported: the custom search SQL we run doesn't escape them, but the ?_search used to calculate facet counts does. So this is a bug.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support advanced FTS queries 692386625  
695124698 https://github.com/dogsheep/dogsheep-beta/issues/15#issuecomment-695124698 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/15 MDEyOklzc3VlQ29tbWVudDY5NTEyNDY5OA== simonw 9599 2020-09-18T23:17:38Z 2020-09-18T23:17:38Z MEMBER

This can be part of the demo instance in #6.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add a bunch of config examples 694136490  
694548909 https://github.com/dogsheep/dogsheep-beta/issues/16#issuecomment-694548909 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/16 MDEyOklzc3VlQ29tbWVudDY5NDU0ODkwOQ== simonw 9599 2020-09-17T23:15:09Z 2020-09-17T23:15:09Z MEMBER

I have sort by date now, #21.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Timeline view 694493566  
695851036 https://github.com/dogsheep/dogsheep-beta/issues/16#issuecomment-695851036 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/16 MDEyOklzc3VlQ29tbWVudDY5NTg1MTAzNg== simonw 9599 2020-09-20T23:34:57Z 2020-09-20T23:34:57Z MEMBER

Really basic starting point is to add facet by date.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Timeline view 694493566  
695877627 https://github.com/dogsheep/dogsheep-beta/issues/16#issuecomment-695877627 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/16 MDEyOklzc3VlQ29tbWVudDY5NTg3NzYyNw== simonw 9599 2020-09-21T02:42:29Z 2020-09-21T02:42:29Z MEMBER

Fun twist: assuming timestamp is always stored as UTC, I need the interface to be timezone aware so I can see e.g. everything from 4th July 2020 in the San Francisco timezone definition of 4th July 2020.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Timeline view 694493566  
687880459 https://github.com/dogsheep/dogsheep-beta/issues/17#issuecomment-687880459 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/17 MDEyOklzc3VlQ29tbWVudDY4Nzg4MDQ1OQ== simonw 9599 2020-09-06T19:36:32Z 2020-09-06T19:36:32Z MEMBER

At some point I may even want to support search types which are indexed from (and inflated from) more than one database file. I'm going to ignore that for the moment though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Rename "table" to "type" 694500679  
689226390 https://github.com/dogsheep/dogsheep-beta/issues/17#issuecomment-689226390 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/17 MDEyOklzc3VlQ29tbWVudDY4OTIyNjM5MA== simonw 9599 2020-09-09T00:36:07Z 2020-09-09T00:36:07Z MEMBER

Alternative names:

  • type
  • record_type
  • doctype

I think type is right. It matches what Elasticsearch used to call their equivalent of this (before they removed the feature!). https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Rename "table" to "type" 694500679  
688622995 https://github.com/dogsheep/dogsheep-beta/issues/18#issuecomment-688622995 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/18 MDEyOklzc3VlQ29tbWVudDY4ODYyMjk5NQ== simonw 9599 2020-09-08T05:15:21Z 2020-09-08T05:15:21Z MEMBER

Alternatively it could run as it does now but add a DELETE FROM index1.search_index WHERE key not in (select key from ...).

I'm not sure which would be more efficient.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Deleted records stay in the search index 695553522  
688623097 https://github.com/dogsheep/dogsheep-beta/issues/18#issuecomment-688623097 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/18 MDEyOklzc3VlQ29tbWVudDY4ODYyMzA5Nw== simonw 9599 2020-09-08T05:15:51Z 2020-09-08T05:15:51Z MEMBER

I'm inclined to go with the first, simpler option. I have longer term plans for efficient incremental index updates based on clever trickery with triggers.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Deleted records stay in the search index 695553522  
688625430 https://github.com/dogsheep/dogsheep-beta/issues/19#issuecomment-688625430 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/19 MDEyOklzc3VlQ29tbWVudDY4ODYyNTQzMA== simonw 9599 2020-09-08T05:24:50Z 2020-09-08T05:24:50Z MEMBER

I thought about allowing tables to define a incremental indexing SQL query - maybe something that can return just records touched in the past hour, or records since a recorded "last indexed record" value.

The problem with this is deletes - if you delete a record, how does the indexer know to remove it? See #18 - that's already caused problems.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out incremental re-indexing 695556681  
688626037 https://github.com/dogsheep/dogsheep-beta/issues/19#issuecomment-688626037 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/19 MDEyOklzc3VlQ29tbWVudDY4ODYyNjAzNw== simonw 9599 2020-09-08T05:27:07Z 2020-09-08T05:27:07Z MEMBER

A really clever way to do this would be with triggers. The indexer script would add triggers to each of the database tables that it is indexing - each in their own database.

Those triggers would then maintain a _index_queue_ table. This table would record the primary key of rows that are added, modified or deleted. The indexer could then work by reading through the _index_queue_ table, re-indexing (or deleting) just the primary keys listed there, and then emptying the queue once it has finished.

This would add a small amount of overhead to insert/update/delete queries run against the table. My hunch is that the overhead would be miniscule, but I could still allow people to opt-out for tables that are so high traffic that this would matter.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out incremental re-indexing 695556681  
685115519 https://github.com/dogsheep/dogsheep-beta/issues/2#issuecomment-685115519 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/2 MDEyOklzc3VlQ29tbWVudDY4NTExNTUxOQ== simonw 9599 2020-09-01T20:31:57Z 2020-09-01T20:31:57Z MEMBER

Actually this doesn't work: you can't turn on stemming for specific tables, because all of the content goes into a single search_index table which is configured the same way.

So stemming needs to be a global option.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Apply porter stemming 689809225  
685121074 https://github.com/dogsheep/dogsheep-beta/issues/2#issuecomment-685121074 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/2 MDEyOklzc3VlQ29tbWVudDY4NTEyMTA3NA== simonw 9599 2020-09-01T20:42:00Z 2020-09-01T20:42:00Z MEMBER

Documentation at the bottom of the Usage section here: https://github.com/dogsheep/dogsheep-beta/blob/0.2/README.md#usage

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Apply porter stemming 689809225  
694551406 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694551406 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1MTQwNg== simonw 9599 2020-09-17T23:22:07Z 2020-09-17T23:22:07Z MEMBER

Neat, I can debug this with the new --pdb option:

datasette . --get '/-/beta?q=pycon&sort=oldest' --pdb
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694551646 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694551646 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1MTY0Ng== simonw 9599 2020-09-17T23:22:48Z 2020-09-17T23:22:48Z MEMBER

Looks like its happening in a Jinja fragment template for one of the results:

  /Users/simon/Dropbox/Development/dogsheep-beta/dogsheep_beta/__init__.py(169)process_results()
-> output = compiled.render({**result, **{"json": json}})
  /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/asyncsupport.py(71)render()
-> return original_render(self, *args, **kwargs)
  /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/environment.py(1090)render()
-> self.environment.handle_exception()
  /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/environment.py(832)handle_exception()
-> reraise(*rewrite_traceback_stack(source=source))
  /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/_compat.py(28)reraise()
-> raise value.with_traceback(tb)
  <template>(5)top-level template code()
> /usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py(341)loads()
-> raise TypeError(f'the JSON object must be str, bytes or bytearray, '
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694552393 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694552393 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1MjM5Mw== simonw 9599 2020-09-17T23:25:01Z 2020-09-17T23:25:17Z MEMBER

Ran locals() In the debugger:
{'range': <class 'range'>, 'dict': <class 'dict'>, 'lipsum': <function generate_lorem_ipsum at 0x10aeff430>, 'cycler': <class 'jinja2.utils.Cycler'>, 'joiner': <class 'jinja2.utils.Joiner'>, 'namespace': <class 'jinja2.utils.Namespace'>, 'rank': -9.383801886431414, 'rowid': 14297, 'type': 'twitter.db/tweets', 'key': '312658917933076480', 'title': 'Tweet by @chrisstreeter', 'category': 2, 'timestamp': '2013-03-15T20:17:49+00:00', 'search_1': '@simonw are you at pycon? Would love to meet you.', 'display': {'avatar_url': 'https://pbs.twimg.com/profile_images/806275088597204993/38yLHfJi_normal.jpg', 'user_name': 'Chris Streeter', 'screen_name': 'chrisstreeter', 'followers_count': 280, 'tweet_id': 312658917933076480, 'created_at': '2013-03-15T20:17:49+00:00', 'full_text': '@simonw are you at pycon? Would love to meet you.', 'media_urls_2': '[]', 'media_urls': '[]'}, 'json': <module 'json' from '/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py'>}

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694552681 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694552681 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1MjY4MQ== simonw 9599 2020-09-17T23:25:54Z 2020-09-17T23:25:54Z MEMBER

This is the template fragment it's rendering:

            <div style="overflow: hidden;">
              <p>Tweet by <a href="https://twitter.com/{{ display.screen_name }}">@{{ display.screen_name }}</a> ({{ display.user_name }}, {{ "{:,}".format(display.followers_count or 0) }} followers)
                on <a href="https://twitter.com/{{ display.screen_name }}/status/{{ display.tweet_id }}">{{ display.created_at }}</a></p>
              </p>
              <blockquote>{{ display.full_text }}</blockquote>
              {% if display.media_urls and json.loads(display.media_urls) %}
                {% for url in json.loads(display.media_urls) %}
                  <img src="{{ url }}" style="height: 200px;">
                {% endfor %}
              {% endif %}
            </div>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694553579 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694553579 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1MzU3OQ== simonw 9599 2020-09-17T23:28:37Z 2020-09-17T23:28:37Z MEMBER

More investigation in pdb:

(dogsheep-beta) dogsheep-beta % datasette . --get '/-/beta?q=pycon&sort=oldest' --pdb
> /usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py(341)loads()
-> raise TypeError(f'the JSON object must be str, bytes or bytearray, '
(Pdb) list
336             if s.startswith('\ufeff'):
337                 raise JSONDecodeError("Unexpected UTF-8 BOM (decode using utf-8-sig)",
338                                       s, 0)
339         else:
340             if not isinstance(s, (bytes, bytearray)):
341  ->             raise TypeError(f'the JSON object must be str, bytes or bytearray, '
342                                 f'not {s.__class__.__name__}')
343             s = s.decode(detect_encoding(s), 'surrogatepass')
344     
345         if "encoding" in kw:
346             import warnings
(Pdb) bytes
<class 'bytes'>
(Pdb) locals()['s']
Undefined
(Pdb) type(locals()['s'])
<class 'jinja2.runtime.Undefined'>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694554584 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694554584 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1NDU4NA== simonw 9599 2020-09-17T23:31:25Z 2020-09-17T23:31:25Z MEMBER

I'd prefer it if errors in these template fragments were displayed as errors inline where the fragment should have been inserted, rather than 500ing the whole page - especially since the template fragments are user-provided and could have all kinds of odd errors in them which should be as easy to debug as possible.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694557425 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694557425 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1NzQyNQ== simonw 9599 2020-09-17T23:41:01Z 2020-09-17T23:41:01Z MEMBER

I removed all of the json.loads() calls and I'm still getting that Undefined error.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
695113871 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-695113871 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NTExMzg3MQ== simonw 9599 2020-09-18T22:30:17Z 2020-09-18T22:30:17Z MEMBER

I think I know what's going on here:

https://github.com/dogsheep/dogsheep-beta/blob/0f1b951c5131d16f3c8559a8e4d79ed5c559e3cb/dogsheep_beta/__init__.py#L166-L171

This is a logic bug - the compiled variable could be the template from the previous loop!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
695108895 https://github.com/dogsheep/dogsheep-beta/issues/25#issuecomment-695108895 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/25 MDEyOklzc3VlQ29tbWVudDY5NTEwODg5NQ== simonw 9599 2020-09-18T22:11:32Z 2020-09-18T22:11:32Z MEMBER

I'm going to make this a new plugin configuration setting, template_debug.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
template_debug mechanism 704685890  
695109140 https://github.com/dogsheep/dogsheep-beta/issues/25#issuecomment-695109140 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/25 MDEyOklzc3VlQ29tbWVudDY5NTEwOTE0MA== simonw 9599 2020-09-18T22:12:20Z 2020-09-18T22:12:20Z MEMBER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
template_debug mechanism 704685890  
695855646 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695855646 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg1NTY0Ng== simonw 9599 2020-09-21T00:16:11Z 2020-09-21T00:16:11Z MEMBER

Should I do this with offset/limit or should I do proper keyset pagination?

I think keyset because then it will work well for the full search interface with no filters or search string.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695855723 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695855723 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg1NTcyMw== simonw 9599 2020-09-21T00:16:52Z 2020-09-21T00:17:53Z MEMBER

It feels a bit weird to implement keyset pagination against results sorted by rank because the ranks could change substantially if the search index gets updated while the user is paginating.

I may just ignore that though. If you want reliable pagination you can get it by sorting by date. Maybe it doesn't even make sense to offer pagination if you sort by relevance?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695856398 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695856398 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg1NjM5OA== simonw 9599 2020-09-21T00:22:20Z 2020-09-21T00:22:20Z MEMBER

I'm going to try for keyset pagination sorted by relevance just as a learning exercise.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695856967 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695856967 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg1Njk2Nw== simonw 9599 2020-09-21T00:26:59Z 2020-09-21T00:26:59Z MEMBER

It's a shame Datasette doesn't currently have an easy way to implement sorted-by-rank keyset-paginated using a TableView or QueryView. I'll have to do this using the custom SQL query constructed in the plugin: https://github.com/dogsheep/dogsheep-beta/blob/bed9df2b3ef68189e2e445427721a28f4e9b4887/dogsheep_beta/__init__.py#L8-L43

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695875274 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695875274 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg3NTI3NA== simonw 9599 2020-09-21T02:28:58Z 2020-09-21T02:28:58Z MEMBER

Datasette's implementation is complex because it has to support compound primary keys: https://github.com/simonw/datasette/blob/a258339a935d8d29a95940ef1db01e98bb85ae63/datasette/utils/__init__.py#L88-L114 - but that's not something that's needed for dogsheep-beta.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695879237 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695879237 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg3OTIzNw== simonw 9599 2020-09-21T02:53:29Z 2020-09-21T02:53:29Z MEMBER

If previous page ended at 2018-02-11T16:32:53+00:00:

select
  search_index.rowid,
  search_index.type,
  search_index.key,
  search_index.title,
  search_index.category,
  search_index.timestamp,
  search_index.search_1
from
  search_index
 where 
  date("timestamp") = '2018-02-11'
 and timestamp < '2018-02-11T16:32:53+00:00'
order by
  search_index.timestamp desc, rowid
limit 41
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695879531 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695879531 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg3OTUzMQ== simonw 9599 2020-09-21T02:55:28Z 2020-09-21T02:55:54Z MEMBER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
711089647 https://github.com/dogsheep/dogsheep-beta/issues/28#issuecomment-711089647 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/28 MDEyOklzc3VlQ29tbWVudDcxMTA4OTY0Nw== simonw 9599 2020-10-17T22:43:13Z 2020-10-17T22:43:13Z MEMBER

Since my personal Dogsheep uses Datasette authentication, I'm going to need to pass through cookies. https://github.com/simonw/datasette/issues/1020 will solve that in the future but for now I need to solve it explicitly.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch to using datasette.client 723861683  
712266834 https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-712266834 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 MDEyOklzc3VlQ29tbWVudDcxMjI2NjgzNA== simonw 9599 2020-10-19T16:01:23Z 2020-10-19T16:01:23Z MEMBER

Might just be a documented pattern for how to configure this in YAML templates.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add search highlighting snippets 724759588  
747029636 https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747029636 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 MDEyOklzc3VlQ29tbWVudDc0NzAyOTYzNg== simonw 9599 2020-12-16T21:14:03Z 2020-12-16T21:14:03Z MEMBER

I think I can do this as a cunning trick in display_sql. Consider this example query: https://til.simonwillison.net/tils?sql=select%0D%0A++path%2C%0D%0A++snippet%28til_fts%2C+-1%2C+%27b4de2a49c8%27%2C+%278c94a2ed4b%27%2C+%27...%27%2C+60%29+as+snippet%0D%0Afrom%0D%0A++til%0D%0A++join+til_fts+on+til.rowid+%3D+til_fts.rowid%0D%0Awhere%0D%0A++til_fts+match+escape_fts%28%3Aq%29%0D%0A++and+path+%3D+%27asgi_lifespan-test-httpx.md%27%0D%0A&q=pytest

select
  path,
  snippet(til_fts, -1, 'b4de2a49c8', '8c94a2ed4b', '...', 60) as snippet
from
  til
  join til_fts on til.rowid = til_fts.rowid
where
  til_fts match escape_fts(:q)
  and path = 'asgi_lifespan-test-httpx.md'

The and path = 'asgi_lifespan-test-httpx.md' bit means we only get back a specific document - but the snippet highlighting is applied to it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add search highlighting snippets 724759588  
747030964 https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747030964 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 MDEyOklzc3VlQ29tbWVudDc0NzAzMDk2NA== simonw 9599 2020-12-16T21:14:54Z 2020-12-16T21:14:54Z MEMBER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add search highlighting snippets 724759588  
747031608 https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747031608 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 MDEyOklzc3VlQ29tbWVudDc0NzAzMTYwOA== simonw 9599 2020-12-16T21:15:18Z 2020-12-16T21:15:18Z MEMBER

Should I pass any other details to the display_sql here as well?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add search highlighting snippets 724759588  
747034481 https://github.com/dogsheep/dogsheep-beta/issues/29#issuecomment-747034481 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29 MDEyOklzc3VlQ29tbWVudDc0NzAzNDQ4MQ== simonw 9599 2020-12-16T21:17:05Z 2020-12-16T21:17:05Z MEMBER

I'm just going to add q for the moment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add search highlighting snippets 724759588  
684250044 https://github.com/dogsheep/dogsheep-beta/issues/3#issuecomment-684250044 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/3 MDEyOklzc3VlQ29tbWVudDY4NDI1MDA0NA== simonw 9599 2020-09-01T05:01:09Z 2020-09-01T05:01:23Z MEMBER

Maybe this starts out as a custom templated canned query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Datasette plugin to provide custom page for running faceted, ranked searches 689810340  
685961809 https://github.com/dogsheep/dogsheep-beta/issues/3#issuecomment-685961809 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/3 MDEyOklzc3VlQ29tbWVudDY4NTk2MTgwOQ== simonw 9599 2020-09-02T19:54:24Z 2020-09-02T19:54:24Z MEMBER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Datasette plugin to provide custom page for running faceted, ranked searches 689810340  
686689612 https://github.com/dogsheep/dogsheep-beta/issues/3#issuecomment-686689612 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/3 MDEyOklzc3VlQ29tbWVudDY4NjY4OTYxMg== simonw 9599 2020-09-03T18:44:20Z 2020-09-03T18:44:20Z MEMBER

Facets are now displayed but selecting them doesn't work yet.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Datasette plugin to provide custom page for running faceted, ranked searches 689810340  
748426501 https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426501 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 MDEyOklzc3VlQ29tbWVudDc0ODQyNjUwMQ== simonw 9599 2020-12-19T06:12:22Z 2020-12-19T06:12:22Z MEMBER

I deliberately added support for advanced FTS in https://github.com/dogsheep/dogsheep-beta/commit/cbb2491b85d7ff416d6d429b60109e6c2d6d50b9 for #13 but that's the cause of this bug.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Searching for "github-to-sqlite" throws an error 771316301  
748426581 https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426581 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 MDEyOklzc3VlQ29tbWVudDc0ODQyNjU4MQ== simonw 9599 2020-12-19T06:13:17Z 2020-12-19T06:13:17Z MEMBER

One fix for this could be to try running the raw query, but if it throws an error run it again with the query escaped.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Searching for "github-to-sqlite" throws an error 771316301  
748426663 https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426663 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 MDEyOklzc3VlQ29tbWVudDc0ODQyNjY2Mw== simonw 9599 2020-12-19T06:14:06Z 2020-12-19T06:14:06Z MEMBER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Searching for "github-to-sqlite" throws an error 771316301  
748426877 https://github.com/dogsheep/dogsheep-beta/issues/31#issuecomment-748426877 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/31 MDEyOklzc3VlQ29tbWVudDc0ODQyNjg3Nw== simonw 9599 2020-12-19T06:16:11Z 2020-12-19T06:16:11Z MEMBER

Here's why:

if "fts5" in str(e):

But the error being raised here is:

sqlite3.OperationalError: no such column: to

I'm going to attempt the escaped on on every error.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Searching for "github-to-sqlite" throws an error 771316301  
684395444 https://github.com/dogsheep/dogsheep-beta/issues/4#issuecomment-684395444 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/4 MDEyOklzc3VlQ29tbWVudDY4NDM5NTQ0NA== simonw 9599 2020-09-01T06:00:03Z 2020-09-01T06:00:03Z MEMBER

I ran sqlite-utils optimize beta.db against my test DB and the size reduced from 183M to 176M - and a 450ms search ran in 359ms. So not a huge improvement but still worthwhile.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize the FTS table 689839399  
686689366 https://github.com/dogsheep/dogsheep-beta/issues/5#issuecomment-686689366 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/5 MDEyOklzc3VlQ29tbWVudDY4NjY4OTM2Ng== simonw 9599 2020-09-03T18:43:50Z 2020-09-03T18:43:50Z MEMBER

No longer needed thanks to #9

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add a context column that's not searchable 689847361  
685895540 https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685895540 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 MDEyOklzc3VlQ29tbWVudDY4NTg5NTU0MA== simonw 9599 2020-09-02T17:46:44Z 2020-09-02T17:46:44Z MEMBER

Some opet questions about this:

  • Should I restrict to two exclusive categories here, or should I have a generic category mechanism that can be expanded to more than two?
  • Should an item be able to exist in more than one category? Do I want to be able to mark an indexed item as both by-me and liked-by-me for example? This question is more interesting if the number of categories is greater than two.
  • How should this be modeled? Single column, multiple boolean columns, JSON array, m2m against separate table?
  • What's the best way to make this performant
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for differentiating between "by me" and "liked by me" 691265198  
685962280 https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685962280 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 MDEyOklzc3VlQ29tbWVudDY4NTk2MjI4MA== simonw 9599 2020-09-02T19:55:26Z 2020-09-02T19:59:58Z MEMBER

Relevant: https://charlesleifer.com/blog/a-tour-of-tagging-schemas-many-to-many-bitmaps-and-more/

SQLite supports bitwise operators Binary AND (&) and Binary OR (|) - I could try those. Not sure how they interact with indexes though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for differentiating between "by me" and "liked by me" 691265198  
685965516 https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685965516 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 MDEyOklzc3VlQ29tbWVudDY4NTk2NTUxNg== simonw 9599 2020-09-02T20:01:54Z 2020-09-02T20:01:54Z MEMBER

Relevant post: https://sqlite.org/forum/forumpost/9f06fedaa5 - drh says:

Indexes are one-to-one. There is one entry in the index for each row in the table.

You are asking for an index that is many-to-one - multiple index entries for each table row.

A Full-Text Index is basically a many-to-one index. So if all of your array entries really are words, you could probably get this to work using a Full-Text Index.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for differentiating between "by me" and "liked by me" 691265198  
685966361 https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685966361 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 MDEyOklzc3VlQ29tbWVudDY4NTk2NjM2MQ== simonw 9599 2020-09-02T20:03:29Z 2020-09-02T20:03:41Z MEMBER

I'm going to implement the first version of this as an indexed integer category column which has 1 for "about me" and 2 for "liked by me" - and space for other category numerals in the future, albeit a row can only belong to one category.

I'll think about a full tagging system separately.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for differentiating between "by me" and "liked by me" 691265198  
685966707 https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685966707 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 MDEyOklzc3VlQ29tbWVudDY4NTk2NjcwNw== simonw 9599 2020-09-02T20:04:08Z 2020-09-02T20:04:08Z MEMBER

I'll make category a foreign key to a categories table so Datasette can automatically show the name column.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for differentiating between "by me" and "liked by me" 691265198  
685970384 https://github.com/dogsheep/dogsheep-beta/issues/7#issuecomment-685970384 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/7 MDEyOklzc3VlQ29tbWVudDY4NTk3MDM4NA== simonw 9599 2020-09-02T20:11:41Z 2020-09-02T20:11:59Z MEMBER

Default categories:

  • 1 = created
  • 2 = saved
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for differentiating between "by me" and "liked by me" 691265198  
685960072 https://github.com/dogsheep/dogsheep-beta/issues/8#issuecomment-685960072 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/8 MDEyOklzc3VlQ29tbWVudDY4NTk2MDA3Mg== simonw 9599 2020-09-02T19:50:47Z 2020-09-02T19:50:47Z MEMBER

This doesn't actually help, because the Datasette table view page doesn't then support adding the where search_index_fts match :query bit.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Create a view for running faceted searches 691369691  
686153967 https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686153967 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 MDEyOklzc3VlQ29tbWVudDY4NjE1Mzk2Nw== simonw 9599 2020-09-03T00:17:16Z 2020-09-03T00:17:55Z MEMBER

Maybe I can take advantage of https://sqlite.org/np1queryprob.html here - I could define a SQL query for fetching the "display" version of each item, and include a Jinja template fragment in the configuration as well. Maybe something like this:

photos.db:
    photos_with_apple_metadata:
        sql: |-
            select
              sha256 as key,
              'Photo in ' || coalesce(place_city, 'unknown') as title,
              (
                select
                  group_concat(normalized_string, ' ')
                from
                  labels
                where
                  labels.uuid = photos_with_apple_metadata.uuid
              ) as search_1,
              date as timestamp,
              1 as category
            from
              photos_with_apple_metadata
        display_sql: |-
            select
              sha256, place_city, date
            from photos_with_apple_metadata
            where sha256 = :key
        display: |-
          <img src="https://photos.simonwillison.net/i/{{ display.sha256 }}.jpeg?w=600">
          <p>Taken in {{ display.place_city }} on {{ display.date }}</p>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for defining custom display of results 691521965  
686154486 https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686154486 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 MDEyOklzc3VlQ29tbWVudDY4NjE1NDQ4Ng== simonw 9599 2020-09-03T00:18:54Z 2020-09-03T00:18:54Z MEMBER

display_sql could be optional. If it's not defined, a row object is passed to the template which is the row that's stored in search_index. If display_sql IS defined then it's executed and the result is made available as a display object in addition to the row object.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for defining custom display of results 691521965  
686154627 https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686154627 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 MDEyOklzc3VlQ29tbWVudDY4NjE1NDYyNw== simonw 9599 2020-09-03T00:19:22Z 2020-09-03T00:19:22Z MEMBER

If this performs well enough (100 displayed items will be 100 extra display_sql calls) then I'll go with this as the design for the feature.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for defining custom display of results 691521965  
686158454 https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686158454 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 MDEyOklzc3VlQ29tbWVudDY4NjE1ODQ1NA== simonw 9599 2020-09-03T00:32:42Z 2020-09-03T00:32:42Z MEMBER

If this turns out to be too inefficient I could add a display text column to the search_index table which is designed to be populated with arbitrary JSON by the indexing query, which can then be used to render the template fragment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for defining custom display of results 691521965  
686163754 https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686163754 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 MDEyOklzc3VlQ29tbWVudDY4NjE2Mzc1NA== simonw 9599 2020-09-03T00:46:21Z 2020-09-03T00:46:21Z MEMBER

Challenge: the dogsheep-beta.yml configuration file that is passed to the dogsheep-beta index command needs to also be made available to Datasette itself, so that it can read the configuration.

Let's say it can either be duplicated in the plugins configuration block of the metadata.yml OR you can do this in metadata.yml:

plugins:
  dogsheep-beta:
    config_file: dogsheep-beta.yml
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for defining custom display of results 691521965  
686688963 https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686688963 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 MDEyOklzc3VlQ29tbWVudDY4NjY4ODk2Mw== simonw 9599 2020-09-03T18:42:59Z 2020-09-03T18:42:59Z MEMBER

I'm pleased with how this works now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for defining custom display of results 691521965  
686689122 https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686689122 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 MDEyOklzc3VlQ29tbWVudDY4NjY4OTEyMg== simonw 9599 2020-09-03T18:43:20Z 2020-09-03T18:43:20Z MEMBER

Needs documentation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for defining custom display of results 691521965  
686767208 https://github.com/dogsheep/dogsheep-beta/issues/9#issuecomment-686767208 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/9 MDEyOklzc3VlQ29tbWVudDY4Njc2NzIwOA== simonw 9599 2020-09-03T21:12:14Z 2020-09-03T21:12:14Z MEMBER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for defining custom display of results 691521965  
623193947 https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623193947 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 MDEyOklzc3VlQ29tbWVudDYyMzE5Mzk0Nw== simonw 9599 2020-05-03T22:36:17Z 2020-05-03T22:36:17Z MEMBER

I'm going to use osxphotos for this.

Since I've already got code to upload photos and insert them into a table based on their sha256 hash, my first go at this will be to import data using the tool and foreign-key it to the sha256 hash in the existing table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import photo metadata from Apple Photos into SQLite 602533300  
623195197 https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623195197 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 MDEyOklzc3VlQ29tbWVudDYyMzE5NTE5Nw== simonw 9599 2020-05-03T22:44:33Z 2020-05-03T22:44:33Z MEMBER

Command will be this:

$ photos-to-sqlite apple-photos photos.db

This will populate a apple_photos table with the data imported by the osxphotos library, plus the calculated sha256.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import photo metadata from Apple Photos into SQLite 602533300  
623198653 https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623198653 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 MDEyOklzc3VlQ29tbWVudDYyMzE5ODY1Mw== simonw 9599 2020-05-03T23:09:57Z 2020-05-03T23:09:57Z MEMBER

For locations: I'll add place_x columns for all of these:

(Pdb) photo.place.address._asdict()
{'street': None, 'sub_locality': None, 'city': 'Loreto', 'sub_administrative_area': 'Loreto', 'state_province': 'BCS', 'postal_code': None, 'country': 'Mexico', 'iso_country_code': 'MX'}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import photo metadata from Apple Photos into SQLite 602533300  
623198986 https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623198986 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 MDEyOklzc3VlQ29tbWVudDYyMzE5ODk4Ng== simonw 9599 2020-05-03T23:12:31Z 2020-05-03T23:12:46Z MEMBER

To get the taken date in UTC:

from datetime import timezone
(Pdb) photo.date.astimezone(timezone.utc).isoformat()
'2018-02-13T20:21:31.620000+00:00'
(Pdb) photo.date.astimezone(timezone.utc).isoformat().split(".")
['2018-02-13T20:21:31', '620000+00:00']
(Pdb) photo.date.astimezone(timezone.utc).isoformat().split(".")[0]
'2018-02-13T20:21:31'
(Pdb) photo.date.astimezone(timezone.utc).isoformat().split(".")[0] + "+00:00"
'2018-02-13T20:21:31+00:00'
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import photo metadata from Apple Photos into SQLite 602533300  
623199214 https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623199214 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 MDEyOklzc3VlQ29tbWVudDYyMzE5OTIxNA== simonw 9599 2020-05-03T23:14:08Z 2020-05-03T23:14:08Z MEMBER

Albums have UUIDs:

(Pdb) photo.album_info[0].__dict__
{'_uuid': '17816791-ABF3-447B-942C-9FA8065EEBBA', '_db': osxphotos.PhotosDB(dbfile='/Users/simon/Pictures/Photos Library.photoslibrary/database/photos.db'), '_title': 'Geotaggable Photos geotagged'}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import photo metadata from Apple Photos into SQLite 602533300  
623199701 https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623199701 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 MDEyOklzc3VlQ29tbWVudDYyMzE5OTcwMQ== simonw 9599 2020-05-03T23:17:38Z 2020-05-03T23:17:38Z MEMBER

Record burst_uuid as a column:

(Pdb) with_bursts[0]._info["burstUUID"]
'703FAA23-57BF-40B4-8A33-D9CEB143391B'
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import photo metadata from Apple Photos into SQLite 602533300  
623199750 https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623199750 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 MDEyOklzc3VlQ29tbWVudDYyMzE5OTc1MA== simonw 9599 2020-05-03T23:17:58Z 2020-05-03T23:17:58Z MEMBER

Reading this source code is really useful for figuring out how to store a photo in a DB table: https://github.com/RhetTbull/osxphotos/blob/7444b6d173918a3ad2a07aefce5ecf054786c787/osxphotos/photoinfo.py

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import photo metadata from Apple Photos into SQLite 602533300  
623232984 https://github.com/dogsheep/dogsheep-photos/issues/1#issuecomment-623232984 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1 MDEyOklzc3VlQ29tbWVudDYyMzIzMjk4NA== simonw 9599 2020-05-04T02:41:32Z 2020-05-04T02:41:32Z MEMBER

Needs documentation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import photo metadata from Apple Photos into SQLite 602533300  
618796564 https://github.com/dogsheep/dogsheep-photos/issues/12#issuecomment-618796564 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/12 MDEyOklzc3VlQ29tbWVudDYxODc5NjU2NA== simonw 9599 2020-04-24T04:35:25Z 2020-04-24T04:35:25Z MEMBER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If less than 500MB, show size in MB not GB 606033104  
620273692 https://github.com/dogsheep/dogsheep-photos/issues/13#issuecomment-620273692 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/13 MDEyOklzc3VlQ29tbWVudDYyMDI3MzY5Mg== simonw 9599 2020-04-27T22:42:50Z 2020-04-27T22:42:50Z MEMBER
>>> def ext_counts(directory):
...     counts = {}
...     for path in pathlib.Path(directory).glob("**/*"):
...         ext = path.suffix
...         counts[ext] = counts.get(ext, 0) + 1
...     return counts
... 
>>> 
>>> ext_counts("/Users/simon/Pictures/Photos Library.photoslibrary/originals")
{'': 16, '.heic': 15478, '.jpeg': 21691, '.mov': 946, '.png': 2262, '.gif': 38, '.mp4': 116, '.aae': 2}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Also upload movie files 607888367  
620309185 https://github.com/dogsheep/dogsheep-photos/issues/13#issuecomment-620309185 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/13 MDEyOklzc3VlQ29tbWVudDYyMDMwOTE4NQ== simonw 9599 2020-04-28T00:39:45Z 2020-04-28T00:39:45Z MEMBER

I'm going to leave this until I have the mechanism for associating a live photo video with the photo.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Also upload movie files 607888367  
620769348 https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620769348 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14 MDEyOklzc3VlQ29tbWVudDYyMDc2OTM0OA== simonw 9599 2020-04-28T18:09:21Z 2020-04-28T18:09:21Z MEMBER

Pricing is pretty good: free for first 1,000 calls per month, then $1.50 per thousand after that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Annotate photos using the Google Cloud Vision API 608512747  
620771067 https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620771067 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14 MDEyOklzc3VlQ29tbWVudDYyMDc3MTA2Nw== simonw 9599 2020-04-28T18:12:34Z 2020-04-28T18:15:38Z MEMBER

Python library docs: https://googleapis.dev/python/vision/latest/index.html

I'm creating a new project for this called simonwillison-photos: https://console.cloud.google.com/projectcreate

https://console.cloud.google.com/home/dashboard?project=simonwillison-photos

Then I enabled the Vision API. The direct link to https://console.cloud.google.com/flows/enableapi?apiid=vision-json.googleapis.com which they provided in the docs didn't work - it gave me a "You don't have sufficient permissions to use the requested API" error - but starting at the "Enable APIs" page and searching for it worked fine.

I created a new service account as an "owner" of that project: https://console.cloud.google.com/apis/credentials/serviceaccountkey (and complained about it on Twitter and through their feedback form)

pip install google-cloud-vision

from google.cloud import vision
client = vision.ImageAnnotatorClient.from_service_account_file("simonwillison-photos-18c570b301fe.json")
# Photo of a lemur
response = client.annotate_image(
    {
        "image": {
            "source": {
                "image_uri": "https://photos.simonwillison.net/i/1b3414ee9ade67ce04ade9042e6d4b433d1e523c9a16af17f490e2c0a619755b.jpeg"
            }
        },
        "features": [
            {"type": vision.enums.Feature.Type.IMAGE_PROPERTIES},
            {"type": vision.enums.Feature.Type.OBJECT_LOCALIZATION},
            {"type": vision.enums.Feature.Type.LABEL_DETECTION},
        ],
    }
)
response

Output is:

label_annotations {
  mid: "/m/09686"
  description: "Vertebrate"
  score: 0.9851104021072388
  topicality: 0.9851104021072388
}
label_annotations {
  mid: "/m/04rky"
  description: "Mammal"
  score: 0.975814163684845
  topicality: 0.975814163684845
}
label_annotations {
  mid: "/m/01280g"
  description: "Wildlife"
  score: 0.8973650336265564
  topicality: 0.8973650336265564
}
label_annotations {
  mid: "/m/02f9pk"
  description: "Lemur"
  score: 0.8270352482795715
  topicality: 0.8270352482795715
}
label_annotations {
  mid: "/m/0fbf1m"
  description: "Terrestrial animal"
  score: 0.7443860769271851
  topicality: 0.7443860769271851
}
label_annotations {
  mid: "/m/06z_nw"
  description: "Tail"
  score: 0.6934166550636292
  topicality: 0.6934166550636292
}
label_annotations {
  mid: "/m/0b5gs"
  description: "Branch"
  score: 0.6203985214233398
  topicality: 0.6203985214233398
}
label_annotations {
  mid: "/m/05s2s"
  description: "Plant"
  score: 0.585474967956543
  topicality: 0.585474967956543
}
label_annotations {
  mid: "/m/089v3"
  description: "Zoo"
  score: 0.5488107800483704
  topicality: 0.5488107800483704
}
label_annotations {
  mid: "/m/02tcwp"
  description: "Trunk"
  score: 0.5200017690658569
  topicality: 0.5200017690658569
}
image_properties_annotation {
  dominant_colors {
    colors {
      color {
        red: 172.0
        green: 146.0
        blue: 116.0
      }
      score: 0.24523821473121643
      pixel_fraction: 0.027533333748579025
    }
    colors {
      color {
        red: 54.0
        green: 50.0
        blue: 42.0
      }
      score: 0.10449723154306412
      pixel_fraction: 0.12893334031105042
    }
    colors {
      color {
        red: 141.0
        green: 121.0
        blue: 97.0
      }
      score: 0.1391485631465912
      pixel_fraction: 0.039133332669734955
    }
    colors {
      color {
        red: 28.0
        green: 25.0
        blue: 20.0
      }
      score: 0.08589499443769455
      pixel_fraction: 0.11506666988134384
    }
    colors {
      color {
        red: 87.0
        green: 82.0
        blue: 74.0
      }
      score: 0.0845794677734375
      pixel_fraction: 0.16113333404064178
    }
    colors {
      color {
        red: 121.0
        green: 117.0
        blue: 108.0
      }
      score: 0.05901569500565529
      pixel_fraction: 0.13379999995231628
    }
    colors {
      color {
        red: 94.0
        green: 83.0
        blue: 66.0
      }
      score: 0.049011144787073135
      pixel_fraction: 0.03946666792035103
    }
    colors {
      color {
        red: 155.0
        green: 117.0
        blue: 90.0
      }
      score: 0.04164913296699524
      pixel_fraction: 0.0023333332501351833
    }
    colors {
      color {
        red: 178.0
        green: 143.0
        blue: 102.0
      }
      score: 0.02993861958384514
      pixel_fraction: 0.0012666666880249977
    }
    colors {
      color {
        red: 61.0
        green: 51.0
        blue: 35.0
      }
      score: 0.027391711249947548
      pixel_fraction: 0.01953333243727684
    }
  }
}
crop_hints_annotation {
  crop_hints {
    bounding_poly {
      vertices {
        x: 2073
      }
      vertices {
        x: 4008
      }
      vertices {
        x: 4008
        y: 3455
      }
      vertices {
        x: 2073
        y: 3455
      }
    }
    confidence: 0.65625
    importance_fraction: 0.746666669845581
  }
}
localized_object_annotations {
  mid: "/m/0jbk"
  name: "Animal"
  score: 0.7008256912231445
  bounding_poly {
    normalized_vertices {
      x: 0.0390297956764698
      y: 0.26235100626945496
    }
    normalized_vertices {
      x: 0.8466796875
      y: 0.26235100626945496
    }
    normalized_vertices {
      x: 0.8466796875
      y: 0.9386426210403442
    }
    normalized_vertices {
      x: 0.0390297956764698
      y: 0.9386426210403442
    }
  }
}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Annotate photos using the Google Cloud Vision API 608512747  
620771698 https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620771698 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14 MDEyOklzc3VlQ29tbWVudDYyMDc3MTY5OA== simonw 9599 2020-04-28T18:13:48Z 2020-04-28T18:13:48Z MEMBER

For face detection:

    {"type": vision.enums.Feature.Type.Type.FACE_DETECTION}

For OCR:

    {"type": vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Annotate photos using the Google Cloud Vision API 608512747  
620772190 https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620772190 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14 MDEyOklzc3VlQ29tbWVudDYyMDc3MjE5MA== simonw 9599 2020-04-28T18:14:43Z 2020-04-28T18:14:43Z MEMBER

Database schema for this will require some thought. Just dumping the output into a JSON column isn't going to be flexible enough - I want to be able to FTS against labels and OCR text, and potentially query against other characteristics too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Annotate photos using the Google Cloud Vision API 608512747  
620774507 https://github.com/dogsheep/dogsheep-photos/issues/14#issuecomment-620774507 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14 MDEyOklzc3VlQ29tbWVudDYyMDc3NDUwNw== simonw 9599 2020-04-28T18:19:06Z 2020-04-28T18:19:06Z MEMBER

The default timeout is a bit aggressive and sometimes failed for me if my resizing proxy took too long to fetch and resize the image.

client.annotate_image(..., timeout=3.0) may be worth trying.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Annotate photos using the Google Cloud Vision API 608512747  
623723026 https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-623723026 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 MDEyOklzc3VlQ29tbWVudDYyMzcyMzAyNg== simonw 9599 2020-05-04T21:41:30Z 2020-05-04T21:41:30Z MEMBER

I'm going to put these in a table called apple_photos_scores - I'll also pull in the following columns from the ZGENERICASSET table:

  • ZOVERALLAESTHETICSCORE
  • ZCURATIONSCORE
  • ZHIGHLIGHTVISIBILITYSCORE
  • ZPROMOTIONSCORE
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767  
623723687 https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-623723687 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 MDEyOklzc3VlQ29tbWVudDYyMzcyMzY4Nw== simonw 9599 2020-05-04T21:43:06Z 2020-05-04T21:43:06Z MEMBER

It looks like I can map the photos I'm importing to these tables using the ZUUID column on ZGENERICASSET to get a Z_PK which then maps to the rows in ZGENERICASSET.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767  
623730934 https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-623730934 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 MDEyOklzc3VlQ29tbWVudDYyMzczMDkzNA== simonw 9599 2020-05-04T22:00:38Z 2020-05-04T22:00:48Z MEMBER

Here's the query to create the new table:

create table apple_photos_scores as select
    ZGENERICASSET.ZUUID,
    ZGENERICASSET.ZOVERALLAESTHETICSCORE,
    ZGENERICASSET.ZCURATIONSCORE,
    ZGENERICASSET.ZPROMOTIONSCORE,
    ZGENERICASSET.ZHIGHLIGHTVISIBILITYSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZBEHAVIORALSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZFAILURESCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZHARMONIOUSCOLORSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZIMMERSIVENESSSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZINTERACTIONSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZINTERESTINGSUBJECTSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZINTRUSIVEOBJECTPRESENCESCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZLIVELYCOLORSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZLOWLIGHT,
    ZCOMPUTEDASSETATTRIBUTES.ZNOISESCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTCAMERATILTSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTCOMPOSITIONSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTLIGHTINGSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTPATTERNSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTPERSPECTIVESCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTPOSTPROCESSINGSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTREFLECTIONSSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZPLEASANTSYMMETRYSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZSHARPLYFOCUSEDSUBJECTSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZTASTEFULLYBLURREDSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZWELLCHOSENSUBJECTSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZWELLFRAMEDSUBJECTSCORE,
    ZCOMPUTEDASSETATTRIBUTES.ZWELLTIMEDSHOTSCORE
from
    attached.ZGENERICASSET
      join attached.ZCOMPUTEDASSETATTRIBUTES on
          attached.ZGENERICASSET.Z_PK = attached.ZCOMPUTEDASSETATTRIBUTES.Z_PK;
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767  
623739934 https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-623739934 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 MDEyOklzc3VlQ29tbWVudDYyMzczOTkzNA== simonw 9599 2020-05-04T22:24:26Z 2020-05-04T22:24:26Z MEMBER

Twitter thread with some examples of photos that are coming up from queries against these scores: https://twitter.com/simonw/status/1257434670750408705

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767  
748436115 https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748436115 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 MDEyOklzc3VlQ29tbWVudDc0ODQzNjExNQ== nickvazz 8573886 2020-12-19T07:43:38Z 2020-12-19T07:47:36Z NONE

Hey Simon! I really enjoy datasette so far, just started trying it out today following your iPhone photos example.

I am not sure if you had run into this or not, but it seems like they might have changed one of the column names from
ZGENERICASSET to ZASSET. Should I open a PR?

Would change:
- here
- here
- here

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767  
748436779 https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748436779 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 MDEyOklzc3VlQ29tbWVudDc0ODQzNjc3OQ== RhetTbull 41546558 2020-12-19T07:49:00Z 2020-12-19T07:49:00Z CONTRIBUTOR

@nickvazz ZGENERICASSET changed to ZASSET in Big Sur. Here's a list of other changes to the schema in Big Sur: https://github.com/RhetTbull/osxphotos/wiki/Changes-in-Photos-6---Big-Sur

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767  
748562288 https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748562288 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 MDEyOklzc3VlQ29tbWVudDc0ODU2MjI4OA== RhetTbull 41546558 2020-12-20T04:44:22Z 2020-12-20T04:44:22Z CONTRIBUTOR

@nickvazz @simonw I opened a PR that replaces the SQL for ZCOMPUTEDASSETATTRIBUTES to use osxphotos which now exposes all this data and has been updated for Big Sur. I did regression tests to confirm the extracted data is identical, with one exception which should not affect operation: the old code pulled data from ZCOMPUTEDASSETATTRIBUTES for missing photos while the main loop ignores missing photos and does not add them to apple_photos. The new code does not add rows to the apple_photos_scores table for missing photos.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767  
623805823 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623805823 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgwNTgyMw== simonw 9599 2020-05-05T02:45:56Z 2020-05-05T02:45:56Z MEMBER

I filed an issue with osxphotos about this here: https://github.com/RhetTbull/osxphotos/issues/121

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623806085 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806085 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgwNjA4NQ== simonw 9599 2020-05-05T02:47:18Z 2020-05-05T02:47:18Z MEMBER

In https://github.com/RhetTbull/osxphotos/issues/121#issuecomment-623249263 Rhet Turnbull spotted a table called ZSCENEIDENTIFIER which looked like it might have the right data, but the columns in it aren't particularly helpful:

Z_PK,Z_ENT,Z_OPT,ZSCENEIDENTIFIER,ZASSETATTRIBUTES,ZCONFIDENCE
8,49,1,731,5,0.11834716796875
9,49,1,684,6,0.0233648251742125
10,49,1,1702,1,0.026153564453125

I love the look of those confidence scores, but what do the numbers mean?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623806533 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806533 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgwNjUzMw== simonw 9599 2020-05-05T02:50:16Z 2020-05-05T02:50:16Z MEMBER

I figured there must be a separate database that Photos uses to store the text of the identified labels.

I used "Open Files and Ports" in Activity Monitor against the Photos app to try and spot candidates... and found /Users/simon/Pictures/Photos Library.photoslibrary/database/search/psi.sqlite - a 53MB SQLite database file.

https://user-images.githubusercontent.com/9599/81031213-61ea0800-8e40-11ea-8237-cfce4a5128e0.png">

Here's the schema of that file:

$ sqlite3 psi.sqlite .schema
CREATE TABLE word_embedding(word TEXT, extended_word TEXT, score DOUBLE);
CREATE INDEX word_embedding_index ON word_embedding(word);
CREATE VIRTUAL TABLE word_embedding_prefix USING fts5(extended_word)
/* word_embedding_prefix(extended_word) */;
CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_data'(id INTEGER PRIMARY KEY, block BLOB);
CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_idx'(segid, term, pgno, PRIMARY KEY(segid, term)) WITHOUT ROWID;
CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_content'(id INTEGER PRIMARY KEY, c0);
CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_docsize'(id INTEGER PRIMARY KEY, sz BLOB);
CREATE TABLE IF NOT EXISTS 'word_embedding_prefix_config'(k PRIMARY KEY, v) WITHOUT ROWID;
CREATE TABLE groups(category INT2, owning_groupid INT, content_string TEXT, normalized_string TEXT, lookup_identifier TEXT, token_ranges_0 INT8, token_ranges_1 INT8, UNIQUE(category, owning_groupid, content_string, lookup_identifier, token_ranges_0, token_ranges_1));
CREATE TABLE assets(uuid_0 INT, uuid_1 INT, creationDate INT, UNIQUE(uuid_0, uuid_1));
CREATE TABLE ga(groupid INT, assetid INT, PRIMARY KEY(groupid, assetid));
CREATE TABLE collections(uuid_0 INT, uuid_1 INT, startDate INT, endDate INT, title TEXT, subtitle TEXT, keyAssetUUID_0 INT, keyAssetUUID_1 INT, typeAndNumberOfAssets INT32, sortDate DOUBLE, UNIQUE(uuid_0, uuid_1));
CREATE TABLE gc(groupid INT, collectionid INT, PRIMARY KEY(groupid, collectionid));
CREATE VIRTUAL TABLE prefix USING fts5(content='groups', normalized_string, category UNINDEXED, tokenize = 'PSITokenizer');
CREATE TABLE IF NOT EXISTS 'prefix_data'(id INTEGER PRIMARY KEY, block BLOB);
CREATE TABLE IF NOT EXISTS 'prefix_idx'(segid, term, pgno, PRIMARY KEY(segid, term)) WITHOUT ROWID;
CREATE TABLE IF NOT EXISTS 'prefix_docsize'(id INTEGER PRIMARY KEY, sz BLOB);
CREATE TABLE IF NOT EXISTS 'prefix_config'(k PRIMARY KEY, v) WITHOUT ROWID;
CREATE TABLE lookup(identifier TEXT PRIMARY KEY, category INT2);
CREATE TRIGGER trigger_groups_insert AFTER INSERT ON groups BEGIN INSERT INTO prefix(rowid, normalized_string, category) VALUES (new.rowid, new.normalized_string, new.category); END;
CREATE TRIGGER trigger_groups_delete AFTER DELETE ON groups BEGIN INSERT INTO prefix(prefix, rowid, normalized_string, category) VALUES('delete', old.rowid, old.normalized_string, old.category); END;
CREATE INDEX group_pk ON groups(category, content_string, normalized_string, lookup_identifier);
CREATE INDEX asset_pk ON assets(uuid_0, uuid_1);
CREATE INDEX ga_assetid ON ga(assetid, groupid);
CREATE INDEX collection_pk ON collections(uuid_0, uuid_1);
CREATE INDEX gc_collectionid ON gc(collectionid);
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623806687 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623806687 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgwNjY4Nw== simonw 9599 2020-05-05T02:51:16Z 2020-05-05T02:51:16Z MEMBER

Running datasette against it directly doesn't work:

simon@Simons-MacBook-Pro search % datasette psi.sqlite
Serve! files=('psi.sqlite',) (immutables=()) on port 8001
Usage: datasette serve [OPTIONS] [FILES]...

Error: Connection to psi.sqlite failed check: no such tokenizer: PSITokenizer

Instead, I created a new SQLite database with a copy of some of the key tables, like this:

sqlite-utils rows psi.sqlite groups | sqlite-utils insert /tmp/search.db groups -
sqlite-utils rows psi.sqlite assets | sqlite-utils insert /tmp/search.db assets -
sqlite-utils rows psi.sqlite ga | sqlite-utils insert /tmp/search.db ga -
sqlite-utils rows psi.sqlite collections | sqlite-utils insert /tmp/search.db collections -
sqlite-utils rows psi.sqlite gc | sqlite-utils insert /tmp/search.db gc -
sqlite-utils rows psi.sqlite lookup | sqlite-utils insert /tmp/search.db lookup -
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623807568 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623807568 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgwNzU2OA== simonw 9599 2020-05-05T02:56:06Z 2020-05-05T02:56:06Z MEMBER

I'm pretty sure this is what I'm after. The groups table has what looks like identified labels in the rows with category = 2025:

https://user-images.githubusercontent.com/9599/81031361-e0df4080-8e40-11ea-9060-6d850aa52140.png">

Then there's a ga table that maps groups to assets:

https://user-images.githubusercontent.com/9599/81031387-f48aa700-8e40-11ea-9a3d-da23903be928.png">

And an assets table which looks like it has one row for every one of my photos:

https://user-images.githubusercontent.com/9599/81031402-04a28680-8e41-11ea-8047-e9199d068563.png">

One major challenge: these UUIDs are split into two integer numbers, uuid_0 and uuid_1 - but the main photos database uses regular UUIDs like this:

I need to figure out how to match up these two different UUID representations. I asked on Twitter if anyone has any ideas: https://twitter.com/simonw/status/1257500689019703296

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623811131 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623811131 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzgxMTEzMQ== simonw 9599 2020-05-05T03:16:18Z 2020-05-05T03:16:18Z MEMBER

Here's how to convert two integers unto a UUID using Java. Not sure if it's the solution I need though (or how to do the same thing in Python):

https://repl.it/repls/EuphoricSomberClasslibrary

https://user-images.githubusercontent.com/9599/81032267-0d488c00-8e44-11ea-9be7-680eaccd1611.png">

import java.util.UUID;

class Main {
  public static void main(String[] args) {
    java.util.UUID uuid = new java.util.UUID(
      2544182952487526660L,
      -3640314103732024685L
    );
    System.out.println(
      uuid
    );
  }
}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623845014 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623845014 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg0NTAxNA== RhetTbull 41546558 2020-05-05T03:55:14Z 2020-05-05T03:56:24Z CONTRIBUTOR

I'm traveling w/o access to my Mac so can't help with any code right now. I suspected ZSCENEIDENTIFIER was a foreign key into one of these psi.sqlite tables. But looks like you're on to something connecting groups to assets. As for the UUID, I think there's two ints because each is 64-bits but UUIDs are 128-bits. Thus they need to be combined to get the 128 bit UUID. You might be able to use Apple's NSUUID, for example, by wrapping with pyObjC. Here's one example of using this in PyObjC's test suite. Interesting it's stored this way instead of a UUIDString as in Photos.sqlite. Perhaps it for faster indexing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623846880 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623846880 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg0Njg4MA== simonw 9599 2020-05-05T04:06:08Z 2020-05-05T04:06:08Z MEMBER

This function seems to convert them into UUIDs that match my photos:

def to_uuid(uuid_0, uuid_1):
    b = uuid_0.to_bytes(8, 'little', signed=True) + uuid_1.to_bytes(8, 'little', signed=True)
    return str(uuid.UUID(bytes=b)).upper()
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623855841 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623855841 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg1NTg0MQ== simonw 9599 2020-05-05T04:54:28Z 2020-05-05T04:54:28Z MEMBER

Things were not matching up for me correctly:

https://user-images.githubusercontent.com/9599/81035923-ca8db080-8e51-11ea-95a7-6ee60bae7502.png">

I think that's because my import script didn't correctly import the existing rowid values.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623855885 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623855885 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg1NTg4NQ== simonw 9599 2020-05-05T04:54:39Z 2020-05-05T04:54:53Z MEMBER

Trying this import mechanism instead:
sqlite3 /Users/simon/Pictures/Photos\ Library.photoslibrary/database/search/psi.sqlite .dump | grep -v 'CREATE INDEX' | grep -v 'CREATE TRIGGER' | grep -v 'CREATE VIRTUAL TABLE' | sqlite3 search.db

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623857417 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623857417 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg1NzQxNw== simonw 9599 2020-05-05T05:01:47Z 2020-05-05T05:01:47Z MEMBER

Even that didn't work - it didn't copy across the rowid values. I'm pretty sure that's what's wrong here:

sqlite3 /Users/simon/Pictures/Photos\ Library.photoslibrary/database/search/psi.sqlite 'select rowid, uuid_0, uuid_1 from assets limit 10'                                              
1619605|-9205353363298198838|4814875488794983828
1641378|-9205348195631362269|390804289838822030
1634974|-9205331524553603243|-3834026796261633148
1619083|-9205326176986145401|7563404215614709654
22131|-9205315724827218763|8370531509591906734
1645633|-9205247376092758131|-1311540150497601346
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623863902 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623863902 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg2MzkwMg== simonw 9599 2020-05-05T05:31:53Z 2020-05-05T05:31:53Z MEMBER

Yes! Turning those rowid values into id with this script did the job:

import sqlite3
import sqlite_utils

conn = sqlite3.connect(
    "/Users/simon/Pictures/Photos Library.photoslibrary/database/search/psi.sqlite"
)


def all_rows(table):
    result = conn.execute("select rowid as id, * from {}".format(table))
    cols = [c[0] for c in result.description]
    for row in result.fetchall():
        yield dict(zip(cols, row))


if __name__ == "__main__":
    db = sqlite_utils.Database("psi_copy.db")
    for table in ("assets", "collections", "ga", "gc", "groups"):
        db[table].upsert_all(all_rows(table), pk="id", alter=True)

Then I ran this query:

select 
  json_object('img_src', 'https://photos.simonwillison.net/i/' || photos.sha256 || '.' || photos.ext || '?w=400') as photo,
   group_concat(strip_null_chars(groups.content_string), ' ') as words, assets.uuid_0, assets.uuid_1, to_uuid(assets.uuid_0, assets.uuid_1) as uuid
from assets join ga on assets.id = ga.assetid
join groups on ga.groupid = groups.id
join photos on photos.uuid = to_uuid(assets.uuid_0, assets.uuid_1)
where groups.category = 2024
group by assets.id
order by random() limit 10

And got these results!
https://user-images.githubusercontent.com/9599/81037264-f1021a80-8e56-11ea-9924-6f9f55a0fb4b.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623865250 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623865250 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg2NTI1MA== simonw 9599 2020-05-05T05:38:16Z 2020-05-05T05:38:16Z MEMBER

It looks like groups.content_string often has a null byte in it. I should clean this up as part of the import.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
624278090 https://github.com/dogsheep/dogsheep-photos/issues/17#issuecomment-624278090 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/17 MDEyOklzc3VlQ29tbWVudDYyNDI3ODA5MA== simonw 9599 2020-05-05T20:06:01Z 2020-05-05T20:06:01Z MEMBER
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Only install osxphotos if running on macOS 612860531  
624278714 https://github.com/dogsheep/dogsheep-photos/issues/17#issuecomment-624278714 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/17 MDEyOklzc3VlQ29tbWVudDYyNDI3ODcxNA== simonw 9599 2020-05-05T20:07:19Z 2020-05-05T20:07:19Z MEMBER

From https://hynek.me/articles/conditional-python-dependencies/ I think this will look like:

setup(
    # ...
    install_requires=[
        # ...
        "osxphotos>=0.28.13 ; sys_platform=='darwin'",
    ]
)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Only install osxphotos if running on macOS 612860531  

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);