home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

20 rows where "created_at" is on date 2020-10-11 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 8

  • Redesign default .json format 7
  • Figure out how to display images from <en-media> tags inline in Datasette 5
  • Configure FTS + add an index on the date columns 2
  • Better handling of OCR data 2
  • Add Link: pagination HTTP headers 1
  • Research: could Datasette install its own plugins? 1
  • Documentation on how to use this with Datasette 1
  • Add a "delete" icon next to filters (in addition to "remove filter") 1

author_association 2

  • MEMBER 10
  • OWNER 10

user 1

  • simonw 20
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
706788010 https://github.com/simonw/datasette/issues/1016#issuecomment-706788010 https://api.github.com/repos/simonw/datasette/issues/1016 MDEyOklzc3VlQ29tbWVudDcwNjc4ODAxMA== simonw 9599 2020-10-11T23:50:39Z 2020-10-11T23:50:39Z OWNER

For consistency can reuse the icon used on selected facets:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add a "delete" icon next to filters (in addition to "remove filter") 718953669  
706786548 https://github.com/dogsheep/evernote-to-sqlite/issues/4#issuecomment-706786548 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/4 MDEyOklzc3VlQ29tbWVudDcwNjc4NjU0OA== simonw 9599 2020-10-11T23:39:46Z 2020-10-11T23:39:46Z MEMBER

Should have used porter stemming for this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Configure FTS + add an index on the date columns 718938508  
706785201 https://github.com/dogsheep/evernote-to-sqlite/issues/6#issuecomment-706785201 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/6 MDEyOklzc3VlQ29tbWVudDcwNjc4NTIwMQ== simonw 9599 2020-10-11T23:29:39Z 2020-10-11T23:29:39Z MEMBER

It looks to me like each of those <item> blocks has a number of guesses in order of confidence: xml <item x="215" y="190" w="187" h="39"> <t w="57">wonders,</t> <t w="55">wanders,</t> <t w="52">wonders ?</t> <t w="45">wonders</t> <t w="42">wonders.</t> </item> So maybe the best approach here is to just take the first t element within each item.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better handling of OCR data 718949182  
706785086 https://github.com/dogsheep/evernote-to-sqlite/issues/6#issuecomment-706785086 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/6 MDEyOklzc3VlQ29tbWVudDcwNjc4NTA4Ng== simonw 9599 2020-10-11T23:28:50Z 2020-10-11T23:28:50Z MEMBER

The XML for the OCR stuff is a bit weird. Currently I'm doing this to it:

https://github.com/dogsheep/evernote-to-sqlite/blob/c33d7b043a45eb3e88676e5fa3ce31755199d9f8/evernote_to_sqlite/utils.py#L70-L78

This can produce some odd results, for example:

Sure 'Sure, 'Sure. Sure, Sure. sure sure. sure ? If you If Yau [you live jive In m 1n an area devoid of natural wonders, wanders, wonders ? wonders wonders. your mind will be blown, blown' blown. blown ? -e i ? ,1 IL it ? at ? KY ? fl ft bat at

Which came from this image:

The XML for that is:

xml <recoIndex docType="unknown" objType="image" objID="05ffb72b307bf495f064243c7099d94f" engineVersion="6.5.17.7" recoType="service" lang="en" objWidth="1000" objHeight="1504"> <item x="68" y="75" w="104" h="37"> <t w="60">Sure</t> <t w="52">'Sure,</t> <t w="47">'Sure.</t> <t w="33">Sure,</t> <t w="26">Sure.</t> </item> <item x="182" y="83" w="92" h="26"> <t w="62">sure</t> <t w="58">sure.</t> <t w="46">sure ?</t> </item> <item x="69" y="132" w="107" h="45"> <t w="81">If you</t> <t w="64">If Yau</t> <t w="31">[you</t> </item> <item x="186" y="132" w="67" h="35"> <t w="85">live</t> <t w="51">jive</t> </item> <item x="263" y="140" w="36" h="27"> <t w="82">In</t> <t w="56">m</t> <t w="53">1n</t> </item> <item x="309" y="140" w="53" h="27"> <t w="82">an</t> </item> <item x="372" y="141" w="90" h="26"> <t w="94">area</t> </item> <item x="472" y="132" w="138" h="35"> <t w="85">devoid</t> </item> <item x="620" y="132" w="43" h="35"> <t w="82">of</t> </item> <item x="68" y="190" w="137" h="35"> <t w="87">natural</t> </item> <item x="215" y="190" w="187" h="39"> <t w="57">wonders,</t> <t w="55">wanders,</t> <t w="52">wonders ?</t> <t w="45">wonders</t> <t w="42">wonders.</t> </item> <item x="410" y="198" w="98" h="36"> <t w="88">your</t> </item> <item x="518" y="190" w="102" h="35"> <t w="86">mind</t> </item> <item x="630" y="190" w="69" h="34"> <t w="87">will</t> </item> <item x="709" y="190" w="55" h="35"> <t w="82">be</t> </item> <item x="774" y="190" w="137" h="34"> <t w="56">blown,</t> <t w="55">blown'</t> <t w="48">blown.</t> <t w="48">blown ?</t> </item> <item x="166" y="736" w="8" h="6"> <t w="66">-e</t> </item> <item x="273" y="966" w="29" h="21"> <t w="11">i ?</t> </item> <item x="281" y="1004" w="28" h="11"> <t w="11">,1</t> </item> <item x="512" y="1083" w="10" h="7"> <t w="10">IL</t> </item> <item x="29" y="1447" w="7" h="23"> <t w="17">it ?</t> <t w="15">at ?</t> <t w="13">KY ?</t> </item> <item x="414" y="841" w="8" h="16"> <t w="22">fl</t> <t w="20">ft</t> <t w="20">bat</t> <t w="19">at</t> </item> </recoIndex>

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better handling of OCR data 718949182  
706784028 https://github.com/dogsheep/evernote-to-sqlite/issues/4#issuecomment-706784028 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/4 MDEyOklzc3VlQ29tbWVudDcwNjc4NDAyOA== simonw 9599 2020-10-11T23:20:32Z 2020-10-11T23:20:32Z MEMBER

I haven't done the FTS on OCR yet. I'm going to move that to another ticket because it requires more thought.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Configure FTS + add an index on the date columns 718938508  
706776808 https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776808 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDcwNjc3NjgwOA== simonw 9599 2020-10-11T22:23:14Z 2020-10-11T22:23:14Z MEMBER

... but it's still important to be able to get to the rendered note directly from the browse notes /evernote/notes page. Maybe use a simple render_cell() hook that just knows how to generate the link to the rendered note page?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out how to display images from <en-media> tags inline in Datasette 718938889  
706776680 https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776680 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDcwNjc3NjY4MA== simonw 9599 2020-10-11T22:22:16Z 2020-10-11T22:22:16Z MEMBER

Maybe the best way do this is with a custom route, /-/evernote/note-id - that way I can clean the HTML and resolve the other things in the <en-note> structure without using render_cell() and the like. My concern about using render_cell() is that it could lead to weird security problems when combined with ?sql= queries.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out how to display images from <en-media> tags inline in Datasette 718938889  
706776447 https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776447 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDcwNjc3NjQ0Nw== simonw 9599 2020-10-11T22:20:32Z 2020-10-11T22:20:32Z MEMBER

Or... I could do this client-side. JavaScript that looks for <en-media> tags and fetches the data using fetch() wouldn't be too hard to write.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out how to display images from <en-media> tags inline in Datasette 718938889  
706776242 https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776242 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDcwNjc3NjI0Mg== simonw 9599 2020-10-11T22:18:30Z 2020-10-11T22:19:48Z MEMBER

Alternatively, rather than relying on datasette-media this could base64-embed the images. evernote-to-sqlite could register itself as a Datasette plugin that knows how to do this.

Maybe rename the column to evernote_content and register a render cell hook that knows how to rewrite those note bodies so that they are visible?

Might need to feed them through Bleach too, just in case any nasty code can get into them.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out how to display images from <en-media> tags inline in Datasette 718938889  
706776180 https://github.com/dogsheep/evernote-to-sqlite/issues/5#issuecomment-706776180 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5 MDEyOklzc3VlQ29tbWVudDcwNjc3NjE4MA== simonw 9599 2020-10-11T22:17:55Z 2020-10-11T22:17:55Z MEMBER

We could even do server-side thumbnailing for some of these images, but I'm inclined to serve up the full size ones and set a width on the image element based on the width attribute on <en-media>.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out how to display images from <en-media> tags inline in Datasette 718938889  
706775706 https://github.com/dogsheep/evernote-to-sqlite/issues/1#issuecomment-706775706 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/1 MDEyOklzc3VlQ29tbWVudDcwNjc3NTcwNg== simonw 9599 2020-10-11T22:14:00Z 2020-10-11T22:14:00Z MEMBER

A live demo would be good too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Documentation on how to use this with Datasette 718934942  
706756879 https://github.com/simonw/datasette/issues/1015#issuecomment-706756879 https://api.github.com/repos/simonw/datasette/issues/1015 MDEyOklzc3VlQ29tbWVudDcwNjc1Njg3OQ== simonw 9599 2020-10-11T19:35:03Z 2020-10-11T19:35:03Z OWNER

Since plugins are installed via pip this would require Datasette to be restarted. This StackOverflow thread looks relevant to that: https://stackoverflow.com/questions/11329917/restart-python-script-from-within-itself

This recipe looks promising: ```python import os import sys import psutil import logging

def restart_program(): """Restarts the current program, with file objects and descriptors cleanup """

try:
    p = psutil.Process(os.getpid())
    for handler in p.get_open_files() + p.connections():
        os.close(handler.fd)
except Exception, e:
    logging.error(e)

python = sys.executable
os.execl(python, python, *sys.argv)

`` https://docs.python.org/3/library/os.html#os.execl says aboutos.execl`:

These functions all execute a new program, replacing the current process; they do not return. On Unix, the new executable is loaded into the current process, and will have the same process id as the caller

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: could Datasette install its own plugins? 718910318  
706745236 https://github.com/simonw/datasette/issues/782#issuecomment-706745236 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjc0NTIzNg== simonw 9599 2020-10-11T18:16:05Z 2020-10-11T18:16:05Z OWNER

Here's the datasette-json-preview plugin I'll be using to experiment with different formats: https://github.com/simonw/datasette-json-preview

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706740250 https://github.com/simonw/datasette/issues/782#issuecomment-706740250 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjc0MDI1MA== simonw 9599 2020-10-11T17:40:48Z 2020-10-11T17:43:07Z OWNER

Building this plugin reminded me of an oddity of the register_output_renderer() plugin hook: one of the arguments that can be passed to it is data, which is the default internal data structure created by Datasette - but I deliberately avoided documenting that on https://docs.datasette.io/en/stable/plugin_hooks.html#register-output-renderer-datasette because it's not a stable interface.

That's not ideal. I'd like custom renderers to be able to access this data to get at things like suggested facets, on an opt-in basis.

So maybe that kind of stuff is re-implemented as "extras" which are awaitable callables - then renderer plugins can call the extras that they need to as part of their execution.

To illustrate the problem (in this case the need to access next_url) here's my first prototype of the plugin: ```python from datasette import hookimpl from datasette.utils.asgi import Response

@hookimpl def register_output_renderer(datasette): return { "extension": "json-preview", "render": json_preview, }

def json_preview(data, columns, rows): next_url = data.get("next_url") headers = {} if next_url: headers["link"] = '<{}>; rel="next"'.format(next_url) return Response.json([dict(zip(columns, row)) for row in rows], headers=headers) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706738020 https://github.com/simonw/datasette/issues/782#issuecomment-706738020 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjczODAyMA== simonw 9599 2020-10-11T17:23:18Z 2020-10-11T17:23:48Z OWNER

I'm going to prototype what it would look like if the default shape was a list of objects and ?_extra= turns that into an object with a rows key, in a plugin. As a separate extension (maybe .json-preview).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706735341 https://github.com/simonw/datasette/issues/782#issuecomment-706735341 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjczNTM0MQ== simonw 9599 2020-10-11T17:03:29Z 2020-10-11T17:15:34Z OWNER

Maybe .jsonfull becomes a new renderer that returns ALL of the defined ?_extra= blocks.

Or... ?_extra=all turns on ALL of the available information blocks (some of which can come from plugins).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706735200 https://github.com/simonw/datasette/issues/782#issuecomment-706735200 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjczNTIwMA== simonw 9599 2020-10-11T17:02:11Z 2020-10-11T17:14:51Z OWNER

Since the total count can be expensive to calculate, I'm inclined to make that an opt-in extra - maybe ?_extra=count.

Based on that, the default JSON shape could look something like this:

json { "rows": [{"id": 1}, {"id": 2}], "next": "2", "next_url": "/db/table?_next=2" } And with ?_extra=count: json { "rows": [{"id": 1}, {"id": 2}], "next": "2", "next_url": "/db/table?_next=2", "count": 31 }

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706736541 https://github.com/simonw/datasette/issues/782#issuecomment-706736541 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjczNjU0MQ== simonw 9599 2020-10-11T17:12:27Z 2020-10-11T17:12:27Z OWNER

The core issue that I keep reconsidering is whether the default .json representation should be an object or a list.

Arguments in favour of a list:

  • It's what I always want. Almost all of the code that I've written against the API myself uses ?_shape=array.
  • It's really easy to use. You can pipe it to e.g. sqlite-utils insert, you can load it into JavaScript without thinking about it.

Arguments against:

  • Nowhere to put pagination or total counts. I added pagination to the link: HTTP header in #1014 (inspired by the WordPress and GitHub APIs) but I haven't solved for total count, and there's other stuff that's useful like "truncated": true to indicate that more than 1000 results were returned and they were truncated.
  • An array is inherently non-extensible: if the root item is an object it's easy to add new features to it in a backwards-compatible way in the future. An array is a fixed format.

But maybe that last point is a positive? It ensures the default .json format remains completely predictable forever.

If .json DID default to an array of objects, the ?_shape= argument could still be used to get back alternative formats.

Maybe .json?_extra=total changes the shape of that default to be this instead:

json { "rows": [{"id": 1}, {"id": 2}], "total": 104 }

The thing I care about most though is next_url. That could be provided like so:

.json?_extra=total&_extra=next - alternative syntax .json?_extra=total,next:

json { "rows": [{"id": 1}, {"id": 2}], "total": 104, "next": "2", "next_url": "/db/table.json?_extra=total&_extra=next&_next=2" } This is feeling a bit verbose for a common combination though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706735280 https://github.com/simonw/datasette/issues/782#issuecomment-706735280 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjczNTI4MA== simonw 9599 2020-10-11T17:03:01Z 2020-10-11T17:03:01Z OWNER

Should that default also include "columns" as a list of strings? That would be duplicate data of the keys in the "rows" list of objects, and I've never found myself wanting it in my own code - so I'm going to say no.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706631006 https://github.com/simonw/datasette/issues/1014#issuecomment-706631006 https://api.github.com/repos/simonw/datasette/issues/1014 MDEyOklzc3VlQ29tbWVudDcwNjYzMTAwNg== simonw 9599 2020-10-11T00:36:43Z 2020-10-11T00:36:43Z OWNER

Demo using paginate-json: % paginate-json 'https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array' | jq '. | length' https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array&_next=a%2Cd%2Cv http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array&_next=a%2Ch%2Cr http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array&_next=a%2Cl%2Cn http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array&_next=a%2Cp%2Cj http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array&_next=a%2Ct%2Cf http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array&_next=a%2Cx%2Cb http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array&_next=b%2Ca%2Cx http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array&_next=b%2Ce%2Ct http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array&_next=b%2Ci%2Cp http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_shape=array&_next=b%2Cm%2Cl 1001 New documentation: https://docs.datasette.io/en/latest/json_api.html#pagination

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add Link: pagination HTTP headers 718723543  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 524.524ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows