home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where issue = 725184645 and "updated_at" is on date 2020-10-20 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 9

issue 1

  • Better way of representing binary data in .csv output · 9 ✖

author_association 1

  • OWNER 9
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
713191819 https://github.com/simonw/datasette/issues/1034#issuecomment-713191819 https://api.github.com/repos/simonw/datasette/issues/1034 MDEyOklzc3VlQ29tbWVudDcxMzE5MTgxOQ== simonw 9599 2020-10-20T23:12:58Z 2020-10-20T23:12:58Z OWNER

Enzo has a great solution here: https://twitter.com/enzo_mdd/status/1318685442976436226

Or maybe an option for a url. This keeps the CSV small but allows scripts to download binary data as needed.

In #1036 I'm planning on adding a way for users to access BLOB data. I can include that URL in the CSV output.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better way of representing binary data in .csv output 725184645  
713176082 https://github.com/simonw/datasette/issues/1034#issuecomment-713176082 https://api.github.com/repos/simonw/datasette/issues/1034 MDEyOklzc3VlQ29tbWVudDcxMzE3NjA4Mg== simonw 9599 2020-10-20T22:27:33Z 2020-10-20T22:27:33Z OWNER

This feels good to me - it's consistent with how other features in Datasette work, and it means users who need the binary data in CSV (for whatever reason) can get it if they want to.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better way of representing binary data in .csv output 725184645  
713175741 https://github.com/simonw/datasette/issues/1034#issuecomment-713175741 https://api.github.com/repos/simonw/datasette/issues/1034 MDEyOklzc3VlQ29tbWVudDcxMzE3NTc0MQ== simonw 9599 2020-10-20T22:26:45Z 2020-10-20T22:26:45Z OWNER

New idea: since binary in CSV doesn't make sense anyway, emulate Datasette's HTML UI default and output this:

id,title,data
1,Some title,<Binary data: 14 bytes>
2,Other title,<Binary data: 57 bytes>

Then allow users to add ?_base64=1 to the URL to get base64 instead https://twitter.com/simonw/status/1318679950635888641

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better way of representing binary data in .csv output 725184645  
713174690 https://github.com/simonw/datasette/issues/1034#issuecomment-713174690 https://api.github.com/repos/simonw/datasette/issues/1034 MDEyOklzc3VlQ29tbWVudDcxMzE3NDY5MA== simonw 9599 2020-10-20T22:23:50Z 2020-10-20T22:23:50Z OWNER

Or... default to <Binary data: 7 bytes> and support a ?_base64=1 option which outputs in base64 instead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better way of representing binary data in .csv output 725184645  
713174341 https://github.com/simonw/datasette/issues/1034#issuecomment-713174341 https://api.github.com/repos/simonw/datasette/issues/1034 MDEyOklzc3VlQ29tbWVudDcxMzE3NDM0MQ== simonw 9599 2020-10-20T22:22:53Z 2020-10-20T22:23:14Z OWNER

An even easier option: do what the Datasette UI does and output <Binary data: 7 bytes> for that CSV cell, as seen on https://latest.datasette.io/fixtures/binary_data

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better way of representing binary data in .csv output 725184645  
713172901 https://github.com/simonw/datasette/issues/1034#issuecomment-713172901 https://api.github.com/repos/simonw/datasette/issues/1034 MDEyOklzc3VlQ29tbWVudDcxMzE3MjkwMQ== simonw 9599 2020-10-20T22:19:10Z 2020-10-20T22:20:28Z OWNER

I could go with the same format as datasette-render-binary but using 0x00 as the format for the hex bytes.

0x15 0x1C 0x02 0xC7 JFIF 0x00 0x01

Problem with this is that it's ambiguous: if the ASCII characters 0x15 occur in the text they will be indistinguishable from those hex bytes.

But since representing binary data in CSV fundamentally doesn't make sense I'm not sure if that really matters.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better way of representing binary data in .csv output 725184645  
712582699 https://github.com/simonw/datasette/issues/1034#issuecomment-712582699 https://api.github.com/repos/simonw/datasette/issues/1034 MDEyOklzc3VlQ29tbWVudDcxMjU4MjY5OQ== simonw 9599 2020-10-20T04:36:04Z 2020-10-20T04:36:14Z OWNER

Asked for ideas on Twitter: https://twitter.com/simonw/status/1318409558805467136

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better way of representing binary data in .csv output 725184645  
712581994 https://github.com/simonw/datasette/issues/1034#issuecomment-712581994 https://api.github.com/repos/simonw/datasette/issues/1034 MDEyOklzc3VlQ29tbWVudDcxMjU4MTk5NA== simonw 9599 2020-10-20T04:33:28Z 2020-10-20T04:33:28Z OWNER

The datasette-render-binary plugin does this, which I really like - but without the different coloured fonts I'm not sure how readable it would be as just plain text:

Really the goal here is to find the most human-friendly option, so that people looking at the output have a vague idea what's going on. That's why I'm not leaping at the chance to use base64.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better way of representing binary data in .csv output 725184645  
712580976 https://github.com/simonw/datasette/issues/1034#issuecomment-712580976 https://api.github.com/repos/simonw/datasette/issues/1034 MDEyOklzc3VlQ29tbWVudDcxMjU4MDk3Ng== simonw 9599 2020-10-20T04:29:23Z 2020-10-20T04:29:23Z OWNER

Most obvious option is base64. Any other potential solutions I'm missing?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better way of representing binary data in .csv output 725184645  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 27.352ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows