{"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-719094027", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 719094027, "node_id": "MDEyOklzc3VlQ29tbWVudDcxOTA5NDAyNw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-30T00:11:17Z", "updated_at": "2020-10-30T00:11:17Z", "author_association": "OWNER", "body": "Demos:\r\n\r\nhttps://latest.datasette.io/fixtures/binary_data.csv?_size=max\r\n\r\n```csv\r\nrowid,data\r\n1,http://latest.datasette.io/fixtures/binary_data/1.blob?_blob_column=data\r\n2,http://latest.datasette.io/fixtures/binary_data/2.blob?_blob_column=data\r\n3,\r\n```\r\n\r\nhttps://latest.datasette.io/fixtures.csv?sql=select+rowid%2C+data+from+binary_data+order+by+rowid+limit+1001&_size=max\r\n\r\n```csv\r\nrowid,data\r\n1,http://latest.datasette.io/fixtures.blob?sql=select+rowid%2C+data+from+binary_data+order+by+rowid+limit+1001&_size=max&_blob_column=data&_blob_hash=f3088978da8f9aea479ffc7f631370b968d2e855eeb172bea7f6c7a04262bb6d\r\n2,http://latest.datasette.io/fixtures.blob?sql=select+rowid%2C+data+from+binary_data+order+by+rowid+limit+1001&_size=max&_blob_column=data&_blob_hash=b835b0483cedb86130b9a2c280880bf5fadc5318ddf8c18d0df5204d40df1724\r\n3,\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-719050754", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 719050754, "node_id": "MDEyOklzc3VlQ29tbWVudDcxOTA1MDc1NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-29T22:04:52Z", "updated_at": "2020-10-29T22:04:52Z", "author_association": "OWNER", "body": "I'm going to link to. the new `.blob` representation using the new `?_blob_hash=xxx` argument to ensure that the content served is the expected binary blob.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-716078777", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 716078777, "node_id": "MDEyOklzc3VlQ29tbWVudDcxNjA3ODc3Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-25T01:25:11Z", "updated_at": "2020-10-25T01:25:11Z", "author_association": "OWNER", "body": "SQLite actually has APIs that could help here: https://www.sqlite.org/c3ref/column_database_name.html - for any given SQL query they identify the origin/table/column that is the source of each resulting column.\r\n\r\nThose aren't exposed in the Python `sqlite3` module though, so using them could be extremely tricky.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-716078605", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 716078605, "node_id": "MDEyOklzc3VlQ29tbWVudDcxNjA3ODYwNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-25T01:22:22Z", "updated_at": "2020-10-25T01:22:22Z", "author_association": "OWNER", "body": "For arbitrary CSV the only solution I can think of is to embed the base64 value.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-716078512", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 716078512, "node_id": "MDEyOklzc3VlQ29tbWVudDcxNjA3ODUxMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-25T01:21:11Z", "updated_at": "2020-10-25T01:21:11Z", "author_association": "OWNER", "body": "What should happen for CSV export of arbitrary SQL queries, where there's no obvious BLOB to link to?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-716078420", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 716078420, "node_id": "MDEyOklzc3VlQ29tbWVudDcxNjA3ODQyMA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-25T01:20:00Z", "updated_at": "2020-10-25T01:20:00Z", "author_association": "OWNER", "body": "That documentation: https://docs.datasette.io/en/latest/internals.html#absolute-url-request-path", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-716077541", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 716077541, "node_id": "MDEyOklzc3VlQ29tbWVudDcxNjA3NzU0MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-25T01:09:38Z", "updated_at": "2020-10-25T01:10:04Z", "author_association": "OWNER", "body": "I should turn `datasette.absolute_url(...)` into a documented internal API on https://docs.datasette.io/en/stable/internals.html#datasette-class", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-716077508", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 716077508, "node_id": "MDEyOklzc3VlQ29tbWVudDcxNjA3NzUwOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-25T01:09:17Z", "updated_at": "2020-10-25T01:09:17Z", "author_association": "OWNER", "body": "Here's how those absolute `next_url` values are generated: https://github.com/simonw/datasette/blob/5db7ae3ce165ded57c7fb1cfbdb3258b1cf06c10/datasette/views/table.py#L774-L776", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-716077436", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 716077436, "node_id": "MDEyOklzc3VlQ29tbWVudDcxNjA3NzQzNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-25T01:08:35Z", "updated_at": "2020-10-25T01:08:42Z", "author_association": "OWNER", "body": "This is actually a bit tricky to implement, for a few reasons:\r\n\r\n- Need to generate a full URL, including the `https://host/` bit. I've done this for `next_url` in the JSON output before, thankfully.\r\n- This only makes sense for CSV output for tables. If it's the CSV output of an arbitrary query there's no `/db/table/-/blob/pk/column.blob` page for me to link to.\r\n- Need to generate those `/.../-/blob/...` URLs for the data that is being output as CSV.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-713277810", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 713277810, "node_id": "MDEyOklzc3VlQ29tbWVudDcxMzI3NzgxMA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-21T03:40:50Z", "updated_at": "2020-10-25T01:01:23Z", "author_association": "OWNER", "body": "Blocked awaiting #1036 (update: now unblocked)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-713191819", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 713191819, "node_id": "MDEyOklzc3VlQ29tbWVudDcxMzE5MTgxOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-20T23:12:58Z", "updated_at": "2020-10-20T23:12:58Z", "author_association": "OWNER", "body": "Enzo has a great solution here: https://twitter.com/enzo_mdd/status/1318685442976436226\r\n\r\n> Or maybe an option for a url. This keeps the CSV small but allows scripts to download binary data as needed.\r\n\r\nIn #1036 I'm planning on adding a way for users to access BLOB data. I can include that URL in the CSV output.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-713176082", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 713176082, "node_id": "MDEyOklzc3VlQ29tbWVudDcxMzE3NjA4Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-20T22:27:33Z", "updated_at": "2020-10-20T22:27:33Z", "author_association": "OWNER", "body": "This feels good to me - it's consistent with how other features in Datasette work, and it means users who need the binary data in CSV (for whatever reason) can get it if they want to.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-713175741", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 713175741, "node_id": "MDEyOklzc3VlQ29tbWVudDcxMzE3NTc0MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-20T22:26:45Z", "updated_at": "2020-10-20T22:26:45Z", "author_association": "OWNER", "body": "> New idea: since binary in CSV doesn't make sense anyway, emulate Datasette's HTML UI default and output this:\r\n> \r\n> id,title,data\r\n> 1,Some title,\r\n> 2,Other title,\r\n> \r\n> Then allow users to add ?_base64=1 to the URL to get base64 instead\r\n> https://twitter.com/simonw/status/1318679950635888641", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-713174690", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 713174690, "node_id": "MDEyOklzc3VlQ29tbWVudDcxMzE3NDY5MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-20T22:23:50Z", "updated_at": "2020-10-20T22:23:50Z", "author_association": "OWNER", "body": "Or... default to `` and support a `?_base64=1` option which outputs in base64 instead.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-713174341", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 713174341, "node_id": "MDEyOklzc3VlQ29tbWVudDcxMzE3NDM0MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-20T22:22:53Z", "updated_at": "2020-10-20T22:23:14Z", "author_association": "OWNER", "body": "An even easier option: do what the Datasette UI does and output `` for that CSV cell, as seen on https://latest.datasette.io/fixtures/binary_data", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-713172901", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 713172901, "node_id": "MDEyOklzc3VlQ29tbWVudDcxMzE3MjkwMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-20T22:19:10Z", "updated_at": "2020-10-20T22:20:28Z", "author_association": "OWNER", "body": "I could go with the same format as `datasette-render-binary` but using `0x00` as the format for the hex bytes.\r\n\r\n 0x15 0x1C 0x02 0xC7 JFIF 0x00 0x01\r\n\r\nProblem with this is that it's ambiguous: if the ASCII characters `0x15` occur in the text they will be indistinguishable from those hex bytes.\r\n\r\nBut since representing binary data in CSV fundamentally doesn't make sense I'm not sure if that really matters.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-712582699", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 712582699, "node_id": "MDEyOklzc3VlQ29tbWVudDcxMjU4MjY5OQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-20T04:36:04Z", "updated_at": "2020-10-20T04:36:14Z", "author_association": "OWNER", "body": "Asked for ideas on Twitter: https://twitter.com/simonw/status/1318409558805467136", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-712581994", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 712581994, "node_id": "MDEyOklzc3VlQ29tbWVudDcxMjU4MTk5NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-20T04:33:28Z", "updated_at": "2020-10-20T04:33:28Z", "author_association": "OWNER", "body": "The [datasette-render-binary](https://github.com/simonw/datasette-render-binary) plugin does this, which I really like - but without the different coloured fonts I'm not sure how readable it would be as just plain text:\r\n\r\n![image](https://user-images.githubusercontent.com/9599/96540435-9c125f00-1252-11eb-85aa-5fc8d0e63728.png)\r\n\r\nReally the goal here is to find the most human-friendly option, so that people looking at the output have a vague idea what's going on. That's why I'm not leaping at the chance to use base64.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1034#issuecomment-712580976", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1034", "id": 712580976, "node_id": "MDEyOklzc3VlQ29tbWVudDcxMjU4MDk3Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-20T04:29:23Z", "updated_at": "2020-10-20T04:29:23Z", "author_association": "OWNER", "body": "Most obvious option is base64. Any other potential solutions I'm missing?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 725184645, "label": "Better way of representing binary data in .csv output"}, "performed_via_github_app": null}