{"html_url": "https://github.com/simonw/datasette/issues/266#issuecomment-389626715", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/266", "id": 389626715, "node_id": "MDEyOklzc3VlQ29tbWVudDM4OTYyNjcxNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-16T18:50:46Z", "updated_at": "2018-05-16T18:50:46Z", "author_association": "OWNER", "body": "> I\u2019d recommend using the Windows-1252 encoding for maximum compatibility, unless you have any characters not in that set, in which case use UTF8 with a byte order mark. Bit of a pain, but some progams (eg various versions of Excel) don\u2019t read UTF8.\r\n**frankieroberto** https://twitter.com/frankieroberto/status/996823071947460616\r\n\r\n> There is software that consumes CSV and doesn't speak UTF8!? Huh. Well I can't just use Windows-1252 because I need to support the full UTF8 range of potential data - maybe I should support an optional ?_encoding=windows-1252 argument\r\n**simonw** https://twitter.com/simonw/status/996824677245857793", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323681589, "label": "Export to CSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/266#issuecomment-389608473", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/266", "id": 389608473, "node_id": "MDEyOklzc3VlQ29tbWVudDM4OTYwODQ3Mw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-16T17:52:35Z", "updated_at": "2018-05-16T17:54:11Z", "author_association": "OWNER", "body": "There are some code examples in this issue which should help with the streaming part: https://github.com/channelcat/sanic/issues/1067\r\n\r\nAlso https://github.com/channelcat/sanic/blob/master/docs/sanic/streaming.md#response-streaming", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323681589, "label": "Export to CSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/266#issuecomment-389592566", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/266", "id": 389592566, "node_id": "MDEyOklzc3VlQ29tbWVudDM4OTU5MjU2Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-16T17:01:29Z", "updated_at": "2018-05-16T17:02:21Z", "author_association": "OWNER", "body": "Let's provide a CSV Dialect definition too: https://frictionlessdata.io/specs/csv-dialect/ - via https://twitter.com/drewdaraabrams/status/996794915680997382", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323681589, "label": "Export to CSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/266#issuecomment-389579762", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/266", "id": 389579762, "node_id": "MDEyOklzc3VlQ29tbWVudDM4OTU3OTc2Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-16T16:21:12Z", "updated_at": "2018-05-16T16:21:12Z", "author_association": "OWNER", "body": "> I basically want someone to tell me which arguments I can pass to Python's csv.writer() function that will result in the least complaints from people who try to parse the results :)\r\nhttps://twitter.com/simonw/status/996786815938977792", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323681589, "label": "Export to CSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/266#issuecomment-389579363", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/266", "id": 389579363, "node_id": "MDEyOklzc3VlQ29tbWVudDM4OTU3OTM2Mw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-16T16:20:06Z", "updated_at": "2018-05-16T16:20:06Z", "author_association": "OWNER", "body": "I started a thread on Twitter discussing various CSV output dialects: https://twitter.com/simonw/status/996783395504979968 - I want to pick defaults which will work as well as possible for whatever tools people might be using to consume the data.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323681589, "label": "Export to CSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/266#issuecomment-389572201", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/266", "id": 389572201, "node_id": "MDEyOklzc3VlQ29tbWVudDM4OTU3MjIwMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-16T15:58:43Z", "updated_at": "2018-05-16T16:00:47Z", "author_association": "OWNER", "body": "This will likely be implemented in the `BaseView` class, which needs to know how to spot the `.csv` extension, call the underlying JSON generating function and then return the `columns` and `rows` as correctly formatted CSV.\r\n\r\nhttps://github.com/simonw/datasette/blob/9959a9e4deec8e3e178f919e8b494214d5faa7fd/datasette/views/base.py#L201-L207\r\n\r\nThis means it will take ALL arguments that are available to the `.json` view. It may ignore some (e.g. `_facet=` makes no sense since CSV tables don't have space to show the facet results).\r\n\r\nIn streaming mode, things will behave a little bit differently - in particular, if `_stream=1` then `_next=` will be forbidden.\r\n\r\nIt can't include a length header because we don't know how many bytes it will be\r\n\r\nCSV output will throw an error if the endpoint doesn't have rows and columns keys eg `/-/inspect.json`\r\n\r\nSo the implementation...\r\n\r\n- looks for the `.csv` extension\r\n- internally fetches the `.json` data instead\r\n- If no `_stream` it just transposes that JSON to CSV with the correct content type header\r\n- If `_stream=1` - checks for `_next=` and throws an error if it was provided\r\n- Otherwise... fetch first page and emit CSV header and first set of rows\r\n- Then start async looping, emitting more CSV rows and following the `_next=` internal reference until done\r\n\r\nI like that this takes advantage of efficient pagination. It may not work so well for views which use offset/limit though.\r\n\r\nIt won't work at all for custom SQL because custom SQL doesn't support _next= pagination. That's fine.\r\n\r\nFor views... easiest fix is to cut off after first X000 records. That seems OK. View JSON would need to include a property that the mechanism can identify.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323681589, "label": "Export to CSV"}, "performed_via_github_app": null}