html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app https://github.com/simonw/datasette/issues/1160#issuecomment-752257666,https://api.github.com/repos/simonw/datasette/issues/1160,752257666,MDEyOklzc3VlQ29tbWVudDc1MjI1NzY2Ng==,9599,2020-12-29T22:09:18Z,2020-12-29T22:09:18Z,OWNER,"### Figuring out the API design I want to be able to support different formats, and be able to parse them into tables either streaming or in one go depending on if the format supports that. Ideally I want to be able to pull the first 1,024 bytes for the purpose of detecting the format, then replay those bytes again later. I'm considering this a stretch goal though. CSV is easy to parse as a stream - here’s [how sqlite-utils does it](https://github.com/simonw/sqlite-utils/blob/f1277f638f3a54a821db6e03cb980adad2f2fa35/sqlite_utils/cli.py#L630): dialect = ""excel-tab"" if tsv else ""excel"" with file_progress(json_file, silent=silent) as json_file: reader = csv_std.reader(json_file, dialect=dialect) headers = next(reader) docs = (dict(zip(headers, row)) for row in reader) Problem: using `db.insert_all()` could block for a long time on a big set of rows. Probably easiest to batch the records before calling `insert_all()` and then run a batch at a time using a `db.execute_write_fn()` call.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",775666296,