id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,pull_request,body,repo,type,active_lock_reason,performed_via_github_app,reactions,draft,state_reason 836829560,MDU6SXNzdWU4MzY4Mjk1NjA=,248,support for Apache Arrow / parquet files I/O,649467,open,0,,,1,2021-03-20T14:59:30Z,2021-10-28T23:46:48Z,,NONE,,"I just started looking at Apache Arrow using pyarrow for import and export of tabular datasets, and it looks quite compelling. It might be worth looking at for sqlite-utils and/or datasette. As a test, I took a random jsonl data dump of a dataset I have with floats, strings, and ints and converted it to arrow's parquet format using the naive `pyarrow.parquet.write_file()` command, which has automatic type inferrence. It compressed down to 7% of the original size. Conversion of a 26MB JSON file and serializing it to parquet was eyeblink instantaneous. Parquet files are portable and can be directly imported into pandas and other analytics software. The only hangup is the automatic type inference of the naive reader. It's great for general laziness and for parsing JSON columns (it correctly interpreted a table of mine with a JSON array). However, I did get an exception for a string column where most entries looked integer-like but had a couple values that weren't -- the reader tried to coerce all of them for some reason, even though the JSON type is string. Since the writer optionally takes a schema, it shouldn't be too hard to grab the sqlite header types. With some additional hinting, you might get datetime columns and JSON, which are native Arrow types. Somewhat tangentially, someone even wrote an sqlite vfs extension for Parquet: https://cldellow.com/2018/06/22/sqlite-parquet-vtable.html ",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/248/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 718395987,MDExOlB1bGxSZXF1ZXN0NTAwNzk4MDkx,1008,Add json_loads and json_dumps jinja2 filters,649467,open,0,,,1,2020-10-09T20:11:34Z,2020-12-15T02:30:28Z,,FIRST_TIME_CONTRIBUTOR,simonw/datasette/pulls/1008,,107914493,pull,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/1008/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 699947574,MDU6SXNzdWU2OTk5NDc1NzQ=,963,Currently selected array facets are not correctly persisted through hidden form fields,649467,closed,0,,5818042,1,2020-09-12T01:49:17Z,2020-09-12T21:54:29Z,2020-09-12T21:54:09Z,NONE,,"Faceted search uses JSON array elements as facets rather than the arrays. However, if a search is ""Apply""ed (using the Apply button), the array itself rather than its elements used. To reproduce: https://latest.datasette.io/fixtures/facetable?_sort=pk&_facet=created&_facet=tags&_facet_array=tags Press ""Apply"", which might be done when removing a filter. Notice that the ""tags"" facet values are now arrays, not array elements. It appears the ""&_facet_array=tags"" element of the query string is dropped.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/963/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed 432727685,MDU6SXNzdWU0MzI3Mjc2ODU=,20,JSON column values get extraneously quoted ,649467,closed,0,,4348046,1,2019-04-12T20:15:30Z,2019-05-25T00:57:19Z,2019-05-25T00:57:19Z,NONE,,"If the input to `sqlite-utils insert` includes a column that is a JSON array or object, `sqlite-utils query` will introduce an extra level of quoting on output: ``` # echo '[{""key"": [""one"", ""two"", ""three""]}]' | sqlite-utils insert t.db t - # sqlite-utils t.db 'select * from t' [{""key"": ""[\""one\"", \""two\"", \""three\""]""}] # sqlite3 t.db 'select * from t' [""one"", ""two"", ""three""] ``` This might require an imperfect solution, since sqlite3 doesn't have a JSON type. Perhaps fields that start with `[""` or `{""` and end with `""]` or `""}` could be detected, with a flag to turn off that behavior for weird text fields (or vice versa).",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/20/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed