home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

69 rows where user = 25778 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

issue >30

  • Add new spatialite helper methods 7
  • Helper methods for working with SpatiaLite 6
  • Scripted exports 5
  • Make it easier to insert geometries, with documentation and maybe code 4
  • Advanced class-based `conversions=` mechanism 4
  • Proposal: datasette query 4
  • register_output_renderer() should support streaming data 3
  • API to insert a single record into an existing table 3
  • Exclude virtual tables from datasette inspect 3
  • Option to automatically configure based on directory layout 2
  • Async support 2
  • Add SpatiaLite helpers to CLI 2
  • Add SpatiaLite helpers to CLI 2
  • ?_trace=1 fails with datasette-geojson for some reason 2
  • Ability to merge databases and tables 2
  • "datasette publish cloudrun" command to publish to Google Cloud Run 1
  • Replace "datasette publish --extra-options" with "--setting" 1
  • sqlite-utils insert: options for column types 1
  • Research: syntactic sugar for using --get with SQL queries, maybe "datasette query" 1
  • Ensure db.path is a string before trying to insert into internal database 1
  • Idea: import CSV to memory, run SQL, export in a single command 1
  • Add new `"sql_file"` key to Canned Queries in metadata? 1
  • Command for creating an empty database 1
  • Add KNN and data_licenses to hidden tables list 1
  • Support for generated columns 1
  • Optional Pandas integration 1
  • Datasette feature for publishing snapshots of query results 1
  • Call for birthday presents: if you're using Datasette, let us know how you're using it here 1
  • Document datasette.urls.row and row_blob 1
  • Make CustomJSONEncoder a documented public API 1
  • …

user 1

  • eyeseast · 69 ✖

author_association 1

  • CONTRIBUTOR 69
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1465315726 https://github.com/simonw/sqlite-utils/pull/531#issuecomment-1465315726 https://api.github.com/repos/simonw/sqlite-utils/issues/531 IC_kwDOCGYnMM5XVvGO eyeseast 25778 2023-03-12T22:21:56Z 2023-03-12T22:21:56Z CONTRIBUTOR

Exactly, that's what I was running into. On my M2 MacBook, SpatiaLite ends up in what is -- for the moment -- a non-standard location, so even when I passed in the location with --load-extension, I still hit an error on create-spatial-index.

What I learned doing this originally is that SQLite needs to load the extension for each connection, even if all the SpatiaLite stuff is already in the database. So that's why init_spatialite() gets called again.

Here's the code where I hit the error: https://github.com/eyeseast/boston-parcels/blob/main/Makefile#L30 It works using this branch.

I'm not attached to this solution if you can think of something better. And I'm not sure, TBH, my test would actually catch what I'm after here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add paths for homebrew on Apple silicon 1620164673  
1457172180 https://github.com/simonw/datasette/issues/2033#issuecomment-1457172180 https://api.github.com/repos/simonw/datasette/issues/2033 IC_kwDOBm6k_c5W2q7U eyeseast 25778 2023-03-06T22:54:52Z 2023-03-06T22:54:52Z CONTRIBUTOR

This would be a nice feature to have with datasette publish too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`datasette install -r requirements.txt` 1612296210  
1419357290 https://github.com/simonw/sqlite-utils/issues/524#issuecomment-1419357290 https://api.github.com/repos/simonw/sqlite-utils/issues/524 IC_kwDOCGYnMM5Umaxq eyeseast 25778 2023-02-06T16:21:44Z 2023-02-06T16:21:44Z CONTRIBUTOR

SQLite doesn't have a native DATETIME type. It stores dates internally as strings and then has functions to work with date-like strings. Yes it's weird.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Transformation type `--type DATETIME` 1572766460  
1375810027 https://github.com/simonw/datasette/issues/1983#issuecomment-1375810027 https://api.github.com/repos/simonw/datasette/issues/1983 IC_kwDOBm6k_c5SATHr eyeseast 25778 2023-01-09T15:35:58Z 2023-01-09T15:35:58Z CONTRIBUTOR

Yes please, and thank you. I realized I was maybe getting myself in trouble using that, but I think it's a good way to standardize JSON handling.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Make CustomJSONEncoder a documented public API 1525815985  
1375708725 https://github.com/simonw/datasette/issues/1978#issuecomment-1375708725 https://api.github.com/repos/simonw/datasette/issues/1978 IC_kwDOBm6k_c5R_6Y1 eyeseast 25778 2023-01-09T14:30:00Z 2023-01-09T14:30:00Z CONTRIBUTOR

Totally missed that issue. I can close this as a duplicate.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Document datasette.urls.row and row_blob 1522778923  
1332310772 https://github.com/simonw/datasette/issues/1605#issuecomment-1332310772 https://api.github.com/repos/simonw/datasette/issues/1605 IC_kwDOBm6k_c5PaXL0 eyeseast 25778 2022-11-30T15:06:37Z 2022-11-30T15:06:37Z CONTRIBUTOR

I'll add issues for both and do a documentation PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Scripted exports 1108671952  
1331187551 https://github.com/simonw/datasette/issues/1605#issuecomment-1331187551 https://api.github.com/repos/simonw/datasette/issues/1605 IC_kwDOBm6k_c5PWE9f eyeseast 25778 2022-11-29T19:29:42Z 2022-11-29T19:29:42Z CONTRIBUTOR

Interesting. I started a version using metadata like I outlined up top, but I realized that there's no documented way for a plugin to access either metadata or canned queries. Or at least, I couldn't find a way.

There is this method: https://github.com/simonw/datasette/blob/main/datasette/app.py#L472 but I don't want to rely on it if it's not documented. Same with this: https://github.com/simonw/datasette/blob/main/datasette/app.py#L544

If those are safe, I'll build on them. I'm also happy to document them, if that greases the wheels.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Scripted exports 1108671952  
1314241058 https://github.com/simonw/datasette/issues/1886#issuecomment-1314241058 https://api.github.com/repos/simonw/datasette/issues/1886 IC_kwDOBm6k_c5OVboi eyeseast 25778 2022-11-14T19:06:35Z 2022-11-14T19:06:35Z CONTRIBUTOR

This probably counts as a case study: https://github.com/eyeseast/spatial-data-cooking-show. Even has video.

Seriously, though, this workflow has become integral to my work with reporters and editors across USA TODAY Network. Very often, I get sent a folder of data in mixed formats, with a vague ask of how we should communicate some part of it to users. Datasette and its constellation of tools makes it easy to get a quick look at that data, run exploratory queries, map it and ask questions to figure out what's important to show. And then I export a version of the data that's exactly what I need for display.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Call for birthday presents: if you're using Datasette, let us know how you're using it here 1447050738  
1314066229 https://github.com/simonw/datasette/issues/1884#issuecomment-1314066229 https://api.github.com/repos/simonw/datasette/issues/1884 IC_kwDOBm6k_c5OUw81 eyeseast 25778 2022-11-14T16:48:35Z 2022-11-14T16:48:35Z CONTRIBUTOR

I'm realizing I don't know if a virtual table will ever return a count. Maybe it depends on the implementation. For these three, just checking now, it'll always return zero.

That said, I'm not sure there's any downside to having them return zero and caching that. (They're hidden, too.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Exclude virtual tables from datasette inspect 1439009231  
1313962183 https://github.com/simonw/datasette/issues/1884#issuecomment-1313962183 https://api.github.com/repos/simonw/datasette/issues/1884 IC_kwDOBm6k_c5OUXjH eyeseast 25778 2022-11-14T15:46:32Z 2022-11-14T15:46:32Z CONTRIBUTOR

It does work, though I think it's probably still worth excluding virtual tables that will always be zero. Here's the same inspection as before, now with --load-extension spatialite:

json { "alltheplaces": { "hash": "0843cfe414439ab903c22d1121b7ddbc643418c35c7f0edbcec82ef1452411df", "size": 963375104, "file": "alltheplaces.db", "tables": { "spatial_ref_sys": { "count": 6215 }, "spatialite_history": { "count": 18 }, "sqlite_sequence": { "count": 2 }, "geometry_columns": { "count": 3 }, "spatial_ref_sys_aux": { "count": 6164 }, "views_geometry_columns": { "count": 0 }, "virts_geometry_columns": { "count": 0 }, "geometry_columns_statistics": { "count": 3 }, "views_geometry_columns_statistics": { "count": 0 }, "virts_geometry_columns_statistics": { "count": 0 }, "geometry_columns_field_infos": { "count": 0 }, "views_geometry_columns_field_infos": { "count": 0 }, "virts_geometry_columns_field_infos": { "count": 0 }, "geometry_columns_time": { "count": 3 }, "geometry_columns_auth": { "count": 3 }, "views_geometry_columns_auth": { "count": 0 }, "virts_geometry_columns_auth": { "count": 0 }, "data_licenses": { "count": 10 }, "sql_statements_log": { "count": 0 }, "states": { "count": 56 }, "counties": { "count": 3234 }, "idx_states_geometry_rowid": { "count": 56 }, "idx_states_geometry_node": { "count": 3 }, "idx_states_geometry_parent": { "count": 2 }, "idx_counties_geometry_rowid": { "count": 3234 }, "idx_counties_geometry_node": { "count": 98 }, "idx_counties_geometry_parent": { "count": 97 }, "idx_places_geometry_rowid": { "count": 1236796 }, "idx_places_geometry_node": { "count": 38163 }, "idx_places_geometry_parent": { "count": 38162 }, "places": { "count": 1332609 }, "SpatialIndex": { "count": 0 }, "ElementaryGeometries": { "count": 0 }, "KNN": { "count": 0 }, "idx_states_geometry": { "count": 56 }, "idx_counties_geometry": { "count": 3234 }, "idx_places_geometry": { "count": 1236796 } } } }

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Exclude virtual tables from datasette inspect 1439009231  
1309735529 https://github.com/simonw/datasette/issues/1884#issuecomment-1309735529 https://api.github.com/repos/simonw/datasette/issues/1884 IC_kwDOBm6k_c5OEPpp eyeseast 25778 2022-11-10T03:57:23Z 2022-11-10T03:57:23Z CONTRIBUTOR

Here's how to get a list of virtual tables: https://stackoverflow.com/questions/46617118/how-to-fetch-names-of-virtual-tables

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Exclude virtual tables from datasette inspect 1439009231  
1292592210 https://github.com/simonw/datasette/issues/1851#issuecomment-1292592210 https://api.github.com/repos/simonw/datasette/issues/1851 IC_kwDOBm6k_c5NC2RS eyeseast 25778 2022-10-26T20:03:46Z 2022-10-26T20:03:46Z CONTRIBUTOR

Yeah, every time I see something cool done with triggers, I remember that I need to start using triggers.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
API to insert a single record into an existing table 1421544654  
1291228502 https://github.com/simonw/datasette/issues/1851#issuecomment-1291228502 https://api.github.com/repos/simonw/datasette/issues/1851 IC_kwDOBm6k_c5M9pVW eyeseast 25778 2022-10-25T23:02:10Z 2022-10-25T23:02:10Z CONTRIBUTOR

That's reasonable. Canned queries and custom endpoints are certainly going to give more room for specific needs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
API to insert a single record into an existing table 1421544654  
1290615599 https://github.com/simonw/datasette/issues/1851#issuecomment-1290615599 https://api.github.com/repos/simonw/datasette/issues/1851 IC_kwDOBm6k_c5M7Tsv eyeseast 25778 2022-10-25T14:05:12Z 2022-10-25T14:05:12Z CONTRIBUTOR

This could use a new plugin hook, too. I don't want to complicate your life too much, but for things like GIS, I'd want a way to turn regular JSON into SpatiaLite geometries or combine X/Y coordinates into point geometries and such. Happy to help however I can.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
API to insert a single record into an existing table 1421544654  
1258712931 https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258712931 https://api.github.com/repos/simonw/sqlite-utils/issues/491 IC_kwDOCGYnMM5LBm9j eyeseast 25778 2022-09-26T22:31:58Z 2022-09-26T22:31:58Z CONTRIBUTOR

Right. The backup command will copy tables completely, but in the case of conflicting table names, the destination gets overwritten silently. That might not be what you want here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ability to merge databases and tables 1383646615  
1258508215 https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258508215 https://api.github.com/repos/simonw/sqlite-utils/issues/491 IC_kwDOCGYnMM5LA0-3 eyeseast 25778 2022-09-26T19:22:14Z 2022-09-26T19:22:14Z CONTRIBUTOR

This might be fairly straightforward using SQLite's backup utility: https://docs.python.org/3/library/sqlite3.html#sqlite3.Connection.backup

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ability to merge databases and tables 1383646615  
1151887842 https://github.com/simonw/datasette/issues/1528#issuecomment-1151887842 https://api.github.com/repos/simonw/datasette/issues/1528 IC_kwDOBm6k_c5EqGni eyeseast 25778 2022-06-10T03:23:08Z 2022-06-10T03:23:08Z CONTRIBUTOR

I just put together a version of this in a plugin: https://github.com/eyeseast/datasette-query-files. Happy to have any feedback.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add new `"sql_file"` key to Canned Queries in metadata? 1060631257  
1128064864 https://github.com/simonw/datasette/issues/1742#issuecomment-1128064864 https://api.github.com/repos/simonw/datasette/issues/1742 IC_kwDOBm6k_c5DPOdg eyeseast 25778 2022-05-16T19:42:13Z 2022-05-16T19:42:13Z CONTRIBUTOR

Just to add a wrinkle here, this loads fine: https://alltheplaces-datasette.fly.dev/alltheplaces/places.geojson?_trace=1

But also, this doesn't add any trace data: https://alltheplaces-datasette.fly.dev/alltheplaces/places.json?_trace=1

What am I missing?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
?_trace=1 fails with datasette-geojson for some reason 1237586379  
1128049716 https://github.com/simonw/datasette/issues/1742#issuecomment-1128049716 https://api.github.com/repos/simonw/datasette/issues/1742 IC_kwDOBm6k_c5DPKw0 eyeseast 25778 2022-05-16T19:24:44Z 2022-05-16T19:24:44Z CONTRIBUTOR

Where is _trace getting injected? And is it something a plugin should be able to handle? (If it is, I guess I should handle it in this case.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
?_trace=1 fails with datasette-geojson for some reason 1237586379  
1125342229 https://github.com/simonw/datasette/issues/741#issuecomment-1125342229 https://api.github.com/repos/simonw/datasette/issues/741 IC_kwDOBm6k_c5DE1wV eyeseast 25778 2022-05-12T19:21:16Z 2022-05-12T19:21:16Z CONTRIBUTOR

Came here to check if this had been flagged already. Was helping a colleague get something on Cloud Run and had to dig to find --extra-options="--setting sql_time_limit_ms 2500".

If I get some time next week, maybe I'll try to tackle it. Would definitely make things easier to be able to do something like this:

sh datasette publish cloudrun something.db --setting sql_time_limit_ms 2500

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Replace "datasette publish --extra-options" with "--setting" 607223136  
1105642187 https://github.com/simonw/datasette/issues/1101#issuecomment-1105642187 https://api.github.com/repos/simonw/datasette/issues/1101 IC_kwDOBm6k_c5B5sLL eyeseast 25778 2022-04-21T18:59:08Z 2022-04-21T18:59:08Z CONTRIBUTOR

Ha! That was your idea (and a good one).

But it's probably worth measuring to see what overhead it adds. It did require both passing in the database and making the whole thing async.

Just timing the queries themselves:

  1. Using AsGeoJSON(geometry) as geometry takes 10.235 ms
  2. Leaving as binary takes 8.63 ms

Looking at the network panel:

  1. Takes about 200 ms for the fetch request
  2. Takes about 300 ms

I'm not sure how best to time the GeoJSON generation, but it would be interesting to check. Maybe I'll write a plugin to add query times to response headers.

The other thing to consider with async streaming is that it might be well-suited for a slower response. When I have to get the whole result and send a response in a fixed amount of time, I need the most efficient query possible. If I can hang onto a connection and get things one chunk at a time, maybe it's ok if there's some overhead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
register_output_renderer() should support streaming data 749283032  
1105588651 https://github.com/simonw/datasette/issues/1101#issuecomment-1105588651 https://api.github.com/repos/simonw/datasette/issues/1101 IC_kwDOBm6k_c5B5fGr eyeseast 25778 2022-04-21T18:15:39Z 2022-04-21T18:15:39Z CONTRIBUTOR

What if you split rendering and streaming into two things:

  • render is a function that returns a response
  • stream is a function that sends chunks, or yields chunks passed to an ASGI send callback

That way current plugins still work, and streaming is purely additive. A stream function could get a cursor or iterator of rows, instead of a list, so it could more efficiently handle large queries.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
register_output_renderer() should support streaming data 749283032  
1099540225 https://github.com/simonw/datasette/issues/1713#issuecomment-1099540225 https://api.github.com/repos/simonw/datasette/issues/1713 IC_kwDOBm6k_c5BiacB eyeseast 25778 2022-04-14T19:09:57Z 2022-04-14T19:09:57Z CONTRIBUTOR

I wonder if this overlaps with what I outlined in #1605. You could run something like this:

sh datasette freeze -d exports/ aws s3 cp exports/ s3://my-export-bucket/$(date)

And maybe that does what you need. Of course, that plugin isn't built yet. But that's the idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Datasette feature for publishing snapshots of query results 1203943272  
1094453751 https://github.com/simonw/datasette/issues/1699#issuecomment-1094453751 https://api.github.com/repos/simonw/datasette/issues/1699 IC_kwDOBm6k_c5BPAn3 eyeseast 25778 2022-04-11T01:32:12Z 2022-04-11T01:32:12Z CONTRIBUTOR

Was looking through old issues and realized a bunch of this got discussed in #1101 (including by me!), so sorry to rehash all this. Happy to help with whatever piece of it I can. Would be very excited to be able to use format plugins with exports.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Proposal: datasette query 1193090967  
1092386254 https://github.com/simonw/datasette/issues/1699#issuecomment-1092386254 https://api.github.com/repos/simonw/datasette/issues/1699 IC_kwDOBm6k_c5BHH3O eyeseast 25778 2022-04-08T02:39:25Z 2022-04-08T02:39:25Z CONTRIBUTOR

And just to think this through a little more, here's what stream_geojson might look like:

python async def stream_geojson(datasette, columns, rows, database, stream): db = datasette.get_database(database) for row in rows: feature = await row_to_geojson(row, db) stream.write(feature + "\n") # just assuming newline mode for now

Alternately, that could be an async generator, like this:

python async def stream_geojson(datasette, columns, rows, database): db = datasette.get_database(database) for row in rows: feature = await row_to_geojson(row, db) yield feature

Not sure which makes more sense, but I think this pattern would open up a lot of possibility. If you had your stream_indented_json function, you could do yield from stream_indented_json(rows, 2) and be one your way.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Proposal: datasette query 1193090967  
1092370880 https://github.com/simonw/datasette/issues/1699#issuecomment-1092370880 https://api.github.com/repos/simonw/datasette/issues/1699 IC_kwDOBm6k_c5BHEHA eyeseast 25778 2022-04-08T02:07:40Z 2022-04-08T02:07:40Z CONTRIBUTOR

So maybe render_output_render returns something like this:

python @hookimpl def register_output_renderer(datasette): return { "extension": "geojson", "render": render_geojson, "stream": stream_geojson, "can_render": can_render_geojson, }

And stream gets an iterator, instead of a list of rows, so it can efficiently handle large queries. Maybe it also gets passed a destination stream, or it returns an iterator. I'm not sure what makes more sense. Either way, that might cover both CLI exports and streaming responses.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Proposal: datasette query 1193090967  
1092357672 https://github.com/simonw/datasette/issues/1699#issuecomment-1092357672 https://api.github.com/repos/simonw/datasette/issues/1699 IC_kwDOBm6k_c5BHA4o eyeseast 25778 2022-04-08T01:39:40Z 2022-04-08T01:39:40Z CONTRIBUTOR

My best thought on how to differentiate them so far is plugins: if Datasette plugins that provide alternative outputs - like .geojson and .yml and suchlike - also work for the datasette query command that would make a lot of sense to me.

That's my thinking, too. It's really the thing I've been wanting since writing datasette-geojson, since I'm always exporting with datasette --get. The workflow I'm always looking for is something like this:

sh cd alltheplaces-datasette datasette query dunkin_in_suffolk -f geojson -o dunkin_in_suffolk.geojson

I think this probably needs either a new plugin hook separate from register_output_renderer or a way to use that without going through the HTTP stack. Or maybe a render mode that writes to a stream instead of a response. Maybe there's a new key in the dictionary that register_output_renderer returns that handles CLI exports.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Proposal: datasette query 1193090967  
1077671779 https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1077671779 https://api.github.com/repos/simonw/sqlite-utils/issues/399 IC_kwDOCGYnMM5AO_dj eyeseast 25778 2022-03-24T14:11:33Z 2022-03-24T14:11:43Z CONTRIBUTOR

Coming back to this. I was about to add a utility function to datasette-geojson to convert lat/lng columns to geometries. Thankfully I googled first. There's a SpatiaLite function for this: MakePoint.

sql select MakePoint(longitude, latitude) as geometry from places;

I'm not sure if that would work with conversions, since it needs two columns, but it's an option for tables that already have latitude, longitude columns.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Make it easier to insert geometries, with documentation and maybe code 1124731464  
1067981656 https://github.com/simonw/sqlite-utils/issues/131#issuecomment-1067981656 https://api.github.com/repos/simonw/sqlite-utils/issues/131 IC_kwDOCGYnMM4_qBtY eyeseast 25778 2022-03-15T13:21:42Z 2022-03-15T13:21:42Z CONTRIBUTOR

Just ran into this issue last night. I have a big table that's mostly numbers, but also a zip code column in a state where ZIP codes start with 0. Would be great to run something like this:

sh sqlite-utils insert data.db places file.csv --csv --detect-types --type zipcode text

Maybe I'll take a crack at this one.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils insert: options for column types 675753042  
1065477258 https://github.com/simonw/sqlite-utils/issues/411#issuecomment-1065477258 https://api.github.com/repos/simonw/sqlite-utils/issues/411 IC_kwDOCGYnMM4_geSK eyeseast 25778 2022-03-11T20:14:59Z 2022-03-11T20:14:59Z CONTRIBUTOR

Good call on adding this to create-table, especially for stored columns. Having the stored/virtual split might make this tricky to implement, but I haven't gone any farther than thinking about what the CLI looks like. I'm going to try making the SQL side work first and figure that'll tell me more about what it needs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support for generated columns 1160034488  
1059647114 https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059647114 https://api.github.com/repos/simonw/sqlite-utils/issues/412 IC_kwDOCGYnMM4_KO6K eyeseast 25778 2022-03-05T01:54:24Z 2022-03-05T01:54:24Z CONTRIBUTOR

I haven't tried this, but it looks like Pandas has a method for this: https://pandas.pydata.org/docs/reference/api/pandas.read_sql_query.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optional Pandas integration 1160182768  
1040998433 https://github.com/simonw/sqlite-utils/pull/407#issuecomment-1040998433 https://api.github.com/repos/simonw/sqlite-utils/issues/407 IC_kwDOCGYnMM4-DGAh eyeseast 25778 2022-02-16T01:29:39Z 2022-02-16T01:29:39Z CONTRIBUTOR

Happy to do it and have it in the library. Going to use it a bunch. This whole SpatiaLite toolchain become a huge part of my work in the past year.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add SpatiaLite helpers to CLI 1138948786  
1040580250 https://github.com/simonw/sqlite-utils/pull/407#issuecomment-1040580250 https://api.github.com/repos/simonw/sqlite-utils/issues/407 IC_kwDOCGYnMM4-Bf6a eyeseast 25778 2022-02-15T17:40:00Z 2022-02-15T17:40:00Z CONTRIBUTOR

@simonw I think this is ready for a look.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add SpatiaLite helpers to CLI 1138948786  
1038336591 https://github.com/simonw/sqlite-utils/issues/398#issuecomment-1038336591 https://api.github.com/repos/simonw/sqlite-utils/issues/398 IC_kwDOCGYnMM4948JP eyeseast 25778 2022-02-13T18:48:21Z 2022-02-13T18:49:49Z CONTRIBUTOR

Been chipping away at this between other things and realized sqlite-utils init-spatialite is probably unnecessary. Any of the other commands requires running db.init_spatialite to have the extension functions available, and that will do everything init-spatialite would do.

I think it's probably worth keeping a SpatiaLite flag on create-database in case you wanted to create all the spatial metadata up front. Otherwise, it's going to get added the first time you run add-geometry-column or create-spatial-index, which is probably fine in most cases.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add SpatiaLite helpers to CLI 1124237013  
1035057014 https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1035057014 https://api.github.com/repos/simonw/sqlite-utils/issues/402 IC_kwDOCGYnMM49sbd2 eyeseast 25778 2022-02-10T15:30:28Z 2022-02-10T15:30:40Z CONTRIBUTOR

Yeah, the CLI experience is probably where any kind of multi-column, configured setup is going to fall apart. Sticking with GIS examples, one way I might think about this is using the fiona CLI:

```sh

assuming a database is already created and has SpatiaLite

fio cat boundary.shp | sqlite-utils insert boundaries --conversion geometry GeometryGeoJSON - ```

Anyway, very interested to see where you land here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Advanced class-based `conversions=` mechanism 1125297737  
1032732242 https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1032732242 https://api.github.com/repos/simonw/sqlite-utils/issues/402 IC_kwDOCGYnMM49jj5S eyeseast 25778 2022-02-08T15:26:59Z 2022-02-08T15:26:59Z CONTRIBUTOR

What if you did something like this:

```python

class Conversion: def init(self, args, *kwargs): "Put whatever settings you need here"

def python(self, row, column, value): # not sure on args here
    "Python step to transform value"
    return value

def sql(self, row, column, value):
    "Return the actual sql that goes in the insert/update step, and maybe params"
    # value is the return of self.python()
    return value, []

```

This way, you're always passing an instance, which has methods that do the conversion. (Or you're passing a SQL string, as you would now.) The __init__ could take column names, or SRID, or whatever other setup state you need per row, but the row is getting processed with the python and sql methods (or whatever you want to call them). This is pretty rough, so do what you will with names and args and such.

You'd then use it like this:

```python

subclass might be unneeded here, if methods are present

class LngLatConversion(Conversion): def init(self, x="longitude", y="latitude"): self.x = x self.y = y

def python(self, row, column, value):
    x = row[self.x]
    y = row[self.y]
    return x, y

def sql(self, row, column, value):
    # value is now a tuple, returned above
    s = "GeomFromText(POINT(? ?))"
    return s, value

table.insert_all(rows, conversions={"point": LngLatConversion("lng", "lat"))} ```

I haven't thought through all the implementation details here, and it'll probably break in ways I haven't foreseen, but wanted to get this idea out of my head. Hope it helps.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Advanced class-based `conversions=` mechanism 1125297737  
1031791783 https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1031791783 https://api.github.com/repos/simonw/sqlite-utils/issues/402 IC_kwDOCGYnMM49f-Sn eyeseast 25778 2022-02-07T18:37:40Z 2022-02-07T18:37:40Z CONTRIBUTOR

I've never used it either, but it's interesting, right? Feel like I should try it for something.

I'm trying to get my head around how this conversions feature might work, because I really like the idea of it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Advanced class-based `conversions=` mechanism 1125297737  
1031779460 https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1031779460 https://api.github.com/repos/simonw/sqlite-utils/issues/402 IC_kwDOCGYnMM49f7SE eyeseast 25778 2022-02-07T18:24:56Z 2022-02-07T18:24:56Z CONTRIBUTOR

I wonder if there's any overlap with the goals here and the sqlite3 module's concept of adapters and converters: https://docs.python.org/3/library/sqlite3.html#sqlite-and-python-types

I'm not sure that's exactly what we're talking about here, but it might be a parallel with some useful ideas to borrow.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Advanced class-based `conversions=` mechanism 1125297737  
1030741289 https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030741289 https://api.github.com/repos/simonw/sqlite-utils/issues/399 IC_kwDOCGYnMM49b90p eyeseast 25778 2022-02-06T03:03:43Z 2022-02-06T03:03:43Z CONTRIBUTOR

I wonder if there are any interesting non-geospatial canned conversions that it would be worth including?

Off the top of my head:

  • Un-nesting JSON objects into columns
  • Splitting arrays
  • Normalizing dates and times
  • URL munging with urlparse
  • Converting strings to numbers

Some of this is easy enough with SQL functions, some is easier in Python. Maybe that's where having pre-built classes gets really handy, because it saves you from thinking about which way it's implemented.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Make it easier to insert geometries, with documentation and maybe code 1124731464  
1030740826 https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740826 https://api.github.com/repos/simonw/sqlite-utils/issues/399 IC_kwDOCGYnMM49b9ta eyeseast 25778 2022-02-06T02:59:10Z 2022-02-06T02:59:10Z CONTRIBUTOR

All this said, I don't think it's unreasonable to point people to dedicated tools like geojson-to-sqlite. If I'm dealing with a bunch of GeoJSON or Shapefiles, I need to something to read those anyway (or I need to figure out virtual tables). But something like this might make it easier to build those libraries, or standardize the underlying parts.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Make it easier to insert geometries, with documentation and maybe code 1124731464  
1030740653 https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030740653 https://api.github.com/repos/simonw/sqlite-utils/issues/399 IC_kwDOCGYnMM49b9qt eyeseast 25778 2022-02-06T02:57:17Z 2022-02-06T02:57:17Z CONTRIBUTOR

I like the idea of having stock conversions you could import. I'd actually move them to a dedicated module (call it sqlite_utils.conversions or something), because it's different from other utilities. Maybe they even take configuration, or they're composable.

```python from sqlite_utils.conversions import LongitudeLatitude

db["places"].insert( { "name": "London", "lng": -0.118092, "lat": 51.509865, }, conversions={"point": LongitudeLatitude("lng", "lat")}, ) ```

I would definitely use that for every CSV I get with lat/lng columns where I actually need GeoJSON.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Make it easier to insert geometries, with documentation and maybe code 1124731464  
1030629879 https://github.com/simonw/sqlite-utils/issues/398#issuecomment-1030629879 https://api.github.com/repos/simonw/sqlite-utils/issues/398 IC_kwDOCGYnMM49bin3 eyeseast 25778 2022-02-05T13:57:33Z 2022-02-05T19:49:38Z CONTRIBUTOR

I'm mostly using geojson-to-sqlite at the moment. Even with shapefiles, I'm usually converting to GeoJSON and projecting to EPSG:4326 (with ogr2ogr) first.

I think an open question here is how much you want to leave to external libraries and how much you want here. My thinking has been that adding Spatialite helpers here would make external stuff easier, but it would be nice to have some standard way to insert geometries.

I'm in the middle of adding GeoJSON and Spatialite support to geocode-sqlite, and that will probably use WKT. Since that's all points, I think I can just make the string inline. But for polygons, I'd generally use Shapely, which probably isn't a dependency you want to add to sqlite-utils.

I've also been trying to get some of the approaches here to work, but haven't had any success so far.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add SpatiaLite helpers to CLI 1124237013  
1030002502 https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1030002502 https://api.github.com/repos/simonw/sqlite-utils/issues/385 IC_kwDOCGYnMM49ZJdG eyeseast 25778 2022-02-04T13:50:19Z 2022-02-04T13:50:19Z CONTRIBUTOR

Awesome. Thanks for your help getting it in. Will now look at adding CLI versions of this. It's going to be super helpful on a bunch of my projects.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add new spatialite helper methods 1102899312  
1029370537 https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029370537 https://api.github.com/repos/simonw/sqlite-utils/issues/385 IC_kwDOCGYnMM49WvKp eyeseast 25778 2022-02-03T20:25:58Z 2022-02-03T20:25:58Z CONTRIBUTOR

OK, I moved all the GIS helpers into db.py as methods on Database and Table, and I put find_spatialite back in utils.py. I deleted gis.py, since there's nothing left it. Docs and tests are updated and passing.

I think this is better.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add new spatialite helper methods 1102899312  
1029338360 https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029338360 https://api.github.com/repos/simonw/sqlite-utils/issues/385 IC_kwDOCGYnMM49WnT4 eyeseast 25778 2022-02-03T19:43:56Z 2022-02-03T19:43:56Z CONTRIBUTOR

Works for me. I was just looking at how the FTS extensions work and they're just methods, too. So this can be consistent with that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add new spatialite helper methods 1102899312  
1029326568 https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029326568 https://api.github.com/repos/simonw/sqlite-utils/issues/385 IC_kwDOCGYnMM49Wkbo eyeseast 25778 2022-02-03T19:28:26Z 2022-02-03T19:28:26Z CONTRIBUTOR

from sqlite_utils.utils import find_spatialite is part of the documented API already:

https://sqlite-utils.datasette.io/en/3.22.1/python-api.html#finding-spatialite

To avoid needing to bump the major version number to 4 to indicate a backwards incompatible change, we should keep a from .gis import find_spatialite line at the top of utils.py such that any existing code with that documented import continues to work.

This is fixed now. I had to take out the type annotations for Database and Table to avoid a circular import, but that's fine and may be moot if these become class methods.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add new spatialite helper methods 1102899312  
1029317527 https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1029317527 https://api.github.com/repos/simonw/sqlite-utils/issues/79 IC_kwDOCGYnMM49WiOX eyeseast 25778 2022-02-03T19:18:02Z 2022-02-03T19:18:02Z CONTRIBUTOR

Taking part of the conversation from #385 here.

Would sqlite-utils add-geometry-column ... be a good CLI enhancement. for example?

Yes. And also sqlite-utils create-spatial-index would be great to have. My plan would be to add those once the Python API is settled.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Helper methods for working with SpatiaLite 557842245  
1029306428 https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029306428 https://api.github.com/repos/simonw/sqlite-utils/issues/385 IC_kwDOCGYnMM49Wfg8 eyeseast 25778 2022-02-03T19:03:43Z 2022-02-03T19:03:43Z CONTRIBUTOR

I thought about adding these as methods on Database and Table, and I'm back and forth on it for the same reasons you are. It's certainly cleaner, and it's clearer what you're operating on. I could go either way.

I do sort of like having all the Spatialite stuff in its own module, just because it's built around an extension you might not have or want, but I don't know if that's a good reason to have a different API.

You could have init_spatialite add methods to Database and Table, so they're only there if you have Spatialite set up. Is that too clever? It feels too clever.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add new spatialite helper methods 1102899312  
1029180984 https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029180984 https://api.github.com/repos/simonw/sqlite-utils/issues/385 IC_kwDOCGYnMM49WA44 eyeseast 25778 2022-02-03T16:42:04Z 2022-02-03T16:42:04Z CONTRIBUTOR

Fixed my spelling. That's a useful thing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add new spatialite helper methods 1102899312  
1029175907 https://github.com/simonw/sqlite-utils/pull/385#issuecomment-1029175907 https://api.github.com/repos/simonw/sqlite-utils/issues/385 IC_kwDOCGYnMM49V_pj eyeseast 25778 2022-02-03T16:36:54Z 2022-02-03T16:36:54Z CONTRIBUTOR

@simonw Not sure if you've seen this, but any chance you can run the tests?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add new spatialite helper methods 1102899312  
1018778667 https://github.com/simonw/datasette/issues/1605#issuecomment-1018778667 https://api.github.com/repos/simonw/datasette/issues/1605 IC_kwDOBm6k_c48uVQr eyeseast 25778 2022-01-21T19:00:01Z 2022-01-21T19:00:01Z CONTRIBUTOR

Let me know if you want help prototyping any of this, because I'm thinking about it and trying stuff out. Happy to be a sounding board, if it helps.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Scripted exports 1108671952  
1018741262 https://github.com/simonw/datasette/issues/1605#issuecomment-1018741262 https://api.github.com/repos/simonw/datasette/issues/1605 IC_kwDOBm6k_c48uMIO eyeseast 25778 2022-01-21T18:05:09Z 2022-01-21T18:05:09Z CONTRIBUTOR

Thinking about this more, as well as #1356 and various other tickets related to output formats, I think there's a missing plugin hook for formatting results, separate from register_output_renderer (or maybe part of it, depending on #1101).

Right now, as I understand it, getting output in any format goes through the normal view stack -- a table, a row or a query -- and so by the time register_output_renderer gets it, the results have already been truncated or paginated. What I'd want, I think, is to be able to register ways to format results independent of where those results are sent.

It's possible this could be done using conn.row_factory (maybe in the prepare_connection hook), but I'm not sure that's where it belongs.

Another option is some kind of registry of serializers, which register_output_renderer and other plugin hooks could use. What I'm trying to avoid here is writing a plugin that also needs plugins for formats I haven't thought of yet.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Scripted exports 1108671952  
1016994329 https://github.com/simonw/datasette/issues/1605#issuecomment-1016994329 https://api.github.com/repos/simonw/datasette/issues/1605 IC_kwDOBm6k_c48nhoZ eyeseast 25778 2022-01-20T00:27:17Z 2022-01-20T00:27:17Z CONTRIBUTOR

Right now, I usually have a line in a Makefile like this:

make combined.geojson: project.db pipenv run datasette project.db --get /project/combined.geojson \ --load-extension spatialite \ --setting sql_time_limit_ms 5000 \ --setting max_returned_rows 20000 \ -m metadata.yml > $@

That all assumes I've loaded whatever I need into project.db and created a canned query called combined (and then uses datasette-geojson for geojson output).

It works, but as you can see, it's a lot to manage, a lot of boilerplate, and it wasn't obvious how to get there. If there's an error in the canned query, I get an HTML error page, so that's hard to debug. And it's only one query, so each output needs a line like this. Make isn't ideal, either, for that reason.

The thing I really liked with datafreeze was doing templated filenames. I have a project now where I need to export a bunch of litttle geojson files, based on queries, and it would be awesome to be able to do something like this:

yml databases: project: queries: boundaries: sql: "SELECT * FROM boundaries" filename: "boundaries/{id}.geojson" mode: "item" format: geojson

And then do:

sh datasette freeze -m metadata.yml project.db

For HTML export, maybe there's a template argument, or format: template or something. And that gets you a static site generator, kinda for free.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Scripted exports 1108671952  
1016651485 https://github.com/simonw/datasette/issues/1601#issuecomment-1016651485 https://api.github.com/repos/simonw/datasette/issues/1601 IC_kwDOBm6k_c48mN7d eyeseast 25778 2022-01-19T16:39:03Z 2022-01-19T16:39:03Z CONTRIBUTOR

I think both of these are Spatialite specific. They get generated when you first initialize the extension. KNN is actually deprecated in favor of KNN2, as I understand it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add KNN and data_licenses to hidden tables list 1105916061  
1013698557 https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1013698557 https://api.github.com/repos/simonw/sqlite-utils/issues/79 IC_kwDOCGYnMM48a8_9 eyeseast 25778 2022-01-15T15:15:22Z 2022-01-15T15:15:22Z CONTRIBUTOR

@simonw I have a PR here https://github.com/simonw/sqlite-utils/pull/385 that adds Spatialite helpers on the Python side. Please let me know how it looks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Helper methods for working with SpatiaLite 557842245  
1012413729 https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012413729 https://api.github.com/repos/simonw/sqlite-utils/issues/79 IC_kwDOCGYnMM48WDUh eyeseast 25778 2022-01-13T18:50:00Z 2022-01-13T18:50:00Z CONTRIBUTOR

One more thing I'm going to add: A method to add a geometry column, which I'll need to do to create a spatial index on a table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Helper methods for working with SpatiaLite 557842245  
1012253198 https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012253198 https://api.github.com/repos/simonw/sqlite-utils/issues/79 IC_kwDOCGYnMM48VcIO eyeseast 25778 2022-01-13T15:39:14Z 2022-01-13T15:39:14Z CONTRIBUTOR

Other thing: If there get to be enough utils, I think it's worth moving all the spatialite stuff into its own file (gis.py or something) just so it's easier to find later.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Helper methods for working with SpatiaLite 557842245  
1012230212 https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012230212 https://api.github.com/repos/simonw/sqlite-utils/issues/79 IC_kwDOCGYnMM48VWhE eyeseast 25778 2022-01-13T15:15:13Z 2022-01-13T15:15:13Z CONTRIBUTOR

Some proposals I'd add to sqlite-utils:

Some version of this, from geojson-to-sqlite:

python def init_spatialite(db, lib): db.conn.enable_load_extension(True) db.conn.load_extension(lib) # Initialize SpatiaLite if not yet initialized if "spatial_ref_sys" in db.table_names(): return db.conn.execute("select InitSpatialMetadata(1)")

Also a function for creating a spatial index:

python db.conn.execute("select CreateSpatialIndex(?, ?)", [table, "geometry"])

I don't know the nuances of updating a spatial index, or checking if one already exists. This could be a CLI method like:

sh sqlite-utils spatial-index spatial.db table-name column-name

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Helper methods for working with SpatiaLite 557842245  
1012158895 https://github.com/simonw/sqlite-utils/issues/79#issuecomment-1012158895 https://api.github.com/repos/simonw/sqlite-utils/issues/79 IC_kwDOCGYnMM48VFGv eyeseast 25778 2022-01-13T13:55:59Z 2022-01-13T13:55:59Z CONTRIBUTOR

Came here to add this. I might pick it up.

Would also add a utility to create (and update and delete?) a spatial index. It's not much code but I have to look it up every time.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Helper methods for working with SpatiaLite 557842245  
983155079 https://github.com/simonw/sqlite-utils/issues/348#issuecomment-983155079 https://api.github.com/repos/simonw/sqlite-utils/issues/348 IC_kwDOCGYnMM46mcGH eyeseast 25778 2021-12-01T00:28:40Z 2021-12-01T00:28:40Z CONTRIBUTOR

I'd use this. Right now, I tend to do touch my.db and then enable-wal or whatever else, but I'm never sure if that's a bad idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Command for creating an empty database 1067771698  
953911245 https://github.com/simonw/sqlite-utils/issues/242#issuecomment-953911245 https://api.github.com/repos/simonw/sqlite-utils/issues/242 IC_kwDOCGYnMM4424fN eyeseast 25778 2021-10-28T14:37:55Z 2021-10-28T14:37:55Z CONTRIBUTOR

I've been thinking about this a bit lately, doing a project that involves moving a lot of data in and out of SQLite files, datasette and GeoJSON. This has me leaning toward the idea that something like datasette query would be a better place to do async queries.

I know there's a lot of overlap in sqlite-utils and datasette, and maybe keeping sqlite-utils synchronous would let datasette be entirely async and give a cleaner separation of implementations.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Async support 817989436  
869191854 https://github.com/simonw/datasette/issues/1101#issuecomment-869191854 https://api.github.com/repos/simonw/datasette/issues/1101 MDEyOklzc3VlQ29tbWVudDg2OTE5MTg1NA== eyeseast 25778 2021-06-27T16:42:14Z 2021-06-27T16:42:14Z CONTRIBUTOR

This would really help with this issue: https://github.com/eyeseast/datasette-geojson/issues/7

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
register_output_renderer() should support streaming data 749283032  
861944202 https://github.com/simonw/sqlite-utils/issues/272#issuecomment-861944202 https://api.github.com/repos/simonw/sqlite-utils/issues/272 MDEyOklzc3VlQ29tbWVudDg2MTk0NDIwMg== eyeseast 25778 2021-06-16T01:41:03Z 2021-06-16T01:41:03Z CONTRIBUTOR

So, I do things like this a lot, too. I like the idea of piping in from stdin. Something like this would be nice to do in a makefile:

sh cat file.csv | sqlite-utils --csv --table data - 'SELECT * FROM data WHERE col="whatever"' > filtered.csv

If you assumed that you're always piping out the same format you're piping in, the option names don't have to change. Depends how much you want to change formats.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Idea: import CSV to memory, run SQL, export in a single command 921878733  
857298526 https://github.com/simonw/datasette/pull/1370#issuecomment-857298526 https://api.github.com/repos/simonw/datasette/issues/1370 MDEyOklzc3VlQ29tbWVudDg1NzI5ODUyNg== eyeseast 25778 2021-06-09T01:18:59Z 2021-06-09T01:18:59Z CONTRIBUTOR

I'm happy to grab some or all of these in this PR, if you want.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ensure db.path is a string before trying to insert into internal database 914130834  
853895159 https://github.com/simonw/datasette/issues/1356#issuecomment-853895159 https://api.github.com/repos/simonw/datasette/issues/1356 MDEyOklzc3VlQ29tbWVudDg1Mzg5NTE1OQ== eyeseast 25778 2021-06-03T14:03:59Z 2021-06-03T14:03:59Z CONTRIBUTOR

(Putting thoughts here to keep the conversation in one place.)

I think using datasette for this use-case is the right approach. I usually have both datasette and sqlite-utils installed in the same project, and that's where I'm trying out queries, so it probably makes the most sense to have datasette also manage the output (and maybe the input, too).

It seems like both --get and --query could work better as subcommands, rather than options, if you're looking at building out a full CLI experience in datasette. It would give a cleaner separation in what you're trying to do and let each have its own dedicated options. So something like this:

```sh

run an arbitrary query

datasette query covid.db "select * from ny_times_us_counties limit 1" --format yaml

run a canned query

datasette get covid.db some-canned-query --format yaml ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: syntactic sugar for using --get with SQL queries, maybe "datasette query" 910092577  
787121933 https://github.com/simonw/sqlite-utils/issues/242#issuecomment-787121933 https://api.github.com/repos/simonw/sqlite-utils/issues/242 MDEyOklzc3VlQ29tbWVudDc4NzEyMTkzMw== eyeseast 25778 2021-02-27T19:18:57Z 2021-02-27T19:18:57Z CONTRIBUTOR

I think HTTPX gets it exactly right, with a clear separation between sync and async clients, each with a basically identical API. (I'm about to switch feed-to-sqlite over to it, from Requests, to eventually make way for async support.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Async support 817989436  
618758326 https://github.com/simonw/datasette/issues/731#issuecomment-618758326 https://api.github.com/repos/simonw/datasette/issues/731 MDEyOklzc3VlQ29tbWVudDYxODc1ODMyNg== eyeseast 25778 2020-04-24T01:55:00Z 2020-04-24T01:55:00Z CONTRIBUTOR

Mounting ./static at /static seems the simplest way. Saves you the trouble of deciding what else (img for example) gets special treatment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option to automatically configure based on directory layout 605110015  
618126449 https://github.com/simonw/datasette/issues/731#issuecomment-618126449 https://api.github.com/repos/simonw/datasette/issues/731 MDEyOklzc3VlQ29tbWVudDYxODEyNjQ0OQ== eyeseast 25778 2020-04-23T01:38:55Z 2020-04-23T01:38:55Z CONTRIBUTOR

I've almost suggested this same thing a couple times. I tend to have Makefile (because I'm doing other make stuff anyway to get data prepped), and I end up putting all those CLI options in something like make run. But it would be way easier to just have all those typical options -- plugins, templates, metadata -- be defaults.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option to automatically configure based on directory layout 605110015  
489105665 https://github.com/simonw/datasette/pull/434#issuecomment-489105665 https://api.github.com/repos/simonw/datasette/issues/434 MDEyOklzc3VlQ29tbWVudDQ4OTEwNTY2NQ== eyeseast 25778 2019-05-03T14:01:30Z 2019-05-03T14:01:30Z CONTRIBUTOR

This is exactly what I needed. Thank you.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"datasette publish cloudrun" command to publish to Google Cloud Run 434321685  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1.2ms · About: github-to-sqlite