github
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | pull_request | body | repo | type | active_lock_reason | performed_via_github_app | reactions | draft | state_reason |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1084193403 | PR_kwDOBm6k_c4wDKmb | 1574 | introduce new option for datasette package to use a slim base image | 33631 | closed | 0 | 6 | 2021-12-19T21:18:19Z | 2022-08-15T08:49:31Z | 2022-08-15T08:49:31Z | NONE | simonw/datasette/pulls/1574 | The official python images on docker hub come with a slim variant that is significantly smaller than the default. The diff does not change the default, but allows to switch to the `slim` variant with commandline switch (`--slim-base-image`) Size comparison: ``` $ datasette package some.db -t fat --install "datasette-basemap datasette-cluster-map" $ datasette package some.db -t slim --slim-base-image --install "datasette-basemap datasette-cluster-map" $ docker images REPOSITORY TAG IMAGE ID CREATED SIZE fat latest 807b393ace0d 9 seconds ago 978MB slim latest 31bc5e63505c 8 minutes ago 191MB ``` | 107914493 | pull | { "url": "https://api.github.com/repos/simonw/datasette/issues/1574/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
0 | |||||
395236066 | MDU6SXNzdWUzOTUyMzYwNjY= | 393 | CSV export in "Advanced export" pane doesn't respect query | 1727065 | closed | 0 | 6 | 2019-01-02T12:39:41Z | 2021-06-17T18:14:24Z | 2019-01-03T02:44:10Z | NONE | It looks like there's an inconsistency when exporting to CSV via the the web interface. Say I'm looking at [songs released in 1989](https://fivethirtyeight.datasettes.com/fivethirtyeight-c300360/classic-rock%2Fclassic-rock-song-list?Release+Year__exact=1989) in the `classic-rock/classic-rock-song-list` table from the Five Thirty Eight data. The JSON and CSV export links at the top of the page both give me filtered data using `Release+Year__exact=1989` in the URL. In the `Advanced export` tab, though, the CSV option gives me the whole data set, while the JSON options preserve the query. It may be that this is intended behaviour related to the streaming CSV stuff [discussed here](https://github.com/simonw/datasette/issues/266), but if that's the case then I think it should be a little clearer. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/393/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
826613352 | MDExOlB1bGxSZXF1ZXN0NTg4NjAxNjI3 | 1254 | Update Docker Spatialite version to 5.0.1 + add support for Spatialite topology functions | 3200608 | closed | 0 | 6 | 2021-03-09T20:49:08Z | 2021-03-10T18:27:45Z | 2021-03-09T22:04:23Z | NONE | simonw/datasette/pulls/1254 | This requires adding the RT Topology library (Spatialite changed to RT Topology from LWGEOM between 4.4 and 5.0), as well as upgrading the GEOS version (which is the reason for switching to `python:3.7.10-slim-buster` as the base image.) `autoconf` and `libtool` are added to build RT Topology, and Spatialite is now built with `--disable-minizip` (minizip wasn't an option in 4.4 and I didn't want to add another dependency) and `--disable-dependency-tracking` which, according to Spatialite, "speeds up one-time builds" | 107914493 | pull | { "url": "https://api.github.com/repos/simonw/datasette/issues/1254/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
0 | |||||
573583971 | MDU6SXNzdWU1NzM1ODM5NzE= | 689 | "Templates considered" comment broken in >=0.35 | 35075 | closed | 0 | 6 | 2020-03-01T17:31:21Z | 2020-04-05T19:39:44Z | 2020-04-05T19:39:44Z | NONE | Noticed that the "Templates Considered" comment is missing in 0.37. Believe I traced it back to #664 as you can see it in https://v0-34.datasette.io/ but not https://v0-35.datasette.io/. Looking at the template context debug between the two you can see what is missing from 0.35 vs. 0.34: ```diff < "datasette_version": "0.34", < "app_css_hash": "ffa51a", < "select_templates": [ < "*index.html" < ], < "zip": "<class 'zip'>", < "body_scripts": [], < "extra_css_urls": "<generator object BaseView._asset_urls at 0x7f6529ac05f0>", < "extra_js_urls": "<generator object BaseView._asset_urls at 0x7f6529ac0660>", < "format_bytes": "<function format_bytes at 0x7f652a1588b0>", < "database_url": "<bound method BaseView.database_url of <datasette.views.index.IndexView object at 0x7f6529b03e50>>", < "database_color": "<bound method BaseView.database_color of <datasette.views.index.IndexView object at 0x7f6529b03e50>>" --- > "datasette_version": "0.35", > "database_url": "<bound method BaseView.database_url of <datasette.views.index.IndexView object at 0x7f6140dacd90>>", > "database_color": "<bound method BaseView.database_color of <datasette.views.index.IndexView object at 0x7f6140dacd90>>" ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/689/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
512996469 | MDU6SXNzdWU1MTI5OTY0Njk= | 607 | Ways to improve fuzzy search speed on larger data sets? | 8431341 | closed | 0 | 6 | 2019-10-27T17:31:37Z | 2019-11-07T03:38:10Z | 2019-11-07T03:38:10Z | NONE | I have an sqlite table with 16 million rows in it. Having read @simonw article "[Fast Autocomplete Search for Your Website](https://24ways.org/2018/fast-autocomplete-search-for-your-website/)" I was curious to try datasette to see what kind of query performance I could get out of it. In truth I don't need to do full text search since all I would like to do is give my users a way to search for the names of investors such as "Warren Buffet", or "Tim Cook" (who's names are in a single column). On the first search, Datasette takes over 20 seconds to return all records associated with `elon musk`: > ![image](https://user-images.githubusercontent.com/8431341/67638889-a86e1100-f8b7-11e9-9f7e-a9d13a42e988.png) > ![image](https://user-images.githubusercontent.com/8431341/67638825-ed457800-f8b6-11e9-94d1-b44f1a40ee8c.png) If I rerun the same search, it then takes almost 9 seconds: > ![image](https://user-images.githubusercontent.com/8431341/67638908-e4a17180-f8b7-11e9-9d00-748c80ef1f21.png) That's far to slow to implement an autocomplete feature. I could reduce the latency by making a special table of only unique investor names, thereby reducing the search space to less than a million rows (then I'd need to implement a way to add only new investor names to the table as I received new data.. about 4,000 rows a day). If I did that, I'm still concerned the new table wouldn't be lean enough to lookup investor names quickly. Plus, even if I can implement the autocomplete feature, I would still finally have to lookup records for that investors which would take between 8 - 20 seconds. Are there any tricks for speeding this up? Here's my hardware: > ![image](https://user-images.githubusercontent.com/8431341/67638861-55945980-f8b7-11e9-96a8-ca76c7c68c5d.png) | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/607/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed |