github
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | pull_request | body | repo | type | active_lock_reason | performed_via_github_app | reactions | draft | state_reason |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1907281675 | I_kwDOCGYnMM5xrs8L | 595 | Cascading DELETE not working with Table.delete(pk) | 123451970 | closed | 0 | 1 | 2023-09-21T15:46:41Z | 2023-09-25T09:38:57Z | 2023-09-25T09:38:13Z | NONE | Hi ! I noticed that when I am trying to use the delete method of the Table object, the record get properly deleted from the table, but the cascading delete triggers on foreign keys do not activate. `self.db["contact"].delete(contact_id)` I tried querying the database directly via DB Browser and the triggers work without any issue. Looked up the source code and behind the scene this method is just querying the database normally so I'm not exactly sure where this behavior comes from. Thank you in advance for your time ! | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/595/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1366423176 | I_kwDOCGYnMM5RcfaI | 485 | Progressbar not shown when inserting/upserting jsonlines file | 99098079 | closed | 0 | 1 | 2022-09-08T14:13:18Z | 2022-09-15T20:39:52Z | 2022-09-15T20:37:52Z | CONTRIBUTOR | When inserting or upserting a jsonlines file, no progressbar is shown. Expected behavior is that, just like with .csv/.tsv files, also for a jsonlines file (--nl), unless --silent is provided, a progressbar is shown. ```bash sql-utils upsert mydb.db posts posts.jl --nl --pk post_id (silence) ``` Currently `file_progress` is only called within the tsv/csv logic, however I think it can be safely wrapped around all the all the input formats that use `decoded`: https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/cli.py#L963 | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/485/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1620254998 | I_kwDOCGYnMM5gkyEW | 532 | Show more information when JSON can't be imported with sqlite-utils insert | 83080728 | closed | 0 | 2 | 2023-03-12T06:41:44Z | 2023-05-08T20:32:16Z | 2023-05-08T20:32:02Z | NONE | I am currently trying to import the [JSON export of my data from Discord](https://support.discord.com/hc/en-us/articles/360004027692-Requesting-a-Copy-of-your-Data), specifically `activity/reporting/events-*.json` ``` sqlite-utils.exe insert test.db reporting events-2023-00000-of-00001.json [###################################-] 99% 00:00:00 Error: Invalid JSON - use --csv for CSV or --tsv for TSV files ``` Please show more information as to *why* this is invalid, if possible. I am using version 3.30 with Python 3.10 on Windows 11. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/532/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1579695809 | I_kwDOBm6k_c5eKD7B | 2023 | Error: Invalid setting 'hash_urls' in settings.json in 0.64.1 | 80409402 | closed | 0 | 2 | 2023-02-10T13:35:01Z | 2023-02-10T15:40:00Z | 2023-02-10T15:39:59Z | NONE | On a Debian machine, using datasette 0.64.1 installed with `pip3`, I am getting a `datasette[114272]: Error: Invalid setting 'hash_urls' in settings.json` in `journalctl -xe`. The same settings work on 0.54.1 on another Debian server. This is my `settings.json`: ```json { "default_page_size": 200, "max_returned_rows": 8000, "num_sql_threads": 3, "sql_time_limit_ms": 1000, "default_facet_size": 30, "facet_time_limit_ms": 200, "facet_suggest_time_limit_ms": 50, "hash_urls": false, "allow_facet": true, "allow_download": true, "suggest_facets": true, "default_cache_ttl": 5, "default_cache_ttl_hashed": 31536000, "cache_size_kb": 0, "allow_csv_stream": true, "max_csv_mb": 100, "truncate_cells_html": 2048, "force_https_urls": false, "template_debug": false, "base_url": "/pclim/db/" } ``` This looks ok to me. Would you have any ideas? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2023/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
994390593 | MDU6SXNzdWU5OTQzOTA1OTM= | 1468 | Faceting for custom SQL queries | 72577720 | closed | 0 | 2 | 2021-09-13T02:52:16Z | 2021-09-13T04:54:22Z | 2021-09-13T04:54:17Z | CONTRIBUTOR | Facets are awesome. But not when I need to join to tidy tables together. Or even just running explicitly the default SQL query that simply lists all the rows and columns of a table (up to SIZE). That is to say, when I browse a table, I see facets: https://latest.datasette.io/fixtures/compound_three_primary_keys But when I run a custom query, I don't: https://latest.datasette.io/fixtures?sql=select+pk1%2C+pk2%2C+pk3%2C+content+from+compound_three_primary_keys+order+by+pk1%2C+pk2%2C+pk3+limit+101 Is there an idiom to cause custom SQL to come back with facet suggestions? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1468/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1388631785 | I_kwDOBm6k_c5SxNbp | 1826 | render_cell documentation example doesn't match the method signature | 66709385 | closed | 0 | 3 | 2022-09-28T02:37:59Z | 2022-09-28T04:30:28Z | 2022-09-28T04:05:16Z | NONE | Open Datasette stable doc at https://docs.datasette.io/en/stable/plugin_hooks.html?highlight=render_cell#render-cell-row-value-column-table-database-datasette render_cell plugin hook method signature is `render_cell(row, value, column, table, database, datasette)`, the example shown inline uses `render_cell(value)`. ![image](https://user-images.githubusercontent.com/66709385/192674691-34265b81-6cdd-41d2-8424-aa12f8bc8c94.png) | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1826/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
976399638 | MDU6SXNzdWU5NzYzOTk2Mzg= | 319 | [Enhancement] Please allow 'insert-files' to insert content as text. | 66709385 | closed | 0 | 10 | 2021-08-22T15:10:46Z | 2021-08-24T23:33:45Z | 2021-08-24T23:33:44Z | NONE | 'insert-files' creates BLOB columns for file contents. Transforming the column to TEXT still keep the content as binary. Even though I'm sure there is a transform that can be applied decoding the text it would be great to have a argument to make 'insert-files' to do it as text (with optional text encoding). The use case is a bunch of htmls (single file) on a directory structure that inserted with this command could be served in Datasette allowing full text search. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/319/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
568091133 | MDU6SXNzdWU1NjgwOTExMzM= | 676 | ?_searchmode=raw option for running FTS searches without escaping characters | 58088336 | closed | 0 | 9 | 2020-02-20T06:56:57Z | 2020-02-25T05:57:24Z | 2020-02-25T05:56:04Z | NONE | After the version 0.34. I am not able to use the wildchar in the _search option( or the full text search). It will not return any result unless I specify the whole word for text search. If I use 'match :search || "*" ' in the sql statement then it will work as expected. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/676/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
551834842 | MDU6SXNzdWU1NTE4MzQ4NDI= | 659 | README information is obscured by feature history | 55480210 | closed | 0 | 1 | 2020-01-18T22:34:51Z | 2020-12-10T23:28:51Z | 2020-12-10T23:28:51Z | NONE | While it's sometimes valuable to know how a project has developed, there is usually little justification for including this information in the README, and certainly not immediately after other key information such as "what does this package do, and who might want to use it?" Might I recommend that the feature history is migrated to an Appendix in the documentation? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/659/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1026794056 | I_kwDOCGYnMM49M6JI | 331 | Mypy error: found module but no type hints or library stubs | 53032010 | closed | 0 | 2 | 2021-10-14T20:29:50Z | 2021-11-14T23:21:08Z | 2021-11-14T23:21:08Z | NONE | ``` Python 3.9.5 mypy 0.910 sqlite-utils 3.17.1 ``` While using sqlite-utils as a library, when I use mypy for static type checking, it throws an error: ``` mypy . src/etl.py:5: error: Skipping analyzing "sqlite_utils": found module but no type hints or library stubs import sqlite_utils ^ src/etl.py:5: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports test/test_etl.py:4: error: Skipping analyzing "sqlite_utils": found module but no type hints or library stubs import sqlite_utils ^ Found 2 errors in 2 files (checked 7 source files) ``` When I add a `py.typed` file to the sqlite-utils package to mark it as PEP 561 compatible, the error goes away. ``` al@nbal ..b/python3.9/site-packages/sqlite_utils (git)-[main] % la total 200 drwx------ 3 al al 4096 Oct 14 22:00 . drwx------ 117 al al 4096 Oct 12 21:12 .. -rw------- 1 al al 64409 Oct 12 21:11 cli.py -rw------- 1 al al 109092 Oct 12 21:11 db.py -rw------- 1 al al 0 Oct 14 22:00 py.typed -rw------- 1 al al 684 Oct 12 21:11 recipes.py -rw------- 1 al al 7988 Oct 12 21:11 utils.py -rw------- 1 al al 113 Oct 12 21:11 __init__.py ``` I would like to suggest adding a `py.typed` file to the repository. See also the mypy docs on creating PEP 561 compatible packages: https://mypy.readthedocs.io/en/stable/installed_packages.html#creating-pep-561-compatible-packages | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/331/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1076057610 | I_kwDOBm6k_c5AI1YK | 1546 | validating the sql | 50336793 | closed | 0 | 1 | 2021-12-09T21:35:57Z | 2021-12-18T02:05:17Z | 2021-12-18T02:05:16Z | NONE | Could someone tell me that part of the code is responsible for validating the sql that guarantees that only a table can be read | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1546/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1353441389 | I_kwDOCGYnMM5Qq-Bt | 477 | Conda Forge | 49702524 | closed | 0 | 2 | 2022-08-28T19:03:08Z | 2022-09-07T03:46:55Z | 2022-09-07T03:46:55Z | NONE | Hello! I have successfully put this package on to Conda Forge, and I have extending the invitation for the owner/maintainers of this package to be maintainers on Conda Forge as well. Let me know if you are interested! Thanks. https://github.com/conda-forge/sqlite-utils-feedstock | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/477/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
518506242 | MDU6SXNzdWU1MTg1MDYyNDI= | 616 | Datasette FTS detection bug | 49656826 | closed | 0 | 2 | 2019-11-06T14:25:47Z | 2019-11-08T15:31:33Z | 2019-11-08T02:06:56Z | NONE | I'm having a trouble with datasette. I deployed EXACTLY the same project on two different apps on Heroku. Both have databases (not all) with FTS activated but only one detects and works fine. You can take a look here: With search: http://teste-templates.herokuapp.com/amazonia_protege/car Without search: http://bases.vortex.media/amazonia_protege/car ![teste](https://user-images.githubusercontent.com/49656826/68306310-11a80e00-0088-11ea-8d1c-db3bd3375518.jpg) | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/616/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
743297582 | MDU6SXNzdWU3NDMyOTc1ODI= | 7 | evernote-to-sqlite on windows 10 give this error: TypeError: insert() got an unexpected keyword argument 'replace' | 42387931 | closed | 0 | 1 | 2020-11-15T16:57:28Z | 2021-02-11T22:13:17Z | 2021-02-11T22:13:17Z | NONE | running evernote-to-sqlite 0.2 on windows 10. Command: evernote-to-sqlite enex evernote.db MyNotes.enex I get the followinng error: File "C:\Users\marti\AppData\Roaming\Python\Python38\site-packages\evernote_to_sqlite\utils.py", line 46, in save_note note_id = db["notes"].insert(row, hash_id="id", replace=True, alter=True).last_pk TypeError: insert() got an unexpected keyword argument 'replace' Removing replace=True, Leads to below error: note_id = db["notes"].insert(row, hash_id="id", alter=True).last_pk File "C:\Users\marti\AppData\Roaming\Python\Python38\site-packages\sqlite_utils\db.py", line 924, in insert return self.insert_all( File "C:\Users\marti\AppData\Roaming\Python\Python38\site-packages\sqlite_utils\db.py", line 1046, in insert_all result = self.db.conn.execute(sql, values) sqlite3.IntegrityError: UNIQUE constraint failed: notes.id | 303218369 | issue | { "url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/7/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
796736607 | MDU6SXNzdWU3OTY3MzY2MDc= | 56 | Not all quoted statuses get fetched? | 42315895 | closed | 0 | 3 | 2021-01-29T09:48:44Z | 2021-02-03T10:36:36Z | 2021-02-03T10:36:36Z | NONE | ![image](https://user-images.githubusercontent.com/42315895/106259325-5f75dc80-621f-11eb-8311-db8f2fe2a257.png) In my database I have 13300 quote tweets, but eta 3600 have `quoted_status` empty. I fetched some of them using `https://api.twitter.com/1.1/statuses/show.json?id=xx` and they did have ids of quoted tweets. | 206156866 | issue | { "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/56/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1617823309 | I_kwDOJHON9s5gbgZN | 8 | Increase performance using macnotesapp | 41546558 | closed | 0 | 1 | 2023-03-09T18:51:05Z | 2023-03-14T22:00:22Z | 2023-03-14T22:00:21Z | NONE | Neat project! You can probably increase performance using my python interface to Notes, [macnotesapp](https://github.com/RhetTbull/macnotesapp), which uses Scripting Bridge and bulk queries for much better performance than AppleScript. Another related project is [PyXA](https://github.com/SKaplanOfficial/PyXA) which uses Scripting Bridge to access Notes (and many other apps) and can return all the notes at once as opposed to calling AppleScript for each note. macnotesapp allows you to access multiple accounts and folders as well. ```python from macnotesapp import NotesApp # NotesApp() provides interface to Notes.app notesapp = NotesApp() # Get list of notes (Note objects for each note) notes = notesapp.notes() note = notes[0] print( note.id, note.account, note.folder, note.name, note.body, note.plaintext, note.password_protected, ) print(note.asdict()) ``` | 611552758 | issue | { "url": "https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/8/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1023243105 | I_kwDOBm6k_c48_XNh | 1486 | pipx installation instructions for plugins don't reference pipx inject | 41546558 | closed | 0 | 0 | 2021-10-12T00:43:42Z | 2021-10-13T21:09:11Z | 2021-10-13T21:09:11Z | CONTRIBUTOR | The datasette [installation instructions](https://github.com/simonw/datasette/blob/main/docs/installation.rst) discuss how to install with pipx, how to upgrade with pipx, and how to upgrade plugins with pipx but do not mention how to install a plugin with pipx. You discussed this on your [blog](https://til.simonwillison.net/python/installing-upgrading-plugins-with-pipx) but looks like this didn't make it in when you updated the docs for pipx (#756). I'll submit a PR shortly to fix this. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1486/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1976986318 | I_kwDOCGYnMM511mrO | 599 | Cannot find spatialite on arm64 linux | 37802088 | closed | 0 | 1 | 2023-11-03T22:05:51Z | 2023-11-04T01:06:31Z | 2023-11-04T00:33:28Z | CONTRIBUTOR | Initially, I found an issue in `datasette` where it wouldn’t find `spatialite` when running on my Radxa Rock 5B - an RK3588 powered SBC, running the arm64 build of Debian Bullseye. I confirmed the same behaviour on my Raspberry Pi 4 - a BCM2711 powered SBC, running the arm64 build of Debian Bookworm. ``` $ datasette --load-extension=spatialite example.db Error: Could not find SpatiaLite extension ``` I did some digging and realised the issue originates in this project. Even with the `libsqlite3-mod-spatialite` package installed, `pytest` skips all of the GIS tests in the project. ``` $ apt list --installed | grep spatial […] libsqlite3-mod-spatialite/stable,now 5.0.1-3 arm64 [installed] $ ls -l /usr/lib/*/*spatial* lrwxrwxrwx 1 root root 23 Dec 1 2022 /usr/lib/aarch64-linux-gnu/mod_spatialite.so -> mod_spatialite.so.7.1.0 lrwxrwxrwx 1 root root 23 Dec 1 2022 /usr/lib/aarch64-linux-gnu/mod_spatialite.so.7 -> mod_spatialite.so.7.1.0 -rw-r--r-- 1 root root 7348584 Dec 1 2022 /usr/lib/aarch64-linux-gnu/mod_spatialite.so.7.1.0 ``` ``` $ pytest tests/test_get.py ...... [ 73%] tests/test_gis.py ssssssssssss [ 75%] tests/test_hypothesis.py .... [ 75%] ``` I tracked the issue down to the [`find_sqlite()` function in the `utils.py`](https://github.com/simonw/sqlite-utils/blob/622c3a5a7dd53a09c029e2af40c2643fe7579340/sqlite_utils/utils.py#L60) file. The [`SPATIALITE_PATHS`](https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/utils.py#L34-L39) array doesn’t have an entry for the location of this module on arm64 linux. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/599/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1199158210 | I_kwDOCGYnMM5HebPC | 423 | .extract() doesn't set foreign key when extracted columns contain NULL value | 37447552 | closed | 0 | 1 | 2022-04-10T20:05:30Z | 2022-08-27T14:45:04Z | 2022-08-27T14:45:04Z | NONE | I've run into an issue with `extract` and I don't believe this is the intended behaviour. I'm working with a database with music listening information. Currently it has one large table `listens` that contains all information. I'm trying to normalize the database by extracting relevant columns to separate tables (`artists`, `tracks`, `albums`). Not every track has an album. A simplified demonstration with just `track_title` and `album_title` columns: ```ipython In [1]: import sqlite_utils In [2]: db = sqlite_utils.Database(memory=True) In [3]: db["listens"].insert_all([ ...: {"id": 1, "track_title": "foo", "album_title": "bar"}, ...: {"id": 2, "track_title": "baz", "album_title": None} ...: ], pk="id") Out[3]: <Table listens (id, track_title, album_title)> ``` The track in the first row has an album, the second track doesn't. Now I extract album information into a separate column: ```ipython In [4]: db["listens"].extract(columns=["album_title"], table="albums", fk_column="album_id") Out[4]: <Table listens (id, track_title, album_id)> In [5]: list(db["albums"].rows) Out[5]: [{'id': 1, 'album_title': 'bar'}, {'id': 2, 'album_title': None}] In [6]: list(db["listens"].rows) Out[6]: [{'id': 1, 'track_title': 'foo', 'album_id': 1}, {'id': 2, 'track_title': 'baz', 'album_id': None}] ``` This behaves as expected -- the `album` table contains entries for both the existing album and the NULL album. The `listens` table has a foreign key only for the first row (since the album in the second row was empty). Now I want to extract the track information as well. Album information belongs to the track so I want to extract both columns to a new table. ```ipython In [7]: db["listens"].extract(columns=["track_title", "album_id"], table="tracks", fk_column="track_id") Out[7]: <Table listens (id, track_id)> In [8]: list(db["tracks"].rows) Out[8]: [{'id': 1, 'track_title': 'foo', 'album_id': 1}, {'id': 2, 'track_title': 'baz', 'album_id': None}] In [9]: list(db["… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/423/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
418329842 | MDU6SXNzdWU0MTgzMjk4NDI= | 415 | Add query parameter to hide SQL textarea | 36796532 | closed | 0 | 3 | 2019-03-07T14:11:30Z | 2019-03-15T09:30:57Z | 2019-03-15T05:22:43Z | NONE | It would be cool if there was a query parameter to hide / remove the SQL textarea. Then I could simply save a bookmark for a certain query and open it to see the data without having to scroll below the (long) SQL query first. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/415/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1871935751 | I_kwDOD079W85vk3kH | 40 | ImportError: cannot import name 'formatargspec' from 'inspect' | 36752421 | closed | 0 | 0 | 2023-08-29T15:36:31Z | 2023-08-31T03:18:07Z | 2023-08-31T03:18:06Z | NONE | I get the following error when running "pip3 install dogsheep-photos" " from inspect import ismethod, isclass, formatargspec ImportError: cannot import name 'formatargspec' from 'inspect' (/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/inspect.py). Did you mean: 'formatargvalues'?" Python 3.12.0rc1 sqlite 3.43.0 datasette, version 0.64.3 | 256834907 | issue | { "url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/40/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1838266862 | I_kwDOBm6k_c5tkbnu | 2126 | Permissions in metadata.yml / metadata.json | 36199671 | closed | 0 | 3 | 2023-08-06T16:24:10Z | 2023-08-11T05:52:30Z | 2023-08-11T05:52:29Z | NONE | https://docs.datasette.io/en/latest/authentication.html#other-permissions-in-metadata says the following: > For all other permissions, you can use one or more "permissions" blocks in your metadata. > To grant access to the permissions debug tool to all signed in users you can grant permissions-debug to any actor with an id matching the wildcard * by adding this a the root of your metadata: ```yaml permissions: debug-menu: id: '*' ``` I tried this. My `metadata.yml` file looks like: ```yaml permissions: debug-menu: id: '*' permissions-debug: id: '*' plugins: datasette-auth-passwords: myuser_password_hash: $env: "PASSWORD_HASH_MYUSER" ``` And then I run ```zsh datasette -m metadata.yml tiddlywiki.db --root ``` And I open a session for the "root" user of datasette with the link given. I open a private browser session and log in as "myuser" from http://127.0.0.1:8001/-/login Then I check http://127.0.0.1:8001/-/actor which confirms that I am logged in as the "myuser" actor ```json { "actor": { "id": "myuser" } } ``` In the session where I am logged in as "myuser" I then try to go to http://127.0.0.1:8001/-/permissions But all I get there as the logged in user "myuser" is > Forbidden > > Permission denied And then if I check the http://127.0.0.1:8001/-/permissions as the datasette "root" user from another browser session, I see: > permissions-debug checked at 2023-08-06T16:22:58.997841 ✗ (used default) > > Actor: {"id": "myuser"} It seems that in spite of having tried to give the `permissions-debug` permission to the "myuser" user in my `metadata.yml` file, datasette does not agree that "myuser" has permission `permissions-debug`.. What do I need to do differently so that my "myuser" user is able to access http://127.0.0.1:8001/-/permissions ? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2126/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
343728754 | MDU6SXNzdWUzNDM3Mjg3NTQ= | 346 | Logo design for DATASETTE | 35750428 | closed | 0 | 0 | 2018-07-23T17:40:17Z | 2018-08-02T02:31:59Z | 2018-08-02T02:31:59Z | NONE | Hello :) , I'm a graphic designer, I'm interested in collaborating with open source projects, besides this helps me expand my portfolio. I would like to design a logo for your project. I will be happy to collaborate with you :). | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/346/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1496652622 | I_kwDOBm6k_c5ZNRtO | 1955 | invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things | 32839123 | closed | 0 | 36 | 2022-12-14T13:39:56Z | 2022-12-19T04:34:16Z | 2022-12-18T02:45:18Z | NONE | In the past (pre-september 14, #1809) I had a running deployment of Datasette on Azure WebApps by emulating the call in cli.py to Gunicorn: `gunicorn -w 2 -k uvicorn.workers.UvicornWorker app:app`. My most recent deployment, however, fails loudly by shouting that `Datasette.invoke_startup()` was not called. It does not seem to be possible to call `invoke_startup` when running using a uvicorn command directly like this (I've reproduced this locally using `uvicorn`). Two candidates that I have tried: * Uvicorn has a `--factory` option, but the app factory has to be synchronous, so no `await invoke_startup` there * `asyncio.get_event_loop().run_until_complete` is also not an option because `uvicorn` already has the event loop running. One additional option is: * Use Gunicorn's [server hooks](https://docs.gunicorn.org/en/stable/settings.html#server-hooks) to call `invoke_startup`. These are also synchronous, but I might be able to get ahead of the event loop starting here. In my current deployment setup, it does not appear to be possible to use `datasette serve` directly, so I'm stuck either * Trying to rework my complete deployment setup, for instance, using Azure functions as described [here](https://github.com/simonw/azure-functions-datasette)) * Or dig into the ASGI spec and write a wrapper for the sole purpose of launching Datasette using a direct Uvicorn invocation. Questions for the maintainers: * Is this intended behaviour/will not support/etc.? If so, I'd be happy to add a PR with a couple lines in the documentation. * if this is not intended behaviour, what is a good way to fix it? I could have a go at the ASGI spec thing (I think the Azure Functions thing is related) and provide a PR with the wrapper here, but I'm all ears! Almost forgot, minimal reproducer: ```python from datasette import Datasette ds = Datasette(files=['./global-power-plants.db'])] app = ds.app() ``` Save as app.py in the same folder as global-power-plants.db, and then try running `uvicorn app:app`. O… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1955/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1170497629 | I_kwDOBm6k_c5FxGBd | 1662 | [feature request] Publish to fully static website | 32609395 | closed | 0 | 1 | 2022-03-16T03:32:28Z | 2022-03-19T00:42:23Z | 2022-03-19T00:42:23Z | NONE | It seems currently all datasette publish requires a real backend server which is able to query the database and send results back to the frontend. There are a few projects to on-demand download a portion of data from the database from a sqlite lite database url, and present it directly to the user. These methods leverages web assembly under the hood. I think datasette is a perfect use case for this technology. Below are a few examples of querying sqlite database from frontend directly. * [Using sqlite3 as a notekeeping document graph with automatic reference indexing](https://epilys.github.io/bibliothecula/notekeeping.html) * [Hosting SQLite databases on Github Pages - (or any static file hoster) - phiresky's blog](https://phiresky.github.io/blog/2021/hosting-sqlite-databases-on-github-pages/) * [Static torrent website with peer-to-peer queries over BitTorrent on 2M records](https://boredcaveman.xyz/post/0x2_static-torrent-website-p2p-queries.html) | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1662/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
610517472 | MDU6SXNzdWU2MTA1MTc0NzI= | 103 | sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns | 32605365 | closed | 0 | 8 | 2020-05-01T02:26:14Z | 2020-05-14T00:18:57Z | 2020-05-14T00:18:57Z | CONTRIBUTOR | If using insert_all to put in 1000 rows of data with varying number of columns, it comes up with this message `sqlite3.OperationalError: too many SQL variables` if the number of columns is larger in later records (past the first row) I've reduced `SQLITE_MAX_VARS` by 100 to 899 at the top of `db.py` to add wiggle room, so that if the column count increases it wont go past SQLite's batch limit as calculated by this line of code based on the count of the first row's dict keys batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns)) | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/103/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
944326512 | MDU6SXNzdWU5NDQzMjY1MTI= | 296 | `table.search(..., quote=True)` parameter and `sqlite-utils search --quote` option | 32427188 | closed | 0 | 6 | 2021-07-14T11:26:47Z | 2021-08-18T20:13:12Z | 2021-08-18T20:10:48Z | NONE | Hi, Recently got this error: ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/ethan/git/music-metadata-indexer/src/mmindexer/__init__.py", line 38, in <module> start("/home/ethan/git/music-metadata-indexer/sample", "/home/ethan/git/music-metadata-indexer/test.db") File "/home/ethan/git/music-metadata-indexer/src/mmindexer/__init__.py", line 23, in start scanner.build_database() File "/home/ethan/git/music-metadata-indexer/src/mmindexer/scan.py", line 79, in build_database _import_song(self.db, Path(dirpath).joinpath(f), self.logger) File "/home/ethan/git/music-metadata-indexer/src/mmindexer/scan.py", line 23, in _import_song db.add_song(filepath) File "/home/ethan/git/music-metadata-indexer/src/mmindexer/index.py", line 166, in add_song for match in self.search("albums", album): File "/home/ethan/git/music-metadata-indexer/env/lib/python3.9/site-packages/sqlite_utils/db.py", line 1625, in search cursor = self.db.execute( File "/home/ethan/git/music-metadata-indexer/env/lib/python3.9/site-packages/sqlite_utils/db.py", line 243, in execute return self.conn.execute(sql, parameters) sqlite3.OperationalError: fts5: syntax error near "." ``` So, the error seems to suggest there was a "." character somewhere in the SQL command that was causing the error. I did a little digging and found this in the docs: https://www.sqlite.org/fts5.html#fts5_strings. "." is one of the many prohibited characters. My solution was to just strip these out of the query using this line `query = query.translate({e: None for e in itertools.chain(range(0,26), range(27, 48), range(58,65), range(91,95), [96], range(123,128))})` Perhaps this could be included into the `table.search()` function? | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/296/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1435917503 | I_kwDOBm6k_c5Vlly_ | 1883 | Errors when using table filters behind a proxy | 31312775 | closed | 0 | 13 | 2022-11-04T11:18:47Z | 2022-11-11T09:20:22Z | 2022-11-11T06:54:58Z | NONE | Using datasette==0.63 table filters do not respect the `base_url` setting as described [here](https://docs.datasette.io/en/stable/deploying.html#running-datasette-behind-a-proxy) To reproduce, go to: https://datasette-apache-proxy-demo.datasette.io/prefix/fixtures/binary_data Then use the table filter buttons. The `/prefix/` is dropped, resulting in URL not found: https://datasette-apache-proxy-demo.datasette.io/fixtures/binary_data?_sort=rowid&rowid__exact=1 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1883/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 } |
completed | ||||||
708185405 | MDU6SXNzdWU3MDgxODU0MDU= | 975 | Dependabot couldn't authenticate with https://pypi.python.org/simple/ | 27856297 | closed | 0 | 0 | 2020-09-24T13:44:40Z | 2020-09-25T13:34:34Z | 2020-09-25T13:34:34Z | CONTRIBUTOR | Dependabot couldn't authenticate with https://pypi.python.org/simple/. You can provide authentication details in your [Dependabot dashboard](https://app.dependabot.com/accounts/simonw) by clicking into the account menu (in the top right) and selecting 'Config variables'. [View the update logs](https://app.dependabot.com/accounts/simonw/update-logs/48611311). | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/975/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1180427792 | I_kwDOCGYnMM5GW-YQ | 421 | "Error: near "(": syntax error" when using sqlite-utils indexes CLI | 24938923 | closed | 0 | 8 | 2022-03-25T07:12:51Z | 2022-04-13T22:41:59Z | 2022-04-13T22:41:59Z | NONE | This bug relates to https://github.com/simonw/sqlite-utils/issues/408#issuecomment-1066139147 **New error when using CLI: "sqlite-utils indexes global.db --table"** ``` (app-root) sqlite-utils indexes global.db --table Error: near "(": syntax error (app-root) sqlite-utils --version sqlite-utils, version 3.25.1 (app-root) sqlite3 --version 3.36.0 2021-06-18 18:36:39 (app-root) python --version Python 3.8.11 ``` Dockerfile ``` FROM centos/python-38-centos7 USER root RUN yum update -y RUN yum upgrade -y # epel RUN yum -y install epel-release && yum clean all # SQLite RUN yum -y install zlib-devel geos geos-devel proj proj-devel freexl freexl-devel libxml2-devel WORKDIR /build/ COPY sqlite-autoconf-3360000.tar.gz ./ RUN tar -zxf sqlite-autoconf-3360000.tar.gz WORKDIR /build/sqlite-autoconf-3360000 RUN ./configure RUN make RUN make install # RUN /opt/app-root/bin/python3.8 -m pip install --upgrade pip RUN pip install sqlite-utils ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/421/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1145882578 | I_kwDOCGYnMM5ETMfS | 408 | `deterministic=True` fails on versions of SQLite prior to 3.8.3 | 24938923 | closed | 0 | 6 | 2022-02-21T14:36:43Z | 2022-03-13T16:54:09Z | 2022-03-02T00:38:11Z | NONE | Hi, love your work. I am unable to lookup indexes in a database using sqlite-utils: ` sqlite-utils indexes city_spec.db --table` or `sqlite-utils indexes city_spec.db MyTable ` **Software** sqlite-utils, version 3.24 sqlite3 --version: 3.36.0 **Output:** Traceback (most recent call last): File "/opt/app-root/bin/sqlite-utils", line 8, in <module> sys.exit(cli()) File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "/opt/app-root/lib64/python3.8/site-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), *args, **kwargs) File "/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/cli.py", line 2123, in indexes ctx.invoke( File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/cli.py", line 1624, in query db.register_fts4_bm25() File "/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/db.py", line 403, in register_fts4_bm25 self.register_function(rank_bm25, deterministic=True) File "/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/db.py", line 399, in register_function register(fn) File "/opt/app-root/lib64/python3.8/site-packages/sqlite_utils/db.py", line 392, in register self.conn.create_function(name, arity, fn, **kwargs) sqlite3.NotSupportedE… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/408/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
767685961 | MDU6SXNzdWU3Njc2ODU5NjE= | 210 | Support of RData files | 23739126 | closed | 0 | 1 | 2020-12-15T15:04:14Z | 2021-01-02T00:02:40Z | 2021-01-02T00:02:40Z | NONE | Hi Simon, Would be great if you could ingest RData files! I could do this in a few lines of code but I am too lazy - sorry! Peter | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/210/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
925406964 | MDU6SXNzdWU5MjU0MDY5NjQ= | 1382 | Datasette with Glitch - is it possible to use CSV with ISO-8859-1 encoding? | 23701514 | closed | 0 | 1 | 2021-06-19T14:37:20Z | 2021-06-20T00:21:02Z | 2021-06-20T00:20:06Z | NONE | Hi Please, I used Remix on Glitch to create a project on Glitch and uploaded a CSV But it's a CSV with ISO-8859-1 encoding (https://en.wikipedia.org/wiki/ISO/IEC_8859-1) Is it possible for me to change the encoding to correctly visualize the data? Example: https://emphasized-carpal-pillow.glitch.me/data/Emendas Best | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1382/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1028056713 | I_kwDOCGYnMM49RuaJ | 332 | `sqlite-utils memory --flatten` option to flatten nested JSON | 22523840 | closed | 0 | 1 | 2021-10-16T14:04:42Z | 2021-11-14T23:05:05Z | 2021-11-14T23:05:05Z | NONE | currently --flatten option works only for `insert` command, it would be cool if it worked for `memory` as well to query nested json | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/332/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1572766460 | I_kwDOCGYnMM5dvoL8 | 524 | Transformation type `--type DATETIME` | 21095447 | closed | 0 | 15 | 2023-02-06T15:18:42Z | 2023-02-15T12:10:54Z | 2023-02-15T12:10:54Z | NONE | Hey. Currently i do transformation with the type `--type TEXT`, but i noticed using the sqlalchemy based library [dataset](https://github.com/pudo/dataset) that is reading and writing differ depending on the column types `TEXT`, `DATETIME`. Is it possible to alter a column type to `DATETIME` somehow using Sqlite-Utils? | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/524/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
892457208 | MDU6SXNzdWU4OTI0NTcyMDg= | 1327 | Support Unicode characters in metadata.json | 20846286 | closed | 0 | 2 | 2021-05-15T14:33:58Z | 2021-05-24T19:10:21Z | 2021-05-24T19:10:21Z | NONE | Hello , when I used Burmese (Unicode) characters in metadata.json like below - ![image](https://user-images.githubusercontent.com/20846286/118364978-cba70100-b5c0-11eb-967c-7dc3b62478f2.png) It gave wrong results when I run datasette - ![image](https://user-images.githubusercontent.com/20846286/118365025-fc873600-b5c0-11eb-97ce-19541b8cc6d8.png) It would be great & helpful for us if metadata.json can support in Unicode supported Asian Languages. Thanks & Regards. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1327/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
995098231 | MDU6SXNzdWU5OTUwOTgyMzE= | 1470 | ?_sort=rowid with _next= returns error | 19851673 | closed | 0 | 4 | 2021-09-13T16:36:15Z | 2021-10-18T19:30:15Z | 2021-10-10T01:15:03Z | NONE | For example: - Go to https://cryptics.eigenfoo.xyz/clues/clues?_next=100 (this is the second page of results in a Datasette site) - Search anything using the FTS search bar. For example, searching for `hello` will take you to https://cryptics.eigenfoo.xyz/clues/clues?_search=hello&_sort=rowid&_next=100 - A `500 Error: list index out of range` is raised. This is because the search URL includes the `&_next=100` UTM parameter, carried over from where the FTS search was run. However, there isn't a second page in the search results, so a `list index out of range` error is raised. You can confirm that removing this UTM parameter from the URL returns the appropriate search results. The FTS search request should strip any `_next` UTM parameter. --- ```bash datasette, version 0.58.1 sqlite-utils, version 3.17 ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1470/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1594383280 | I_kwDOBm6k_c5fCFuw | 2030 | How to use Datasette with apache webserver on GCP? | 19700859 | closed | 0 | 2 | 2023-02-22T03:08:49Z | 2023-02-22T21:54:39Z | 2023-02-22T21:54:39Z | NONE | Hi Simon and Datasette team- I have installed apache2 webserver inside GCP VM using apt. I can see my "Hello World" index.html if I use the external IP of this GCP in a browser. However, when I try to run datasette with different combinations of -h and -p, I am still unable to access the webpage. I cannot invest Docker on this VM. Any pointers to use datasette with already existing apache2 webserver on GCP is appreciated. Thanks. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2030/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
963897111 | MDU6SXNzdWU5NjM4OTcxMTE= | 309 | sqlite-utils insert errors should show SQL and parameters, if possible | 16622642 | closed | 0 | 6 | 2021-08-09T11:24:14Z | 2021-08-09T23:40:29Z | 2021-08-09T22:25:58Z | NONE | I've tried several approaches, but this is the current one: ```sh echo $json-line | sqlite-utils insert json.db jsontable --truncate --alter --detect-types - ``` In all cases, I get this error: ```sh OverflowError: Python int too large to convert to SQLite INTEGER Traceback (most recent call last): File "/home/sean/.local/bin/sqlite-utils", line 8, in <module> sys.exit(cli()) File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__ return self.main(*args, **kwargs) File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) File "/home/sean/.local/lib/python3.8/site-packages/sqlite_utils/cli.py", line 841, in insert insert_upsert_implementation( File "/home/sean/.local/lib/python3.8/site-packages/sqlite_utils/cli.py", line 780, in insert_upsert_implementation db[table].insert_all( File "/home/sean/.local/lib/python3.8/site-packages/sqlite_utils/db.py", line 2145, in insert_all self.insert_chunk( File "/home/sean/.local/lib/python3.8/site-packages/sqlite_utils/db.py", line 1957, in insert_chunk result = self.db.execute(query, params) File "/home/sean/.local/lib/python3.8/site-packages/sqlite_utils/db.py", line 257, in execute return self.conn.execute(sql, parameters) ``` I googled the error and checked SO answers and advice, all good. I changed my JSON file to not use integers so I no longer get this error. Of course, that makes using the database a bit harder, so I also tried to solve the problem by modifying DB structure (while using integers in JSON). If change all `INTEGER` Data Types to something else (`ST… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/309/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
831751367 | MDU6SXNzdWU4MzE3NTEzNjc= | 246 | Escaping FTS search strings | 16001974 | closed | 0 | 4 | 2021-03-15T12:15:09Z | 2021-08-18T18:57:13Z | 2021-08-18T18:43:12Z | CONTRIBUTOR | Thanks for the excellent library, it's very nice to use! I've been building some in memory search functionality for a data annotation tool i'm making, and I got tripped up a little bit with escaping the full text search queries. First I tried using `db.quote(q)`, which doesn't work, because sqlite FTS has it's own (separate)[ query syntax](https://www2.sqlite.org/fts5.html#full_text_query_syntax). You can see this happening here also: http://search-24ways.herokuapp.com/24ways-f8f455f/articles?_search=acces%2A I got around this by aggressively escaping quotes inside the query string like this: ```python quoted = q.replace('"', '""') quoted = f'"{quoted}"' print(quoted) results = db["data"].search(quoted, columns=["id"]) return [x["id"] for x in results] ``` This works in the sense it doesn't crash, but it also removes access to the search query syntax. Given the well specified definition, it might be possible for sqlite-utils to provide a `db.quote_query(q)` which would intelligently escape a query whilst leaving the syntax intact. This would be very nice! | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/246/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
823035080 | MDU6SXNzdWU4MjMwMzUwODA= | 1248 | duckdb database (very low performance in SQLite) | 15836677 | closed | 0 | 1 | 2021-03-05T12:20:29Z | 2021-03-08T00:25:27Z | 2021-03-08T00:25:27Z | NONE | My sqlite is getting too big to be processed by datasette (more than 10 minutes waiting to load) so I am working with duckdb and is waaaaay faster. I think the fastest embeddable database actually. https://duckdb.org/ Taking into account DuckDb is SQLite based it would be GREAT to use it with datasette. is that possible? Regards and thanks for a superb job | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1248/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1157182254 | I_kwDOBm6k_c5E-TMu | 1646 | Configuration directory mode does not pick up other file extensions than .db | 15640196 | closed | 0 | 3 | 2022-03-02T13:15:23Z | 2022-10-07T23:06:17Z | 2022-10-07T23:03:35Z | NONE | Hello, I've been trying to run Datasette with the [configuration directory mode](https://docs.datasette.io/en/stable/settings.html#configuration-directory-mode) with a structure such as this one: ```plain some-directory/ example.sqlite3 another-example.db one-more.custom [...] ``` (In my scenario I can't just change the filename extension without other problems arising) Now databases with the `.sqlite3` or the custom filename extension are ignored by Datasette in this case. I'm aware that the docs state that a `.db` extension is required, but I was wondering if there is a reason for restricting this or any workaround available? When I run `datasette example.sqlite3` or `datasette one-more.custom` the databases are served by Datasette without a problem. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1646/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1801394744 | I_kwDOCGYnMM5rXxo4 | 567 | Plugin system | 15178711 | closed | 0 | 9 | 2023-07-12T17:02:14Z | 2023-07-22T22:59:37Z | 2023-07-22T22:59:36Z | CONTRIBUTOR | I'd like there to be a plugin system for sqlite-utils, similar to the datasette/llm plugins. I'd like to make plugins that would do things like: - Register SQLite extensions for more SQL functions + virtual tables - Register new subcommands - Different input file formats for `sqlite-utils memory` - Different output file formats (in addition to `--csv` `--tsv` `--nl` etc. A few real-world use-cases of plugins I'd like to see in sqlite-utils: - Register many of my sqlite extensions in sqlite-utils (`sqlite-http`, `sqlite-lines`, `sqlite-regex`, etc.) - New subcommands to work with `sqlite-vss` vector tables - Input/ouput Parquet/Avro/Arrow IPC files with `sqlite-arrow` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/567/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1339663518 | I_kwDOBm6k_c5P2aSe | 1784 | Include "entrypoint" option on `--load-extension`? | 15178711 | closed | 0 | 2 | 2022-08-16T00:22:57Z | 2022-08-23T18:34:31Z | 2022-08-23T18:34:31Z | CONTRIBUTOR | ## Problem SQLite extensions have the option to define multiple "entrypoints" in each loadable extension. For example, the upcoming version of `sqlite-lines` will have 2 entrypoints: the default `sqlite3_lines_init` (which SQLite will automatically guess for) and `sqlite3_lines_noread_init`. The `sqlite3_lines_noread_init` version omits functions that read from the filesystem, which is necessary for security purposes when running untrusted SQL (which Datasette does). (Similar multiple entrypoints will also be added for sqlite-http). The `--load-extension` flag, however, doesn't give the option to specify a different entrypoint, so the default one is always used. ## Proposal I want there to be a new command line option of the `--load-extension` flag to specify a custom entrypoint like so: ``` datasette my.db \ --load-extension ./lines0 sqlite3_lines0_noread_init ``` Then, under the hood, this line of code: https://github.com/simonw/datasette/blob/7af67b54b7d9bca43e948510fc62f6db2b748fa8/datasette/app.py#L562 Would look something like this: ```python conn.execute("SELECT load_extension(?, ?)", [extension, entrypoint]) ``` One potential problem: For backward compatibility, I'm not sure if Click allows cli flags to have variable number of options ("arity"). So I guess it could also use a `:` delimiter like `--static`: ``` datasette my.db \ --load-extension ./lines0:sqlite3_lines0_noread_init ``` Or maybe even a new flag name? ``` datasette my.db \ --load-extension-entrypoint ./lines0 sqlite3_lines0_noread_init ``` Personally I prefer the `:` option... and maybe even `--load-extension` -> `--load`? Definitely out of scope for this issue tho ``` datasette my.db \ --load./lines0:sqlite3_lines0_noread_init ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1784/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
274160723 | MDU6SXNzdWUyNzQxNjA3MjM= | 100 | TemplateAssertionError: no filter named 'tojson' | 13304454 | closed | 0 | 2 | 2017-11-15T13:43:41Z | 2017-11-16T09:25:10Z | 2017-11-16T00:14:13Z | NONE | A 500 error is raised upon clicking on the name of a table on the homepage, say _http://0.0.0.0:8001/_ to _http://0.0.0.0:8001/test_check-c1f4771/users_ The API part seems to function as intended, though... ``` 2017-11-15 14:33:57 - (sanic)[ERROR]: Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/sanic/app.py", line 503, in handle_request response = await response File "/usr/local/lib/python3.5/dist-packages/datasette/app.py", line 155, in get return await self.view_get(request, name, hash, **kwargs) File "/usr/local/lib/python3.5/dist-packages/datasette/app.py", line 219, in view_get **context, File "/usr/local/lib/python3.5/dist-packages/sanic_jinja2/__init__.py", line 84, in render return html(self.render_string(template, request, **context)) File "/usr/local/lib/python3.5/dist-packages/sanic_jinja2/__init__.py", line 81, in render_string return self.env.get_template(template).render(**context) File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 812, in get_template return self._load_template(name, self.make_globals(globals)) File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 786, in _load_template template = self.loader.load(self, name, globals) File "/usr/lib/python3/dist-packages/jinja2/loaders.py", line 125, in load code = environment.compile(source, name, filename) File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 565, in compile self.handle_exception(exc_info, source_hint=source_hint) File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 754, in handle_exception reraise(exc_type, exc_value, tb) File "/usr/lib/python3/dist-packages/jinja2/_compat.py", line 37, in reraise raise value.with_traceback(tb) File "/usr/local/lib/python3.5/dist-packages/datasette/templates/table.html", line 29, in template <pre>params = {{ query.params|tojson(4) }}</pre> File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 515, i… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/100/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
505512251 | MDU6SXNzdWU1MDU1MTIyNTE= | 588 | Queries per DB table in metadata.json | 12617395 | closed | 0 | 3 | 2019-10-10T21:08:19Z | 2019-10-21T12:58:22Z | 2019-10-21T01:48:42Z | NONE | It doesn't appear possible to have separate queries defined per database table. When I do something like below, my table descriptions show up but not the queries: ` "databases": { "MYDB": { "tables": { "MYFIRSTTABLE": { "source": "Test", "source_url": "https://www.google.com", "queries": { "Query 1": { "sql": "select * from MYFIRSTTABLE", "title": "Query 1", "description": "This is the first query" }, } }, "MYSECONDTABLE": { "source":"Test2", "source_url":"https://www.google.com", "queries": { "Query 2" : { "sql":"select * from MYSECONDTABLE;", "title": "Query 2", "description":"This is the second query" } } } }` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/588/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
341123355 | MDU6SXNzdWUzNDExMjMzNTU= | 342 | Requesting support for query description | 12617395 | closed | 0 | 4 | 2018-07-13T18:50:16Z | 2018-07-24T04:53:21Z | 2018-07-16T02:33:54Z | NONE | It would be great if the metadata file allowed you to enter a description for the query. We have a lot of pre-defined queries that can only be so descriptive by their name. It would be nice if an optional description could be included underneath the name within the UI, or on hover where it currently shows the SQL. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/342/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
340396247 | MDU6SXNzdWUzNDAzOTYyNDc= | 339 | Expose SANIC_RESPONSE_TIMEOUT config option in a sensible way | 12617395 | closed | 0 | 4 | 2018-07-11T20:38:06Z | 2022-03-21T22:22:40Z | 2022-03-21T22:22:34Z | NONE | Is it possible to configure the sql_time_limit_ms beyond 60 seconds? It seems queries are still timing out at 60 seconds when sql_time_limit_ms is set to 180000. We have a very large data set and often encounter timeouts when testing new queries from the datasette UI. We are optimizing our database as much as we can, but still may require more than 60 seconds for complex queries. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/339/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
334190959 | MDU6SXNzdWUzMzQxOTA5NTk= | 321 | Wildcard support in query parameters | 12617395 | closed | 0 | 3439337 | 8 | 2018-06-20T18:03:56Z | 2018-06-21T17:00:10Z | 2018-06-21T04:55:26Z | NONE | I haven't found a way to get the wildcard (%) inserted automatically in to a query parameter. This would be useful for cases the query parameter is followed by a LIKE clause. Wrapping the parameter name using the wildcard character within the metadata file (ie - ...where xyz like %:querystring%) does not seem to work. Can this be made possible? Or if not, can the template be extended to provide a tip to the user that they need to insert the wildcard characters themselves? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/321/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
2007893839 | I_kwDOCGYnMM53rgdP | 605 | Insert fails with `Error: Python int too large to convert to SQLite INTEGER`; can we use `NUMERIC` here? | 12229877 | closed | 0 | 1 | 2023-11-23T10:19:46Z | 2023-12-08T05:07:54Z | 2023-12-08T05:07:54Z | NONE | I'm currently working on a new feature for Hypothesis, where we can dump a tidy jsonlines table of all the test cases we tried - including arguments, outcomes, timings, coverage, etc. Exploring this seems like a perfect cases for `sqlite-utils` and `datasette`, but I pretty quickly ran into an integer overflow problem and don't want to recommend that experience to my users. I originally went to report this as a bug... and then found https://github.com/simonw/sqlite-utils/issues/309#issuecomment-895581038 almost exactly matched my repro 😅 https://github.com/simonw/sqlite-utils/issues/110#issuecomment-626391063 suggests that using `NUMERIC` would avoid this overflow error, although "If the TEXT value is a well-formed integer literal that is too large to fit in a 64-bit signed integer, it is converted to REAL." suggests that this would come at the cost of rounding to the nearest float value. Maybe I should just convert large integers to float before writing out my json? After a bit more hacking, "manually cast large integers to float" seems like a decent solution for my particular case, but having written it up I thought I might as well post this issue anyway - I hope it's useful feedback, and won't mind at all if you close as wontfix if it's not. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/605/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
309033998 | MDU6SXNzdWUzMDkwMzM5OTg= | 187 | Windows installation error | 11855322 | closed | 0 | 7 | 2018-03-27T16:04:37Z | 2019-06-15T21:44:23Z | 2019-06-15T21:44:23Z | NONE | On attempting install on a Win 7 PC with py 3.6.2 (Anaconda dist) I get the error: ``` Collecting uvloop>=0.5.3 (from Sanic==0.7.0->datasette) Downloading uvloop-0.9.1.tar.gz (1.8MB) 100% |¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦| 1.8MB 12.8MB/s Complete output from command python setup.py egg_info: Traceback (most recent call last): File "<string>", line 1, in <module> File "C:\Users\RCole\AppData\Local\Temp\pip-build-juakfqt8\uvloop\setup.py ", line 10, in <module> raise RuntimeError('uvloop does not support Windows at the moment') RuntimeError: uvloop does not support Windows at the moment ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/187/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1497909798 | I_kwDOBm6k_c5ZSEom | 1958 | datasette --root running in Docker doesn't reliably show the magic URL | 11729897 | closed | 0 | 11 | 2022-12-13T16:29:13Z | 2022-12-16T00:59:12Z | 2022-12-16T00:55:19Z | NONE | I followed these steps: `docker run datasetteproject/datasette pip install datasette-upload-csvs` `docker commit $(docker ps -lq) datasette-with-plugins` `docker run -p 8001:8001 -v $(pwd):/mnt datasette-with-plugins datasette --root -p 8001 -h 0.0.0.0` Visited: http://127.0.0.1:8001/-/plugins ![image](https://user-images.githubusercontent.com/11729897/207392071-d939cd5e-1d96-4e11-b0be-dc06dd207866.png) Visited: http://localhost:8001/-/upload-csvs ![image](https://user-images.githubusercontent.com/11729897/207389241-3e96ca66-ca74-4a16-8b7d-4427ee862c5e.png) I may have missed a step? Thank you. --- Ubuntu 22.04.1 LTS | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1958/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1041778507 | I_kwDOCGYnMM4-GEdL | 334 | Filter by datetime objects using rows_where() | 11642379 | closed | 0 | 0 | 2021-11-02T00:44:08Z | 2021-11-13T19:23:21Z | 2021-11-13T19:23:21Z | NONE | Firstly, thanks for this nice utility. It would be nice to have an example in the docs on how to filter by date range using `rows_where()`. This doesn't seem to work: ``` table.rows_where('datetime(created) between datetime("2021-10-31T17:29:59.277428-04:00") AND datetime("2021-11-01T03:44:04.544651+00:00")') ``` I could probably just use `db.query()`, which works for the above, but it would be nice if I could pass in `datetime` objects in `rows_where()`. Thanks. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/334/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
549287310 | MDU6SXNzdWU1NDkyODczMTA= | 76 | order_by mechanism | 10501166 | closed | 0 | 4 | 2020-01-14T02:06:03Z | 2020-04-16T06:23:29Z | 2020-04-16T03:13:06Z | NONE | In some cases, I want to iterate rows in a table with `ORDER BY` clause. It would be nice to have a `rows_order_by` function similar to `rows_where`. In a more general case, `rows_filter` function might be added to allow more customized filtering to iterate rows. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/76/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1817281557 | I_kwDOC8SPRc5sUYQV | 37 | cannot use jinja filters in display? | 10352819 | closed | 0 | 1 | 2023-07-23T20:09:54Z | 2023-07-23T20:18:27Z | 2023-07-23T20:18:26Z | NONE | Hi, I'm trying to have a display function in Dogsheep's `config.yml` that includes something like this: ``` <h3> <a href="{{ urls.row('my_database', 'my_table', key) }}">{{ display.title }}</a> <a href="{{ display.url }}🔗" target="_blank">(source)</a> </h3> <p>{{ display.snippet|safe }}</p> ``` Unfortunately, rendering fails with a message 'urls is undefined'. The same happens if I'm trying to build a row URL manually, using filters like `quote_plus` (as my keys are URLs). Any hints? Thanks! | 197431109 | issue | { "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/37/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
471292050 | MDU6SXNzdWU0NzEyOTIwNTA= | 563 | incorrect json url for row-level data? | 10352819 | closed | 0 | 0 | 2019-07-22T19:59:38Z | 2019-10-21T02:03:09Z | 2019-10-21T02:03:09Z | CONTRIBUTOR | While visiting [this example page](https://register-of-members-interests.datasettes.com/regmem-98dc8b7/people/uk.org.publicwhip%2Fperson%2F10001) (linked from Datasette documentation), manually clicking on [the link](https://register-of-members-interests.datasettes.com/regmem-98dc8b7/people/uk.org.publicwhip%2Fperson%2F10001?_format=json) ("This data as .json") to the json data results in an error 500 `data() got an unexpected keyword argument 'as_format'` The [JSON page linked to from the documentation](https://register-of-members-interests.datasettes.com/regmem-d22c12c/people/uk.org.publicwhip%2Fperson%2F10001.json) however is correct (the page address ends in `.json` rather than using a query string `?format=json`) This particular datasette demo page is now a few versions behind, but I was able to reproduce the issue using v0.29.2 and a downloaded copy of the demo database (and also with the current HEAD). Here is a stack trace: ``` Traceback (most recent call last): File "/home/romain/miniconda3/envs/dsbug/lib/python3.7/site-packages/datasette/utils/asgi.py", line 101, in __call__ return await view(new_scope, receive, send) File "/home/romain/miniconda3/envs/dsbug/lib/python3.7/site-packages/datasette/utils/asgi.py", line 173, in view request, **scope["url_route"]["kwargs"] File "/home/romain/miniconda3/envs/dsbug/lib/python3.7/site-packages/datasette/views/base.py", line 267, in get request, database, hash, correct_hash_provided, **kwargs File "/home/romain/miniconda3/envs/dsbug/lib/python3.7/site-packages/datasette/views/base.py", line 399, in view_get request, database, hash, **kwargs TypeError: data() got an unexpected keyword argument 'as_format' ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/563/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
398559195 | MDU6SXNzdWUzOTg1NTkxOTU= | 400 | datasette publish cloudrun plugin | 10352819 | closed | 0 | 1 | 2019-01-12T14:35:11Z | 2019-05-03T16:57:35Z | 2019-05-03T16:57:35Z | CONTRIBUTOR | Google announced that they may launch a simple service for running Docker containers (previously serverless containers, now called "cloud run" -- link to alpha [here](https://services.google.com/fb/forms/serverlesscontainers/)). If/when this happens, it might be a good fit for publishing datasettes? (at least using the current version, manually publishing a datasette seems relatively painless). | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/400/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
559197745 | MDU6SXNzdWU1NTkxOTc3NDU= | 82 | Tutorial command no longer works | 10350886 | closed | 0 | 3 | 2020-02-03T16:36:11Z | 2020-02-27T04:16:43Z | 2020-02-27T04:16:30Z | NONE | Issue with command on [tutorial](https://simonwillison.net/2019/Feb/25/sqlite-utils/) on Simon's site. The following command no longer works, and breaks with the previous too many variables error: #50 ``` cmd > curl "https://data.nasa.gov/resource/y77d-th95.json" | \ sqlite-utils insert meteorites.db meteorites - --pk=id ``` Output: ``` cmd Traceback (most recent call last): File "continuum\miniconda3\envs\main\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "continuum\miniconda3\envs\main\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "Continuum\miniconda3\envs\main\Scripts\sqlite-utils.exe\__main__.py", line 9, in <module> File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 764, in __call__ return self.main(*args, **kwargs) File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 717, in main rv = self.invoke(ctx) File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 555, in invoke return callback(*args, **kwargs) File "continuum\miniconda3\envs\main\lib\site-packages\sqlite_utils\cli.py", line 434, in insert default=default, File "continuum\miniconda3\envs\main\lib\site-packages\sqlite_utils\cli.py", line 384, in insert_upsert_implementation docs, pk=pk, batch_size=batch_size, alter=alter, **extra_kwargs File "continuum\miniconda3\envs\main\lib\site-packages\sqlite_utils\db.py", line 1081, in insert_all result = self.db.conn.execute(query, params) sqlite3.OperationalError: too many SQL variables ``` My thought is that maybe the dataset grew over the last few years and so didn't run into this issue before. No error… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/82/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1181432624 | I_kwDOBm6k_c5Gazsw | 1688 | [plugins][documentation] Is it possible to serve per-plugin static folders when writing one-off (single file) plugins? | 9020979 | closed | 0 | 3 | 2022-03-26T01:17:44Z | 2022-03-27T01:01:14Z | 2022-03-26T21:34:47Z | CONTRIBUTOR | I'm trying to make a small plugin that depends on static assets, by following the guide [here](https://docs.datasette.io/en/stable/writing_plugins.html#writing-one-off-plugins). I made a `plugins/` directory with `datasette_nteract_data_explorer.py`. I am trying to follow the example of `datasette_vega`, and serving static assets. I created a `statics/` directory within `plugins/` to serve my JS and CSS. https://github.com/simonw/datasette-vega/blob/00de059ab1ef77394ba9f9547abfacf966c479c4/datasette_vega/__init__.py#L13 Unfortunately, datasette doesn't seem to be able to find my assets. Input: ```bash datasette ~/Library/Safari/History.db --plugins-dir=plugins/ ``` ![Image 2022-03-25 at 9 18 17 PM](https://user-images.githubusercontent.com/9020979/160218979-a3ff474b-5255-4a76-85d1-6f90ab2e3b44.jpg) Output: ![Image 2022-03-25 at 9 11 00 PM](https://user-images.githubusercontent.com/9020979/160218733-ca5144cf-f23f-43d8-a8d3-e3a871e57f3a.jpg) I suspect this issue might go away if I move away from "one-off" plugin mode, but it's been a while since I created a new python package so I'm not sure how much work there is to go between "one off" and "packaged for PyPI". I'd like to try to avoid needing to repackage a new `tar.gz` file and or reinstall my library repeatedly when developing new python code. 1. Is there a way to serve a static assets when using the `plugins/` directory method instead of installing plugins as a new python package? 2. If not, is there a way I can work on developing a plugin without creating and repackaging tar.gz files after every change, or is that the recommended path? Thanks for your help! | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1688/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
512996469 | MDU6SXNzdWU1MTI5OTY0Njk= | 607 | Ways to improve fuzzy search speed on larger data sets? | 8431341 | closed | 0 | 6 | 2019-10-27T17:31:37Z | 2019-11-07T03:38:10Z | 2019-11-07T03:38:10Z | NONE | I have an sqlite table with 16 million rows in it. Having read @simonw article "[Fast Autocomplete Search for Your Website](https://24ways.org/2018/fast-autocomplete-search-for-your-website/)" I was curious to try datasette to see what kind of query performance I could get out of it. In truth I don't need to do full text search since all I would like to do is give my users a way to search for the names of investors such as "Warren Buffet", or "Tim Cook" (who's names are in a single column). On the first search, Datasette takes over 20 seconds to return all records associated with `elon musk`: > ![image](https://user-images.githubusercontent.com/8431341/67638889-a86e1100-f8b7-11e9-9f7e-a9d13a42e988.png) > ![image](https://user-images.githubusercontent.com/8431341/67638825-ed457800-f8b6-11e9-94d1-b44f1a40ee8c.png) If I rerun the same search, it then takes almost 9 seconds: > ![image](https://user-images.githubusercontent.com/8431341/67638908-e4a17180-f8b7-11e9-9d00-748c80ef1f21.png) That's far to slow to implement an autocomplete feature. I could reduce the latency by making a special table of only unique investor names, thereby reducing the search space to less than a million rows (then I'd need to implement a way to add only new investor names to the table as I received new data.. about 4,000 rows a day). If I did that, I'm still concerned the new table wouldn't be lean enough to lookup investor names quickly. Plus, even if I can implement the autocomplete feature, I would still finally have to lookup records for that investors which would take between 8 - 20 seconds. Are there any tricks for speeding this up? Here's my hardware: > ![image](https://user-images.githubusercontent.com/8431341/67638861-55945980-f8b7-11e9-96a8-ca76c7c68c5d.png) | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/607/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
476437213 | MDU6SXNzdWU0NzY0MzcyMTM= | 566 | Unexpected keyword argument 'hidden' | 8330931 | closed | 0 | 1 | 2019-08-03T10:07:57Z | 2019-08-03T16:13:36Z | 2019-08-03T16:13:36Z | NONE | I couldn't get a test example running. I am running python 3.6.8 and tried both windows and windows subsystem for linux, getting the same error. My test.db was created by converting a five line csv file with csvs-to-sqlite. The csv file is: col1, col2, col3 1,2,3 4,5,6 7,8,9 10,11,12 Here is the error message: (myvenv) davido@DESKTOP-L29G79U:~/dot/datasette-eg$ datasette test.db Traceback (most recent call last): File "/home/davido/dot/datasette-eg/myvenv/bin/datasette", line 7, in <module> from datasette.cli import cli File "/home/davido/dot/datasette-eg/myvenv/lib/python3.6/site-packages/datasette/cli.py", line 2, in <module> import uvicorn File "/home/davido/dot/datasette-eg/myvenv/lib/python3.6/site-packages/uvicorn/__init__.py", line 2, in <module> from uvicorn.main import Server, main, run File "/home/davido/dot/datasette-eg/myvenv/lib/python3.6/site-packages/uvicorn/main.py", line 224, in <module> headers: typing.List[str], File "/home/davido/dot/datasette-eg/myvenv/lib/python3.6/site-packages/click/decorators.py", line 170, in decorator _param_memo(f, OptionClass(param_decls, **attrs)) File "/home/davido/dot/datasette-eg/myvenv/lib/python3.6/site-packages/click/core.py", line 1430, in __init__ Parameter.__init__(self, param_decls, type=type, **attrs) TypeError: __init__() got an unexpected keyword a… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/566/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
564579430 | MDU6SXNzdWU1NjQ1Nzk0MzA= | 86 | Problem with square bracket in CSV column name | 8149512 | closed | 0 | 7 | 2020-02-13T10:19:57Z | 2020-02-27T04:16:08Z | 2020-02-27T04:16:07Z | NONE | testing some data from european power information (entsoe.eu), the title of the csv contains square brackets. as I am playing with glitch, sqlite-utils are used for creating the db. Traceback (most recent call last): File "/app/.local/bin/sqlite-utils", line 8, in <module> sys.exit(cli()) File "/app/.local/lib/python3.7/site-packages/click/core.py", line 764, in __call__ return self.main(*args, **kwargs) File "/app/.local/lib/python3.7/site-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File "/app/.local/lib/python3.7/site-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/app/.local/lib/python3.7/site-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File "/app/.local/lib/python3.7/site-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) File "/app/.local/lib/python3.7/site-packages/sqlite_utils/cli.py", line 434, in insert default=default, File "/app/.local/lib/python3.7/site-packages/sqlite_utils/cli.py", line 384, in insert_upsert_implementation docs, pk=pk, batch_size=batch_size, alter=alter, **extra_kwargs File "/app/.local/lib/python3.7/site-packages/sqlite_utils/db.py", line 997, in insert_all extracts=extracts, File "/app/.local/lib/python3.7/site-packages/sqlite_utils/db.py", line 618, in create extracts=extracts, File "/app/.local/lib/python3.7/site-packages/sqlite_utils/db.py", line 310, in create_table self.conn.execute(sql) sqlite3.OperationalError: unrecognized token: "]" entsoe_2016.csv renamed to txt for uploading compatibility [entsoe_2016.txt](https://github.com/simonw/sqlite-utils/files/4197688/entsoe_2016.txt) code is remixed directly from your https://glitch.com/edit/#!/datasette-csvs repo | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/86/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
459397625 | MDU6SXNzdWU0NTkzOTc2MjU= | 514 | Documentation with recommendations on running Datasette in production without using Docker | 7936571 | closed | 0 | 5971510 | 27 | 2019-06-21T22:48:12Z | 2020-10-08T23:55:53Z | 2020-10-08T23:33:05Z | NONE | I've got some SQLite databases too big to push to Heroku or the other services with built-in support in datasette. So instead I moved my datasette code and databases to a remote server on Kimsufi. In the folder containing the SQLite databases I run the following code. `nohup datasette serve -h 0.0.0.0 *.db --cors --port 8000 --metadata metadata.json > output.log 2>&1 &`. When I go to `http://my-remote-server.com:8000`, the site loads. But I know this is not a good long-term solution to running datasette on this server. What is the "correct" way to have this site run, preferably on server port 80? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/514/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
457201907 | MDU6SXNzdWU0NTcyMDE5MDc= | 513 | Is it possible to publish to Heroku despite slug size being too large? | 7936571 | closed | 0 | 2 | 2019-06-18T00:12:02Z | 2019-06-21T22:35:54Z | 2019-06-21T22:35:54Z | NONE | I'm trying to push more than 1.5GB worth of SQLite databases -- 535MB compressed -- to Heroku but I get this error when I run the `datasette publish heroku` command. Compiled slug size: 535.5M is too large (max is 500M). Can I publish the databases and make datasette work on Heroku despite the large slug size? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/513/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
453243459 | MDU6SXNzdWU0NTMyNDM0NTk= | 503 | Handle SQLite databases with spaces in their names? | 7936571 | closed | 0 | 9599 | 1 | 2019-06-06T21:20:59Z | 2019-11-04T23:16:30Z | 2019-11-04T23:16:30Z | NONE | I named my SQLite database "Government workers" and published it to Heroku. When I clicked the "Government workers" database online it lead to a 404 page: `Database not found: Government%20workers`. I believe this is because the database name has a space. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/503/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
453131917 | MDU6SXNzdWU0NTMxMzE5MTc= | 502 | Exporting sqlite database(s)? | 7936571 | closed | 0 | 3 | 2019-06-06T16:39:53Z | 2021-04-03T05:16:54Z | 2019-06-11T18:50:42Z | NONE | I'm working on datasette from one computer. But if I want to work on it from another computer and want to copy the SQLite database(s) already on the Heroku datasette instance, how to I copy the database(s) to the second computer so that I can then update it and push to online via datasette's command line code that pushes code to Heroku? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/502/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
451513541 | MDU6SXNzdWU0NTE1MTM1NDE= | 498 | Full text search of all tables at once? | 7936571 | closed | 0 | 12 | 2019-06-03T14:24:43Z | 2020-05-30T17:26:02Z | 2020-05-30T17:26:02Z | NONE | Does datasette have a built-in way, in a browser, to do a full-text search of all columns, in all databases and tables, that have full-text search enabled? Is there a plugin that does this? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/498/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1740150327 | I_kwDOCGYnMM5nuJY3 | 557 | Aliased ROWID option for tables created from alter=True commands | 7908073 | closed | 0 | 2 | 2023-06-04T05:29:28Z | 2023-06-14T06:09:21Z | 2023-06-05T19:26:26Z | CONTRIBUTOR | > If you use INTEGER PRIMARY KEY column, the VACUUM does not change the values of that column. However, if you use unaliased rowid, the VACUUM command will reset the rowid values. ROWID should never be used with foreign keys but the simple act of aliasing rowid to id (which is what happens when one does `id integer primary key` DDL) makes it OK. It would be convenient if there were more options to use a string column (eg. filepath) as the PK, and be able to use it during upserts, but when creating a foreign key, to create an integer column which aliases rowid I made an attempt to switch to integer primary keys here but it is not going well... In my usecase the path column is a business key. Yes, it should be as simple as including the `id` column in any select statement where I plan on using `upsert` but it would be nice if this could be abstracted away somehow https://github.com/chapmanjacobd/library/commit/788cd125be01d76f0fe2153335d9f6b21db1343c https://github.com/chapmanjacobd/library/actions/runs/5173602136/jobs/9319024777 | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/557/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1655860104 | I_kwDOCGYnMM5ismuI | 535 | rows: --transpose or psql extended view-like functionality | 7908073 | closed | 0 | 2 | 2023-04-05T15:37:33Z | 2023-06-15T08:39:49Z | 2023-06-14T22:05:28Z | CONTRIBUTOR | It would be nice if the rows subcommand had a flag, perhaps called `--transpose` which would print in long form instead of wide. Similar to extended display mode in psql (`\x`) In other words instead of this: ``` sqlite-utils rows --limit 5 --fmt github track_metadata.db songs ``` | track_id | title | song_id | release | artist_id | artist_mbid | artist_name | duration | artist_familiarity | artist_hotttnesss | year | track_7digitalid | shs_perf | shs_work | |--------------------|-------------------|--------------------|--------------------------------------|--------------------|--------------------------------------|------------------|------------|----------------------|---------------------|--------|--------------------|------------|------------| | TRMMMYQ128F932D901 | Silent Night | SOQMMHC12AB0180CB8 | Monster Ballads X-Mas | ARYZTJS1187B98C555 | 357ff05d-848a-44cf-b608-cb34b5701ae5 | Faster Pussy cat | 252.055 | 0.649822 | 0.394032 | 2003 | 7032331 | -1 | 0 | | TRMMMKD128F425225D | Tanssi vaan | SOVFVAK12A8C1350D9 | Karkuteillä | ARMVN3U1187FB3A1EB | 8d7ef530-a6fd-4f8f-b2e2-74aec765e0f9 | Karkkiautomaatti | 156.551 | 0.439604 | 0.356992 | 1995 | 1514808 | -1 | 0 | | TRMMMRX128F93187D9 | No One Could Ever | SOGTUKN12AB017F4F1 | Butter | ARGEKB01187FB50750 | 3d403d44-36ce-465c-ad43-ae877e65adc4 | Hudson Mohawke | 138.971 | 0.643681 | 0.437504 | 2006 | 6945353 | -1 | 0 | | TRMMMCH128F425532C | Si Vos Querés | SOBNYVR12A8C13558C | De Culo | ARNWYLR1187B9B2F9C | 12be7648-7094-495f-90e6-df4189d68615 | Yerba Brava | 145.058 | 0.448501 | 0.372349 | 2003 | 2168257 |… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/535/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1581090327 | I_kwDOCGYnMM5ePYYX | 529 | Microsoft line endings | 7908073 | closed | 0 | 1 | 2023-02-12T02:20:48Z | 2023-06-14T23:12:12Z | 2023-06-14T23:11:47Z | CONTRIBUTOR | sqlite-utils prints `\r\n` but [it should probably](https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/) print `\n` (unless the platform is detected as Windows?) It has tripped me up a few times when piping the output of sqlite-utils to other programs: ``` $ sqlite-utils --no-headers --csv ~/lb/fs/d.db 'select path from media limit 1' | cat -A /mnt/d7/file^M$ $ sqlite-utils --no-headers --csv ~/lb/fs/d.db 'select path from media limit 1' | tr -d '\r' | cat -A /mnt/d7/file$ ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/529/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1436539554 | I_kwDOCGYnMM5Vn9qi | 511 | [insert_all, upsert_all] IntegrityError: constraint failed | 7908073 | closed | 0 | 2 | 2022-11-04T19:21:48Z | 2022-11-04T22:59:54Z | 2022-11-04T22:54:09Z | CONTRIBUTOR | My understand is that `INSERT OR IGNORE` will ignore when inserts would cause duplicate keys so I'm not sure exactly why the error is raised from `sqlite3`. ``` import argparse from pathlib import Path from xklb import db, utils from xklb.utils import log def parse_args() -> argparse.Namespace: parser = argparse.ArgumentParser() parser.add_argument("database") parser.add_argument("dbs", nargs="*") parser.add_argument("--upsert") parser.add_argument("--db", "-db", help=argparse.SUPPRESS) parser.add_argument("--verbose", "-v", action="count", default=0) args = parser.parse_args() if args.db: args.database = args.db Path(args.database).touch() args.db = db.connect(args) log.info(utils.dict_filter_bool(args.__dict__)) return args def merge_db(args, source_db): source_db = str(Path(source_db).resolve()) s_db = db.connect(argparse.Namespace(database=source_db, verbose=args.verbose)) for table in [s for s in s_db.table_names() if not "_fts" in s and not s.startswith("sqlite_")]: log.info("[%s]: %s", source_db, table) with s_db.conn: data = s_db[table].rows with args.db.conn: if args.upsert: args.db[table].upsert_all(data, pk=args.upsert.split(","), alter=True) else: args.db[table].insert_all(data, alter=True, replace=True) def merge_dbs(): args = parse_args() for s_db in args.dbs: merge_db(args, s_db) if __name__ == "__main__": merge_dbs() ``` ``` $ lb-dev merge video.db tube_71.db --upsert path -vv SQL: INSERT OR IGNORE INTO [media]([path]) VALUES(?); - params: ['https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz'] ... File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:3122, in Table.insert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, hash_id_columns, alter, ignore, re… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/511/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1430325103 | I_kwDOCGYnMM5VQQdv | 507 | conn.execute: UnicodeEncodeError: 'utf-8' codec can't encode character | 7908073 | closed | 0 | 1 | 2022-10-31T18:49:51Z | 2022-11-01T00:40:17Z | 2022-11-01T00:40:16Z | CONTRIBUTOR | I'm not really sure what caused this and it happened in the middle of my program (after running for 35775 seconds). ``` Extracting metadata 49.9% (chunk 9893 of 19831) ... File "/home/xk/.local/lib/python3.10/site-packages/xklb/fs_extract.py", line 90, in extract_chunk args.db["media"].insert_all(utils.list_dict_filter_bool(media), pk="path", alter=True, replace=True) File "/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py", line 3107, in insert_all self.insert_chunk( File "/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py", line 2872, in insert_chunk result = self.db.execute(query, params) File "/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py", line 483, in execute return self.conn.execute(sql, parameters) UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 62: surrogates not allowed ``` This might be relevant: https://stackoverflow.com/questions/31898353/python-cant-encode-with-surrogateescape I'm going to try re-running with ```py def execute( self, sql: str, parameters: Optional[Union[Iterable, dict]] = None ) -> sqlite3.Cursor: """ Execute SQL query and return a ``sqlite3.Cursor``. :param sql: SQL query to execute :param parameters: Parameters to use in that query - an iterable for ``where id = ?`` parameters, or a dictionary for ``where id = :id`` """ try: if self._tracer: self._tracer(sql, parameters) if parameters is not None: return self.conn.execute(sql, parameters) else: return self.conn.execute(sql) except UnicodeEncodeError: sql = sql.encode('utf-8', 'surrogatepass').decode('utf-8') if parameters is not None: parameters = parameters.encode('utf-8', 'surrogatepass').decode('utf-8') return self.execute(sql, parameters) ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/507/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1393212964 | I_kwDOCGYnMM5TCr4k | 497 | column_names | 7908073 | closed | 0 | 1 | 2022-10-01T03:34:21Z | 2022-10-25T21:09:28Z | 2022-10-25T21:09:28Z | CONTRIBUTOR | It would be nice to have a `column_names`. Similar to `table_names`. Or if you could get one or all of the following syntax to work for both Database and Table that might be even better: Style 1 - `if 'table1' in db` - `if 'col1' in db['table1']` Style 2 - `if 'table1' in db.tables` - `if 'col1' in db['table1'].columns` maybe the table ones actually work but I'm too lazy to check. I just know that I have to do: `[c.name for c in db['table1'].columns]` Edit: This is possible with `columns_dict`. I have actually used that before but I forgot about it. Feel free to close, but I do think accessing this data could be more consistent and intuitive. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/497/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1361355564 | I_kwDOCGYnMM5RJKMs | 482 | balanced table default column_order | 7908073 | closed | 0 | 1 | 2022-09-05T03:00:18Z | 2022-10-10T17:43:02Z | 2022-09-06T20:17:27Z | CONTRIBUTOR | Is there any performance or size difference with column order in SQLITE ? similar to this https://www.cybertec-postgresql.com/en/column-order-in-postgresql-does-matter/ It might be interesting to have an option to create with an optimized column order. I'm assuming this would look something like INTEGER columns, REAL columns, BLOB columns, TEXT columns, NULL columns. NULL columns at the end because they are more likely to be TEXT and it is impossible to know if they will become INTEGER (Of course, any schema evolution would reduce optimization but maybe column order could also be re-evaluated when schema changes) edit: this is easy to accomplish with the existing `transform` method: ``` int_columns = [k for k, v in table_columns.items() if v == int] db[table].transform(column_order=[*int_columns]) ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/482/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1239034903 | I_kwDOCGYnMM5J2iwX | 433 | CLI eats my cursor | 7908073 | closed | 0 | 10 | 2022-05-17T18:52:52Z | 2023-11-04T00:46:30Z | 2023-11-04T00:46:30Z | CONTRIBUTOR | I'm not sure why this happens but `sqlite-utils` makes my terminal cursor disappear after running commands like `sqlite-utils insert`. I've only noticed this behavior in `sqlite-utils`, not in any other CLI tools I can still type commands after it runs but the text cursor is invisible | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/433/reactions", "total_count": 5, "+1": 5, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
442327592 | MDU6SXNzdWU0NDIzMjc1OTI= | 456 | Installing installs the tests package | 7725188 | closed | 0 | 3 | 2019-05-09T16:35:16Z | 2020-07-24T20:39:54Z | 2020-07-24T20:39:54Z | CONTRIBUTOR | Because `setup.py` uses `find_packages` and `tests` is on the top-level, `pip install datasette` will install a top-level package called `tests`, which is probably not desired behavior. The offending line is here: https://github.com/simonw/datasette/blob/bfa2ae0d16d39bb82dbe4da4f3fdc3c7f6257418/setup.py#L40 And only `pip uninstall datasette` with a conflicting package would warn you by default; apparently another package had the same problem, which is why I get this message when uninstalling: ``` $ pip uninstall datasette Uninstalling datasette-0.27: Would remove: /usr/local/bin/datasette /usr/local/lib/python3.7/site-packages/datasette-0.27.dist-info/* /usr/local/lib/python3.7/site-packages/datasette/* /usr/local/lib/python3.7/site-packages/tests/* Would not remove (might be manually added): [ .. snip .. ] Proceed (y/n)? ``` This should be a relatively simple fix, and I could drop a PR if desired! Cheers | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/456/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1501900064 | I_kwDOBm6k_c5ZhS0g | 1966 | Broken link to live demo in Getting started docs | 7551922 | closed | 0 | 1 | 2022-12-18T13:17:00Z | 2022-12-31T19:15:19Z | 2022-12-31T19:15:10Z | NONE | The link in [Play with a live demo in Getting started](https://github.com/simonw/datasette/blob/main/docs/getting_started.rst#play-with-a-live-demo) to [https://fivethirtyeight.datasettes.com/fivethirtyeight](https://fivethirtyeight.datasettes.com/fivethirtyeight) is broken and the datasette is no longer working (maybe due to the end of the free tier). | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1966/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
934123448 | MDU6SXNzdWU5MzQxMjM0NDg= | 295 | Insert with --tsv and --no-headers give error about --nl arguments | 7288187 | closed | 0 | 1 | 2021-06-30T21:01:01Z | 2021-08-18T20:19:04Z | 2021-08-18T20:18:57Z | NONE | Not quite sure if this is a bug, or just an assumption I made but I thought `--tsv` and `--no-headers` would work together when inserting from a file, and currently they seem not to (sqlite-utils, version 3.12, installed on Mac OS X via brew) Instead it says: `Error: Use just one of --nl, --csv or --tsv` As if it has interpreted the --no-headers as --nl. The --help does specifically say CSV: `--no-headers CSV file has no header row` And this heading in the documentation also only refers to CSV, but the text does mention TSV in passing, and I'd generally expect them to behave the same in most cases. https://sqlite-utils.datasette.io/en/stable/cli.html#csv-files-without-a-header-row | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/295/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
814591962 | MDU6SXNzdWU4MTQ1OTE5NjI= | 1240 | Allow facetting on custom queries | 7107523 | closed | 0 | 3 | 2021-02-23T15:52:19Z | 2021-02-26T18:19:46Z | 2021-02-26T18:18:18Z | NONE | Facets are a tremendously useful feature, especially for people peeking at the database for the first time and still having little knowledge about the details of the data. It is of great assistance to discover interesting features to explore futher in advanced queries. Yet, it seems it's impossible to use facets when running a custom SQL query, be it from the little gear icons in column names, the facet suggestions at the top (hidden when performing a custom query), or by appending a facet code to the URL. Is there a technical limitation, or is this something that could be unlocked easily? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1240/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
810397025 | MDU6SXNzdWU4MTAzOTcwMjU= | 1228 | 500 error caused by faceting if a column called `n` exists | 7107523 | closed | 0 | 5 | 2021-02-17T17:41:20Z | 2022-03-19T06:44:40Z | 2022-03-19T01:38:04Z | NONE | I recently discovered `datasette` thanks to your great talk at FOSDEM and would like to use it for some projects. However, when trying to use it on databases created from some csv ot tsv files, I am sometimes getting this issue when going to http://127.0.0.1:8001/databasetest/databasetest and I don't exactly understand what it refers to. So far, I couldn't find anything relevant when reviewing the raw text files that could explain this issue, nor could I find something obvious between the files that generate this issue and those that don't. Does the error ring a bell and, if so, could you please point me to the right direction? ``` $ datasette databasetest.db INFO: Started server process [1408482] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8001 (Press CTRL+C to quit) INFO: 127.0.0.1:56394 - "GET / HTTP/1.1" 200 OK INFO: 127.0.0.1:56394 - "GET /-/static/app.css?4e362c HTTP/1.1" 200 OK INFO: 127.0.0.1:56396 - "GET /-/static-plugins/datasette_vega/main.2acbb312.css HTTP/1.1" 200 OK INFO: 127.0.0.1:56398 - "GET /-/static-plugins/datasette_vega/main.08f5d3d8.js HTTP/1.1" 200 OK Traceback (most recent call last): File "/home/kabouik/.local/lib/python3.7/site-packages/datasette/app.py", line 1099, in route_path response = await view(request, send) File "/home/kabouik/.local/lib/python3.7/site-packages/datasette/views/base.py", line 147, in view request, **request.scope["url_route"]["kwargs"] File "/home/kabouik/.local/lib/python3.7/site-packages/datasette/views/base.py", line 121, in dispatch_request return await handler(request, *args, **kwargs) File "/home/kabouik/.local/lib/python3.7/site-packages/datasette/views/base.py", line 260, in get request, database, hash, correct_hash_provided, **kwargs File "/home/kabouik/.local/lib/python3.7/site-packages/datasette/views/base.py", line 434, in view_get request, database, hash, **kwargs File "/home/kabouik/.loc… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1228/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
810618495 | MDU6SXNzdWU4MTA2MTg0OTU= | 235 | Extract columns cannot create foreign key relation: sqlite3.OperationalError: table sqlite_master may not be modified | 6913891 | closed | 0 | 18 | 2021-02-17T23:33:23Z | 2023-06-26T01:47:01Z | 2023-06-25T23:25:53Z | NONE | Thanks for what seems like a truly great suite of libraries. I wanted to try out Datasette, but never got more than half way through your YouTube video with the SF tree dataset. Whenever I try to extract a column, I get a `sqlite3.OperationalError: table sqlite_master may not be modified` error from Python. This snippet reproduces the error on my system, Python 3.9.1 and sqlite-utils 3.5 on an M1 Macbook Pro running in rosetta mode: ``` curl "https://data.nasa.gov/resource/y77d-th95.json" | \ sqlite-utils insert meteorites.db meteorites - --pk=id sqlite-utils extract meteorites.db meteorites recclass ``` I have tried googling the problem, but all I've found is that this *might* be a problem with the sqlite3 database running in defensive mode, but I definitely can't know for sure. Does the problem seem familiar to you? | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/235/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1088816961 | I_kwDODEm0Qs5A5gdB | 62 | KeyError: 'created_at' for private accounts? | 6764957 | closed | 0 | 2 | 2021-12-26T17:51:51Z | 2022-03-12T02:36:32Z | 2022-02-24T18:10:18Z | NONE | hey Simon! i was running `twitter-to-sqlite user-timeline twitter.db` for [my private alt](https://twitter.com/swyxio) and ran into this error: <details> <summary> ![image](https://user-images.githubusercontent.com/6764957/147416165-46b69c30-100a-406f-8534-8612b75547ae.png) </summary> ```bash Traceback (most recent call last): File "/Users/swyx/Work/datasette/env/bin/twitter-to-sqlite", line 8, in <module> sys.exit(cli()) File "/Users/swyx/Work/datasette/env/lib/python3.9/site-packages/click/core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "/Users/swyx/Work/datasette/env/lib/python3.9/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/Users/swyx/Work/datasette/env/lib/python3.9/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/Users/swyx/Work/datasette/env/lib/python3.9/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "/Users/swyx/Work/datasette/env/lib/python3.9/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "/Users/swyx/Work/datasette/env/lib/python3.9/site-packages/twitter_to_sqlite/cli.py", line 291, in user_timeline profile = utils.get_profile(db, session, **kwargs) File "/Users/swyx/Work/datasette/env/lib/python3.9/site-packages/twitter_to_sqlite/utils.py", line 133, in get_profile save_users(db, [profile]) File "/Users/swyx/Work/datasette/env/lib/python3.9/site-packages/twitter_to_sqlite/utils.py", line 453, in save_users transform_user(user) File "/Users/swyx/Work/datasette/env/lib/python3.9/site-packages/twitter_to_sqlite/utils.py", line 285, in transform_user user["created_at"] = parser.parse(user["created_at"]) KeyError: 'created_at' ``` </details> this looks awfully like #37 but it can't be, because i'm authed into my account and obviously i have perms to read my own account. wonder i… | 206156866 | issue | { "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/62/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
644582921 | MDU6SXNzdWU2NDQ1ODI5MjE= | 865 | base_url doesn't seem to work when adding criteria and clicking "apply" | 6739646 | closed | 0 | 6026070 | 11 | 2020-06-24T12:39:57Z | 2020-11-12T23:49:24Z | 2020-10-20T05:22:59Z | NONE | Over on Apache Tika, we're using datasette to allow users to make sense of the metadata for our file regression testing corpus. This could be user error in how I've set up the reverse proxy! I started datasette like so: `docker run -d -p 8001:8001 -v `pwd`:/mnt datasetteproject/datasette datasette -p 8001 -h 0.0.0.0 /mnt/corpora-metadata.db --config sql_time_limit_ms:60000 --config base_url:/datasette/` I then reverse proxied like so: ProxyPreserveHost On ProxyPass /datasette http://x.y.z.q:xxxx ProxyPassReverse /datasette http://x.y.z.q:xxx Regular sql works perfectly: https://corpora.tika.apache.org/datasette/corpora-metadata?sql=select+mime_string%2C+count%281%29+as+cnt%0D%0Afrom+profiles+p%0D%0Ajoin+mimes+m+on+p.mime_id%3Dm.mime_id%0D%0Agroup+by+mime_string%0D%0Aorder+by+cnt+desc However, adding criteria and clicking 'Apply' https://corpora.tika.apache.org/datasette/corpora-metadata/tika_1_24_1_mimes?_sort=file&mime__exact=text%2Fplain bounces back to: https://corpora.tika.apache.org/corpora-metadata/tika_1_24_1_mimes?_sort=file&file__contains=bug&mime__exact=text%2Fplain | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/865/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
868188068 | MDU6SXNzdWU4NjgxODgwNjg= | 257 | Insert from JSON containing strings with non-ascii characters are escaped as unicode for lists, tuples, dicts. | 6586811 | closed | 0 | 0 | 2021-04-26T20:46:25Z | 2021-05-19T02:57:05Z | 2021-05-19T02:57:05Z | CONTRIBUTOR | JSON Test File (test.json): ```json [ { "id": 123, "text": "FR Théâtre" }, { "id": 223, "text": [ "FR Théâtre" ] } ] ``` Command to import: ```bash sqlite-utils insert test.db text test.json --pk=id ``` Resulting table view from datasette: ![image](https://user-images.githubusercontent.com/6586811/116147833-cdf2fb00-a6a5-11eb-8412-0aae81b6e6dd.png) Original, db.py line 2225: ```python return json.dumps(value, default=repr) ``` Fix, db.py line 2225: ```python return json.dumps(value, default=repr, ensure_ascii=False) ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/257/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
600120439 | MDU6SXNzdWU2MDAxMjA0Mzk= | 726 | Foreign key : case of a link to the associated row not displayed | 6371750 | closed | 0 | 1 | 2020-04-15T08:31:27Z | 2020-04-27T22:05:47Z | 2020-04-27T22:05:46Z | CONTRIBUTOR | Hello, I use Datasette to publish tsv files linked together by foreign keys declared thanks to sqlite-utils. In one table, [prelib_personne](http://crbc-dataset.huma-num.fr/prelib/prelib_personne), the foreign keys are properly noticed by a link to the associated row (for instance ville_naissance_id is properly linked to prelib_ville). But every link to the foreign key prelib_oeuvre.id fails. For instance, [prelib_ecritoeuvre](http://crbc-dataset.huma-num.fr/prelib/prelib_ecritoeuvre) has links to prelib_personne but none to prelib_oeuvre. In despite of the schema: CREATE TABLE "prelib_ecritoeuvre" ( "id" INTEGER, "fonction_id" INTEGER, "oeuvre_id" INTEGER, "personne_id" INTEGER ,PRIMARY KEY ([id]), FOREIGN KEY(fonction_id) REFERENCES prelib_fonctionecritoeuvre(id), FOREIGN KEY(personne_id) REFERENCES prelib_personne(id), FOREIGN KEY(oeuvre_id) REFERENCES prelib_oeuvre(id) ); Would you have any clue to investigate the reason of this problem? Thanks, | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/726/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
546961357 | MDU6SXNzdWU1NDY5NjEzNTc= | 656 | Display of the column definitions | 6371750 | closed | 0 | 1 | 2020-01-08T16:16:53Z | 2020-01-20T14:17:11Z | 2020-01-20T14:14:33Z | CONTRIBUTOR | Hello, Is the nice display of headers and definitions at the top of https://fivethirtyeight.datasettes.com/fivethirtyeight-ac35616/antiquities-act%2Factions_under_antiquities_act is configured in the metadata.json file ? Thank you, | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/656/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1382457780 | I_kwDOCGYnMM5SZqG0 | 490 | Ability to insert multi-line files | 6180701 | closed | 0 | 4 | 2022-09-22T13:29:22Z | 2022-09-26T18:24:44Z | 2022-09-23T16:37:58Z | NONE | I was looking into how to parse application log files that contain multiline text (e.g. Java stack traces) into sqlite. I can see that at the moment `--lines` helps, but falls short when processing multi-line texts. I wonder if this functionality would be useful for sqlite-utils. A similar approach to Elastic logstash/filebeat can be adopted: https://www.elastic.co/guide/en/beats/filebeat/current/multiline-examples.html Potential changes: - add a `--multiline` option - additional properties for - multiline-pattern (regex expression) - multiline-negate: true/false - multiline-what: previous or next Or if this is achievable in a different way, please share. Thanks! | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/490/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
611284481 | MDU6SXNzdWU2MTEyODQ0ODE= | 38 | [Feature Request] Support Repo Name in Search 🥺 | 5779832 | closed | 0 | 4 | 2020-05-02T22:08:51Z | 2020-05-03T02:34:32Z | 2020-05-02T23:15:11Z | NONE | ## Description Per your [v2.2 release tweet](https://twitter.com/simonw/status/1256700238099693568) I played with the demo, but the output did not match my expectations. ## Expected Behavior Expected a search query for "twitter" contained within the `repo` column to return non-zero results. ## Actual Behavior 😭 [0 rows where repo contains "twitter" sorted by starred_at descending](https://github-to-sqlite.dogsheep.net/github/stars?repo__contains=twitter&_sort_desc=starred_at) ## Best Explanation Per the table schema (see appendix) `repo` is of type `INTEGER` which built from `repo_id` and does not expose the repo name in search. ## Desired Behavior Given that searching for "206156866" is less intuitive than "twitter", it would be great to support this via extending the search capabilities or by adding an additional column. ✅ 104 rows where repo contains "twitter" ❌ [104 rows where repo contains "206156866" sorted by starred_at descending](https://github-to-sqlite.dogsheep.net/github/stars?repo__contains=206156866&_sort_desc=starred_at) ## Appendix ``` CREATE TABLE [stars] ( [user] INTEGER REFERENCES [users]([id]), [repo] INTEGER REFERENCES [repos]([id]), [starred_at] TEXT, PRIMARY KEY ([user], [repo]) ); CREATE INDEX [idx_stars_repo] ON [stars] ([repo]); CREATE INDEX [idx_stars_user] ON [stars] ([user]); ``` | 207052882 | issue | { "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/38/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
891969037 | MDU6SXNzdWU4OTE5NjkwMzc= | 1326 | How to limit fields returned from the JSON API? | 5268174 | closed | 0 | 1 | 2021-05-14T14:27:41Z | 2021-05-23T02:55:06Z | 2021-05-23T02:55:00Z | NONE | Hi, I have quite wide tables, and in many cases only want a subset of the data (to save on network bandwidth). I need to use the JSON API as handling pagination is so much easier, but I can't see a way to select specific columns. Is there a way to do this, or is it a feature request? Thanks! | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1326/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
797651831 | MDU6SXNzdWU3OTc2NTE4MzE= | 1212 | Tests are very slow. | 4488943 | closed | 0 | 4 | 2021-01-31T08:06:16Z | 2021-02-19T22:54:13Z | 2021-02-19T22:54:13Z | CONTRIBUTOR | Working on my PR i noticed that tests are very slow. The plain pytest run took about 37 minutes for me. However i could shave of about 10 minutes from that if i used pytest-xdist to parallelize execution. `pytest -n 8` is run only in 28 minutes on my machine. I can create a PR to mention that in your documentation. This will be a simple change to add pytest-xdist to requirements and change a command to run pytest in documentation. Does that make sense to you? After a bit more investigation it looks like python-xdist is not an answer. It creates a race condition for tests that try to clead temp dir before run. Profiling shows that most time is spent on conn.executescript(TABLES) in make_app_client function. Which makes sense. Perhaps the better approach would be look at the app_client fixture which is already session scoped, but not used by all test cases. And/or use conn = sqlite3.connect(":memory:") which is much faster. And/or truncate tables after each TC instead of deleting the file and re-creating them. I can take a look which is the best approach if you give the go-ahead. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1212/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
794554881 | MDU6SXNzdWU3OTQ1NTQ4ODE= | 1208 | A lot of open(file) functions are used without a context manager thus producing ResourceWarning: unclosed file <_io.TextIOWrapper | 4488943 | closed | 0 | 2 | 2021-01-26T20:56:28Z | 2021-03-11T16:15:49Z | 2021-03-11T16:15:49Z | CONTRIBUTOR | Your code is full of open files that are never closed, especially when you deal with reading/writing json/yaml files. If you run python with warnings enabled this problem becomes evident. This probably contributes to some memory leaks in long running datasettes if the GC will not 'collect' those resources properly. This is easily fixed by using a context manager instead of just using open: ```python with open('some_file', 'w') as opened_file: opened_file.write('string') ``` In some newer parts of the code you use Path objects 'read_text' and 'write_text' functions which close the file properly and are prefered in some cases. If you want I can create a PR for all places i found this pattern in. Bellow is a fraction of places where i found a ResourceWarning: ```python update-docs-help.py: 20 actual = actual.replace("Usage: cli ", "Usage: datasette ") 21: open(docs_path / filename, "w").write(actual) 22 datasette\app.py: 210 ): 211: inspect_data = json.load((config_dir / "inspect-data.json").open()) 212 if immutables is None: 266 if config_dir and (config_dir / "settings.json").exists() and not config: 267: config = json.load((config_dir / "settings.json").open()) 268 self._settings = dict(DEFAULT_SETTINGS, **(config or {})) 445 self._app_css_hash = hashlib.sha1( 446: open(os.path.join(str(app_root), "datasette/static/app.css")) 447 .read() datasette\cli.py: 130 else: 131: out = open(inspect_file, "w") 132 loop = asyncio.get_event_loop() 459 if inspect_file: 460: inspect_data = json.load(open(inspect_file)) 461 ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1208/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
500783373 | MDU6SXNzdWU1MDA3ODMzNzM= | 62 | [enhancement] Method to delete a row in python | 4454869 | closed | 0 | 5 | 2019-10-01T09:45:47Z | 2019-11-04T16:30:34Z | 2019-11-04T16:18:18Z | NONE | Hi ! Thanks for the lib ! Obviously, every possible sql queries won't have a dedicated method. But I was thinking : a method to delete a row (I'm terrible with names, maybe `delete_where()` or something, would be useful. I have a Database, with primary key. For the moment, I use : ```Python3 db.conn.execute(f"DELETE FROM table WHERE key = {key_id}") db.conn.commit() ``` to delete a row I don't need anymore, giving his primary key. Works like a charm. Just an idea : ```Python3 table.delete_where_pkey({'key': key_id}) ``` or something (I know, I'm terrible at naming methods...). Pros : well, no need to write SQL query. Cons : WHERE normally allows to do many more things (operators =, <>, >, <, BETWEEN), not to mention AND, OR, etc... Method is maybe to specific, and/or a pain to render more flexible. Again, just a thought. Writing his own sql works too, so... Thanks again. See yah. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/62/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
506297048 | MDU6SXNzdWU1MDYyOTcwNDg= | 594 | upgrade to uvicorn-0.9 to be Python-3.8 friendly | 4312421 | closed | 0 | 3 | 2019-10-13T09:23:43Z | 2019-11-12T04:47:04Z | 2019-11-12T04:47:04Z | NONE | uvicorn-0.8 relies on websockets-0.7 which lacks python-3.8 compatiblity | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/594/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
506183241 | MDU6SXNzdWU1MDYxODMyNDE= | 593 | make uvicorn optional dependancy (because not ok on windows python yet) | 4312421 | closed | 0 | 3 | 2019-10-12T12:51:07Z | 2019-10-13T06:22:08Z | 2019-10-13T06:22:07Z | NONE | would it be possible to: - remove uvicorn mandatory dependancy ? - eventually make a fallback to hypercorn ? reason: - uvloop not yet supported on Windows/Python-3.8 and below, may happen with Python-3.9 only. - it seems a 6 lines effort (but I'm not expert) | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/593/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
745393298 | MDU6SXNzdWU3NDUzOTMyOTg= | 52 | Discussion: Adding support for fetching only fresh tweets | 4169772 | closed | 0 | 1 | 2020-11-18T07:01:48Z | 2020-11-18T07:12:45Z | 2020-11-18T07:12:45Z | NONE | I think it'd be very useful if this tool has an option like `--incremental` to fetch only newer tweets. This way operations could complete very fast in sequential runs. I'd want to try to implement this feature if it seems OK for this tool's purpose. | 206156866 | issue | { "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/52/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1079422215 | I_kwDOCGYnMM5AVq0H | 357 | pytest-runner is not required | 4067843 | closed | 0 | 1 | 2021-12-14T07:51:24Z | 2021-12-16T20:43:19Z | 2021-12-16T20:43:13Z | NONE | Deprecated pytest-runner is not necessary for running the testsuite. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/357/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
748372469 | MDU6SXNzdWU3NDgzNzI0Njk= | 9 | ParseError: undefined entity š | 4028322 | closed | 0 | 1 | 2020-11-22T23:04:35Z | 2021-02-11T22:10:55Z | 2021-02-11T22:10:55Z | CONTRIBUTOR | I encountered a parse error if the enex file contained š or Run command: evernote-to-sqlite enex evernote.db evernote.enex ``` Traceback (most recent call last): ... File "evernote_to_sqlite/cli.py", line 31, in enex save_note(db, note) File "evernote_to_sqlite/utils.py", line 35, in save_note content = ET.tostring(ET.fromstring(content_xml)).decode("utf-8") File "/usr/lib/python3.8/xml/etree/ElementTree.py", line 1320, in XML parser.feed(text) xml.etree.ElementTree.ParseError: undefined entity š: line 3, column 35 ``` Workaround: ``` sed -i 's/š//g' evernote.enex sed -i 's/ //g' evernote.enex ``` | 303218369 | issue | { "url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/9/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1171599874 | I_kwDOCGYnMM5F1TIC | 415 | Convert with `--multi` and `--dry-run` flag does not work | 3976183 | closed | 0 | 2 | 2022-03-16T21:59:46Z | 2022-03-21T04:18:24Z | 2022-03-21T04:18:24Z | NONE | It's not possible to combine `--multi` and `--dry-run` flag in the `convert` command. Let's first create a simple database from JSON string ```console $ echo '[{"foo": "abc"}]' | sqlite-utils insert demo.db demo - $ sqlite-utils query demo.db "SELECT * FROM demo" [{"foo": "abc"}] ``` and then try to convert the "foo" column with a static value "bar" (see docs [Converting a column into multiple columns](https://sqlite-utils.datasette.io/en/stable/cli.html#converting-a-column-into-multiple-columns)) ```console $ sqlite-utils convert demo.db demo foo '{"foo": "bar"}' --multi --dry-run Traceback (most recent call last): File "/home/dotcs/anaconda3/envs/tools/bin/sqlite-utils", line 8, in <module> sys.exit(cli()) File "/home/dotcs/anaconda3/envs/tools/lib/python3.9/site-packages/click/core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "/home/dotcs/anaconda3/envs/tools/lib/python3.9/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/home/dotcs/anaconda3/envs/tools/lib/python3.9/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/dotcs/anaconda3/envs/tools/lib/python3.9/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/dotcs/anaconda3/envs/tools/lib/python3.9/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "/home/dotcs/anaconda3/envs/tools/lib/python3.9/site-packages/sqlite_utils/cli.py", line 2686, in convert for row in db.conn.execute(sql, where_args).fetchall(): sqlite3.OperationalError: user-defined function raised exception ``` But without the `--dry-run` flag it does work as expected: ```console $ sqlite-utils convert demo.db demo foo '{"foo": "bar"}' --multi $ sqlite-utils query demo.db "SELECT * FROM demo" [{"foo": "bar"}] ``` ```console $ sqlite-utils --version sqlite-utils, versio… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/415/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
870125126 | MDU6SXNzdWU4NzAxMjUxMjY= | 1310 | I'm creating a plugin to export a spreadsheet file (.ods or .xlsx) | 3747136 | closed | 0 | 2 | 2021-04-28T16:20:11Z | 2021-04-30T07:26:11Z | 2021-04-30T06:58:46Z | NONE | Hi, I have started developing a plugin to export records as a spreadsheet file. It could be ods or xlsx, whatever is easier. I have spotted the following packages: - ods files: https://pypi.org/project/odswriter/ - xlsx files: https://openpyxl.readthedocs.io/en/stable/index.html (quite powerful) or https://xlsxwriter.readthedocs.io/ (faster) This is the code I have so far, I test it with the `--plugins-dir` option: ```python from datasette import hookimpl from datasette.utils.asgi import Response import odswriter as ods def render_spreadsheet(rows): with ods.writer(open("test.ods","wb")) as odsfile: for row in rows: odsfile.writerow(["String", "ABCDEF123456", "123456"]) return Response(odsfile, content_type="application/vnd.oasis.opendocument.spreadsheet", status=200) @hookimpl def register_output_renderer(): return {"extension": "ods", "render": render_spreadsheet} ``` I get the following error: ``` Traceback (most recent call last): File "/home/colin/.local/lib/python3.8/site-packages/datasette/app.py", line 1128, in route_path await response.asgi_send(send) File "/home/colin/.local/lib/python3.8/site-packages/datasette/utils/asgi.py", line 339, in asgi_send body = body.encode("utf-8") AttributeError: 'ODSWriter' object has no attribute 'encode' ERROR: Exception in ASGI application Traceback (most recent call last): File "/home/colin/.local/lib/python3.8/site-packages/datasette/app.py", line 1128, in route_path await response.asgi_send(send) File "/home/colin/.local/lib/python3.8/site-packages/datasette/utils/asgi.py", line 339, in asgi_send body = body.encode("utf-8") AttributeError: 'ODSWriter' object has no attribute 'encode' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/colin/.local/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 396, in run_asgi result = await app(self.scope, self.recei… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1310/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
792851444 | MDU6SXNzdWU3OTI4NTE0NDQ= | 11 | XML parse error | 3613583 | closed | 0 | 2 | 2021-01-24T17:38:54Z | 2021-02-11T21:18:58Z | 2021-02-11T21:18:48Z | NONE | I am on Windows 10 using Windows Subsystem for Linux, Python 3.8. I installed evernote-to-sqlite via pipx (in a venv). I tried using enex files from the latest version of Evernote for Windows (10.6.9 which only lets you export 50 notes at a time) and from Legacy Evernote (6.25.2.9198 which lets you export all your notes at once). The enex file from latest evernote gives this error: File "/usr/lib/python3.8/xml/etree/ElementTree.py", line 1320, in XML parser.feed(text) xml.etree.ElementTree.ParseError: XML or text declaration not at start of entity: line 2, column 6 The enex file from Legacy Evernote gives this error: File "/home/david/.local/pipx/venvs/evernote-to-sqlite/lib/python3.8/site-packages/evernote_to_sqlite/utils.py", line 28, in save_note updated = note.find("updated").text AttributeError: 'NoneType' object has no attribute 'text' | 303218369 | issue | { "url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/11/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
860625833 | MDU6SXNzdWU4NjA2MjU4MzM= | 1300 | Make row available to `render_cell` plugin hook | 3243482 | closed | 0 | 5 | 2021-04-18T10:14:37Z | 2022-07-07T16:34:05Z | 2022-07-07T16:31:22Z | CONTRIBUTOR | *Original title: **Generating URL for a row inside `render_cell` hook*** Hey, I am using Datasette to view a database that contains video metadata. It has BLOB columns that contain video thumbnails in JPG format (around 100-500KB per row). I've registered an output formatter that extends `datasette.blob_renderer.render_blob` function and serves the column with `image/jpeg` content type. ```python from datasette.blob_renderer import render_blob async def render_jpg(datasette, database, rows, columns, request, table, view_name): response = await render_blob(datasette, database, rows, columns, request, table, view_name) response.content_type = "image/jpeg" response.headers["Content-Disposition"] = f'inline; filename="image.jpg"' return response @hookimpl def register_output_renderer(): return { "extension": "jpg", "render": render_jpg, "can_render": lambda: True, } ``` This works well. I can visit `http://localhost:8001/mydb/videos/1.jpg?_blob_column=thumbnail` and view the image. I want to display the image directly with an `<img>` tag (lazy-loaded of course). So, I need a URL, because embedding base64 would increase the page size too much (each image > 100KB). Datasette generates a link with `.blob` extension for blob columns. It does this by calling `datasette.urls.row_blob` https://github.com/simonw/datasette/blob/7a2ed9f8a119e220b66d67c7b9e07cbab47b1196/datasette/views/table.py#L169-L179 But I have no way of getting the row inside the `render_cell` hook. ```python @hookimpl def render_cell(value, column, table, database, datasette): if isinstance(value, bytes) and imghdr.what(None, value): # generate url return '$renderedLink' ``` Any pointers? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1300/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed |