github
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | pull_request | body | repo | type | active_lock_reason | performed_via_github_app | reactions | draft | state_reason |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1988525411 | I_kwDOCGYnMM52hn1j | 603 | Pyhton 3.12 Bug report | 1324252 | open | 0 | 1 | 2023-11-10T22:57:48Z | 2023-12-08T05:10:31Z | NONE | I start with new python3 verison 3.12.0 Also have the error where connect DataBase ``` Traceback (most recent call last): File "/home/t/Development/python/FKPJ/ClinicSYS/run.py", line 1, in <module> import re, os, io, json, sqlite_utils, requests, pytz, logging File "/home/t/.local/lib/python3.12/site-packages/sqlite_utils/__init__.py", line 1, in <module> from .db import Database File "/home/t/.local/lib/python3.12/site-packages/sqlite_utils/db.py", line 277, in <module> class Database: File "/home/t/.local/lib/python3.12/site-packages/sqlite_utils/db.py", line 306, in Database filename_or_conn: Optional[Union[str, pathlib.Path, sqlite3.Connection]] = None, ^^^^^^^^^^^^^^^^^^ ``` This bug come from `sqlite-utils` since's v3.33. Anyone get the same ? As well now of the resolved plan just keep the sqlite-utils version in python3.12 with v3.32.1 [tested] but where are the sqlite3.Connection problem.... This won't happen on python version down to 3.11[tested] Just the python3.12.0, I have test this error are come from the sqlite3 connection The error say from `sqlite_utils` and with the sqlite3 Connection, what can I do. Let fix together. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/603/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
2029908157 | I_kwDOBm6k_c54_fC9 | 2214 | CSV export fails for some `text` foreign key references | 2874 | open | 0 | 1 | 2023-12-07T05:04:34Z | 2023-12-07T07:36:34Z | NONE | I'm starting this issue without a clear reproduction in case someone else has seen this behavior, and to use the issue as a notebook for research. I'm using Datasette with the [SWITRS](https://iswitrs.chp.ca.gov/) data set, which is a California Highway Patrol collection of traffic incident data from the past decade or so. I receive data from them in CSV and want to work with it in Datasette, then export it to CSV for mapping in Felt.com. Their data makes extensive use of codes for incident column data (`1` for `Monday` and so on), some of it integer codes and some of it letter/text codes. The text codes are sometimes blank or `-`. During import, I'm creating lookup tables for foreign key references to make the Datasette UI presentation of the data easier to read. If I import the data and set up the integer foreign keys, everything works fine, but if I set up the text foreign keys, CSV export starts to fail. The foreign key configuration is as follows: ``` # Some tables use integer ids, like sensible tables do. Let's import them first # since we favor them. for TABLE in DAY_OF_WEEK CHP_SHIFT POPULATION SPECIAL_COND BEAT_TYPE COLLISION_SEVERITY do sqlite-utils create-table records.db $TABLE id integer name text --pk=id sqlite-utils insert records.db $TABLE lookup-tables/$TABLE.csv --csv sqlite-utils add-foreign-key records.db collisions $TABLE $TABLE id sqlite-utils create-index records.db collisions $TABLE done # *Other* tables use letter keys, like they were raised by WOLVES. Let's put them # at the end of the import queue. for TABLE in WEATHER_1 WEATHER_2 LOCATION_TYPE RAMP_INTERSECTION SIDE_OF_HWY \ PRIMARY_COLL_FACTOR PCF_CODE_OF_VIOL PCF_VIOL_CATEGORY TYPE_OF_COLLISION MVIW \ PED_ACTION ROAD_SURFACE ROAD_COND_1 ROAD_COND_2 LIGHTING CONTROL_DEVICE \ STWD_VEHTYPE_AT_FAULT CHP_VEHTYPE_AT_FAULT PRIMARY_RAMP SECONDARY_RAMP do sqlite-utils create-table records.db $TABLE key text name text --pk=key sqlite-utils insert records.db $TABLE lookup-tables/$TABLE.csv --csv … | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2214/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
2023057255 | I_kwDOBm6k_c54lWdn | 2212 | Can't filter with numbers | 605070 | open | 0 | 0 | 2023-12-04T05:26:29Z | 2023-12-04T05:26:29Z | NONE | I have a schema that uses numbers for a column (actually it's a boolean 1 or 0 but SQLite doesn't have Boolean). I can't seem to get the facet to work or even filtering on this column. My guess is that Datasette is "stringifying" the number and it's not matching? Example: https://debian-sqlelf.fly.dev/debian/elf_symbols?_sort_desc=name&_facet=exported&exported=0 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2212/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
2019811176 | I_kwDOBm6k_c54Y99o | 2211 | Unreachable exception handlers for `sqlite3.OperationalError` | 1214074 | open | 0 | 0 | 2023-12-01T00:50:22Z | 2023-12-01T00:50:22Z | NONE | There are several places where `sqlite3.OperationalError` is caught as part of an exception handler which catches multiple exceptions, but is then caught again immediately afterwards by a dedicated exception handler. Because the exception will be caught by the first handler, the logic in the second handler is unreachable and will never be executed. If this is intended behavior, the second handler can be removed. If this is not intended, and the second handler should be the one that catches this exception, then `sqlite3.OperationalError` should be removed from the tuple of exceptions in the first handler. This issue was found via a CodeQL query on the repository, and I've listed the occurrences found by the query below. There may be other instances of this issue in the code that were not surfaced by the query. I'd be happy to share the query if others would like to view or run it. One example: https://github.com/simonw/datasette/blob/452a587e236ef642cbc6ae345b58767ea8420cb5/datasette/views/database.py#L534-L537 Other instances: https://github.com/simonw/datasette/blob/main/datasette/views/base.py#L266-L270 https://github.com/simonw/datasette/blob/main/datasette/views/base.py#L452-L456 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2211/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1994845152 | I_kwDOBm6k_c525uvg | 2207 | ModuleNotFoundError: No module named 'click_default_group | 283441 | open | 0 | 0 | 2023-11-15T14:04:32Z | 2023-11-15T14:04:32Z | NONE | No matter what I do, I'm getting this error: ``` $ datasette Traceback (most recent call last): File "/Users/honza/Library/Caches/pypoetry/virtualenvs/juniorguru-Lgaxwd2n-py3.11/bin/datasette", line 5, in <module> from datasette.cli import cli File "/Users/honza/Library/Caches/pypoetry/virtualenvs/juniorguru-Lgaxwd2n-py3.11/lib/python3.11/site-packages/datasette/cli.py", line 6, in <module> from click_default_group import DefaultGroup ModuleNotFoundError: No module named 'click_default_group' ``` I have datasette in my dependencies like this: ```toml [tool.poetry.group.dev.dependencies] datasette = {version = "1.0a7", allow-prereleases = true} ``` I had the latest regular version (not pre-release) there originally, but the result was the same: ```toml [tool.poetry.group.dev.dependencies] datasette = "0.64.5" ``` Full pyproject.toml is at https://github.com/honzajavorek/junior.guru/ Previously datasette worked for me, but I guess something had to upgrade and now I can't even launch it. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2207/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1978603203 | I_kwDOCGYnMM517xbD | 602 | `sqlite-utils transform` removes the `AUTOINCREMENT` keyword | 4472046 | open | 0 | 0 | 2023-11-06T08:48:43Z | 2023-11-06T08:48:43Z | NONE | ### Context We ran into this bug randomly, noticing that deleted `ROWID` would get reused after migrating the DB. Using `transform` to change any column in the table will also unexpectedly strip away the `AUTOINCREMENT` keyword from the primary key definition, even if it was not the transformation target. ### Reproducible example **Original database** ```sql $ sqlite3 test.db << EOF CREATE TABLE mytable ( col1 INTEGER PRIMARY KEY AUTOINCREMENT, col2 TEXT NOT NULL ) EOF $ sqlite3 test.db ".schema mytable" CREATE TABLE mytable ( col1 INTEGER PRIMARY KEY AUTOINCREMENT, col2 TEXT NOT NULL ); ``` **Modified database after sqlite-utils** ```sql $ sqlite-utils transform test.db mytable --rename col2 renamedcol2 $ sqlite3 test.db "SELECT sql FROM sqlite_master WHERE name = 'mytable';" CREATE TABLE IF NOT EXISTS "mytable" ( [col1] INTEGER PRIMARY KEY, [renamedcol2] TEXT NOT NULL ); ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/602/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1977726056 | I_kwDOBm6k_c514bRo | 2203 | custom plugin not seen as sql function | 7113541 | open | 0 | 0 | 2023-11-05T10:30:19Z | 2023-11-05T10:30:19Z | NONE | Hi, I'm not sure if this is the right repo for this issue. I'm using datasette with the parquet (to read a duckdb), and jellyfish plugins. Both work perfectly. Now I need to create a simple plugin that uses the python rouge package and returns a similarity score (similarly to how the jellyfish plugin works). If I create a custom plugin, even the example hello_world one, copied directly from the tutorial, I get the following error: ```duckdb.duckdb.CatalogException: Catalog Error: Scalar Function with name hello_world does not exist!``` Since the jellyfish plugin doesn't do anything more complex, I'm wondering if there is some other kind of issue with my setup. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2203/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1955676270 | I_kwDOBm6k_c50kUBu | 2201 | Discord invite link is invalid | 11708906 | open | 0 | 0 | 2023-10-21T21:50:05Z | 2023-10-21T21:50:05Z | NONE | https://datasette.io/discord leads to https://discord.com/invite/ktd74dm5mw and returns the following: <img width="448" alt="CleanShot 2023-10-21 at 14 49 45@2x" src="https://github.com/simonw/datasette/assets/11708906/a126cbf6-0a4d-41b8-9986-7567d40d453f"> | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2201/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1943259395 | I_kwDOEhK-wc5z08kD | 16 | time data '2014-11-21T11:44:12.000Z' does not match format '%Y%m%dT%H%M%SZ' | 3746270 | open | 0 | 0 | 2023-10-14T13:24:39Z | 2023-10-14T13:24:39Z | NONE | ``` evernote-to-sqlite enex evernote.db ./我的笔记.enex Importing from ENEX [#####-------------------------------] 14% Traceback (most recent call last): File "/usr/local/bin/evernote-to-sqlite", line 8, in <module> sys.exit(cli()) ^^^^^ File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/cli.py", line 31, in enex save_note(db, note) File "/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/utils.py", line 46, in save_note "created": convert_datetime(created), ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/utils.py", line 111, in convert_datetime return datetime.datetime.strptime(s, "%Y%m%dT%H%M%SZ").isoformat() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/_strptime.py", line 568, in _strptime_datetime tt, fraction, gmtoff_fraction = _strptime(data_string, format) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.1… | 303218369 | issue | { "url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/16/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1931794126 | I_kwDOBm6k_c5zJNbO | 2198 | --load-extension=spatialite not working with Windows | 363004 | open | 0 | 0 | 2023-10-08T12:50:22Z | 2023-10-08T12:50:22Z | NONE | Using each of `python -m datasette counties.db -m metadata.yml --load-extension=SpatiaLite` and `python -m datasette counties.db --load-extension="C:\Windows\System32\mod_spatialite.dll"` and `python -m datasette counties.db --load-extension=C:\Windows\System32\mod_spatialite.dll` I got the error: ``` File "C:\Users\m3n7es\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\datasette\database.py", line 209, in in_thread self.ds._prepare_connection(conn, self.name) File "C:\Users\m3n7es\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\datasette\app.py", line 596, in _prepare_connection conn.execute("SELECT load_extension(?, ?)", [path, entrypoint]) sqlite3.OperationalError: The specified module could not be found. ``` I finally tried modifying the code in app.py to read: ``` def _prepare_connection(self, conn, database): conn.row_factory = sqlite3.Row conn.text_factory = lambda x: str(x, "utf-8", "replace") if self.sqlite_extensions: conn.enable_load_extension(True) for extension in self.sqlite_extensions: # "extension" is either a string path to the extension # or a 2-item tuple that specifies which entrypoint to load. #if isinstance(extension, tuple): # path, entrypoint = extension # conn.execute("SELECT load_extension(?, ?)", [path, entrypoint]) #else: conn.execute("SELECT load_extension('C:\Windows\System32\mod_spatialite.dll')") ``` At which point the counties example worked. Is there a correct way to install/use the extension on Windows? My method will cause issues if there's a second extension to be used. On an unrelated note, my next step is to figure out how to write a query across the two loaded databases supplied from the c… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2198/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1920416843 | I_kwDOCGYnMM5ydzxL | 597 | sqlite-utils insert-files should be able to convert fields | 1737541 | open | 0 | 0 | 2023-09-30T22:20:47Z | 2023-09-30T22:20:47Z | NONE | Currently using both `insert-files` and `convert` is needed in order to create sqlar files, it would be more convenient if it could be done with just one command. ```shell ~ ❯ cat test.py import os class Example: def __init__(self, arg1, arg2): self.arg1 = arg1 ~ ❯ sqlite-utils insert-files test.sqlar sqlar test.py -c name:name -c data:content -c mode:mode -c mtime:mtime -c sz:size --pk=name [####################################] 100% ~ ❯ sqlite-utils convert test.sqlar sqlar data "zlib.compress(value)" --import=zlib --where "name = 'test.py'" [####################################] 100% ~ ❯ cat test.py | sqlite-utils convert test.sqlar sqlar data "zlib.compress(sys.stdin.buffer.read())" --import=zlib --import=sys --where "name = 'test.py'" # Alternative way [####################################] 100% ~ ❯ sqlite3 test.sqlar "SELECT hex(data) FROM sqlar WHERE name = 'test.py';" | python3 -c "import sys, zlib; sys.stdout.buffer.write(zlib.decompress(bytes.fromhex(sys.stdin.read())))" import os class Example: def __init__(self, arg1, arg2): self.arg1 = arg1 ~ ❯ rm test.py ~ ❯ sqlar -l test.sqlar test.py ~ ❯ sqlar -x test.sqlar ~ ❯ cat test.py import os class Example: def __init__(self, arg1, arg2): self.arg1 = arg1 ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/597/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1825007061 | I_kwDOBm6k_c5sx2XV | 2123 | datasette serve when invoked with --reload interprets the serve command as a file | 79087 | open | 0 | 2 | 2023-07-27T19:07:22Z | 2023-09-18T13:02:46Z | NONE | When running `datasette serve` with the `--reload` flag, the serve command is picked up as a file argument: ``` $ datasette serve --reload test_db Starting monitor for PID 13574. Error: Invalid value for '[FILES]...': Path 'serve' does not exist. Press ENTER or change a file to reload. ``` If a 'serve' file is created it launches properly (albeit with an empty database called serve): ``` $ touch serve; datasette serve --reload test_db Starting monitor for PID 13628. INFO: Started server process [13628] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8001 (Press CTRL+C to quit) ``` Version (running from HEAD on main): ``` $ datasette --version datasette, version 1.0a2 ``` This issue appears to have existed for awhile as https://github.com/simonw/datasette/issues/1380#issuecomment-953366110 mentions the error in a different context. I'm happy to debug and land a patch if it's welcome. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2123/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1899310542 | I_kwDOBm6k_c5xNS3O | 2187 | Datasette for serving JSON only | 19705106 | open | 0 | 0 | 2023-09-16T05:48:29Z | 2023-09-16T05:48:29Z | NONE | Hi, is there any way to use datasette for serving json only without displaying webpage? I've tried to search about this in documentation but didn't get any information | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2187/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1010112818 | I_kwDOBm6k_c48NRky | 1479 | Win32 "used by another process" error with datasette publish | 76450761 | open | 0 | 7 | 2021-09-28T19:12:00Z | 2023-09-07T02:14:16Z | NONE | I unfortunately was not successful to deploy to fly.io. Please see the details above of the three scenarios that I took. I am also new to datasette. Failed to deploy. Attaching logs: 1. Tried with an app created via `flyctl apps create frosty-fog-8565` and the ran `datasette publish fly covid.db --app frosty-fog-8565` ``` Deploying frosty-fog-8565 ==> Validating app configuration --> Validating app configuration done Services TCP 80/443 ⇢ 8080 Error error connecting to docker: An unknown error occured. Traceback (most recent call last): File "c:\users\grott\anaconda3\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "c:\users\grott\anaconda3\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\grott\Anaconda3\Scripts\datasette.exe\__main__.py", line 7, in <module> File "c:\users\grott\anaconda3\lib\site-packages\click\core.py", line 829, in __call__ return self.main(*args, **kwargs) File "c:\users\grott\anaconda3\lib\site-packages\click\core.py", line 782, in main rv = self.invoke(ctx) File "c:\users\grott\anaconda3\lib\site-packages\click\core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "c:\users\grott\anaconda3\lib\site-packages\click\core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "c:\users\grott\anaconda3\lib\site-packages\click\core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "c:\users\grott\anaconda3\lib\site-packages\click\core.py", line 610, in invoke return callback(*args, **kwargs) File "c:\users\grott\anaconda3\lib\site-packages\datasette_publish_fly\__init__.py", line 156, in fly "--remote-only", File "c:\users\grott\anaconda3\lib\contextlib.py", line 119, in __exit__ next(self.gen) File "c:\users\grott\anaconda3\lib\site-packages\datasette\utils\__init__.py", line 451, in temporary_docker_directory tmp.cleanup() File "c:\use… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1479/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
814595021 | MDU6SXNzdWU4MTQ1OTUwMjE= | 1241 | Share button for copying current URL | 7107523 | open | 0 | 6 | 2021-02-23T15:55:40Z | 2023-08-24T20:09:52Z | NONE | I use datasette in an `iframe` inside another HTML file that contains other ways to represent my data (mostly leaflets maps built with R on summarized data), and the datasette `iframe` is a tab in that page. This particular use prevents users to access the full URLs of their datasette views and queries, which is a shame because the way datasette handles URLs to make every view or query easy to share is awesome. I know how to get the URL from the context menu of my browser, but I don't think many visitors would do it or even notice that datasette uses permalinks for pretty much every action they do. Would it be possible to add a "Share link" button to the interface, either in datasette itself or in a plugin? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1241/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1858228057 | I_kwDOBm6k_c5uwk9Z | 2147 | Plugin hook for database queries that are run | 18899 | open | 0 | 6 | 2023-08-20T18:43:50Z | 2023-08-24T03:54:35Z | NONE | I'm interested in making a plugin that saves every query that gets run to a table in the database. (I know about datasette-query-history but thought it would be good to have a server-side option.) As far as I can tell reading the docs, there isn't really a hook setup to allow this. Maybe I could hack it with some of the hooks that are passed requests, but that doesn't seem good. I'm a little surprised this isn't possible, so I thought I would open an issue and see if that's a deeply considered decision or just "haven't needed it yet." I'm potentially interested in implementing the hook if the latter. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2147/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1754174496 | I_kwDOCGYnMM5ojpQg | 558 | Ability to define unique columns when creating a table | 1910303 | open | 0 | 0 | 2023-06-13T06:56:19Z | 2023-08-18T01:06:03Z | NONE | When creating a new table, it would be good to have an option to set unique columns similar to how not_null is set. ```python from sqlite_utils import Database columns = {"mRID": str, "name": str} db = Database("example.db") db["ExampleTable"].create(columns, pk="mRID", not_null=["mRID"], if_not_exists=True) db["ExampleTable"].create_index(["mRID"], unique=True, if_not_exists=True) ``` So something like this would add the UNIQUE flag to the table definition. ```python db["ExampleTable"].create(columns, pk="mRID", not_null=["mRID"], unique=["mRID"], if_not_exists=True) ``` ```sql CREATE TABLE ExampleTable ( mRID TEXT PRIMARY KEY NOT NULL UNIQUE, name TEXT ); ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/558/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
476852861 | MDU6SXNzdWU0NzY4NTI4NjE= | 568 | Add database_color as a configurable option | 50906992 | open | 0 | 1 | 2019-08-05T13:14:45Z | 2023-08-11T05:19:42Z | NONE | This would be really useful as it would allow us to tie in with colour schemes. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/568/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1839344979 | I_kwDOCGYnMM5toi1T | 582 | Handling CSV/file input that contains NUL bytes | 1448859 | open | 0 | 0 | 2023-08-07T12:24:14Z | 2023-08-07T12:24:14Z | NONE | I was using sqlite-utils to create a DB from a CSV and it turns out the CSV contains a NUL byte. When the processing reaches the line that contains the NUL an exception is raised. I'm wondering if there is something that can be done in `sqlite-utils` to say "skip lines with encoding errors" or some such. I think it isn't super straightforward though as the exception comes from inside the `csv` module that does all the parsing. Concretely the file is the `KernelVersions.csv` from https://www.kaggle.com/datasets/kaggle/meta-kaggle This is the command and output: ``` $ sqlite-utils insert --csv kaggle.db kaggle KernelVersions.csv [------------------------------------] 0% [#####################---------------] 60% 00:04:24Traceback (most recent call last): File "/home/foobar/miniconda/envs/meta-kaggle/bin/sqlite-utils", line 10, in <module> sys.exit(cli()) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1223, in insert insert_upsert_implementation( File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1085, in insert_upsert_implementation db[table].insert_all( File "/home/foobar/minicond… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/582/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1824457306 | I_kwDOBm6k_c5svwJa | 2122 | Parameters on canned queries: fixed or query-generated list? | 1563881 | open | 0 | 0 | 2023-07-27T14:07:07Z | 2023-07-27T14:07:07Z | NONE | Hi, currently parameters in canned queries are just text fields. It would be cool to have one of the options below. Would you accept a PR doing something in this direction? (Possibly this could even work as a plugin.) * adding facets, which would work like facets on tables or views, giving a list of selectable options (and leaving parameters as is) * making it possible to provide a query which returns selectable values for a parameter, e.g. ``` calendar_entries_current_instrument: sql: | select * from calendar_entries where DTEND_UNIX > UNIXEPOCH() and DTSTART_UNIX < UNIXEPOCH() + :days *24*60*60 and current = 1 and MACHINE = :instrument order by DTSTART_UNIX params: days: sql: "SELECT VALUE FROM generate_series(1, 30, 1)" # this obviously requires the corresponding sqlite extension instrument: sql: "SELECT DISTINCT MACHINE FROM calendar_entries" ``` * making it possible to provide a fixed list of parameters ``` calendar_entries_current_instrument: sql: | select * from calendar_entries where DTEND_UNIX > UNIXEPOCH() and DTSTART_UNIX < UNIXEPOCH() + :days *24*60*60 and current = 1 and MACHINE = :instrument order by DTSTART_UNIX params: days: values: [1, 2, 3, 5, 10, 20, 30] instrument: values: [supermachine, crappymachine, boringmachine] ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2122/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1822918995 | I_kwDOCGYnMM5sp4lT | 580 | Add way to export to a csv file using the Python library | 44324811 | open | 0 | 0 | 2023-07-26T18:09:26Z | 2023-07-26T18:09:26Z | NONE | According to the documentation, we can make a csv output using the CLI tool, but not the Python library. Could we have the latter? | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/580/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1816830546 | I_kwDODEm0Qs5sSqJS | 73 | Twitter v1 API shutdown | 6341745 | open | 0 | 0 | 2023-07-22T16:57:41Z | 2023-07-22T16:57:41Z | NONE | I've been using this project reliably over the past two years to periodically download my liked tweets, but unfortunately since 19th July I get: ``` [2023-07-19 21:00:04.937536] File "/home/pi/code/liked-tweets/lib/python3.7/site-packages/twitter_to_sqlite/utils.py", line 202, in fetch_timeline [2023-07-19 21:00:04.937606] raise Exception(str(tweets["errors"])) [2023-07-19 21:00:04.937678] Exception: [{'message': 'You currently have access to a subset of Twitter API v2 endpoints and limited v1.1 endpoints (e.g. media post, oauth) only. If you need access to this endpoint, you may need a different access level. You can learn more here: https://developer.twitter.com/en/portal/product', 'code': 453}] ``` It appears like Twitter has now shut down their v1 endpoints, which is rather gracious of them, considering they [announced they'd be deprecated on 29th April](https://twittercommunity.com/t/reminder-to-migrate-to-the-new-free-basic-or-enterprise-plans-of-the-twitter-api/189737). Unfortunately [retrieving likes using the v2 API](https://developer.twitter.com/en/docs/twitter-api/tweets/likes/introduction) is not part of their [free plan](https://developer.twitter.com/en/portal/products). In fact, with the free plan one can only post and delete tweets and retrieve information about oneself. So I'm afraid this is the end of this very nice project. It was very useful, thank you! | 206156866 | issue | { "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/73/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 } |
||||||||
1811824307 | I_kwDOBm6k_c5r_j6z | 2105 | When reverse proxying datasette with nginx an URL element gets erronously added | 2235371 | open | 0 | 3 | 2023-07-19T12:16:53Z | 2023-07-21T21:17:09Z | NONE | I use this nginx config: ``` location /datasette-llm { return 302 /datasette-llm/; } location /datasette-llm/ { proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "Upgrade"; proxy_http_version 1.1; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto https; proxy_set_header X-Forwarded-Host $http_host; proxy_set_header Host $host; proxy_max_temp_file_size 0; proxy_pass http://127.0.0.1:8001/datasette-llm/; proxy_redirect http:// https://; proxy_buffering off; proxy_request_buffering off; proxy_set_header Origin ''; client_max_body_size 0; auth_basic "datasette-llm"; auth_basic_user_file /etc/nginx/custom-userdb; } ``` Then I start datasette with this command: ``` datasette serve --setting base_url /datasette-llm/ $(llm logs path) ``` Everything else works right, except the links in "This data as json, CSV". They get an extra URL element "datasette-llm" like this: https://192.168.1.3:5432/datasette-llm/datasette-llm/logs.json?sql=select+*+from+_llm_migrations https://192.168.1.3:5432/datasette-llm/datasette-llm/logs.csv?sql=select+*+from+_llm_migrations&_size=max When I remove that extra "datasette-llm" from the URL, those links work too. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2105/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
771608692 | MDU6SXNzdWU3NzE2MDg2OTI= | 14 | UNIQUE constraint failed: workouts.id | 1234956 | open | 0 | 5 | 2020-12-20T15:11:20Z | 2023-07-10T14:46:52Z | NONE | I'm getting an error on my initial attempt to import data: ```console $ healthkit-to-sqlite 20201119\ healthkit\ export.zip healthkit.db Importing from HealthKit [###################################-] 98% 00:00:01 Traceback (most recent call last): File "venv/bin/healthkit-to-sqlite", line 8, in <module> sys.exit(cli()) File "venv/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "venv/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "venv/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "venv/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "venv/lib/python3.9/site-packages/healthkit_to_sqlite/cli.py", line 57, in cli convert_xml_to_sqlite(fp, db, progress_callback=bar.update, zipfile=zf) File "venv/lib/python3.9/site-packages/healthkit_to_sqlite/utils.py", line 34, in convert_xml_to_sqlite workout_to_db(el, db, zipfile) File "venv/lib/python3.9/site-packages/healthkit_to_sqlite/utils.py", line 57, in workout_to_db pk = db["workouts"].insert(record, alter=True, hash_id="id").last_pk File "venv/lib/python3.9/site-packages/sqlite_utils/db.py", line 1660, in insert return self.insert_all( File "venv/lib/python3.9/site-packages/sqlite_utils/db.py", line 1778, in insert_all self.insert_chunk( File "venv/lib/python3.9/site-packages/sqlite_utils/db.py", line 1588, in insert_chunk result = self.db.execute(query, params) File "venv/lib/python3.9/site-packages/sqlite_utils/db.py", line 213, in execute return self.conn.execute(sql, parameters) sqlite3.IntegrityError: UNIQUE constraint failed: workouts.id ``` | 197882382 | issue | { "url": "https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/14/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1795219865 | I_kwDOCGYnMM5rAOGZ | 566 | `--no-headers` doesn't work on most formats | 33625 | open | 0 | 2 | 2023-07-09T03:43:36Z | 2023-07-09T04:13:35Z | NONE | Version 3.33 ``` sqlite-utils query library.db 'select asin from audible' --fmt plain --no-headers | head -3 asin 0062804006 0062891421 ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/566/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1794097871 | I_kwDOBm6k_c5q78LP | 2095 | Introduce "dark mode" CSS | 3315059 | open | 0 | 0 | 2023-07-07T19:15:58Z | 2023-07-07T19:15:58Z | NONE | Using [the CSS media query `prefers-color-scheme`](https://developer.mozilla.org/en-US/docs/Web/CSS/@media/prefers-color-scheme) we can provide a dark-mode version of Datasette | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2095/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1762180409 | I_kwDOBm6k_c5pCL05 | 2085 | Interactive row selection in Datasette | 24938923 | open | 0 | 0 | 2023-06-18T08:29:45Z | 2023-06-18T08:31:23Z | NONE | Simon did a excellent [prototype](https://til.simonwillison.net/datasette/row-selection-prototype) of an interactive row selection in Datasette. I hope this [functionality](https://camo.githubusercontent.com/3d4a0f31fb6a27fd279f809af5b53dc3b76faa63c7721e228951c5252b645a77/68747470733a2f2f7374617469632e73696d6f6e77696c6c69736f6e2e6e65742f7374617469632f323032332f6461746173657474652d7069636b65722e676966) can be turned into a Datasette plugin. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2085/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1761613778 | I_kwDOBm6k_c5pABfS | 2084 | Support facets for columns that contain timestamps | 19492893 | open | 0 | 0 | 2023-06-17T03:33:54Z | 2023-06-17T03:33:54Z | NONE | Django has this very nice filter for datetime fields - <img width="176" alt="image" src="https://github.com/simonw/datasette/assets/19492893/3c66d7c4-1579-4d30-8f08-89d111f4566e"> It would be nice to have something similar to facet by a field that contains a timestamp in datasette too - Which doesn't seem to do anything with timestamps right now... <img width="283" alt="image" src="https://github.com/simonw/datasette/assets/19492893/069083e4-13f5-4b28-9473-a7b9d48839ea"> | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2084/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1383646615 | I_kwDOCGYnMM5SeMWX | 491 | Ability to merge databases and tables | 8904453 | open | 0 | 7 | 2022-09-23T11:10:55Z | 2023-06-14T22:14:24Z | NONE | Hi! Let me firstly say that I am a big fan of your work -- I follow your tweets and blog posts with great interest 😄. Now onto the matter at hand: I think it would be great if `sqlite-utils` included a `merge` or `combine` command, with the purpose of combining different SQLite databases into a single SQLite database. This way, the newly "merged" database would contain all differently named tables contained in the databases to be merged as-is, as well a concatenation of all tables of the same name. This could look something like this: ```bash sqlite-utils merge cats.db dogs.db > animals.db ``` I imagine this is rather straightforward if all databases involved in the merge contain differently named tables (i.e. no chance of conflicts), but things get slightly more complicated if two or more of the databases to be merged contain tables with the same name. Not only do you have to "do something" with the primary key(s), but these tables could also simply have different schemas (and therefore be incompatible for concatenation to begin with). Anyhow, I would love your thoughts on this, and, if you are open to it, work together on the design and implementation! | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/491/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1733198948 | I_kwDOCGYnMM5nToRk | 555 | Filter table by a large bunch of ids | 10843208 | open | 0 | 1 | 2023-05-31T00:29:51Z | 2023-06-14T22:01:57Z | NONE | Hi! this might be a question related to both SQLite & sqlite-utils, and you might be more experienced with them. I have a large bunch of ids, and I'm wondering which is the best way to query them in terms of performance, and simplicity if possible. The naive approach would be something like `select * from table where rowid in (?, ?, ?...)` but that wouldn't scale if ids are >1k. Another approach might be creating a temp table, or in-memory db table, insert all ids in that table and then join with the target one. I failed to attach an in-memory db both using sqlite-utils, and plain sql's execute(), so my closest approach is something like, ```python def filter_existing_video_ids(video_ids): db = get_db() # contains a "videos" table db.execute("CREATE TEMPORARY TABLE IF NOT EXISTS tmp (video_id TEXT NOT NULL PRIMARY KEY)") db["tmp"].insert_all([{"video_id": video_id} for video_id in video_ids]) for row in db["tmp"].rows_where("video_id not in (select video_id from videos)"): yield row["video_id"] db["tmp"].drop() ``` That kinda worked, I couldn't find an option in sqlite-utils's `create_table()` to tell it's a temporary table. Also, `tmp` table is not dropped finally, neither using `.drop()` despite being created with the keyword `TEMPORARY`. I believe it should be automatically dropped after connection/session ends though I read. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/555/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1751214236 | I_kwDOC8SPRc5oYWic | 36 | Getting sqlite_master may not be modified when creating dogsheep index | 8711912 | open | 0 | 0 | 2023-06-11T03:21:53Z | 2023-06-11T03:21:53Z | NONE | When creating a `dogsheep` index from `config.yml` file on pocket.db (created using pocket-to-sqlite), I am getting this error ``` Traceback (most recent call last): File "/Users/khushmeeet/.pyenv/versions/3.11.2/bin/dogsheep-beta", line 8, in <module> sys.exit(cli()) ^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py", line 1130, in __call__ return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/cli.py", line 36, in index run_indexer( File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/utils.py", line 32, in run_indexer ensure_table_and_indexes(db, tokenize) File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/utils.py", line 91, in ensure_table_and_indexes table.add_foreign_key(*fk) File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/sqlite_utils/db.py", line 2155, in add_foreign_key self.db.add_foreign_keys([(self.name, column, other_table, other_column)]) File "/Users/khushmeeet/.pyenv/vers… | 197431109 | issue | { "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/36/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1727478903 | I_kwDOBm6k_c5m9zx3 | 2081 | Update Endpoints defined in metadata throws 403 Forbidden after a while | 15085007 | open | 0 | 0 | 2023-05-26T11:52:30Z | 2023-05-26T11:52:30Z | NONE | Hello. I expose an endpoint to update `tasks`: ``` { "title": "My Datasette Instance", "databases": { "tasks": { "queries": { "update_task": { "sql": "UPDATE tasks SET status = :status, result = :result, systemMessage = :systemMessage WHERE queueID = :queueID", "write": true, "on_success_message": "Task updated", "on_success_redirect": "/tasks/tasks.json", "on_error_message": "Task update failed", "on_error_redirect": "/tasks.json", "params": ["queueID", "taskData", "status", "result", "systemMessage"] } } } } } ``` This works really well! But after a while, the Datasette Instanz answers with **403 Forbidden**. I have to delete the database and recreate it in order to work again. Any help here? (´。_。`) | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2081/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1720096994 | I_kwDOCGYnMM5mhpji | 554 | `IndexError` when doing `.insert(..., pk='id')` after `insert_all` | 1231935 | open | 0 | 1 | 2023-05-22T17:13:02Z | 2023-05-22T17:18:33Z | NONE | I believe this is related to https://github.com/simonw/sqlite-utils/issues/98. When `pk` is specified by table A's `insert` call, it throws an index error if a different table has written a row with a higher rowid than exists in the first table. Here's a basic example: ```py from sqlite_utils import Database def test_pk_for_insert(fresh_db): user = {"id": "abc", "name": "david"} fresh_db["users"].insert(user, pk="id") fresh_db["comments"].insert_all( [ {"id": "def", "text": "ok"}, {"id": "ghi", "text": "great"}, ], ) fresh_db["users"].insert( user, ignore=True, # BUG: when specifying pk on the second insert call # db.py goes into a block it doesn't expect and we get the error pk="id", ) if __name__ == "__main__": db = Database("bug.db") if db["users"].exists(): raise ValueError( "bug only shows on a new database - remove bug.db before running the script" ) test_pk_for_insert(db) ``` The error is: ```py File "/Users/david/projects/reddit-to-sqlite/.venv/lib/python3.11/site-packages/sqlite_utils/db.py", line 2960, in insert_chunk row = list(self.rows_where("rowid = ?", [self.last_rowid]))[0] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^ IndexError: list index out of range ``` The issue is in this block: https://github.com/simonw/sqlite-utils/blob/2747257a3334d55e890b40ec58fada57ae8cfbfd/sqlite_utils/db.py#L2954-L2958 relevant locals are: - `pk`: `'id'` - `result.lastrowid`: `2` What's most interesting is the comment `# self.last_rowid will be 0 if a "INSERT OR IGNORE" happened`, which doesn't seem to be the case here. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/554/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1698865182 | I_kwDOBm6k_c5lQqAe | 2069 | [BUG] Cannot insert new data to deployed instance | 31861128 | open | 0 | 1 | 2023-05-07T02:59:42Z | 2023-05-07T03:17:35Z | NONE | ## Summary Recently, I deployed an instance of datasette to Vercel with the following plugins: - datasette-auth-tokens - datasette-insert With the above plugins, I was able to insert new data to local sqlite db. However, when it comes to the deployment on Vercel, things behave differently. I observed some errors from the logs console on Vercel: ```console File "/var/task/datasette/database.py", line 179, in _execute_writes conn = self.connect(write=True) File "/var/task/datasette/database.py", line 93, in connect assert not (write and not self.is_mutable) AssertionError ``` <img width="1380" alt="image" src="https://user-images.githubusercontent.com/31861128/236655274-3fd5e16f-4e06-4ba8-87ff-41130b8ebe90.png"> I think it is a potential bug. ## Reproduce <details><summary>metadata.json</summary> </br> ```json { "plugins": { "datasette-insert": { "allow": { "id": "*" } }, "datasette-auth-tokens": { "tokens": [ { "token": { "$env": "INSERT_TOKEN" }, "actor": { "id": "repeater" } } ], "param": "_auth_token" } } } ``` </details> <details><summary>commands</summary> </br> ```bash # deploy datasette publish vercel remote.db \ --project=repeater-bot-sqlite \ --metadata metadata.json \ --install datasette-auth-tokens \ --install datasette-insert \ --vercel-json=vercel.json # test insert cat fixtures/dogs.json | curl --request POST -d @- -H "Authorization: Bearer <token>" \ 'https://repeater-bot-sqlite.vercel.app/-/insert/remote/dogs?pk=id' ``` </details> <details><summary>logs</summary> </br> ```console Traceback (most recent call last): File "/var/task/datasette/app.py", line 1354, in route_path response = await view(request, send) File "/var/task/datasette/app.py", line 1500, in async_view_fn response = await async_call_with_supported_arguments( File "/var/task/datasette/utils… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2069/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1690765434 | I_kwDOBm6k_c5kxwh6 | 2067 | Litestream-restored db: errors on 3.11 and 3.10.8; but works on py3.10.7 and 3.10.6 | 39538958 | open | 0 | 1 | 2023-05-01T12:42:28Z | 2023-05-03T00:16:03Z | NONE | Hi! Wondering if this issue is limited to my local system or if it affects others as well. It seems like 3.11 errors out on a "litestream-restored" database. On further investigation, it also appears to conk out on 3.10.8 but works on 3.10.7 and 3.10.6. To demo issue I created a test database, replicated it to an aws s3 bucket, then restored the same under various .pyenv-versioned shells where I test whether I can read the database via the sqlite3 cli. ```sh # create new shell with 3.11.3 litestream restore -o data/db.sqlite s3://mytestbucketxx/db sqlite3 data/db.sqlite # SQLite version 3.41.2 2023-03-22 11:56:21 # Enter ".help" for usage hints. # sqlite> .tables # _litestream_lock _litestream_seq movie # sqlite> ``` However this get me an `OperationalError` when reading via datasette: <details> <summary>Error on 3.11.3 and 3.10.8</summary> ```sh datasette data/db.sqlite ``` ```console /tester/.venv/lib/python3.11/site-packages/pkg_resources/__init__.py:121: DeprecationWarning: pkg_resources is deprecated as an API warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning) Traceback (most recent call last): File "/tester/.venv/bin/datasette", line 8, in <module> sys.exit(cli()) ^^^^^ File "/tester/.venv/lib/python3.11/site-packages/click/core.py", line 1130, in __call__ return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tester/.venv/lib/python3.11/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/tester/.venv/lib/python3.11/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tester/.venv/lib/python3.11/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tester/.venv/lib/python3.11/… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2067/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1665053646 | I_kwDOBm6k_c5jPrPO | 2059 | "Deceptive site ahead" alert on Heroku deployment | 1186275 | open | 0 | 1 | 2023-04-12T18:34:51Z | 2023-04-13T01:13:01Z | NONE | I deployed a fairly basic instance of Datasette (`datasette-auth-passwords` is the only plugin) using Heroku. The deployed URL now gives a "Deceptive site ahead" warning to users. Is there way around this? Maybe a way to add ownership verification [through Google's search console](https://search.google.com/search-console/welcome)? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2059/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1373210675 | I_kwDODD6af85R2Ygz | 13 | fails before generating views. ERR: table sqlite_master may not be modified | 116795 | open | 0 | 4 | 2022-09-14T15:41:50Z | 2023-04-11T03:46:17Z | NONE | generates checkins.db but seems to fail before generating views note: it worked on an Ubuntu WSL but fails on macOS 12.5.1 later edit: I suspect this is a problem with my local set-up, `dogsheep-beta index` also throws the same error full error: Importing 2591 checkins [###################################-] 98% 00:00:00 Traceback (most recent call last): File "/Users/pax/devbox/envAll/bin/swarm-to-sqlite", line 8, in <module> sys.exit(cli()) File "/Users/pax/devbox/envAll/lib/python3.8/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/Users/pax/devbox/envAll/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/Users/pax/devbox/envAll/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/Users/pax/devbox/envAll/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/Users/pax/devbox/envAll/lib/python3.8/site-packages/swarm_to_sqlite/cli.py", line 77, in cli ensure_foreign_keys(db) File "/Users/pax/devbox/envAll/lib/python3.8/site-packages/swarm_to_sqlite/utils.py", line 145, in ensure_foreign_keys db[fk.table].add_foreign_key(fk.column, fk.other_table, fk.other_column) File "/Users/pax/devbox/envAll/lib/python3.8/site-packages/sqlite_utils/db.py", line 2123, in add_foreign_key self.db.add_foreign_keys([(self.name, column, other_table, other_column)]) File "/Users/pax/devbox/envAll/lib/python3.8/site-packages/sqlite_utils/db.py", line 1086, in add_foreign_keys cursor.execute( sqlite3.OperationalError: table sqlite_master may not be modified | 205429375 | issue | { "url": "https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/13/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1650981564 | I_kwDOJHON9s5iZ_q8 | 12 | Error running pytest | 14314871 | open | 0 | 0 | 2023-04-02T15:02:36Z | 2023-04-02T15:07:10Z | NONE | `______________________________________________________ ERROR collecting tests/test_apple_notes_to_sqlite.py _______________________________________________________ ImportError while importing test module '/Users/lol/development/apple-notes-to-sqlite/tests/test_apple_notes_to_sqlite.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /opt/homebrew/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) tests/test_apple_notes_to_sqlite.py:2: in <module> from apple_notes_to_sqlite.cli import cli, COUNT_SCRIPT, FOLDERS_SCRIPT E ModuleNotFoundError: No module named 'apple_notes_to_sqlite'` Solution: This is likely a PYTHONPATH issue due to having pytest installed both globally and in the venv. We can guarantee the tests run by adding the current directory to sys.path automatically using `python -m pytest` The alternative is to activate the venv, install pytest, deactivate, then activate the venv again (https://stackoverflow.com/questions/35045038/how-do-i-use-pytest-with-virtualenv) | 611552758 | issue | { "url": "https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/12/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
907795562 | MDU6SXNzdWU5MDc3OTU1NjI= | 265 | Using enable_fts before search term | 36287 | open | 0 | 1 | 2021-06-01T01:43:34Z | 2023-04-01T17:27:18Z | NONE | Many thanks for the sqlite-utils suite of utilities. Has made my life much much easier. I used this to create a table and enable FTS. All works fine. The datasette utility detects FTS and shows a text box. Searching for a term using that interface works well. However, when I start to use features by following https://www.sqlite.org/fts5.html section **"3. Full-text Query Syntax"** I seem to run into issues that I suspect is due to `escape_fts` wrapper function. As an example, if i search for the term `"^குகை" `on the text box in datasette it produces 140 results. However, when i tweak the query produced by datasette to not use "escape_fts" it produces 5 results. Similarly, when I try to restrict the search to a single column in FTS using a spec like `{title : ^குகை}` it returns no rows. The same thing pulls results when used without `escape_fts`. The text in the table is in Tamil language and the search term is a Tamil word. ``` ... where posts_fts match escape_fts(:search) ``` vs ``` ... where posts_fts match (:search) ``` Any ideas why? How can I get the benefits of both escaping as well as utilizing different facets of providing / controlling search terms? Thanks. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/265/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
702386948 | MDU6SXNzdWU3MDIzODY5NDg= | 159 | .delete_where() does not auto-commit (unlike .insert() or .upsert()) | 11712349 | open | 0 | 9 | 2020-09-16T01:55:52Z | 2023-04-01T17:21:05Z | NONE | When you use the delete_where() function on a table, it never commits.... Is that intentional? | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/159/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1531991339 | I_kwDOBm6k_c5bUFUr | 1989 | Suggestion: Hiding columns | 116795 | open | 0 | 3 | 2023-01-13T09:33:32Z | 2023-03-31T06:18:05Z | NONE | As there's the possibility of [hiding tables](https://docs.datasette.io/en/stable/metadata.html#hiding-tables) - I've run into the **need of hiding specific columns** - data that's either not relevant for public or can't be shown due to privacy reasons. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1989/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1515883470 | I_kwDOC8tyDs5aWovO | 24 | DOC: xml.etree.ElementTree.ParseError due to healthkit version 12 | 6231413 | open | 0 | 2 | 2023-01-01T23:00:38Z | 2023-03-30T10:17:31Z | NONE | Hi @simonw I hope you find this issue ok, the idea is provide some documentation to other users like me about how to solve this problem and save some time. Following the instructions from the `README.md` I've faced this error: ```bash (venv) mgreco@pop-os apple-health master* (23:44|0s) $ healthkit-to-sqlite apple_health_export/export.xml healthkit.db --xml Importing from HealthKit [------------------------------------] 0% Traceback (most recent call last): File "/home/mgreco/github/mmngreco/apple-health/venv/bin/healthkit-to-sqlite", line 33, in <module> sys.exit(load_entry_point('healthkit-to-sqlite', 'console_scripts', 'healthkit-to-sqlite')()) File "/home/mgreco/github/mmngreco/apple-health/venv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__ return self.main(*args, **kwargs) File "/home/mgreco/github/mmngreco/apple-health/venv/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/mgreco/github/mmngreco/apple-health/venv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/mgreco/github/mmngreco/apple-health/venv/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/home/mgreco/github/mmngreco/apple-health/.deps/healthkit-to-sqlite/healthkit_to_sqlite/cli.py", line 57, in cli convert_xml_to_sqlite(fp, db, progress_callback=bar.update, zipfile=zf) File "/home/mgreco/github/mmngreco/apple-health/.deps/healthkit-to-sqlite/healthkit_to_sqlite/utils.py", line 25, in convert_xml_to_sqlite for tag, el in find_all_tags( File "/home/mgreco/github/mmngreco/apple-health/.deps/healthkit-to-sqlite/healthkit_to_sqlite/utils.py", line 12, in find_all_tags for event, el in parser.read_events(): File "/home/mgreco/github/mmngreco/apple-health/venv/lib/python3.10/xml/etree/ElementTree.py", line 1324, in read_events raise event File "/home/mgreco/gith… | 197882382 | issue | { "url": "https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/24/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1646068413 | I_kwDOBm6k_c5iHQK9 | 2048 | Test failures encountered while packaging for GNU Guix | 8332263 | open | 0 | 0 | 2023-03-29T15:36:54Z | 2023-03-29T15:36:54Z | NONE | Hello, While reviewing a packaged submitted to Guix to add `datasette`, the test suite produces the following errors: ``` =================================== FAILURES =================================== _________________________ test_row_strange_table_name __________________________ [gw21] linux -- Python 3.9.9 /gnu/store/slsh0qjv5j68xda2bb6h8gsxwyi1j25a-python-wrapper-3.9.9/bin/python app_client = <datasette.utils.testing.TestClient object at 0x7fffef099be0> def test_row_strange_table_name(app_client): response = app_client.get( "/fixtures/table~2Fwith~2Fslashes~2Ecsv/3.json?_shape=objects" ) > assert response.status == 200 E assert 400 == 200 E + where 400 = <datasette.utils.testing.TestResponse object at 0x7fffef236a30>.status /tmp/guix-build-datasette-0.64.2.drv-0/source/tests/test_api.py:701: AssertionError ----------------------------- Captured stderr call ----------------------------- ERROR: conn=<sqlite3.Connection object at 0x7fffeedfe5d0>, sql = 'select rowid, * from [table%7E2Fwith%7E2Fslashes%7E2Ecsv] where "rowid"=:p0', params = {'p0': '3'}: no such table: table%7E2Fwith%7E2Fslashes%7E2Ecsv _______________ test_database_page_for_database_with_dot_in_name _______________ [gw15] linux -- Python 3.9.9 /gnu/store/slsh0qjv5j68xda2bb6h8gsxwyi1j25a-python-wrapper-3.9.9/bin/python app_client_with_dot = <datasette.utils.testing.TestClient object at 0x7fffef3416a0> def test_database_page_for_database_with_dot_in_name(app_client_with_dot): response = app_client_with_dot.get("/fixtures~2Edot.json") > assert response.status == 200 E assert 302 == 200 E + where 302 = <datasette.utils.testing.TestResponse object at 0x7fffef089fa0>.status /tmp/guix-build-datasette-0.64.2.drv-0/source/tests/test_api.py:633: AssertionError ___________________ test_tilde_encoded_database_names[fo%o] ____________________ [gw6] linux -- Python 3.9.9 /gnu/store/slsh0qjv5j68xda2bb6h8gsxwyi1j25a-python-wrapper-3.9.9/b… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2048/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1590183272 | I_kwDOBm6k_c5eyEVo | 2027 | How to redirect from "/" to a specific db/table | 1350673 | open | 0 | 4 | 2023-02-18T03:14:01Z | 2023-03-08T04:42:22Z | NONE | Using nginx to redirect public IP to the local uvicorn server as 'normal'. I can't figure out how to redirect such that '/' results in accessing the one db/table I want to serve; redirecting / to /db/table breaks some of the CSS; fooling with base_url doesn't seem to help. Can someone explain this, if it's possible? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2027/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
828858421 | MDU6SXNzdWU4Mjg4NTg0MjE= | 1258 | Allow canned query params to specify default values | 1385831 | open | 0 | 5 | 2021-03-11T07:19:02Z | 2023-02-20T23:39:58Z | NONE | If I call a canned query that includes named parameters, without passing any parameters, datasette runs the query anyway, resulting in an HTTP status code 400, and a visible error in the browser, with only a link back to home. This means that one of the default links on https://site/database/ will lead to a broken page with no apparent way out. ![image](https://user-images.githubusercontent.com/1385831/110748683-13e72300-820e-11eb-855c-32e03dfef5bf.png) Is there any way to skip performing the query when parameters aren't supplied, but otherwise render the usual canned query page? Alternatively, can I supply default values for my parameters, either when defining my canned queries or when linking to the canned query page from the default database template. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1258/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1592327343 | I_kwDOBm6k_c5e6Pyv | 2029 | Sorry Simon, didn't know how else to contact you | 5804626 | open | 0 | 0 | 2023-02-20T19:02:53Z | 2023-02-20T19:02:53Z | NONE | Hi Simon, Would you be willing to chat with me about Datasette? I have some questions. I am working on a project to evaluate data ingestion tools for a research organization and I ran across Datasette. I have looked through a lot of your documentation, but still have some questions, which are very specific. If you would be willing to write me back about this, my email is laura@renci.org. Thanks, Laura | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2029/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1323346408 | I_kwDOBm6k_c5O4Kno | 1775 | i18n support | 428820 | open | 0 | 9 | 2022-07-31T02:51:04Z | 2023-02-10T18:04:40Z | NONE | I want contribute for translate UI to es, de, de and it if you share strings | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1775/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1577548579 | I_kwDOBm6k_c5eB3sj | 2021 | Docker images for 1.0 alphas? | 1563881 | open | 0 | 0 | 2023-02-09T09:35:52Z | 2023-02-09T09:35:52Z | NONE | Hi, would you consider putting 1.0alpha images on Dockerhub? (Also, how usable are the alphas?) | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2021/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1575880841 | I_kwDOBm6k_c5d7giJ | 2020 | Documentation refers to "off" setting; doesn't seem to work, "false" does | 1350673 | open | 0 | 0 | 2023-02-08T10:38:10Z | 2023-02-08T10:38:10Z | NONE | https://docs.datasette.io/en/stable/settings.html#suggest-facets, among others, suggests using "off" to disable the setting; however, this doesn't appear to work in the JSON config files, where it apparently needs to be a "JSON boolean" and have the values "true" or "false". Perhaps the Python code is more flexible?...but either way, the documentation probably should mention it. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2020/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1571207083 | I_kwDOBm6k_c5dprer | 2016 | Database metadata fields like description are not available in the index page template's context | 9993 | open | 0 | 3268330 | 1 | 2023-02-05T02:25:53Z | 2023-02-05T22:56:43Z | NONE | When looping through `databases` in the index.html template, I'd like to print the description of each database alongside its name. But it appears that isn't passed in from the view, unless I'm missing it. It would be great to have that. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2016/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
1557599877 | I_kwDODFE5qs5c1xaF | 12 | location history changes | 14809320 | open | 0 | 0 | 2023-01-26T03:57:25Z | 2023-01-26T03:57:25Z | NONE | not sure if each download is unique, but I had to change some things to work with the takeout zip I made 2023-01-25 filename changed from "Location History.json" to "Records.json" `"timestampMs"` is not present, `"timestamp"` is roughly iso timestamp ```py def get_timestamp_ms(raw_timestamp): try: return datetime.datetime.strptime(raw_timestamp, "%Y-%m-%dT%H:%M:%SZ").timestamp() except ValueError: return datetime.datetime.strptime(raw_timestamp, "%Y-%m-%dT%H:%M:%S.%fZ").timestamp() def save_location_history(db, zf): location_history = json.load( zf.open("Takeout/Location History/Records.json") ) db["location_history"].upsert_all( ( { "id": id_for_location_history(row), "latitude": row["latitudeE7"] / 1e7, "longitude": row["longitudeE7"] / 1e7, "accuracy": row["accuracy"], "timestampMs": get_timestamp_ms(row["timestamp"]), "when": row["timestamp"], } for row in location_history["locations"] ), pk="id", ) def id_for_location_history(row): # We want an ID that is unique but can be sorted by in # date order - so we use the isoformat date + the first # 6 characters of a hash of the JSON first_six = hashlib.sha1( json.dumps(row, separators=(",", ":"), sort_keys=True).encode("utf8") ).hexdigest()[:6] return "{}-{}".format( row['timestamp'], first_six, ) ``` example locations from mine ```json { "latitudeE7": 427220206, "longitudeE7": -923423972, "accuracy": 10, "deviceTag": -1312429967, "deviceDesignation": "PRIMARY", "timestamp": "2019-01-08T23:31:50.867Z" } ``` ```json { "latitudeE7": 427011317, "longitudeE7": -923448300, "accuracy": 5, "deviceTag": -1312429967, "deviceDesignation": "PRIMARY", "timestamp": "2019-01-08T23:33:53Z" }, ``` | 206649770 | issue | { "url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/12/reactions", "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 2 } |
||||||||
1553615704 | I_kwDOBm6k_c5cmktY | 2001 | Datasette is not compatible with SQLite's strict quoting compilation option | 406380 | open | 0 | 4 | 2023-01-23T19:10:07Z | 2023-01-25T04:59:58Z | NONE | I have linked Python3.11 on macOS against recent SQLite that was compiled using `-DSQLITE_DQS=0`. This option disables interpretation of double-quoted identifiers as string literals, described in the SQLite docs as a "MySQL 3.x misfeature". See https://www.sqlite.org/quirks.html#dblquote for background. Datasette uses the double-quote syntax in a number of key places, and is thus completely broken in this environment. My experience was to `pip install datasette`, then run `datasette serve -I my-data.db`. When I visit `http://127.0.0.1:8001` I get a 500 response. The error: `sqlite3.OperationalError: no such column: geometry_columns` The responsible SQL: `'select 1 from sqlite_master where tbl_name = "geometry_columns"'` I then installed datasette from GitHub master in development mode and changed the offending SQL to use correct quotes: `"select 1 from sqlite_master where tbl_name = 'geometry_columns'"`. With this change, I get a little further, but have the same problem with the first table name in my database (in my case, "Meta"): ``` OperationalError: no such column: Meta Traceback (most recent call last): File "/Users/gwk/external/datasette/datasette/app.py", line 1522, in route_path response = await view(request, send) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/gwk/external/datasette/datasette/views/base.py", line 151, in view return await self.dispatch_request(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/gwk/external/datasette/datasette/views/base.py", line 105, in dispatch_request response = await handler(request) ^^^^^^^^^^^^^^^^^^^^^^ File "/Users/gwk/external/datasette/datasette/views/index.py", line 70, in get "fts_table": await db.fts_table(table), ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/gwk/external/datasette/datasette/database.py", line 363, in fts_table return await self.execute_fn(lambda conn: detect_fts(conn, table)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2001/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1536851861 | I_kwDOBm6k_c5bmn-V | 1994 | Stuck on loading screen | 10913053 | open | 0 | 1 | 2023-01-17T18:33:49Z | 2023-01-23T08:21:08Z | NONE | Can’t actually open it! Downloaded today from the releases tab Running macOS13.1 ``` bin/python3.9 --version Python 3.9.6 Took 83ms bin/python3.9 --version Python 3.9.6 Took 113ms bin/pip install datasette>=0.59 datasette-app-support>=0.11.6 datasette-vega>=0.6.2 datasette-cluster-map>=0.17.1 datasette-pretty-json>=0.2.1 datasette-edit-schema>=0.4 datasette-configure-fts>=1.1 datasette-leaflet>=0.2.2 --disable-pip-version-check Requirement already satisfied: datasette>=0.59 in lib/python3.9/site-packages (0.63) Requirement already satisfied: datasette-app-support>=0.11.6 in lib/python3.9/site-packages (0.11.6) Requirement already satisfied: datasette-vega>=0.6.2 in lib/python3.9/site-packages (0.6.2) Requirement already satisfied: datasette-cluster-map>=0.17.1 in lib/python3.9/site-packages (0.17.2) Requirement already satisfied: datasette-pretty-json>=0.2.1 in lib/python3.9/site-packages (0.2.2) Requirement already satisfied: datasette-edit-schema>=0.4 in lib/python3.9/site-packages (0.5.1) Requirement already satisfied: datasette-configure-fts>=1.1 in lib/python3.9/site-packages (1.1) Requirement already satisfied: datasette-leaflet>=0.2.2 in lib/python3.9/site-packages (0.2.2) Requirement already satisfied: click>=7.1.1 in lib/python3.9/site-packages (from datasette>=0.59) (8.1.3) Requirement already satisfied: hupper>=1.9 in lib/python3.9/site-packages (from datasette>=0.59) (1.10.3) Requirement already satisfied: pint>=0.9 in lib/python3.9/site-packages (from datasette>=0.59) (0.20.1) Requirement already satisfied: PyYAML>=5.3 in lib/python3.9/site-packages (from datasette>=0.59) (6.0) Requirement already satisfied: httpx>=0.20 in lib/python3.9/site-packages (from datasette>=0.59) (0.23.0) Requirement already satisfied: aiofiles>=0.4 in lib/python3.9/site-packages (from datasette>=0.59) (22.1.0) Requirement already satisfied: asgi-csrf>=0.9 in lib/python3.9/site-packages (from datasette>=0.59) (0.9) Requirement already satisfied: asgiref>=3.2.10 in lib/python3.9/site-packages… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1994/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1550536442 | I_kwDOCGYnMM5ca076 | 521 | Custom JSON encoder | 31504 | open | 0 | 0 | 2023-01-20T09:19:40Z | 2023-01-20T09:19:40Z | NONE | It would be nice if we could specify a custom encoder (and decoder) for types that will need extra deserialisation – e.g., sets, enums or sparse matrices – or even project-specific types | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/521/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1533673397 | I_kwDOBm6k_c5baf-1 | 1991 | fts5 tables are not auto-detected and hidden | 83819 | open | 0 | 0 | 2023-01-15T06:00:42Z | 2023-01-20T04:54:24Z | NONE | I set up a [Datasette instance](https://huggingface.co/spaces/Sygil/INE-dataset-explorer/tree/main) and was following the docs on full-text search. When I used fts4, datasette automatically hid the FTS tables and added the FTS search box where appropriate, but when I changed to fts5 it no longer does either. If I [manually set](https://huggingface.co/spaces/keturn/INED-datasette/blob/main/metadata.json#L9) `fts_table` for a view, then search does work as expected. My table and view creation code looks like this: ```py connection.execute("""CREATE TABLE IF NOT EXISTS captions(image_key text PRIMARY KEY, caption text NOT NULL) """) connection.execute("""CREATE VIRTUAL TABLE captions_fts USING fts5(caption, image_key UNINDEXED, content=captions) """) ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1991/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1538197093 | I_kwDOBm6k_c5brwZl | 1995 | foreign_keys error 500 | 137183 | open | 0 | 0 | 2023-01-18T15:27:36Z | 2023-01-18T16:44:01Z | NONE | **Error 500 expected string or bytes-like object** [espial-new.sqlite3.zip](https://github.com/simonw/datasette/files/10447965/espial-new.sqlite3.zip) run `datasette espial-new.sqlite3` & click on any table other than `User` ``` /home/jon/.local/lib/python3.10/site-packages/datasette/app.py:814 in │ │ expand_foreign_keys │ │ │ │ 811 │ │ │ from {other_table} │ │ 812 │ │ │ where {other_column} in ({placeholders}) │ │ 813 │ │ """.format( │ │ ❱ 814 │ │ │ other_column=escape_sqlite(fk["other_column"]), │ │ 815 │ │ │ label_column=escape_sqlite(label_column), │ │ 816 │ │ │ other_table=escape_sqlite(fk["other_table"]), │ │ 817 │ │ │ placeholders=", ".join(["?"] * len(set(values))), │ │ │ │ ╭───────────────────────────── locals ──────────────────────────────╮ │ │ │ column = 'user_id' │ │ │ │ database = 'espial-new' │ │ │ │ db = <Database: espial-new (mutable, size=53248)> │ │ │ │ fk = { │ │ │ │ │ 'column': 'user_id', │ │ │ │ │ 'other_table': 'user', │ │ │ │ │ 'other_column': None │ │ │ │ } │ │ │ │ foreign_keys = [ │ │ │ │ │ { … | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1995/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1532000914 | I_kwDOBm6k_c5bUHqS | 1990 | Suggestion: Highlight error messages ('These facets timed out') | 116795 | open | 0 | 0 | 2023-01-13T09:40:58Z | 2023-01-13T09:40:58Z | NONE | I had trouble figuring out why faceting didn't work in some instances, it took a while before I noticed the _These facets timed out_ notice. It might help if that would be highlighted, or fading out highlight - if one might think it would be too visually disturbing. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1990/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1524431805 | I_kwDODEm0Qs5a3Pu9 | 72 | Import thread, including self- and others' replies | 601708 | open | 0 | 0 | 2023-01-08T09:51:06Z | 2023-01-08T09:51:06Z | NONE | statuses-lookup, home-timeline, mentions (only for auth'ed user) don't cover this. `twitter-to-sqlite fetch-thread tw-group1.db 1234123412341234` twitter-to-sqlite focuses on archiving users, but does not easily support archiving conversations or community activity. For reference, this is [implemented in twarc](https://sourcegraph.com/github.com/DocNow/twarc/-/blob/twarc/client.py?L708-766&subtree=true), using a search, optionally recursively. Other research suggests that this formerly, or currently, requires a [search query](https://stackoverflow.com/a/30480103/1020467), use of [undocumented `related_results` api](https://stackoverflow.com/a/9419346/1020467), or with requested inclusion of [newer conversation_id](https://stackoverflow.com/a/68115718/1020467) with subsequent query. | 206156866 | issue | { "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/72/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1505411725 | I_kwDODFdgUs5ZusKN | 78 | self-hosted or corp github enterprise | 549431 | open | 0 | 0 | 2022-12-20T22:51:45Z | 2022-12-20T22:51:45Z | NONE | We use github enterprise at work and I would like to use this tool to pull info from that site rather than the public github.com instance. Is there an option for this? If not, can one be added for a custom repo URL? | 207052882 | issue | { "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/78/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
664485022 | MDU6SXNzdWU2NjQ0ODUwMjI= | 46 | Feature: pull request reviews and comments | 1326704 | open | 0 | 6 | 2020-07-23T13:43:45Z | 2022-12-20T14:40:15Z | NONE | Hi there! I saw your [presentation at Boston Python](https://www.meetup.com/bostonpython/events/271887195). I'm already a light user of Datasette (thank you!), but wasn't aware of this project. I've been working on a "pull request dashboard" to get a comprehensive view of the state of open PR's, esp. related to reviews (i.e., pending, approved, changes requested). Currently it's a CLI command, but I thought a Datasette UI might be fun. I see that PR's are available from the `issues` command, but I don't see reviews anywhere. From the [API docs](https://docs.github.com/en/rest/reference/pulls#reviews), it looks like there are separate endpoints for those (as well as pull requests in general). What do you think about adding that? Would you accept a PR? Any sense of the level of effort? | 207052882 | issue | { "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1504352503 | I_kwDOBm6k_c5Zqpj3 | 1968 | Allow to hide some queries in metadata.yml | 562352 | open | 0 | 0 | 2022-12-20T10:45:41Z | 2022-12-20T10:45:41Z | NONE | By default all queries are displayed. But there are many cases where it would be interesting to hide the queries by default: * the website is targeting non-tech people * the query is veeeeeery long ([eg.](https://mirabelle.openfoodfacts.org/products/energy_calculator)) * reading the query is not important for the users, they only want to see the result Of course, the user still could have the option to see the query. It could be an option in the metadata file: ```yml databases: awesome_db: tables: products: hide_sql: true queries: great_query: hide_sql: true sql: select * from products where code = :barcode ``` The priority could be: * no option in the metadata and nothing in the URL: query displayed * hide_sql in the metadata and nothing in the URL: query displayed as asked in the metadata * hide_sql in the metadata and &_hide_sql= in the URL: query as asked in the URL See also: #1824 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1968/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1485017981 | I_kwDODEpn8M5Yg5N9 | 2 | table identifications has no column named previous_observation_taxon | 520541 | open | 0 | 0 | 2022-12-08T16:47:17Z | 2022-12-08T16:47:17Z | NONE | Installed successfully with pip and ran `inaturalist-to-sqlite inaturalist.db simonw` and got the error: ``` sqlite3.OperationalError: table identifications has no column named previous_observation_taxon ``` | 206202864 | issue | { "url": "https://api.github.com/repos/dogsheep/inaturalist-to-sqlite/issues/2/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1452572348 | I_kwDOBm6k_c5WlH68 | 1900 | datasette package --spatialite throws error during build | 419145 | open | 0 | 11 | 2022-11-17T02:03:28Z | 2022-11-18T08:00:38Z | NONE | Hello! Attempting to use `datasette package` to bundle up a SpatiaLite DB and I'm getting this error during the `docker build`: ``` sqlite3.OperationalError: /usr/lib/x86_64-linux-gnu/mod_spatialite.so.so: cannot open shared object file: No such file or directory ``` Seems to be throwing when this step is ran: ``` ERROR [6/6] RUN datasette inspect results.db --inspect-file inspect-data.json ``` This is with `v0.63.1`. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1900/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1453134846 | I_kwDOCGYnMM5WnRP- | 513 | Add or document streamlined workflow for importing Datasette csv / json exports | 19328961 | open | 0 | 0 | 2022-11-17T10:54:47Z | 2022-11-17T10:54:47Z | NONE | I'm working on some small front-end enhancements to the laion-aesthetic-datasette project, and I wanted to partially populate a database directly using exports from the existing Datasette instance instead of downloading the parquet files and creating my own multi-GB database. There have been a number of small issues that are certainly related to my relative lack of familiarity with the toolkit, but that are still surprising. For example: a CSV export of the images table (http://laion-aesthetic.datasette.io/laion-aesthetic-6pls.csv?sql=select+rowid%2C+url%2C+text%2C+domain_id%2C+width%2C+height%2C+similarity%2C+punsafe%2C+pwatermark%2C+aesthetic%2C+hash%2C+__index_level_0__+from+images+order+by+random%28%29+limit+100) has nested single quotes, double quotes, and commas that aren't handled by rows_from_file. Similarly, the json output has to be manually transformed to add the column names and remove extraneous information before sqlite_utils can import it. I was able to work through these issues, but as an enhancement it would be really helpful to create or document a clear workflow that avoids the friction of this data transformation. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/513/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1452360613 | I_kwDOBm6k_c5WkUOl | 1895 | Avoid using host name when building absolute URLs? | 14294 | open | 0 | 0 | 2022-11-16T22:21:27Z | 2022-11-16T22:21:27Z | NONE | When deploying Datasette to Cloud Run and rewriting certain routes from a Firebase app to the Cloud Run service, some of the URLs in the page start with `https://[service].run.app` rather than the (custom) domain of the Firebase app. I guess this is because a) the custom domain of the Firebase app isn't being passed through in the `host` header of the request to the Cloud Run instance and b) the `absolute_url` function in Datasette is using information from the request to build the URL. Would it be possible to not use the host name when building the absolute URLs, i.e. only include the path in the URL? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1895/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1433576351 | I_kwDOBm6k_c5VcqOf | 1880 | Datasette with many and large databases > Memory use | 525934 | open | 0 | 4 | 2022-11-02T18:10:27Z | 2022-11-16T17:50:29Z | NONE | > Datasette maintains an in-memory SQLite database with details of the the databases, tables and columns for all of the attached databases. The above is from the docs ^. There's two problems here - the number of datasette "instances" in a single server/VM and the size of the database itself. We want the **opposite** of in-memory, including what happens on SQLlite - documented in https://www.sqlite.org/inmemorydb.html From the context in https://github.com/simonw/datasette/issues/1150 - does it mean datasette is memory-bound to the size of the dataset - which might be a deal-breaker for many large-scale use cases? In an extreme case - let's say a single server had 100 SQLlite databases, which would enable 100 "instances" of datasette to run, one per client (e.g. in a SaaS multi-tenant environment). How could we achieve all these goals: 1. Allow any _one_ of these 100 databases to grow to say 2Tb in size 2. Have one datasette instance, which connects to 1 of the 100 instances, based on incoming credentials/tenant ID 3. Minimize memory use entirely - both by datasette and SQLlite, such that almost all operations are executed in real-time on-disk with little to no memory consumption per-tenant, or per-database. Any ideas appreciated - we're looking to use this in a SaaS type of setting - many instances, single server. @simonw great work on datasette, in general! Possibly related to https://github.com/simonw/datasette/issues/1480 but we don't want use any kind of serverless infra - this is a long-running VM/server. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1880/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1446657889 | I_kwDOBm6k_c5WOj9h | 1885 | Integrate inside GUI app (tkinter) | 5115787 | open | 0 | 0 | 2022-11-13T00:10:43Z | 2022-11-13T00:11:09Z | NONE | Hi, I'd like to integrate datasette inside a tkinter app. The app should be able to start/stop datasette server. How could I integrate datasette inside my app, so it can start and stop datasette server? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1885/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
802513359 | MDU6SXNzdWU4MDI1MTMzNTk= | 1217 | Possible to deploy as a python app (for Rstudio connect server)? | 6165713 | open | 0 | 4 | 2021-02-05T22:21:24Z | 2022-11-04T11:37:52Z | NONE | Is it possible to deploy a `datasette` application as a python web app? In my enterprise, I have option to deploy python apps via [Rstudio Connect](https://github.com/rstudio/rsconnect-python), and I would like to publish a `datasette` dashboard for sharing. I welcome any pointers to converting `datasette serve` into a python app that can be run as something like `python datasette.py --my_data.db` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1217/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1077560091 | I_kwDODEm0Qs5AOkMb | 61 | Data Pull fails for "Essential" level access to the Twitter API (for Documentation) | 57161638 | open | 0 | 1 | 2021-12-11T14:59:41Z | 2022-10-31T14:47:58Z | NONE | Per Twitter documentation: https://developer.twitter.com/en/docs/twitter-api/getting-started/about-twitter-api#v2-access-leve This isn't any fault of twitter-to-sqlite of course, but it should probably be documented as a side-note. ![image](https://user-images.githubusercontent.com/57161638/145681272-8c85b3b9-be95-44ff-9760-1bafa4917ce2.png) And this is how I'm surfacing the message from utils.py: ![image](https://user-images.githubusercontent.com/57161638/145681005-2776c0ad-9822-4461-b43a-450ab2e828eb.png) | 206156866 | issue | { "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/61/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1063982712 | I_kwDODEm0Qs4_axZ4 | 60 | Execution on Windows | 1733616 | open | 0 | 1 | 2021-11-26T00:24:34Z | 2022-10-14T16:58:27Z | NONE | My installation on Windows using pip has been successful. I have Python 3.6. How do I run twitter-to-sqlite? I cannot even figure out how "auth" is a command. I have python on my path: C:\prog\python\Python36;C:\prog\python\Python36\Scripts Where should the commands be executed, and where are the files created? Could some basics please be added to the documentation to get beginners started? | 206156866 | issue | { "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/60/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1387712501 | I_kwDOBm6k_c5Sts_1 | 1824 | Convert &_hide_sql=1 to #_hide_sql | 562352 | open | 0 | 1 | 2022-09-27T12:53:31Z | 2022-10-05T12:56:27Z | NONE | Hiding the SQL textarea with `&_hide_sql=1` enforces a page reload, which can take several seconds and use server resource (which is annoying for big database or complex queries). It could probably be done with a few lines of Javascript (I'm going to see if I can do that). | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1824/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1363552780 | I_kwDOBm6k_c5RRioM | 1805 | truncate_cells_html does not work for links? | 562352 | open | 0 | 7 | 2022-09-06T16:41:29Z | 2022-10-03T09:18:06Z | NONE | We have many links inside our dataset (please don't blame us ;-). When I use `--settings truncate_cells_html 60` it is not working for the links. Eg. https://images.openfoodfacts.org/images/products/000/000/000/088/nutrition_fr.5.200.jpg (87 chars) is not truncated: ![image](https://user-images.githubusercontent.com/562352/188689045-1946d776-2305-47cf-bfc5-b5685b9206b7.png) IMHO It would make sense that links should be treated as HTML. The link should work of course, but Datasette could truncate it: [https://images.openfoodfacts.org/images/products/00[...].jpg](https://images.openfoodfacts.org/images/products/000/000/000/088/nutrition_fr.5.200.jpg) | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1805/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
reopened | |||||||
459882902 | MDU6SXNzdWU0NTk4ODI5MDI= | 526 | Stream all results for arbitrary SQL and canned queries | 50578294 | open | 0 | 23 | 2019-06-24T13:09:45Z | 2022-09-28T04:01:25Z | NONE | I think that there is a difficulty with canned queries. When I want to stream all results of a canned query TwoDays I get only first 1.000 records. Example: `http://myserver/history_sample/two_days.csv?_stream=on` returns only first 1.000 records. If I do the same with the whole database i.e. `http://myserver/history_sample/database.csv?_stream=on` I get correctly all records. Any ideas? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/526/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1082651698 | I_kwDOCGYnMM5Ah_Qy | 358 | Support for CHECK constraints | 11597658 | open | 0 | 7 | 2021-12-16T21:19:45Z | 2022-09-25T07:15:59Z | NONE | Hi, I noticed the `transform.table()` method doesn't have an option to add/change or drop a check constraint (see https://sqlite.org/lang_createtable.html -> 3.7 Check Constraints. would be great to have this as an option! | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/358/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1375792876 | I_kwDOBm6k_c5SAO7s | 1811 | Drop-down menu with "REGEXP" choice | 562352 | open | 0 | 0 | 2022-09-16T11:06:18Z | 2022-09-16T15:30:31Z | NONE | Drop-down menu below could add "REGEXP" choice when REGEXP sqlite extension is installed and used ![image](https://user-images.githubusercontent.com/562352/190675352-810fbdca-0827-4034-8b9f-fd67d5c35afb.png) Not sure. Close the issue if you don't find it relevant. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1811/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1128466114 | I_kwDOCGYnMM5DQwbC | 406 | Creating tables with custom datatypes | 82988 | open | 0 | 5 | 2022-02-09T12:16:31Z | 2022-09-15T18:13:50Z | NONE | Via https://stackoverflow.com/a/18622264/454773 I note the ability to register custom handlers for novel datatypes that can map into and out of things like sqlite `BLOB`s. From a quick look and a quick play, I didn't spot a way to do this in `sqlite_utils`? For example: ```python # Via https://stackoverflow.com/a/18622264/454773 import sqlite3 import numpy as np import io def adapt_array(arr): """ http://stackoverflow.com/a/31312102/190597 (SoulNibbler) """ out = io.BytesIO() np.save(out, arr) out.seek(0) return sqlite3.Binary(out.read()) def convert_array(text): out = io.BytesIO(text) out.seek(0) return np.load(out) # Converts np.array to TEXT when inserting sqlite3.register_adapter(np.ndarray, adapt_array) # Converts TEXT to np.array when selecting sqlite3.register_converter("array", convert_array) ``` ```python from sqlite_utils import Database db = Database('test.db') # Reset the database connection to used the parsed datatype # sqlite_utils doesn't seem to support eg: # Database('test.db', detect_types=sqlite3.PARSE_DECLTYPES) db.conn = sqlite3.connect(db_name, detect_types=sqlite3.PARSE_DECLTYPES) # Create a table the old fashioned way # but using the new custom data type vector_table_create = """ CREATE TABLE dummy (title TEXT, vector array ); """ cur = db.conn.cursor() cur.execute(vector_table_create) # sqlite_utils doesn't appear to support custom types (yet?!) # The following errors on the "array" datatype """ db["dummy"].create({ "title": str, "vector": "array", }) """ ``` We can then add / retrieve records from the database where the datatype of the `vector` field is a custom registered `array` type (which is to say, a `numpy` array): ```python import numpy as np db["dummy"].insert({'title':"test1", 'vector':np.array([1,2,3])}) for row in db.query("SELECT * FROM dummy"): print(row['title'], row['vector'], type(row['vector'])) """ test1 [1 2 3] <class '… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/406/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1365741480 | I_kwDOBm6k_c5RZ4-o | 1806 | UX to recover from Error 500: "You can only execute one statement at a time." | 1470389 | open | 0 | 0 | 2022-09-08T08:01:27Z | 2022-09-08T08:01:37Z | NONE | When using the Custom SQL query view, when accidentally adding a semicolon in the middle of my query, datasette errors with: > # Error 500 > You can only execute one statement at a time. The error view doesn't contain the query textarea anymore, so it provides no easy way recover from the error. It would be nice if I could change and submit it again. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1806/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1363244199 | I_kwDODFdgUs5RQXSn | 75 | Fetch repos doesn't support organisations | 2757699 | open | 0 | 0 | 2022-09-06T12:55:06Z | 2022-09-06T12:55:06Z | NONE | Say I want to get all my Github Org's repos info, for data analysis. Not just the public repos, but also the private/internal repos. The endpoints are different for organisation, and this tool doesn't take it into account: https://github.com/dogsheep/github-to-sqlite/blob/ace13ec3d98090d99bd71871c286a4a612c96a50/github_to_sqlite/utils.py#L453 https://github.com/dogsheep/github-to-sqlite/blob/ace13ec3d98090d99bd71871c286a4a612c96a50/github_to_sqlite/utils.py#L455 The endpoints for organisation repos is instead ([source](https://docs.github.com/en/rest/repos/repos#list-organization-repositories)): `url = "https://api.github.com/orgs/{}/repos".format(username)` Let's add support for organisations repo scraping. | 207052882 | issue | { "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/75/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1353074021 | I_kwDOCGYnMM5QpkVl | 474 | Add an option for specifying column names when inserting CSV data | 14294 | open | 0 | 3 | 2022-08-27T15:29:59Z | 2022-08-31T03:42:36Z | NONE | https://sqlite-utils.datasette.io/en/stable/cli.html#csv-files-without-a-header-row > The first row of any CSV or TSV file is expected to contain the names of the columns in that file. > If your file does not include this row, you can use the `--no-headers` option to specify that the tool should not use that fist row as headers. > If you do this, the table will be created with column names called `untitled_1` and `untitled_2` and so on. You can then rename them using the `sqlite-utils transform ... --rename` command. It would be nice to be able to specify the column names when importing CSV/TSV without a header row, via an extra command line option. (renaming a column of a large table can take a long time, which makes it an inconvenient workaround) | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/474/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1353411865 | I_kwDODEpn8M5Qq20Z | 1 | Problem with my user | 2467 | open | 0 | 0 | 2022-08-28T16:59:37Z | 2022-08-28T16:59:37Z | NONE | If I call the program with: inaturalist-to-sqlite inaturalist.db ftricas the program exits with an error: `Importing 36 observations Traceback (most recent call last): File "/home/ftricas/.pyenv/versions/3.10.6/bin/inaturalist-to-sqlite", line 8, in <module> sys.exit(cli()) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1130, in __call__ return self.main(*args, **kwargs) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/inaturalist_to_sqlite/cli.py", line 51, in cli save_observation(observation, db) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/inaturalist_to_sqlite/utils.py", line 34, in save_observation db["observations"] File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 2965, in insert return self.insert_all( File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 3068, in insert_all self.create( File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 1564, in create self.db.create_table( File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 951, in create_table sql = self.create_table_sql( File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 765, in create_table_sql foreign_keys = self.resolve_foreign_keys(name, foreign_keys or []) File "/home/ftricas/.pyenv/versions/3.10.6/… | 206202864 | issue | { "url": "https://api.github.com/repos/dogsheep/inaturalist-to-sqlite/issues/1/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1347717749 | I_kwDOBm6k_c5QVIp1 | 1791 | Updating metadata.json on Datasette for MacOS | 1780782 | open | 0 | 1 | 2022-08-23T10:41:16Z | 2022-08-23T13:29:51Z | NONE | I've installed Datasette for Mac as per [the documentation](https://docs.datasette.io/en/stable/installation.html#datasette-desktop-for-mac) and it's working great! However, I'm not sure how to go about adding something like "[Canned Queries](https://docs.datasette.io/en/stable/sql_queries.html#canned-queries)" or utilising other advanced features or settings by manipulating the `metadata.json` or `settings.json` files. I can view these files from the Datasette App from the top right "burger" menu but it only shows the contents of the file with no way to edit or change it. Am I missing something? Where can I update the `metadata.json` file using the MacOS App? PS: This is a fantastic tool! Thanks so much for all the effort and especially adding a bunch of different ways to get started quickly! | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1791/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1337541526 | I_kwDOBm6k_c5PuUOW | 1780 | `facet_time_limit_ms` and `sql_time_limit_ms` overlap? | 53165 | open | 0 | 1 | 2022-08-12T17:55:37Z | 2022-08-15T23:50:08Z | NONE | I needed more than the default 200ms to facet a specific column in a database I was working with, so I ran `datasette` with `--setting facet_time_limit_ms 30000` — definitely overkill! But it still didn't work; it took a moment to realize I also needed to up my `sql_time_limit_ms` to something larger too. I'm happy to submit a PR that documents this behavior if it's helpful. Or, if there's a code change we'd like to make (like making sure `sql_time_limit_ms` is always set to the larger of itself and `facet_time_limit_ms`), happy to do that too. Apologies if I missed this somewhere in the docs. And: thanks. I'm really enjoying the simple, effective tooling datasette gives me out of the box for exploring my databases! | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1780/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1323332006 | I_kwDOBm6k_c5O4HGm | 1774 | Request of feature for mongo | 428820 | open | 0 | 0 | 2022-07-31T01:00:05Z | 2022-07-31T01:00:05Z | NONE | Will love if can we use datasette for mongo and all pipelines and workflows | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1774/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1227571375 | I_kwDOCGYnMM5JK0Cv | 431 | Allow making m2m relation of a table to itself | 738408 | open | 0 | 3 | 2022-05-06T08:30:43Z | 2022-06-23T14:12:51Z | NONE | I am building a database, in which one of the tables has a many-to-many relationship to itself. As far as I can see, this is not (yet) possible using `.m2m()` in sqlite-utils. This may be a bit of a niche use case, so feel free to close this issue if you feel it would introduce too much complexity compared to the benefits. Example: suppose I have a table of people, and I want to store the information that John and Mary have two children, Michael and Suzy. It would be neat if I could do something like this: ```python from sqlite_utils import Database db = Database(memory=True) db["people"].insert({"name": "John"}, pk="name").m2m( "people", [{"name": "Michael"}, {"name": "Suzy"}], m2m_table="parent_child", pk="name" ) db["people"].insert({"name": "Mary"}, pk="name").m2m( "people", [{"name": "Michael"}, {"name": "Suzy"}], m2m_table="parent_child", pk="name" ) ``` But if I do that, the many-to-many table `parent_child` has only one column: ``` CREATE TABLE [parent_child] ( [people_id] TEXT REFERENCES [people]([name]), PRIMARY KEY ([people_id], [people_id]) ) ``` This could be solved by adding one or two keyword_arguments to `.m2m()`, e.g. `.m2m(..., left_name=None, right_name=None)` or `.m2m(..., names=(None, None))`. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/431/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1280799259 | I_kwDOBm6k_c5MV3Ib | 1761 | ensure_ascii=False | 1473102 | open | 0 | 0 | 2022-06-22T19:58:13Z | 2022-06-22T19:58:30Z | NONE | Hi, thanks for the project! For the JSON output, I would consider defaulting to `ensure_ascii=False` (UTF-8 seems pretty universal) or making it an option. When dealing with non-Latin text, `ensure_ascii=True` (the default) can triple the size of the output. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1761/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1250495688 | I_kwDOCGYnMM5KiQzI | 439 | Misleading progress bar against utf-16-le CSV input | 4068 | open | 0 | 12 | 2022-05-27T08:34:49Z | 2022-06-15T03:53:43Z | NONE | The program crashes without any error. ``` wget "https://artsdatabanken.no/Fab2018/api/export/csv" sqlite-utils create-database test.db sqlite-utils insert --csv --delimiter ";" --encoding "utf-16-le" test test.db csv [------------------------------------] 0% [#################-------------------] 49% 00:00:01 ``` I would like to highlight various issues: 1. sqlite-utils catches exceptions without printing the stacktrace and/or reraising the exception, so there is no easy way to use `pdb` or similar to debug the program, solution: add a debug option 2. Silent crash: this is related to (1.), and it happens when there is a catch-all mechanism; solution: let the program fail. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/439/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1224112817 | I_kwDOCGYnMM5I9nqx | 430 | Document how to use `PRAGMA temp_store` to avoid errors when running VACUUM against huge databases | 9308268 | open | 0 | 2 | 2022-05-03T13:33:58Z | 2022-06-14T23:26:37Z | NONE | I'm trying to figure out a way to get the `table.extract()` method to complete successfully -- I'm not sure if maybe the cause (and a possible solution) of this on Ubuntu Server 22.04 is to adjust some of the PRAGMA values within SQLite itself ... on another Linux system (PopOS), using this method on this same database appears to work just fine. Here's the bit that's causing the error, and the resulting error output: ```python # combine these columns into 1 table "bib_properties" : # best_title # bib_level_code # mat_type # material_code # best_author db["circ_trans"].extract( ["best_title", "bib_level_code", "mat_type", "material_code", "best_author"], table="bib_properties", fk_column="bib_properties_id" ) db["circ_trans"].extract( ["call_number"], table="call_number", fk_column="call_number_id", rename={"call_number": "value"} ) ``` ```python --------------------------------------------------------------------------- OperationalError Traceback (most recent call last) Input In [17], in <cell line: 7>() 1 # combine these columns into 1 table "bib_properties" : 2 # best_title 3 # bib_level_code 4 # mat_type 5 # material_code 6 # best_author ----> 7 db["circ_trans"].extract( 8 ["best_title", "bib_level_code", "mat_type", "material_code", "best_author"], 9 table="bib_properties", 10 fk_column="bib_properties_id" 11 ) 13 db["circ_trans"].extract( 14 ["call_number"], 15 table="call_number", 16 fk_column="call_number_id", 17 rename={"call_number": "value"} 18 ) File ~/jupyter/venv/lib/python3.10/site-packages/sqlite_utils/db.py:1764, in Table.extract(self, columns, table, fk_column, rename) 1761 column_order.append(c.name) 1763 # Drop the unnecessary columns and rename lookup column -> 1764 self.transform( 1765 drop=set(columns), 1766 rename={magic_lookup_column:… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/430/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1236693079 | I_kwDOCGYnMM5JtnBX | 432 | Support `rows_where()`, `delete_where()` etc for attached alias databases | 11597658 | open | 0 | 5 | 2022-05-16T06:38:58Z | 2022-06-14T22:16:48Z | NONE | Hi, I noticed `rows_where()` doesn't return any rows from tables which are from attached databases. The `exists()` function returns false. As far as I can see this is because the `table_names()` function only looks for table names in the current database and not in attached (or temp) databases. Besides, `rows_where()`, also `insert_all()` and `delete_where()` didn't do what I was expecting because of this. For the moment I've patched `table_names()` for myself, see below but I'm not sure what the total impact is on the other functions like lookup truncate etc which all use `exists()`. Also `view_names()` doesn't look for views in attached or temp databases. ```python def table_names(self, fts4: bool = False, fts5: bool = False) -> List[str]: "A list of string table names in this database." where = ["type = 'table'"] if fts4: where.append("sql like '%USING FTS4%'") if fts5: where.append("sql like '%USING FTS5%'") dbs = [x[1] for x in self.execute('pragma database_list').fetchall()] lst=[] for db in dbs: sql = "select name from {} where {}".format(db+".sqlite_master"," AND ".join(where)) lst.extend(r[0] for r in self.execute(sql).fetchall()) return lst ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/432/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1266207143 | I_kwDOBm6k_c5LeMmn | 1755 | Gunicorn | 1176293 | open | 0 | 0 | 2022-06-09T14:18:46Z | 2022-06-09T14:18:46Z | NONE | I've read issue #514 which resulted in running Datasette via systemd as recommended approach. We've also adopted this (for now), but I notice that Uvicorn [says the following](https://www.uvicorn.org/#running-with-gunicorn): > Uvicorn includes a Gunicorn worker class allowing you to run ASGI applications, with all of Uvicorn's performance benefits, while also giving you Gunicorn's fully-featured process management. > > This allows you to increase or decrease the number of worker processes on the fly, restart worker processes gracefully, or perform server upgrades without downtime. > > For production deployments we recommend using gunicorn with the uvicorn worker class. We usually deploy Python applications via Gunicorn for these process management features (e.g. `--daemon` and `--pid`). Is this something that would/could work with Datasette as well? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1755/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1251700382 | I_kwDOBm6k_c5Km26e | 1750 | Allow `label_column` to specify array of columns | 408765 | open | 0 | 0 | 2022-05-28T18:45:48Z | 2022-05-28T18:45:48Z | NONE | I think it would be great if the Datasette metadata would allow the `label_column` table key to list multiple columns. Something like: ```json "tables": { "person": { "label_column": ["first_name", "last_name"] }, ``` It would even be interesting with a "label expression" similar to a Python f-string. E.g. `{row.last_name}, {row.first_name}`. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1750/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1247315144 | I_kwDOBm6k_c5KWITI | 1749 | LDAP auth plugin | 380241 | open | 0 | 0 | 2022-05-25T01:35:12Z | 2022-05-25T01:35:12Z | NONE | A [search of the plugins directory](https://datasette.io/plugins?q=ldap) doesn't turn up anything, but is is possible to set up a Datasette app which uses my organisation's LDAP for auth? If not, how much work would it be to write one (I _may_ have some spare cycles on my team to do this, but we haven't written a datasette plugin before). | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1749/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1221849746 | I_kwDOBm6k_c5I0_KS | 1732 | Custom page variables aren't decoded | 52649 | open | 0 | 2 | 2022-04-30T14:55:46Z | 2022-05-03T01:50:45Z | NONE | I have a page `templates/filer/{filer_id}.html`. It uses `filer_id` in a `sql()` call to fetch data. With 0.61.1 this no longer works because the spaces in IDs isn't preserved. Instead, the escaped version is passed into the template and the id isn't present in my db. Datasette should unescape the url component before passing them into the template. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1732/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1129052172 | I_kwDOBm6k_c5DS_gM | 1633 | base_url or prefix does not work with _exact match | 6613091 | open | 0 | 2 | 2022-02-09T21:45:07Z | 2022-04-28T09:12:56Z | NONE | When i hit "Apply" button to search with "_exact" for a column syntax the URL prefix is removed from the url. ![image](https://user-images.githubusercontent.com/6613091/153293758-0b757d55-5757-4987-992e-9426e69a7956.png) And the result is: ![image](https://user-images.githubusercontent.com/6613091/153294672-87be7809-bb7b-455d-bf1a-41e90bbfa4ae.png) If I add the marked row to url_builder.py it seams to work: ![image](https://user-images.githubusercontent.com/6613091/153295231-bdd52e37-efcf-4b21-9d37-69f182a922f4.png) | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1633/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1211283427 | I_kwDODFdgUs5IMrfj | 72 | feature: display progress bar when downloading multi-page responses | 9020979 | open | 0 | 1 | 2022-04-21T16:37:12Z | 2022-04-21T17:29:31Z | NONE | ## Motivation For a long running command (longer than 1 minute) for a big table (like pull requests or commits), it can be tricky to know if the script is still running, or if a rate limit/error was encountered We know how many pages there are, so it may be possible to indicate how many remain. https://github.com/dogsheep/github-to-sqlite/blob/a6e237f75a4b86963d91dcb5c9582e3a1b3349d6/github_to_sqlite/utils.py#L367 ## Resources - Using the existing Click API: - https://click.palletsprojects.com/en/5.x/utils/#showing-progress-bars - Loading spinner: https://github.com/pavdmyt/yaspin - Progress bar: https://github.com/tqdm/tqdm | 207052882 | issue | { "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/72/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1205867842 | I_kwDODtX3eM5H4BVC | 4 | Retrieve the top-level story for a comment | 1755789 | open | 0 | 0 | 2022-04-15T20:25:39Z | 2022-04-15T20:25:39Z | NONE | I think that each comment inserted into the database should include a column `onstory` that contains the ID of the story on which the comment was made. This is exactly equivalent to the link after "on:" at the top of an HN comment page ([example](https://news.ycombinator.com/item?id=18358028)). We could do this either by directly retrieving the HTML page and using Beautiful Soup to find that link, or alternatively recurse up the tree in the Firebase API using the `parent` field (probably using `functools.lru_cache` in case a person has commented a bunch of times on the same story). | 248903544 | issue | { "url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/4/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1181037277 | I_kwDOBm6k_c5GZTLd | 1686 | heroku bails if app name specifed in datasette publish is the same as existing app | 2115933 | open | 0 | 0 | 2022-03-25T17:10:34Z | 2022-03-25T17:10:34Z | NONE | Seem that `heroku` does not accept an app overwrite triggered by specifying the app name using `datasette publish`, as below: ``` datasette publish heroku some.db --name "jazzy-name" ``` The resulting error has the below traceback: ``` Creating jazzy-name... ! ▸ Name jazzy-name is already taken Traceback (most recent call last): File "/opt/homebrew/bin/datasette", line 33, in <module> sys.exit(load_entry_point('datasette==0.60.1', 'console_scripts', 'datasette')()) File "/opt/homebrew/Cellar/datasette/0.60.1/libexec/lib/python3.10/site-packages/click/core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "/opt/homebrew/Cellar/datasette/0.60.1/libexec/lib/python3.10/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/opt/homebrew/Cellar/datasette/0.60.1/libexec/lib/python3.10/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/opt/homebrew/Cellar/datasette/0.60.1/libexec/lib/python3.10/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/opt/homebrew/Cellar/datasette/0.60.1/libexec/lib/python3.10/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "/opt/homebrew/Cellar/datasette/0.60.1/libexec/lib/python3.10/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "/opt/homebrew/Cellar/datasette/0.60.1/libexec/lib/python3.10/site-packages/datasette/publish/heroku.py", line 127, in heroku create_output = check_output(cmd).decode("utf8") File "/opt/homebrew/Cellar/python@3.10/3.10.2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/subprocess.py", line 420, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/opt/homebrew/Cellar/python@3.10/3.10.2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/subprocess.py", line 524, in run … | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1686/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1174655187 | I_kwDOBm6k_c5GA9DT | 1671 | Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply | 9308268 | open | 0 | 8 | 2022-03-20T19:17:24Z | 2022-03-22T17:43:12Z | NONE | I found a strange behavior, and I'm not sure if it's related to views and boolean values perhaps, or if there's something else weird going on here, but I'll provide an example that may help show what I'm seeing happen. ```bash #!/bin/bash echo "\"id\",\"expiration_date\" 0,2018-01-04 1,2019-01-05 2,2020-01-06 3,2021-01-07 4,2022-01-08 5,2023-01-09 6,2024-01-10 7,2025-01-11 8,2026-01-12 9,2027-01-13 " > test.csv csvs-to-sqlite test.csv test.db sqlite-utils create-view --replace test.db test_view "select id, expiration_date, case when julianday('NOW') >= julianday(expiration_date) then 1 else 0 end as has_expired FROM test" ``` ```bash datasette test.db ``` ![image](https://user-images.githubusercontent.com/9308268/159178745-9c6152f7-eac6-4bf9-bef5-a2d63d3ee13f.png) ![image](https://user-images.githubusercontent.com/9308268/159178824-c8952137-270c-42a4-ad1c-f6ad2c51e499.png) ![image](https://user-images.githubusercontent.com/9308268/159178877-23e00b36-443a-43ef-83e5-e0bdddd3fdcd.png) ![image](https://user-images.githubusercontent.com/9308268/159178918-65922cc7-2514-4735-a72d-4904b99976d4.png) Thanks again and let me know if you want me to provide anything else! | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1671/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1123393829 | I_kwDODFE5qs5C9aEl | 10 | sqlite3.OperationalError: no such table: main.my_activity | 69208826 | open | 0 | 1 | 2022-02-03T17:59:29Z | 2022-03-20T02:38:07Z | NONE | Hello, When i run the command `google-takeout-to-sqlite my-activity db.db takeout-20220203T174446Z-001.zip`, i get this error : ``` Traceback (most recent call last): File "c:\users\julie\appdata\local\programs\python\python39-32\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\users\julie\appdata\local\programs\python\python39-32\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\julie\AppData\Local\Programs\Python\Python39-32\Scripts\google-takeout-to-sqlite.exe\__main__.py", line 7, in <module> File "c:\users\julie\appdata\local\programs\python\python39-32\lib\site-packages\click\core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "c:\users\julie\appdata\local\programs\python\python39-32\lib\site-packages\click\core.py", line 1053, in main rv = self.invoke(ctx) File "c:\users\julie\appdata\local\programs\python\python39-32\lib\site-packages\click\core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "c:\users\julie\appdata\local\programs\python\python39-32\lib\site-packages\click\core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "c:\users\julie\appdata\local\programs\python\python39-32\lib\site-packages\click\core.py", line 754, in invoke return __callback(*args, **kwargs) File "c:\users\julie\appdata\local\programs\python\python39-32\lib\site-packages\google_takeout_to_sqlite\cli.py", line 31, in my_activity utils.save_my_activity(db, zf) File "c:\users\julie\appdata\local\programs\python\python39-32\lib\site-packages\google_takeout_to_sqlite\utils.py", line 19, in save_my_activity db["my_activity"].create_index(["time"]) File "c:\users\julie\appdata\local\programs\python\python39-32\lib\site-packages\sqlite_utils\db.py", line 629, in create_index self.db.conn.execute(sql) sqlite3.OperationalError: no such table: main.my_activity ``` Thank you for your help … | 206649770 | issue | { "url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/10/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
531502365 | MDU6SXNzdWU1MzE1MDIzNjU= | 646 | Make database level information from metadata.json available in the index.html template | 18017473 | open | 0 | 3268330 | 3 | 2019-12-02T19:55:10Z | 2022-03-15T20:50:34Z | NONE | Did a search on the issues here and didn't find anything related to what I want. I want to have information that is on the database level of the JSON like title, source and source_url, and use it on the index page. I tried some small tweaks on the python and html files, but failed to get that result. Is there a way? Thanks! | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/646/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
1154399841 | I_kwDOBm6k_c5Ezr5h | 1645 | Sensible `cache-control` headers for static assets, including those served by plugins | 697092 | open | 0 | 3268330 | 4 | 2022-02-28T18:12:03Z | 2022-03-08T02:59:29Z | NONE | ## What I'm seeing With `default_cache_ttl = 86400`, I see the following: A table view returns `Cache-control: max-age=86400`: ![Screenshot_20220228_190000](https://user-images.githubusercontent.com/697092/156034352-4d64683e-39c8-49af-81df-0217a5957bbd.png) A static asset returns no `Cache-control` header: ![Screenshot_20220228_185933](https://user-images.githubusercontent.com/697092/156034363-d0b03cc2-5889-4ed2-b601-8c1846b8469a.png) ## What I expected to see I expected the static asset to return a `Cache-control` header indicating that this response can be cached. ## Why this matters I'm productionising a Datasette deployment right now and was looking into putting it behind a Varnish instance. I was surprised to see requests for static assets being served from Datasette rather than Varnish, this is what led me to look more closely at the response headers. While Datasette serves those static assets pretty quickly, I don't see why Datasette should serve them. By their nature, static assets like images and JS files are very cacheable, so it should be easy to serve them from a cache like Varnish. (Note that Varnish can easily be configured to override this header, enabling caching for static assets. But it would be better if this override was not necessary.) ## Discussion It seems clear to me that serving static assets without a `Cache-control` header is not ideal. I see two options here: A. Static assets use the same logic as table / SQL views to set the `Cache-control` header based on `default_cache_ttl`. B. An additional setting for static assets is introduced (`default_static_cache_ttl`, say). | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1645/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
1148725876 | I_kwDOBm6k_c5EeCp0 | 1640 | Support static assets where file length may change, e.g. logs | 57859326 | open | 0 | 2 | 2022-02-24T00:34:42Z | 2022-03-05T01:19:25Z | NONE | This is a bit of an oxymoron. I am serving a log.txt file for a background process using the Datasette --static CLI. This is useful as I can observe a background process from the web UI to see any errors that occur (instead of spelunking the logs via docker exec/ssh etc). I get this error, which I think is because Datasette assumes that the size of the content does not change (but appending new log lines means the content length changes). ```python Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/datasette/app.py", line 1181, in route_path response = await view(request, send) File "/usr/local/lib/python3.9/site-packages/datasette/utils/asgi.py", line 305, in inner_static await asgi_send_file(send, full_path, chunk_size=chunk_size) File "/usr/local/lib/python3.9/site-packages/datasette/utils/asgi.py", line 280, in asgi_send_file await send( File "/usr/local/lib/python3.9/site-packages/asgi_csrf.py", line 104, in wrapped_send await send(event) File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 460, in send output = self.conn.send(event) File "/usr/local/lib/python3.9/site-packages/h11/_connection.py", line 468, in send data_list = self.send_with_data_passthrough(event) File "/usr/local/lib/python3.9/site-packages/h11/_connection.py", line 501, in send_with_data_passthrough writer(event, data_list.append) File "/usr/local/lib/python3.9/site-packages/h11/_writers.py", line 58, in __call__ self.send_data(event.data, write) File "/usr/local/lib/python3.9/site-packages/h11/_writers.py", line 78, in send_data raise LocalProtocolError("Too much data for declared Content-Length") h11._util.LocalProtocolError: Too much data for declared Content-Length ERROR: Exception in ASGI application Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/datasette/app.py", line 1181, in route_path response = await view(request, send) File "/usr/lo… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1640/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |