github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/datasette/issues/943#issuecomment-693004572 | https://api.github.com/repos/simonw/datasette/issues/943 | 693004572 | MDEyOklzc3VlQ29tbWVudDY5MzAwNDU3Mg== | 9599 | 2020-09-15T22:05:39Z | 2020-09-15T22:05:39Z | OWNER | Maybe these methods become the way most Datasette tests are written, replacing the existing `TestClient` mechanism? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
681375466 | |
https://github.com/simonw/datasette/issues/943#issuecomment-693004296 | https://api.github.com/repos/simonw/datasette/issues/943 | 693004296 | MDEyOklzc3VlQ29tbWVudDY5MzAwNDI5Ng== | 9599 | 2020-09-15T22:04:54Z | 2020-09-15T22:04:54Z | OWNER | So what should I do about streaming responses? I could deliberately ignore them - through an exception if you attempt to run `await datasette.get(...)` against a streaming URL. I could load the entire response into memory and return it as a wrapped object. I could support some kind of asynchronous iterator mechanism. This would be pretty elegant if I could decide the right syntax for it - it would allow plugins to take advantage of other internal URLs that return streaming content without needing to load that content entirely into memory in order to process it. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
681375466 | |
https://github.com/simonw/datasette/issues/943#issuecomment-693003652 | https://api.github.com/repos/simonw/datasette/issues/943 | 693003652 | MDEyOklzc3VlQ29tbWVudDY5MzAwMzY1Mg== | 9599 | 2020-09-15T22:03:08Z | 2020-09-15T22:03:08Z | OWNER | I'm not going to mess around with formats - you'll get back the exact response that a web client would receive. Question: what should the response object look like? e.g. if you do: response = await datasette.get("/db/table.json") What should `response` be? I could reuse the Datasette `Response` class from `datasette.utils.asgi`. This would work well for regular responses which just have a status code, some headers and a response body. It wouldn't be great for streaming responses though such as you get back from `?_stream=1` CSV exports. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
681375466 | |
https://github.com/simonw/datasette/issues/891#issuecomment-693001937 | https://api.github.com/repos/simonw/datasette/issues/891 | 693001937 | MDEyOklzc3VlQ29tbWVudDY5MzAwMTkzNw== | 9599 | 2020-09-15T21:58:56Z | 2020-09-15T21:58:56Z | OWNER | Here's what that looks like: ``` Traceback (most recent call last): File "/Users/simon/Dropbox/Development/datasette/plugins/sql_error.py", line 5, in oh_no_error return 100 / 0 ZeroDivisionError: division by zero ERROR: conn=<sqlite3.Connection object at 0x10bce0030>, sql = 'select oh_no_error()', params = {}: user-defined function raised exception INFO: 127.0.0.1:54066 - "GET /data?sql=select+oh_no_error%28%29 HTTP/1.1" 400 Bad Request ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
653529088 | |
https://github.com/simonw/datasette/issues/891#issuecomment-693000522 | https://api.github.com/repos/simonw/datasette/issues/891 | 693000522 | MDEyOklzc3VlQ29tbWVudDY5MzAwMDUyMg== | 9599 | 2020-09-15T21:55:11Z | 2020-09-15T21:55:11Z | OWNER | I'm going to turn this on. If people complain about it I can turn it off again (or make it a configuration setting). | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
653529088 | |
https://github.com/simonw/datasette/issues/891#issuecomment-692999893 | https://api.github.com/repos/simonw/datasette/issues/891 | 692999893 | MDEyOklzc3VlQ29tbWVudDY5Mjk5OTg5Mw== | 9599 | 2020-09-15T21:53:36Z | 2020-09-15T21:53:36Z | OWNER | Here's the commit (from 15 years ago) where `enable_callback_tracebacks` was first added: https://github.com/ghaering/pysqlite/commit/1e8bd36be93b7d7425910642b72e4152c77b0dfd > - Exceptions in callbacks lead to the query being aborted now instead of silently leading to generating values. > - Exceptions in callbacks can be echoed to stderr if you call the module level function enable_callback_tracebacks: enable_callback_tracebacks(1). | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
653529088 | |
https://github.com/simonw/datasette/issues/891#issuecomment-692998061 | https://api.github.com/repos/simonw/datasette/issues/891 | 692998061 | MDEyOklzc3VlQ29tbWVudDY5Mjk5ODA2MQ== | 9599 | 2020-09-15T21:49:03Z | 2020-09-15T21:49:03Z | OWNER | I've been trying to figure out why this is an optional setting that defaults to off. I think it's because it writes directly to `stderr`, so the maintainers of `sqlite3` reasonably decided that people should be able to opt in to that rather than having weird stuff show up on `stderr` that they weren't expecting. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
653529088 | |
https://github.com/simonw/datasette/issues/891#issuecomment-692968792 | https://api.github.com/repos/simonw/datasette/issues/891 | 692968792 | MDEyOklzc3VlQ29tbWVudDY5Mjk2ODc5Mg== | 9599 | 2020-09-15T20:44:15Z | 2020-09-15T20:44:15Z | OWNER | https://github.com/peter-wangxu/persist-queue/issues/74 warns that this might not work with PyPy. I could solve that with: ```python if hasattr(sqlite3, "enable_callback_tracebacks"): sqlite3.enable_callback_tracebacks(True) ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
653529088 | |
https://github.com/simonw/datasette/issues/877#issuecomment-692967733 | https://api.github.com/repos/simonw/datasette/issues/877 | 692967733 | MDEyOklzc3VlQ29tbWVudDY5Mjk2NzczMw== | 9599 | 2020-09-15T20:42:04Z | 2020-09-15T20:42:04Z | OWNER | I'm not going to drop CSRF protection - it's still needed for older browsers - but I have relaxed the circumstances under which it is applied. It only applies to requests that include cookies for example, so API clients that don't send cookies don't need to worry about it. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648421105 | |
https://github.com/simonw/datasette/issues/889#issuecomment-692967123 | https://api.github.com/repos/simonw/datasette/issues/889 | 692967123 | MDEyOklzc3VlQ29tbWVudDY5Mjk2NzEyMw== | 9599 | 2020-09-15T20:40:52Z | 2020-09-15T20:40:52Z | OWNER | Thanks - I've fixed this in `datasette-media` and the other plugins that use that hook now I think. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
649907676 | |
https://github.com/simonw/datasette/issues/888#issuecomment-692966625 | https://api.github.com/repos/simonw/datasette/issues/888 | 692966625 | MDEyOklzc3VlQ29tbWVudDY5Mjk2NjYyNQ== | 9599 | 2020-09-15T20:39:49Z | 2020-09-15T20:39:49Z | OWNER | Thanks, I've fixed that now. It only affected the GitHub release notes - the ones at https://docs.datasette.io/en/stable/changelog.html#v0-45 had the correct links. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
649702801 | |
https://github.com/simonw/datasette/issues/634#issuecomment-692965761 | https://api.github.com/repos/simonw/datasette/issues/634 | 692965761 | MDEyOklzc3VlQ29tbWVudDY5Mjk2NTc2MQ== | 9599 | 2020-09-15T20:37:58Z | 2020-09-15T20:37:58Z | OWNER | I fixed this in 5e0b72247ecab4ce0fcec599b77a83d73a480872 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
522352520 | |
https://github.com/simonw/datasette/issues/849#issuecomment-692965391 | https://api.github.com/repos/simonw/datasette/issues/849 | 692965391 | MDEyOklzc3VlQ29tbWVudDY5Mjk2NTM5MQ== | 9599 | 2020-09-15T20:37:14Z | 2020-09-15T20:37:14Z | OWNER | I've been running on `main` for a while now with no issues. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
639072811 | |
https://github.com/simonw/datasette/issues/956#issuecomment-692965022 | https://api.github.com/repos/simonw/datasette/issues/956 | 692965022 | MDEyOklzc3VlQ29tbWVudDY5Mjk2NTAyMg== | 9599 | 2020-09-15T20:36:34Z | 2020-09-15T20:36:34Z | OWNER | https://hub.docker.com/r/datasetteproject/datasette/tags - 0.49.1 was successfully pushed to Docker Hub by https://github.com/simonw/datasette/runs/1119815175?check_suite_focus=true | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
688427751 | |
https://github.com/simonw/datasette/issues/956#issuecomment-692955850 | https://api.github.com/repos/simonw/datasette/issues/956 | 692955850 | MDEyOklzc3VlQ29tbWVudDY5Mjk1NTg1MA== | 9599 | 2020-09-15T20:17:49Z | 2020-09-15T20:17:49Z | OWNER | I think I've fixed this with recent changes I made as part of #941 - but I won't know until I release the next version. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
688427751 | |
https://github.com/simonw/datasette/issues/946#issuecomment-692955379 | https://api.github.com/repos/simonw/datasette/issues/946 | 692955379 | MDEyOklzc3VlQ29tbWVudDY5Mjk1NTM3OQ== | 9599 | 2020-09-15T20:16:50Z | 2020-09-15T20:16:50Z | OWNER | Can't reproduce this bug now. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
682184050 | |
https://github.com/simonw/datasette/issues/492#issuecomment-692953174 | https://api.github.com/repos/simonw/datasette/issues/492 | 692953174 | MDEyOklzc3VlQ29tbWVudDY5Mjk1MzE3NA== | 9599 | 2020-09-15T20:12:29Z | 2020-09-15T20:12:29Z | OWNER | I fixed this in ea340cf320a2566d24517fb4a0c9852c5059e771 for #963 (a duplicate of this issue). | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
449854604 | |
https://github.com/simonw/datasette/issues/967#issuecomment-692951144 | https://api.github.com/repos/simonw/datasette/issues/967 | 692951144 | MDEyOklzc3VlQ29tbWVudDY5Mjk1MTE0NA== | 9599 | 2020-09-15T20:08:12Z | 2020-09-15T20:08:12Z | OWNER | I think the easiest fix is for me to ensure that calls to `__len__` on the `MagicParameters` class always return at least 1. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
702069429 | |
https://github.com/simonw/datasette/issues/967#issuecomment-692946616 | https://api.github.com/repos/simonw/datasette/issues/967 | 692946616 | MDEyOklzc3VlQ29tbWVudDY5Mjk0NjYxNg== | 9599 | 2020-09-15T19:59:21Z | 2020-09-15T19:59:21Z | OWNER | I wish I could call https://www.sqlite.org/c3ref/bind_parameter_count.html and https://www.sqlite.org/c3ref/bind_parameter_name.html from Python. Might be possible to do that using `ctypes` - see this example code: https://mail.python.org/pipermail//pypy-commit/2013-February/071372.html ```python param_count = lib.sqlite3_bind_parameter_count(self.statement) for idx in range(1, param_count + 1): param_name = lib.sqlite3_bind_parameter_name(self.statement, idx) ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
702069429 | |
https://github.com/simonw/datasette/issues/967#issuecomment-692945504 | https://api.github.com/repos/simonw/datasette/issues/967 | 692945504 | MDEyOklzc3VlQ29tbWVudDY5Mjk0NTUwNA== | 9599 | 2020-09-15T19:57:10Z | 2020-09-15T19:57:10Z | OWNER | So the problem actually occurs when the `MagicParameters` class wraps an empty dictionary. Relevant code: https://github.com/simonw/datasette/blob/853c5fc37011a7bc09ca3a1af287102f00827c82/datasette/views/database.py#L228-L236 And: https://github.com/simonw/datasette/blob/853c5fc37011a7bc09ca3a1af287102f00827c82/datasette/views/database.py#L364-L383 I'm passing a special magic parameters dictionary for the Python `sqlite3` module to look up parameters in. When that dictionary is `{}` a `__len__` check is performed on that dictionary, the result comes back as 0 and as a result it assumes there are no parameters. I tracked down the relevant C code: https://github.com/python/cpython/blob/81715808716198471fbca0a3db42ac408468dbc5/Modules/_sqlite/statement.c#L218-L237 ```c Py_BEGIN_ALLOW_THREADS num_params_needed = sqlite3_bind_parameter_count(self->st); Py_END_ALLOW_THREADS if (PyTuple_CheckExact(parameters) || PyList_CheckExact(parameters) || (!PyDict_Check(parameters) && PySequence_Check(parameters))) { /* parameters passed as sequence */ if (PyTuple_CheckExact(parameters)) { num_params = PyTuple_GET_SIZE(parameters); } else if (PyList_CheckExact(parameters)) { num_params = PyList_GET_SIZE(parameters); } else { num_params = PySequence_Size(parameters); } if (num_params != num_params_needed) { PyErr_Format(pysqlite_ProgrammingError, "Incorrect number of bindings supplied. The current " "statement uses %d, and there are %zd supplied.", num_params_needed, num_params); return; } ``` It looks to me like this should fail if the number of keys known to be in the dictionary differs from the number of named parameters in the query. But if those numbers fail to match it still works as far as I can tell - it's only dictionary length of 0 that is causing the problems. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
702069429 | |
https://github.com/simonw/datasette/issues/967#issuecomment-692940375 | https://api.github.com/repos/simonw/datasette/issues/967 | 692940375 | MDEyOklzc3VlQ29tbWVudDY5Mjk0MDM3NQ== | 9599 | 2020-09-15T19:47:09Z | 2020-09-15T19:47:09Z | OWNER | Yes! The tests all pass if I update the test function to do this: ```python response = magic_parameters_client.post( "/data/runme_post{}".format(qs), {"ignore_me": "1"}, csrftoken_from=use_csrf or None, allow_redirects=False, ) ``` So the bug only occurs if the POST body is completely empty. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
702069429 | |
https://github.com/simonw/datasette/issues/967#issuecomment-692938935 | https://api.github.com/repos/simonw/datasette/issues/967 | 692938935 | MDEyOklzc3VlQ29tbWVudDY5MjkzODkzNQ== | 9599 | 2020-09-15T19:44:21Z | 2020-09-15T19:44:41Z | OWNER | While I'm running the above test, in the rounds that work the `receive()` awaitable returns `{'type': 'http.request', 'body': b'csrftoken=IlpwUGlSMFVVa3Z3ZlVoamQi.uY2U1tF4i0M-5M6x34vnBCmJgr0'}` In the rounds that fails it returns `{'type': 'http.request'}` So it looks like the `csrftoken_from=True` parameter may be helping just by ensuring the `body` key is present and not missing. I wonder if it would work if a body of `b''` was present there? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
702069429 | |
https://github.com/simonw/datasette/issues/967#issuecomment-692937150 | https://api.github.com/repos/simonw/datasette/issues/967 | 692937150 | MDEyOklzc3VlQ29tbWVudDY5MjkzNzE1MA== | 9599 | 2020-09-15T19:42:57Z | 2020-09-15T19:42:57Z | OWNER | New (failing) test: ```python @pytest.mark.parametrize("use_csrf", [True, False]) @pytest.mark.parametrize("return_json", [True, False]) def test_magic_parameters_csrf_json(magic_parameters_client, use_csrf, return_json): magic_parameters_client.ds._metadata["databases"]["data"]["queries"]["runme_post"][ "sql" ] = "insert into logs (line) values (:_header_host)" qs = "" if return_json: qs = "?_json=1" response = magic_parameters_client.post( "/data/runme_post{}".format(qs), {}, csrftoken_from=use_csrf or None, allow_redirects=False, ) if return_json: assert response.status == 200 assert response.json["ok"], response.json else: assert response.status == 302 messages = magic_parameters_client.ds.unsign( response.cookies["ds_messages"], "messages" ) assert [["Query executed, 1 row affected", 1]] == messages post_actual = magic_parameters_client.get( "/data/logs.json?_sort_desc=rowid&_shape=array" ).json[0]["line"] assert post_actual == "localhost" ``` It passes twice, fails twice - failures are for the ones where `use_csrf` is `False`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
702069429 | |
https://github.com/simonw/datasette/issues/967#issuecomment-692927867 | https://api.github.com/repos/simonw/datasette/issues/967 | 692927867 | MDEyOklzc3VlQ29tbWVudDY5MjkyNzg2Nw== | 9599 | 2020-09-15T19:25:23Z | 2020-09-15T19:25:23Z | OWNER | Hunch: I think the `asgi-csrf` middleware may be consuming the request body and failing to restore it. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
702069429 | |
https://github.com/simonw/datasette/issues/967#issuecomment-692835066 | https://api.github.com/repos/simonw/datasette/issues/967 | 692835066 | MDEyOklzc3VlQ29tbWVudDY5MjgzNTA2Ng== | 9599 | 2020-09-15T16:40:12Z | 2020-09-15T16:40:12Z | OWNER | Is the bug here that magic parameters are incompatible with CSRF-exempt requests (e.g. request with no cookies)? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
702069429 | |
https://github.com/simonw/datasette/issues/967#issuecomment-692834670 | https://api.github.com/repos/simonw/datasette/issues/967 | 692834670 | MDEyOklzc3VlQ29tbWVudDY5MjgzNDY3MA== | 9599 | 2020-09-15T16:39:29Z | 2020-09-15T16:39:29Z | OWNER | Relevant code: https://github.com/simonw/datasette/blob/853c5fc37011a7bc09ca3a1af287102f00827c82/datasette/views/database.py#L222-L236 This issue may not be about `_json=1` interacting with magic parameters after all. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
702069429 | |
https://github.com/simonw/datasette/issues/967#issuecomment-692834064 | https://api.github.com/repos/simonw/datasette/issues/967 | 692834064 | MDEyOklzc3VlQ29tbWVudDY5MjgzNDA2NA== | 9599 | 2020-09-15T16:38:21Z | 2020-09-15T16:38:21Z | OWNER | So the mystery here is why does omitting `csrftoken_from=True` break the `MagicParameters` mechanism? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
702069429 | |
https://github.com/simonw/datasette/issues/967#issuecomment-692832113 | https://api.github.com/repos/simonw/datasette/issues/967 | 692832113 | MDEyOklzc3VlQ29tbWVudDY5MjgzMjExMw== | 9599 | 2020-09-15T16:34:53Z | 2020-09-15T16:37:43Z | OWNER | This is so weird. In the test I wrote for this the following passed: response = magic_parameters_client.post("/data/runme_post?_json=1", {}, csrftoken_from=True) But without the `csrftoken_from=True` parameter it failed with the bindings error: response = magic_parameters_client.post("/data/runme_post?_json=1", {}) Here's the test I wrote: ```python def test_magic_parameters_json_body(magic_parameters_client): magic_parameters_client.ds._metadata["databases"]["data"]["queries"]["runme_post"][ "sql" ] = "insert into logs (line) values (:_header_host)" response = magic_parameters_client.post("/data/runme_post?_json=1", {}, csrftoken_from=True) assert response.status == 200 assert response.json["ok"], response.json post_actual = magic_parameters_client.get( "/data/logs.json?_sort_desc=rowid&_shape=array" ).json[0]["line"] ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
702069429 | |
https://github.com/simonw/datasette/issues/940#issuecomment-692340275 | https://api.github.com/repos/simonw/datasette/issues/940 | 692340275 | MDEyOklzc3VlQ29tbWVudDY5MjM0MDI3NQ== | 9599 | 2020-09-14T22:09:35Z | 2020-09-14T22:09:35Z | OWNER | I'm going to cross my fingers and hope that this works - I don't want to leave this issue open until Datasette 0.50. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
679808124 | |
https://github.com/simonw/datasette/issues/940#issuecomment-692339645 | https://api.github.com/repos/simonw/datasette/issues/940 | 692339645 | MDEyOklzc3VlQ29tbWVudDY5MjMzOTY0NQ== | 9599 | 2020-09-14T22:07:58Z | 2020-09-14T22:07:58Z | OWNER | I shipped the Docker build manually by running the following in a tmate session: docker login # Typed my username and password interactively export REPO=datasetteproject/datasette docker build -f Dockerfile -t $REPO:0.49 . docker tag $REPO:0.49 $REPO:latest docker push $REPO | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
679808124 | |
https://github.com/simonw/datasette/issues/940#issuecomment-692337397 | https://api.github.com/repos/simonw/datasette/issues/940 | 692337397 | MDEyOklzc3VlQ29tbWVudDY5MjMzNzM5Nw== | 9599 | 2020-09-14T22:01:56Z | 2020-09-14T22:01:56Z | OWNER | I'm going to switch to using this logic to decide if I should ship to Docker: https://github.community/t/release-prerelease-action-triggers/17275/2 if: "!github.event.release.prerelease" | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
679808124 | |
https://github.com/simonw/datasette/issues/940#issuecomment-692336564 | https://api.github.com/repos/simonw/datasette/issues/940 | 692336564 | MDEyOklzc3VlQ29tbWVudDY5MjMzNjU2NA== | 9599 | 2020-09-14T21:59:40Z | 2020-09-14T21:59:40Z | OWNER | Using https://github.com/marketplace/actions/debugging-with-tmate to manually submit a new build from within an interactive GitHub Actions session. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
679808124 | |
https://github.com/simonw/datasette/issues/940#issuecomment-692332430 | https://api.github.com/repos/simonw/datasette/issues/940 | 692332430 | MDEyOklzc3VlQ29tbWVudDY5MjMzMjQzMA== | 9599 | 2020-09-14T21:48:59Z | 2020-09-14T21:48:59Z | OWNER | So now I've released Datasette 0.49 but failed to push a new Docker image. This is bad, and I need to fix it. I'd like to push to Docker from GitHub Actions, so I think I'm going to create a one-off workflow task for doing that. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
679808124 | |
https://github.com/simonw/datasette/issues/940#issuecomment-692331919 | https://api.github.com/repos/simonw/datasette/issues/940 | 692331919 | MDEyOklzc3VlQ29tbWVudDY5MjMzMTkxOQ== | 9599 | 2020-09-14T21:47:39Z | 2020-09-14T21:47:39Z | OWNER | I bet that's because the `github.ref` actually looks like this: `${GITHUB_REF#refs/tags/}` And the `refs/tags/` part has an `a` in it. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
679808124 | |
https://github.com/simonw/datasette/issues/940#issuecomment-692331349 | https://api.github.com/repos/simonw/datasette/issues/940 | 692331349 | MDEyOklzc3VlQ29tbWVudDY5MjMzMTM0OQ== | 9599 | 2020-09-14T21:46:11Z | 2020-09-14T21:46:11Z | OWNER | Just release Datasette 0.49 - which shipped to PyPI just fine but skipped the Docker step for some reason! https://github.com/simonw/datasette/runs/1114585275?check_suite_focus=true <img width="929" alt="Release_0_49_Ā·_simonw_datasette_c024952" src="https://user-images.githubusercontent.com/9599/93141431-0571e180-f699-11ea-93c3-acaa68bd1272.png"> | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
679808124 | |
https://github.com/simonw/datasette/issues/880#issuecomment-692324230 | https://api.github.com/repos/simonw/datasette/issues/880 | 692324230 | MDEyOklzc3VlQ29tbWVudDY5MjMyNDIzMA== | 9599 | 2020-09-14T21:28:15Z | 2020-09-14T21:28:21Z | OWNER | Documentation here: https://docs.datasette.io/en/latest/sql_queries.html#json-api-for-writable-canned-queries | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648637666 | |
https://github.com/simonw/datasette/issues/880#issuecomment-692299770 | https://api.github.com/repos/simonw/datasette/issues/880 | 692299770 | MDEyOklzc3VlQ29tbWVudDY5MjI5OTc3MA== | 9599 | 2020-09-14T20:36:40Z | 2020-09-14T20:36:40Z | OWNER | The JSON response will look like this: ```json { "ok": true, "message": "A message", "redirect": "/blah" } ``` `"ok"` will be `true` if everything went right and `false` if there was an error. The `"message"` and `"redirect"` will be whatever was configured using the on_success_message - the message shown `on_success_message`, `on_success_redirect`, `on_error_message` and `on_error_redirect` settings, see https://docs.datasette.io/en/stable/sql_queries.html#writable-canned-queries | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648637666 | |
https://github.com/simonw/datasette/issues/880#issuecomment-692298011 | https://api.github.com/repos/simonw/datasette/issues/880 | 692298011 | MDEyOklzc3VlQ29tbWVudDY5MjI5ODAxMQ== | 9599 | 2020-09-14T20:33:13Z | 2020-09-14T20:33:13Z | OWNER | I'm going to support several ways of indicating that you would like a JSON response instead of getting a HTTP redirect from your writable canned query submission: - Use the `Accept: application/json` request header - Include `?_json=1` in the request query string - Include `"_json": 1` in the form submission (or the JSON body submission) | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648637666 | |
https://github.com/simonw/datasette/issues/880#issuecomment-692272860 | https://api.github.com/repos/simonw/datasette/issues/880 | 692272860 | MDEyOklzc3VlQ29tbWVudDY5MjI3Mjg2MA== | 9599 | 2020-09-14T19:43:47Z | 2020-09-14T19:43:47Z | OWNER | I'm going to add support for POST content that is sent as a JSON document, in addition to the existing support for key=value encoded POST bodies. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648637666 | |
https://github.com/simonw/datasette/issues/880#issuecomment-692271804 | https://api.github.com/repos/simonw/datasette/issues/880 | 692271804 | MDEyOklzc3VlQ29tbWVudDY5MjI3MTgwNA== | 9599 | 2020-09-14T19:41:37Z | 2020-09-14T19:41:37Z | OWNER | Relevant code section: https://github.com/simonw/datasette/blob/1552ac931e4d2cf516caac3ceeab4fd24da1510a/datasette/views/database.py#L209-L232 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648637666 | |
https://github.com/simonw/datasette/issues/965#issuecomment-692244252 | https://api.github.com/repos/simonw/datasette/issues/965 | 692244252 | MDEyOklzc3VlQ29tbWVudDY5MjI0NDI1Mg== | 9599 | 2020-09-14T18:49:48Z | 2020-09-14T18:49:48Z | OWNER | Documented here: https://docs.datasette.io/en/latest/custom_templates.html#custom-error-pages | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
701294727 | |
https://github.com/simonw/datasette/issues/965#issuecomment-692231257 | https://api.github.com/repos/simonw/datasette/issues/965 | 692231257 | MDEyOklzc3VlQ29tbWVudDY5MjIzMTI1Nw== | 9599 | 2020-09-14T18:25:04Z | 2020-09-14T18:25:04Z | OWNER | In documenting this I realized that it's confusing that the default `500.html` template is often used for non-500 errors (404 for example). I think I'll rename that default template to `error.html` instead. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
701294727 | |
https://github.com/simonw/datasette/issues/964#issuecomment-692212641 | https://api.github.com/repos/simonw/datasette/issues/964 | 692212641 | MDEyOklzc3VlQ29tbWVudDY5MjIxMjY0MQ== | 9599 | 2020-09-14T17:49:44Z | 2020-09-14T17:49:44Z | OWNER | Documentation: https://docs.datasette.io/en/latest/custom_templates.html#returning-404s | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
700728217 | |
https://github.com/simonw/datasette/issues/965#issuecomment-692207341 | https://api.github.com/repos/simonw/datasette/issues/965 | 692207341 | MDEyOklzc3VlQ29tbWVudDY5MjIwNzM0MQ== | 9599 | 2020-09-14T17:40:05Z | 2020-09-14T17:40:05Z | OWNER | Also link to these from the docs added in #964. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
701294727 | |
https://github.com/simonw/datasette/issues/944#issuecomment-691788478 | https://api.github.com/repos/simonw/datasette/issues/944 | 691788478 | MDEyOklzc3VlQ29tbWVudDY5MTc4ODQ3OA== | 9599 | 2020-09-14T03:21:45Z | 2020-09-14T03:21:45Z | OWNER | Having tried this out I think it does need a `raise_404()` mechanism - which needs to be smart enough to trigger the default 404 handler without accidentally going into an infinite loop. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
681516976 | |
https://github.com/simonw/datasette/issues/880#issuecomment-691785692 | https://api.github.com/repos/simonw/datasette/issues/880 | 691785692 | MDEyOklzc3VlQ29tbWVudDY5MTc4NTY5Mg== | 9599 | 2020-09-14T03:10:11Z | 2020-09-14T03:10:11Z | OWNER | Answer: no, it's [not safe](https://twitter.com/glenathan/status/1305081266065244162) to skip CSRF if there's an `Accept: application/json` header because of a nasty old `crossdomain.xml` Flash vulnerability: https://blog.appsecco.com/exploiting-csrf-on-json-endpoints-with-flash-and-redirects-681d4ad6b31b?gi=a5ee3d7a8235 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648637666 | |
https://github.com/simonw/datasette/issues/940#issuecomment-691781345 | https://api.github.com/repos/simonw/datasette/issues/940 | 691781345 | MDEyOklzc3VlQ29tbWVudDY5MTc4MTM0NQ== | 9599 | 2020-09-14T02:53:25Z | 2020-09-14T02:53:49Z | OWNER | That worked: https://github.com/simonw/datasette/runs/1110040212?check_suite_focus=true ran and deployed https://pypi.org/project/datasette/0.49a1/ to PyPI but it skipped the push to Docker step because there was an "a" in the tag. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
679808124 | |
https://github.com/simonw/datasette/issues/940#issuecomment-691779693 | https://api.github.com/repos/simonw/datasette/issues/940 | 691779693 | MDEyOklzc3VlQ29tbWVudDY5MTc3OTY5Mw== | 9599 | 2020-09-14T02:46:39Z | 2020-09-14T02:46:39Z | OWNER | I think those should be single quoted. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
679808124 | |
https://github.com/simonw/datasette/issues/940#issuecomment-691779510 | https://api.github.com/repos/simonw/datasette/issues/940 | 691779510 | MDEyOklzc3VlQ29tbWVudDY5MTc3OTUxMA== | 9599 | 2020-09-14T02:45:53Z | 2020-09-14T02:45:53Z | OWNER | This bit here: https://github.com/simonw/datasette/blob/c18117cf08ad67c704dab29e3cb3b88f1de4026b/.github/workflows/publish.yml#L58-L62 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
679808124 | |
https://github.com/simonw/datasette/issues/940#issuecomment-691779361 | https://api.github.com/repos/simonw/datasette/issues/940 | 691779361 | MDEyOklzc3VlQ29tbWVudDY5MTc3OTM2MQ== | 9599 | 2020-09-14T02:45:04Z | 2020-09-14T02:45:04Z | OWNER | Package deploys are still broken, just got this error trying to ship 0.49a1: https://github.com/simonw/datasette/actions/runs/253099665 > The workflow is not valid. .github/workflows/publish.yml (Line: 61, Col: 9): Unexpected symbol: '"a"'. Located at position 24 within expression: !(contains(github.ref, "a") || contains(github.ref, "b")) | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
679808124 | |
https://github.com/simonw/datasette/issues/944#issuecomment-691774262 | https://api.github.com/repos/simonw/datasette/issues/944 | 691774262 | MDEyOklzc3VlQ29tbWVudDY5MTc3NDI2Mg== | 9599 | 2020-09-14T02:24:08Z | 2020-09-14T02:24:08Z | OWNER | Actually don't need `{{ raise_404("Museum not found") }}` because we already have `{{ custom_status(404) }}`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
681516976 | |
https://github.com/simonw/datasette/issues/944#issuecomment-691769222 | https://api.github.com/repos/simonw/datasette/issues/944 | 691769222 | MDEyOklzc3VlQ29tbWVudDY5MTc2OTIyMg== | 9599 | 2020-09-14T02:01:33Z | 2020-09-14T02:01:33Z | OWNER | I'm going to cache the `list_templates()` result in memory. If you want to add a new template-defined route you will need to restart the server. I think that's acceptable. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
681516976 | |
https://github.com/simonw/datasette/issues/519#issuecomment-691566247 | https://api.github.com/repos/simonw/datasette/issues/519 | 691566247 | MDEyOklzc3VlQ29tbWVudDY5MTU2NjI0Nw== | 9599 | 2020-09-12T22:48:53Z | 2020-09-12T22:48:53Z | OWNER | I think I've figured out what to do about stability of the HTML and the default templates with respect to semantic versioning. I'm going to announce that the JSON API - including the variables made available to templates - should be considered stable according to semver. I will only break backwards compatibility at that level in a major version release. The template HTML (and default CSS) will not be considered a stable interface. They won't change on bug fix releases but they may change (albeit described in the release notes) on minor version bumps. Since the template inputs are stable, you can run your own copy of the previous version's templates if something breaks. This means users (and plugin authors) who make changes to the default Datasette UI will have to test their changes against every minor release. I think that's OK. If you write plugins that don't affect the Datasette HTML UI you will be able to expect stability across minor version releases. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
459590021 | |
https://github.com/simonw/datasette/issues/880#issuecomment-691558387 | https://api.github.com/repos/simonw/datasette/issues/880 | 691558387 | MDEyOklzc3VlQ29tbWVudDY5MTU1ODM4Nw== | 9599 | 2020-09-12T22:04:48Z | 2020-09-12T22:04:48Z | OWNER | Is it safe to skip CSRF checks if the incoming request has `Accept: application/json` on it? I'm not sure that matters since `asgi-csrf` already won't reject requests that either have no cookies or are using a `Authorization: Bearer ...` header. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648637666 | |
https://github.com/simonw/datasette/issues/880#issuecomment-691557675 | https://api.github.com/repos/simonw/datasette/issues/880 | 691557675 | MDEyOklzc3VlQ29tbWVudDY5MTU1NzY3NQ== | 9599 | 2020-09-12T22:01:02Z | 2020-09-12T22:01:11Z | OWNER | Maybe POST to `.json` doesn't actually make sense. I could instead support `POST /db/queryname` with an optional mechanism for requesting that the response to that POST be in a JSON format. Could be a `Accept: application/json` header with an option of including `"_accept": "json"` as a POST parameter instead. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648637666 | |
https://github.com/simonw/datasette/issues/880#issuecomment-691557429 | https://api.github.com/repos/simonw/datasette/issues/880 | 691557429 | MDEyOklzc3VlQ29tbWVudDY5MTU1NzQyOQ== | 9599 | 2020-09-12T21:59:39Z | 2020-09-12T21:59:39Z | OWNER | What should happen when something does a POST to an extension that was registered by a plugin, e.g. `POST /db/table.atom` ? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648637666 | |
https://github.com/simonw/datasette/issues/782#issuecomment-691554088 | https://api.github.com/repos/simonw/datasette/issues/782 | 691554088 | MDEyOklzc3VlQ29tbWVudDY5MTU1NDA4OA== | 9599 | 2020-09-12T21:39:03Z | 2020-09-12T21:39:03Z | OWNER | Plan: release a new release of Datasette (probably 0.49) with the new JSON API design, but provide a plugin called something like `datasette-api-0-48` which runs as ASGI wrapping middleware and internally rewrites incoming requests to e.g. `/db/table.json` to behave if they have the `?_extra=` params on them necessary to produce the 0.48 version of the JSON. Anyone who has built applications against 0.48 can install that plugin. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
627794879 | |
https://github.com/simonw/datasette/issues/262#issuecomment-691526975 | https://api.github.com/repos/simonw/datasette/issues/262 | 691526975 | MDEyOklzc3VlQ29tbWVudDY5MTUyNjk3NQ== | 9599 | 2020-09-12T18:22:44Z | 2020-09-12T18:22:44Z | OWNER | Are there any interesting use-cases for a plugin hook that allows plugins to define their own `?_extra=` blocks? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
323658641 | |
https://github.com/simonw/datasette/issues/782#issuecomment-691526878 | https://api.github.com/repos/simonw/datasette/issues/782 | 691526878 | MDEyOklzc3VlQ29tbWVudDY5MTUyNjg3OA== | 9599 | 2020-09-12T18:21:41Z | 2020-09-12T18:22:20Z | OWNER | Would it be so bad if the default format had a `"rows"` key containing the array of rows? Maybe it wouldn't. The reason I always use `?_shape=array` is because I want an array of objects, rather than an array of arrays that I have to match up again with their columns. A default format that's an object rather than array also gives something for the `?_extra=` parameter to add its extras to. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
627794879 | |
https://github.com/simonw/datasette/issues/782#issuecomment-691526762 | https://api.github.com/repos/simonw/datasette/issues/782 | 691526762 | MDEyOklzc3VlQ29tbWVudDY5MTUyNjc2Mg== | 9599 | 2020-09-12T18:20:19Z | 2020-09-12T18:20:19Z | OWNER | I'd like to revisit the idea of using `?_extra=x` to opt-in to extra blocks of JSON, from #262 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
627794879 | |
https://github.com/simonw/datasette/issues/262#issuecomment-691526719 | https://api.github.com/repos/simonw/datasette/issues/262 | 691526719 | MDEyOklzc3VlQ29tbWVudDY5MTUyNjcxOQ== | 9599 | 2020-09-12T18:19:50Z | 2020-09-12T18:19:50Z | OWNER | > Idea: `?_extra=sqllog` could output a lot of every individual SQL statement that was executed in order to generate the page - useful for seeing how foreign key expansion and faceting actually works. I built a version of that a while ago as the `?_trace=1` argument. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
323658641 | |
https://github.com/simonw/datasette/issues/262#issuecomment-389702480 | https://api.github.com/repos/simonw/datasette/issues/262 | 389702480 | MDEyOklzc3VlQ29tbWVudDM4OTcwMjQ4MA== | 9599 | 2018-05-17T00:00:39Z | 2020-09-12T18:19:30Z | OWNER | Idea: `?_extra=sqllog` could output a lot of every individual SQL statement that was executed in order to generate the page - useful for seeing how foreign key expansion and faceting actually works. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
323658641 | |
https://github.com/simonw/datasette/issues/680#issuecomment-691526635 | https://api.github.com/repos/simonw/datasette/issues/680 | 691526635 | MDEyOklzc3VlQ29tbWVudDY5MTUyNjYzNQ== | 9599 | 2020-09-12T18:18:50Z | 2020-09-12T18:18:50Z | OWNER | I'm happy with the not-quite-automated way I'm doing this, so I'm going to close this issue. That's documented here https://docs.datasette.io/en/0.48/contributing.html#release-process - I use https://euangoddard.github.io/clipboard2markdown/ to create the GitHub releases markdown version. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
569275763 | |
https://github.com/simonw/datasette/issues/782#issuecomment-691526489 | https://api.github.com/repos/simonw/datasette/issues/782 | 691526489 | MDEyOklzc3VlQ29tbWVudDY5MTUyNjQ4OQ== | 9599 | 2020-09-12T18:17:16Z | 2020-09-12T18:17:16Z | OWNER | (I think I may have been over-thinking the details of this is for a couple of years now.) | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
627794879 | |
https://github.com/simonw/datasette/issues/782#issuecomment-691526416 | https://api.github.com/repos/simonw/datasette/issues/782 | 691526416 | MDEyOklzc3VlQ29tbWVudDY5MTUyNjQxNg== | 9599 | 2020-09-12T18:16:36Z | 2020-09-12T18:16:36Z | OWNER | I'm going to hack together a preview of this in a branch and deploy it somewhere so people can see what I've got planned. Much easier to evaluate a working prototype than static examples. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
627794879 | |
https://github.com/dogsheep/twitter-to-sqlite/issues/50#issuecomment-691501132 | https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/50 | 691501132 | MDEyOklzc3VlQ29tbWVudDY5MTUwMTEzMg== | 706257 | 2020-09-12T14:48:10Z | 2020-09-12T14:48:10Z | NONE | This seems to be an issue even with larger values of `--stop_after`: ``` $ twitter-to-sqlite favorites twitter.db --stop_after=2000 Importing favorites [####################################] 198 $ ``` | { "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
698791218 | |
https://github.com/simonw/datasette/issues/963#issuecomment-691379980 | https://api.github.com/repos/simonw/datasette/issues/963 | 691379980 | MDEyOklzc3VlQ29tbWVudDY5MTM3OTk4MA== | 9599 | 2020-09-12T01:50:56Z | 2020-09-12T01:50:56Z | OWNER | Good bug - looks like a problem with the hidden form fields. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
699947574 | |
https://github.com/simonw/datasette/issues/782#issuecomment-691323302 | https://api.github.com/repos/simonw/datasette/issues/782 | 691323302 | MDEyOklzc3VlQ29tbWVudDY5MTMyMzMwMg== | 9599 | 2020-09-11T21:38:27Z | 2020-09-11T21:40:04Z | OWNER | Another idea: the default output could be the list of dicts: ```json [ { "pk1": "a", "pk2": "a", "pk3": "a", "content": "a-a-a" }, ... ] ``` BUT... I could include pagination information in the HTTP headers - as seen in the WordPress REST API or the GitHub API: ``` ~ % curl -s -i 'https://api.github.com/repos/simonw/datasette/commits' | head -n 40 HTTP/1.1 200 OK server: GitHub.com date: Fri, 11 Sep 2020 21:37:46 GMT content-type: application/json; charset=utf-8 status: 200 OK cache-control: public, max-age=60, s-maxage=60 vary: Accept, Accept-Encoding, Accept, X-Requested-With etag: W/"71c99379743513394e880c6306b66bf9" last-modified: Fri, 11 Sep 2020 21:32:54 GMT x-github-media-type: github.v3; format=json link: <https://api.github.com/repositories/107914493/commits?page=2>; rel="next", <https://api.github.com/repositories/107914493/commits?page=44>; rel="last" access-control-expose-headers: ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset access-control-allow-origin: * strict-transport-security: max-age=31536000; includeSubdomains; preload x-frame-options: deny x-content-type-options: nosniff x-xss-protection: 1; mode=block referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin content-security-policy: default-src 'none' X-Ratelimit-Limit: 60 X-Ratelimit-Remaining: 55 X-Ratelimit-Reset: 1599863850 X-Ratelimit-Used: 5 Accept-Ranges: bytes Content-Length: 118240 X-GitHub-Request-Id: EC76:0EAD:313F40:5291A4:5F5BEE37 [ { "sha": "d02f6151dae073135a22d0123e8abdc6cbef7c50", "node_id": "MDY6Q29tbWl0MTA3OTE0NDkzOmQwMmY2MTUxZGFlMDczMTM1YTIyZDAxMjNlOGFiZGM2Y2JlZjdjNTA=", "commit": { ``` Alternative shapes would provide the pagination information (and other extensions) in the JSON, e.g.: `/squirrels/squirrels.json?_shape=paginated` ```jsonā¦ | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
627794879 | |
https://github.com/simonw/datasette/issues/947#issuecomment-691318133 | https://api.github.com/repos/simonw/datasette/issues/947 | 691318133 | MDEyOklzc3VlQ29tbWVudDY5MTMxODEzMw== | 9599 | 2020-09-11T21:23:40Z | 2020-09-11T21:23:40Z | OWNER | I'm going to use exit code 1 for any errors, be they 500 or 404. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
684111953 | |
https://github.com/simonw/datasette/issues/962#issuecomment-691250299 | https://api.github.com/repos/simonw/datasette/issues/962 | 691250299 | MDEyOklzc3VlQ29tbWVudDY5MTI1MDI5OQ== | 9599 | 2020-09-11T18:33:50Z | 2020-09-11T18:33:50Z | OWNER | Since this is purely a debugging option I'm going to allow myself not to write a unit test for it! | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
699622046 | |
https://github.com/dogsheep/twitter-to-sqlite/issues/50#issuecomment-690860653 | https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/50 | 690860653 | MDEyOklzc3VlQ29tbWVudDY5MDg2MDY1Mw== | 370930 | 2020-09-11T04:04:08Z | 2020-09-11T04:04:08Z | CONTRIBUTOR | There's probably a nicer way of doing (hence this is a comment rather than a PR), but this appears to fix it: ```diff --- a/twitter_to_sqlite/utils.py +++ b/twitter_to_sqlite/utils.py @@ -181,6 +181,7 @@ def fetch_timeline( args["tweet_mode"] = "extended" min_seen_id = None num_rate_limit_errors = 0 + seen_count = 0 while True: if min_seen_id is not None: args["max_id"] = min_seen_id - 1 @@ -208,6 +209,7 @@ def fetch_timeline( yield tweet min_seen_id = min(t["id"] for t in tweets) max_seen_id = max(t["id"] for t in tweets) + seen_count += len(tweets) if last_since_id is not None: max_seen_id = max((last_since_id, max_seen_id)) last_since_id = max_seen_id @@ -217,7 +219,9 @@ def fetch_timeline( replace=True, ) if stop_after is not None: - break + if seen_count >= stop_after: + break + args["count"] = min(args["count"], stop_after - seen_count) time.sleep(sleep) ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
698791218 | |
https://github.com/simonw/sqlite-utils/issues/157#issuecomment-689850509 | https://api.github.com/repos/simonw/sqlite-utils/issues/157 | 689850509 | MDEyOklzc3VlQ29tbWVudDY4OTg1MDUwOQ== | 9599 | 2020-09-09T22:14:49Z | 2020-09-09T22:14:49Z | OWNER | It will call this method: https://github.com/simonw/sqlite-utils/blob/367082e787101fb90901ef3214804ab23a92ce46/sqlite_utils/db.py#L405-L411 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
697179806 | |
https://github.com/simonw/sqlite-utils/issues/157#issuecomment-689850289 | https://api.github.com/repos/simonw/sqlite-utils/issues/157 | 689850289 | MDEyOklzc3VlQ29tbWVudDY4OTg1MDI4OQ== | 9599 | 2020-09-09T22:14:19Z | 2020-09-09T22:14:19Z | OWNER | This can accept four arguments: table, column, other_table, other_column: ``` sqlite-utils add-foreign-keys calands.db \ units_with_maps ACCESS_TYP ACCESS_TYP id \ units_with_maps AGNCY_NAME AGNCY_NAME id \ units_with_maps AGNCY_LEV AGNCY_LEV id ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
697179806 | |
https://github.com/simonw/sqlite-utils/pull/156#issuecomment-689735140 | https://api.github.com/repos/simonw/sqlite-utils/issues/156 | 689735140 | MDEyOklzc3VlQ29tbWVudDY4OTczNTE0MA== | 9599 | 2020-09-09T18:21:06Z | 2020-09-09T18:21:06Z | OWNER | Good spot, thanks. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
697030843 | |
https://github.com/simonw/datasette/issues/961#issuecomment-689635754 | https://api.github.com/repos/simonw/datasette/issues/961 | 689635754 | MDEyOklzc3VlQ29tbWVudDY4OTYzNTc1NA== | 9599 | 2020-09-09T15:24:31Z | 2020-09-09T15:24:31Z | OWNER | I thought about checking that every database in the `databases:` section exists and ditto for `tables:` - but actually I think it's useful to be able to keep a `metadata.yml` around with configuration for databases or tables that aren't currently attached to Datasette. I could treat those as warnings and output a warning to standard out when the server starts instead. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
696908389 | |
https://github.com/simonw/datasette/issues/961#issuecomment-689635094 | https://api.github.com/repos/simonw/datasette/issues/961 | 689635094 | MDEyOklzc3VlQ29tbWVudDY4OTYzNTA5NA== | 9599 | 2020-09-09T15:23:24Z | 2020-09-09T15:23:24Z | OWNER | Checks can include: - `facets:` lists columns that exist - `sort:` and `sort_desc:` columns - `fts_table` and `fts_pk` are valid | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
696908389 | |
https://github.com/dogsheep/dogsheep-beta/issues/17#issuecomment-689226390 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/17 | 689226390 | MDEyOklzc3VlQ29tbWVudDY4OTIyNjM5MA== | 9599 | 2020-09-09T00:36:07Z | 2020-09-09T00:36:07Z | MEMBER | Alternative names: - type - record_type - doctype I think `type` is right. It matches what Elasticsearch used to call their equivalent of this (before they removed the feature!). https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
694500679 | |
https://github.com/simonw/sqlite-utils/issues/145#issuecomment-689186423 | https://api.github.com/repos/simonw/sqlite-utils/issues/145 | 689186423 | MDEyOklzc3VlQ29tbWVudDY4OTE4NjQyMw== | 9599 | 2020-09-08T23:21:23Z | 2020-09-08T23:21:23Z | OWNER | Fixed in PR #146. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
688659182 | |
https://github.com/simonw/sqlite-utils/pull/146#issuecomment-689185393 | https://api.github.com/repos/simonw/sqlite-utils/issues/146 | 689185393 | MDEyOklzc3VlQ29tbWVudDY4OTE4NTM5Mw== | 9599 | 2020-09-08T23:17:42Z | 2020-09-08T23:17:42Z | OWNER | That seems like a reasonable approach to me, especially since this is going to be a pretty rare edge-case. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
688668680 | |
https://github.com/simonw/sqlite-utils/issues/155#issuecomment-689166404 | https://api.github.com/repos/simonw/sqlite-utils/issues/155 | 689166404 | MDEyOklzc3VlQ29tbWVudDY4OTE2NjQwNA== | 9599 | 2020-09-08T22:20:03Z | 2020-09-08T22:20:03Z | OWNER | I'm going to update `sqlite-utils optimize` to also take an optional list of tables, for consistency. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
696045581 | |
https://github.com/simonw/sqlite-utils/issues/153#issuecomment-689165985 | https://api.github.com/repos/simonw/sqlite-utils/issues/153 | 689165985 | MDEyOklzc3VlQ29tbWVudDY4OTE2NTk4NQ== | 9599 | 2020-09-08T22:18:52Z | 2020-09-08T22:18:52Z | OWNER | I've reverted this change again, because it turns out using the `rebuild` FTS mechanism is a better way of repairing this issue - see #155. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695377804 | |
https://github.com/simonw/sqlite-utils/issues/155#issuecomment-689163158 | https://api.github.com/repos/simonw/sqlite-utils/issues/155 | 689163158 | MDEyOklzc3VlQ29tbWVudDY4OTE2MzE1OA== | 9599 | 2020-09-08T22:10:27Z | 2020-09-08T22:10:27Z | OWNER | For the command version: sqlite-utils rebuild-fts mydb.db This will rebuild all detected FTS tables. You can also specify one or more explicit tables: sqlite-utils rebuild-fts mydb.db dogs | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
696045581 | |
https://github.com/dogsheep/dogsheep-beta/issues/19#issuecomment-688626037 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/19 | 688626037 | MDEyOklzc3VlQ29tbWVudDY4ODYyNjAzNw== | 9599 | 2020-09-08T05:27:07Z | 2020-09-08T05:27:07Z | MEMBER | A really clever way to do this would be with triggers. The indexer script would add triggers to each of the database tables that it is indexing - each in their own database. Those triggers would then maintain a `_index_queue_` table. This table would record the primary key of rows that are added, modified or deleted. The indexer could then work by reading through the `_index_queue_` table, re-indexing (or deleting) just the primary keys listed there, and then emptying the queue once it has finished. This would add a small amount of overhead to insert/update/delete queries run against the table. My hunch is that the overhead would be miniscule, but I could still allow people to opt-out for tables that are so high traffic that this would matter. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695556681 | |
https://github.com/dogsheep/dogsheep-beta/issues/19#issuecomment-688625430 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/19 | 688625430 | MDEyOklzc3VlQ29tbWVudDY4ODYyNTQzMA== | 9599 | 2020-09-08T05:24:50Z | 2020-09-08T05:24:50Z | MEMBER | I thought about allowing tables to define a incremental indexing SQL query - maybe something that can return just records touched in the past hour, or records since a recorded "last indexed record" value. The problem with this is deletes - if you delete a record, how does the indexer know to remove it? See #18 - that's already caused problems. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695556681 | |
https://github.com/dogsheep/dogsheep-beta/issues/18#issuecomment-688623097 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/18 | 688623097 | MDEyOklzc3VlQ29tbWVudDY4ODYyMzA5Nw== | 9599 | 2020-09-08T05:15:51Z | 2020-09-08T05:15:51Z | MEMBER | I'm inclined to go with the first, simpler option. I have longer term plans for efficient incremental index updates based on clever trickery with triggers. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695553522 | |
https://github.com/dogsheep/dogsheep-beta/issues/18#issuecomment-688622995 | https://api.github.com/repos/dogsheep/dogsheep-beta/issues/18 | 688622995 | MDEyOklzc3VlQ29tbWVudDY4ODYyMjk5NQ== | 9599 | 2020-09-08T05:15:21Z | 2020-09-08T05:15:21Z | MEMBER | Alternatively it could run as it does now but add a `DELETE FROM index1.search_index WHERE key not in (select key from ...)`. I'm not sure which would be more efficient. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695553522 | |
https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688573964 | https://api.github.com/repos/simonw/sqlite-utils/issues/146 | 688573964 | MDEyOklzc3VlQ29tbWVudDY4ODU3Mzk2NA== | 96218 | 2020-09-08T01:55:07Z | 2020-09-08T01:55:07Z | CONTRIBUTOR | Okay, I've rewritten this PR to preserve the batching behaviour but still fix #145, and rebased the branch to account for the `db.execute()` api change. It's not terribly sophisticated -- if it attempts to insert a batch which has too many variables, the exception is caught, the batch is split in two and each half is inserted separately, and then it carries on as before with the same `batch_size`. In the edge case where this gets triggered, subsequent batches will all be inserted in two groups too if they continue to have the same number of columns (which is presumably reasonably likely). Do you reckon this is acceptable when set against the awkwardness of recalculating the `batch_size` on the fly? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
688668680 | |
https://github.com/simonw/sqlite-utils/issues/154#issuecomment-688544156 | https://api.github.com/repos/simonw/sqlite-utils/issues/154 | 688544156 | MDEyOklzc3VlQ29tbWVudDY4ODU0NDE1Ng== | 9599 | 2020-09-07T23:47:10Z | 2020-09-07T23:47:10Z | OWNER | This is already covered in the tests though: https://github.com/simonw/sqlite-utils/blob/deb2eb013ff85bbc828ebc244a9654f0d9c3139e/tests/test_cli.py#L1300-L1328 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695441530 | |
https://github.com/simonw/sqlite-utils/issues/154#issuecomment-688543128 | https://api.github.com/repos/simonw/sqlite-utils/issues/154 | 688543128 | MDEyOklzc3VlQ29tbWVudDY4ODU0MzEyOA== | 9599 | 2020-09-07T23:43:10Z | 2020-09-07T23:43:10Z | OWNER | Running this against the same file works: ``` $ sqlite3 beta.db SQLite version 3.31.1 2020-01-27 19:55:54 Enter ".help" for usage hints. sqlite> PRAGMA journal_mode=wal; wal ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695441530 | |
https://github.com/simonw/sqlite-utils/issues/152#issuecomment-688500704 | https://api.github.com/repos/simonw/sqlite-utils/issues/152 | 688500704 | MDEyOklzc3VlQ29tbWVudDY4ODUwMDcwNA== | 9599 | 2020-09-07T20:28:45Z | 2020-09-07T21:17:48Z | OWNER | The principle reason to turn these on - at least so far - is that without it weird things happen where FTS tables (in particular `*_fts_docsize`) grow without limit over time, because calls to `INSERT OR REPLACE` against the parent table cause additional rows to be inserted into `*_fts_docsize` even if the row was replaced rather than being inserted. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695376054 | |
https://github.com/simonw/sqlite-utils/issues/153#issuecomment-688511161 | https://api.github.com/repos/simonw/sqlite-utils/issues/153 | 688511161 | MDEyOklzc3VlQ29tbWVudDY4ODUxMTE2MQ== | 9599 | 2020-09-07T21:07:20Z | 2020-09-07T21:07:29Z | OWNER | FTS4 uses a different column name here: https://datasette-sqlite-fts4.datasette.io/24ways-fts4/articles_fts_docsize ``` CREATE TABLE 'articles_fts_docsize'(docid INTEGER PRIMARY KEY, size BLOB); ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695377804 | |
https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688508510 | https://api.github.com/repos/simonw/sqlite-utils/issues/146 | 688508510 | MDEyOklzc3VlQ29tbWVudDY4ODUwODUxMA== | 9599 | 2020-09-07T20:56:03Z | 2020-09-07T20:56:24Z | OWNER | The problem with this approach is that it requires us to consume the entire iterator before we can start inserting rows into the table - here on line 1052: https://github.com/simonw/sqlite-utils/blob/bb131793feac16bc7181ab997568f941b0220ef2/sqlite_utils/db.py#L1047-L1054 I designed the `.insert_all()` to avoid doing this, because I want to be able to pass it an iterator (or more likely a generator) that could produce potentially millions of records. Doing things one batch of 100 records at a time means that the Python process doesn't need to pull millions of records into memory at once. `db-to-sqlite` is one example of a tool that uses that characteristic, in https://github.com/simonw/db-to-sqlite/blob/63e4ee972f292de13bb11767c0fb64b35339d954/db_to_sqlite/cli.py#L94-L106 So we need to solve this issue without consuming the entire iterator with a `records = list(records)` call. I think one way to do this is to execute each chunk one at a time and watch out for an exception that indicates that we sent too many parameters - then adjust the chunk size down and try again. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
688668680 | |
https://github.com/simonw/sqlite-utils/issues/153#issuecomment-688506015 | https://api.github.com/repos/simonw/sqlite-utils/issues/153 | 688506015 | MDEyOklzc3VlQ29tbWVudDY4ODUwNjAxNQ== | 9599 | 2020-09-07T20:46:58Z | 2020-09-07T20:46:58Z | OWNER | Writing a test for this will be a tiny bit tricky. I think I'll use a test that replicates the bug in #149. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695377804 | |
https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688501064 | https://api.github.com/repos/simonw/sqlite-utils/issues/149 | 688501064 | MDEyOklzc3VlQ29tbWVudDY4ODUwMTA2NA== | 9599 | 2020-09-07T20:30:15Z | 2020-09-07T20:30:38Z | OWNER | The second challenge here is cleaning up all of those junk rows in existing `*_fts_docsize` tables. Doing that just to the demo database from https://github-to-sqlite.dogsheep.net/github.db dropped its size from 22MB to 16MB! Here's the SQL: ```sql DELETE FROM [licenses_fts_docsize] WHERE id NOT IN ( SELECT rowid FROM [licenses_fts]); ``` I can do that as part of the existing `table.optimize()` method, which optimizes FTS tables. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695319258 | |
https://github.com/simonw/sqlite-utils/issues/152#issuecomment-688500294 | https://api.github.com/repos/simonw/sqlite-utils/issues/152 | 688500294 | MDEyOklzc3VlQ29tbWVudDY4ODUwMDI5NA== | 9599 | 2020-09-07T20:27:07Z | 2020-09-07T20:27:07Z | OWNER | I'm going to make this an argument to the `Database()` class constructor which defaults to `True`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695376054 | |
https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688499924 | https://api.github.com/repos/simonw/sqlite-utils/issues/149 | 688499924 | MDEyOklzc3VlQ29tbWVudDY4ODQ5OTkyNA== | 9599 | 2020-09-07T20:25:40Z | 2020-09-07T20:25:50Z | OWNER | https://www.sqlite.org/pragma.html#pragma_recursive_triggers says: > Prior to SQLite [version 3.6.18](https://www.sqlite.org/releaselog/3_6_18.html) (2009-09-11), recursive triggers were not supported. The behavior of SQLite was always as if this pragma was set to OFF. Support for recursive triggers was added in version 3.6.18 but was initially turned OFF by default, for compatibility. Recursive triggers may be turned on by default in future versions of SQLite. So I think the fix is to turn on `recursive_triggers` globally by default for `sqlite-utils`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695319258 | |
https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688499650 | https://api.github.com/repos/simonw/sqlite-utils/issues/149 | 688499650 | MDEyOklzc3VlQ29tbWVudDY4ODQ5OTY1MA== | 9599 | 2020-09-07T20:24:35Z | 2020-09-07T20:24:35Z | OWNER | This replicates the problem: ``` (github-to-sqlite) /tmp % sqlite-utils tables --counts github.db | grep licenses {"table": "licenses", "count": 7}, {"table": "licenses_fts_data", "count": 35}, {"table": "licenses_fts_idx", "count": 16}, {"table": "licenses_fts_docsize", "count": 9151}, {"table": "licenses_fts_config", "count": 1}, {"table": "licenses_fts", "count": 7}, (github-to-sqlite) /tmp % github-to-sqlite repos github.db dogsheep (github-to-sqlite) /tmp % sqlite-utils tables --counts github.db | grep licenses {"table": "licenses", "count": 7}, {"table": "licenses_fts_data", "count": 45}, {"table": "licenses_fts_idx", "count": 26}, {"table": "licenses_fts_docsize", "count": 9161}, {"table": "licenses_fts_config", "count": 1}, {"table": "licenses_fts", "count": 7}, ``` Note how the number of rows in `licenses_fts_docsize` goes from 9151 to 9161. The number went up by ten. I used tracing from #151 to show that the following SQL executed ten times: ``` INSERT OR REPLACE INTO [licenses] ([key], [name], [node_id], [spdx_id], [url]) VALUES (?, ?, ?, ?, ?); ``` Then I tried executing `PRAGMA recursive_triggers=on;` at the start of the script. This fixed the problem - running the script did not increase the number of rows in `licenses_fts_docsize`. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695319258 | |
https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688482355 | https://api.github.com/repos/simonw/sqlite-utils/issues/149 | 688482355 | MDEyOklzc3VlQ29tbWVudDY4ODQ4MjM1NQ== | 9599 | 2020-09-07T19:22:51Z | 2020-09-07T19:22:51Z | OWNER | And the SQLite documentation says: > When the REPLACE conflict resolution strategy deletes rows in order to satisfy a constraint, [delete triggers](https://www.sqlite.org/lang_createtrigger.html) fire if and only if [recursive triggers](https://www.sqlite.org/pragma.html#pragma_recursive_triggers) are enabled. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695319258 | |
https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688482055 | https://api.github.com/repos/simonw/sqlite-utils/issues/149 | 688482055 | MDEyOklzc3VlQ29tbWVudDY4ODQ4MjA1NQ== | 9599 | 2020-09-07T19:21:42Z | 2020-09-07T19:21:42Z | OWNER | Using `replace=True` there executes `INSERT OR REPLACE` - and Dan Kennedy (SQLite maintainer) on the SQLite forums said this: > Are you using "REPLACE INTO", or "UPDATE OR REPLACE" on the "licenses" table without having first executed "PRAGMA recursive_triggers = 1"? The docs note that delete triggers will not be fired in this case, which would explain things. Second paragraph under "REPLACE" here: > > https://www.sqlite.org/lang_conflict.html | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695319258 | |
https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688481374 | https://api.github.com/repos/simonw/sqlite-utils/issues/149 | 688481374 | MDEyOklzc3VlQ29tbWVudDY4ODQ4MTM3NA== | 9599 | 2020-09-07T19:19:08Z | 2020-09-07T19:19:08Z | OWNER | reading through the code for `github-to-sqlite repos` - one of the things it does is calls `save_license` for each repo: https://github.com/dogsheep/github-to-sqlite/blob/39b2234253096bd579feed4e25104698b8ccd2ba/github_to_sqlite/utils.py#L259-L262 ```python def save_license(db, license): if license is None: return None return db["licenses"].insert(license, pk="key", replace=True).last_pk ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
695319258 | |
https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688481317 | https://api.github.com/repos/simonw/sqlite-utils/issues/146 | 688481317 | MDEyOklzc3VlQ29tbWVudDY4ODQ4MTMxNw== | 96218 | 2020-09-07T19:18:55Z | 2020-09-07T19:18:55Z | CONTRIBUTOR | Just force-pushed to update d042f9c with more formatting changes to satisfy `black==20.8b1` and pass the GitHub Actions "Test" workflow. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
688668680 |