{"html_url": "https://github.com/simonw/datasette/issues/2147#issuecomment-1686747420", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2147", "id": 1686747420, "node_id": "IC_kwDOBm6k_c5kibkc", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-08-21T17:31:42Z", "updated_at": "2023-08-21T17:34:19Z", "author_association": "OWNER", "body": "Are you talking just about queries submitted to `/database?sql=` using the interface on https://latest.datasette.io/fixtures?sql=select+*+from+facetable or are you interested in queries that are run to power other pages like https://latest.datasette.io/fixtures/facetable as well? I'll assume the former.\r\n\r\nThere are a few ways you could solve this at the moment.\r\n\r\nThe easiest would be with a piece of ASGI middleware that looks for URLs matching `/dbname?sql=...` and logs those. I played with a version of that a few years ago: https://simonwillison.net/2019/Dec/16/logging-sqlite-asgi-middleware/ - see also https://github.com/simonw/asgi-log-to-sqlite\r\n\r\nThen you can load that middleware from a plugin using https://docs.datasette.io/en/stable/plugin_hooks.html#asgi-wrapper-datasette\r\n\r\nThat feels a bit delicate because it's relying on the URL design not changing, but I'm happy to confirm that URL is going to stay the same for Datasette 1.0 and I have no plans to change it ever.\r\n\r\nThere's also a tracing mechanism built into Datasette itself that you could hook into. The internals of that are documented here: https://docs.datasette.io/en/stable/internals.html#datasette-tracer - but I don't yet consider it a 100% stable API. I don't plan to change it but I won't promise not to either.\r\n\r\nI used that mechanism in this plugin: https://datasette.io/plugins/datasette-pretty-traces - demonstrated here: https://latest-with-plugins.datasette.io/github/commits?_trace=1\r\n\r\nThe hackiest way to do this would be to patch Datasette itself and try to replace the `query_view`. This definitely isn't a documented, stable API though and would be very likely to break at arbitrary points in the future.\r\n\r\nSo my recommendation for the moment is the ASGI middleware option.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1858228057, "label": "Plugin hook for database queries that are run"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2147#issuecomment-1686749342", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2147", "id": 1686749342, "node_id": "IC_kwDOBm6k_c5kicCe", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-08-21T17:33:11Z", "updated_at": "2023-08-21T17:33:11Z", "author_association": "OWNER", "body": "I'm definitely open to suggestions for plugin hooks that might make this kind of thing easier.\r\n\r\nOne idea I've been mulling is whether there should be a plugin hook that files on arbitrary views - similar to Django's `process_view` mechanism: https://docs.djangoproject.com/en/4.2/topics/http/middleware/#process-view\r\n\r\nThat would allow people to setup code that runs before or after any of the default views in Datasette.\r\n\r\nI'm not yet 100% sold on the idea, because I worry about implementing it in a way that guarantees plugins won't break on future releases. But I'm open to considering it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1858228057, "label": "Plugin hook for database queries that are run"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2147#issuecomment-1687433388", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2147", "id": 1687433388, "node_id": "IC_kwDOBm6k_c5klDCs", "user": {"value": 18899, "label": "jackowayed"}, "created_at": "2023-08-22T05:05:33Z", "updated_at": "2023-08-22T05:05:33Z", "author_association": "NONE", "body": "Thanks for all this! You're totally right that the ASGI option is doable, if a bit low level and coupled to the current URI design. I'm totally fine with that being the final answer.\r\n\r\nprocess_view is interesting and in the general direction of what I had in mind.\r\n\r\nA somewhat less powerful idea: Is there value in giving a hook for just the query that's about to be run? Maybe I'm thinking a little narrowly about this problem I decided I wanted to solve, but I could see other uses for a hook of the sketch below:\r\n\r\n```\r\ndef prepare_query(database, table, query):\r\n \"\"\"Modify query that is about to be run in some way. Return the (possibly rewritten) query to run, or None to disallow running the query\"\"\"\r\n```\r\n(Maybe you actually want to return a tuple so there can be an error message when you disallow, or something.)\r\n\r\nMaybe it's too narrowly useful and some of the other pieces of datasette obviate some of these ideas, but off the cuff I could imagine using it to:\r\n* Require a LIMIT. Either fail the query or add the limit if it's not there.\r\n* Do logging, like my usecase.\r\n* Do other analysis on whether you want to allow the query to run; a linter? query complexity? \r\n\r\nDefinitely feel free to say no, or not now. This is all me just playing around with what datasette and its plugin architecture can do with toy ideas, so don't let me push you to commit to a hook you don't feel confident fits well in the design.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1858228057, "label": "Plugin hook for database queries that are run"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2147#issuecomment-1689206170", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2147", "id": 1689206170, "node_id": "IC_kwDOBm6k_c5krz2a", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-08-23T03:05:32Z", "updated_at": "2023-08-23T03:05:32Z", "author_association": "OWNER", "body": "Interestingly enough there's actually a mechanism that looks like that a bit already: https://github.com/simonw/datasette/blob/64fd1d788eeed2624f107ac699f2370590ae1aa3/datasette/views/database.py#L496-L508\r\n\r\nThat `validate_sql_select()` function is defined here: https://github.com/simonw/datasette/blob/64fd1d788eeed2624f107ac699f2370590ae1aa3/datasette/utils/__init__.py#L256-L265\r\n\r\nAgainst these constants:\r\n\r\nhttps://github.com/simonw/datasette/blob/64fd1d788eeed2624f107ac699f2370590ae1aa3/datasette/utils/__init__.py#L223-L253\r\n\r\nWhich isn't a million miles away from your suggestion to have a hook that can say if the query should be executed or not.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1858228057, "label": "Plugin hook for database queries that are run"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2147#issuecomment-1689206768", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2147", "id": 1689206768, "node_id": "IC_kwDOBm6k_c5krz_w", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-08-23T03:06:32Z", "updated_at": "2023-08-23T03:06:32Z", "author_association": "OWNER", "body": "I'm less convinced by the \"rewrite the query in some way\" optional idea. What kind of use-cases can you imagine for that?\r\n\r\nMy hunch is that it's much more likely to cause weird breakages than it is to allow for useful plugin extensions, but I'm willing to be convinced otherwise.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1858228057, "label": "Plugin hook for database queries that are run"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2147#issuecomment-1690955706", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2147", "id": 1690955706, "node_id": "IC_kwDOBm6k_c5kye-6", "user": {"value": 18899, "label": "jackowayed"}, "created_at": "2023-08-24T03:54:35Z", "updated_at": "2023-08-24T03:54:35Z", "author_association": "NONE", "body": "That's fair. The best idea I can think of is that if a plugin wanted to limit intensive queries, it could add LIMITs or something. A hook that gives you visibility of queries and maybe the option to reject felt a little more limited than the existing plugin hooks, so I was trying to think of what else one might want to do while looking at to-be-run queries.\r\n\r\nBut without a real motivating example, I see why you don't want to add that.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1858228057, "label": "Plugin hook for database queries that are run"}, "performed_via_github_app": null}