{"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1028294089", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1028294089, "node_id": "IC_kwDOBm6k_c49SoXJ", "user": {"value": 770231, "label": "strada"}, "created_at": "2022-02-02T19:42:03Z", "updated_at": "2022-02-02T19:42:03Z", "author_association": "NONE", "body": "Thanks for looking into this. It might have been nice if `explain` surfaced these function calls. Looks like `explain query plan` does, but only for basic queries.\r\n\r\n```\r\nsqlite-utils fixtures.db 'explain query plan select * from pragma_function_list(), pragma_database_list(), pragma_module_list()' -t\r\n id parent notused detail\r\n---- -------- --------- ------------------------------------------------\r\n 4 0 0 SCAN pragma_function_list VIRTUAL TABLE INDEX 0:\r\n 8 0 0 SCAN pragma_database_list VIRTUAL TABLE INDEX 0:\r\n 12 0 0 SCAN pragma_module_list VIRTUAL TABLE INDEX 0:\r\n```\r\n\r\n\r\n```\r\nsqlite-utils fixtures.db 'explain query plan select * from pragma_function_list() as fl, pragma_database_list() as dl, pragma_module_list() as ml' -t\r\n id parent notused detail\r\n---- -------- --------- ------------------------------\r\n 4 0 0 SCAN fl VIRTUAL TABLE INDEX 0:\r\n 8 0 0 SCAN dl VIRTUAL TABLE INDEX 0:\r\n 12 0 0 SCAN ml VIRTUAL TABLE INDEX 0:\r\n```\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1027659018", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1027659018, "node_id": "IC_kwDOBm6k_c49QNUK", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-02T07:32:47Z", "updated_at": "2022-02-02T07:32:47Z", "author_association": "OWNER", "body": "I was hoping that `explain select ...` might be able to easily spot when people are calling PRAGMA functions, but this output doesn't look very helpful:\r\n```\r\n% sqlite-utils fixtures.db 'explain select * from pragma_database_list()' -t\r\n addr opcode p1 p2 p3 p4 p5 comment\r\n------ ----------- ---- ---- ---- ----------------- ---- ---------\r\n 0 Init 0 11 0 0\r\n 1 VOpen 0 0 0 vtab:7F9C90AC3070 0\r\n 2 Integer 0 1 0 0\r\n 3 Integer 0 2 0 0\r\n 4 VFilter 0 10 1 0\r\n 5 VColumn 0 0 3 0\r\n 6 VColumn 0 1 4 0\r\n 7 VColumn 0 2 5 0\r\n 8 ResultRow 3 3 0 0\r\n 9 VNext 0 5 0 0\r\n 10 Halt 0 0 0 0\r\n 11 Transaction 0 0 35 0 1\r\n 12 Goto 0 1 0 0\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1027656518", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1027656518, "node_id": "IC_kwDOBm6k_c49QMtG", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-02T07:28:14Z", "updated_at": "2022-02-02T07:31:30Z", "author_association": "OWNER", "body": "I also need to consider if supposedly harmless side-effect free pragma functions could be used to work around the Datasette permissions system. My hunch is that wouldn't be a problem, because if you're allowing arbitrary SQL queries you're already letting people ignore the permissions system.\r\n\r\nOne example:\r\n```\r\nsqlite-utils fixtures.db 'pragma database_list' -t\r\n seq name file\r\n----- ------ ------------------------------------------------------\r\n 0 main /Users/simon/Dropbox/Development/datasette/fixtures.db\r\n```\r\nThough it looks like I already allow-listed that one in #761: https://latest.datasette.io/_memory?sql=select+*+from+pragma_database_list%28%29", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1027656000", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1027656000, "node_id": "IC_kwDOBm6k_c49QMlA", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-02T07:27:14Z", "updated_at": "2022-02-02T07:27:14Z", "author_association": "OWNER", "body": "I also just realized that `pragma pragma_list` can be used to generate a list of all known pragmas for the connection:\r\n\r\n sqlite-utils fixtures.db 'pragma pragma_list' --fmt github\r\n\r\n| name |\r\n|---------------------------|\r\n| analysis_limit |\r\n| application_id |\r\n| auto_vacuum |\r\n| automatic_index |\r\n| busy_timeout |\r\n| cache_size |\r\n| cache_spill |\r\n| case_sensitive_like |\r\n| cell_size_check |\r\n| checkpoint_fullfsync |\r\n| collation_list |\r\n| compile_options |\r\n| count_changes |\r\n| data_version |\r\n| database_list |\r\n| default_cache_size |\r\n| defer_foreign_keys |\r\n| empty_result_callbacks |\r\n| encoding |\r\n| foreign_key_check |\r\n| foreign_key_list |\r\n| foreign_keys |\r\n| freelist_count |\r\n| full_column_names |\r\n| fullfsync |\r\n| function_list |\r\n| hard_heap_limit |\r\n| ignore_check_constraints |\r\n| incremental_vacuum |\r\n| index_info |\r\n| index_list |\r\n| index_xinfo |\r\n| integrity_check |\r\n| journal_mode |\r\n| journal_size_limit |\r\n| legacy_alter_table |\r\n| lock_proxy_file |\r\n| locking_mode |\r\n| max_page_count |\r\n| mmap_size |\r\n| module_list |\r\n| optimize |\r\n| page_count |\r\n| page_size |\r\n| pragma_list |\r\n| query_only |\r\n| quick_check |\r\n| read_uncommitted |\r\n| recursive_triggers |\r\n| reverse_unordered_selects |\r\n| schema_version |\r\n| secure_delete |\r\n| short_column_names |\r\n| shrink_memory |\r\n| soft_heap_limit |\r\n| synchronous |\r\n| table_info |\r\n| table_list |\r\n| table_xinfo |\r\n| temp_store |\r\n| temp_store_directory |\r\n| threads |\r\n| trusted_schema |\r\n| user_version |\r\n| wal_autocheckpoint |\r\n| wal_checkpoint |\r\n| writable_schema |\r\n\r\nSo I could use that list to create a much more specific regular expression, which would then allow the word \"pragma\" to be used more freely while still protecting against any known pragma function being called.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1027654979", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1027654979, "node_id": "IC_kwDOBm6k_c49QMVD", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-02T07:25:22Z", "updated_at": "2022-02-02T07:25:22Z", "author_association": "OWNER", "body": "But... I just noticed something I had missed in the docs for https://www.sqlite.org/pragma.html#pragfunc\r\n\r\n> Table-valued functions exist only for PRAGMAs that return results and that have no side-effects.\r\n\r\nSo it's possible I'm being overly paranoid here after all: what I want to block here is people running things like `PRAGMA case_sensitive_like = 1` which could affect the global state for that connection and cause unexpected behaviour later on.\r\n\r\nSo maybe I should allow all pragma functions. I previously allowed an allow-list of them in:\r\n- #761", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1027653005", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1027653005, "node_id": "IC_kwDOBm6k_c49QL2N", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-02T07:22:13Z", "updated_at": "2022-02-02T07:22:13Z", "author_association": "OWNER", "body": "There's a workaround for this at the moment, which is to use parameterized SQL queries. For example, this:\r\n\r\nhttps://fivethirtyeight.datasettes.com/polls?sql=select+*+from+books+where+title+%3D+%3Atitle&title=The+Pragmatic+Programmer\r\n\r\nSo the SQL query is `select * from books where title = :title` and then `&title=...` is added to the URL.\r\n\r\nThe reason behind the quite aggressive pragma filtering is that SQLite allows you to execute pragmas using function calls, like this one:\r\n\r\n```sql\r\nSELECT * FROM pragma_index_info('idx52');\r\n```\r\nThese can be nested arbitrarily deeply in sub-queries, so it's difficult to write a regular expression that will definitely catch them.\r\n\r\nI'm open to relaxing the regex a bit, but I need to be very confident that it's safe to do so.\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null}