{"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1028294089", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1028294089, "node_id": "IC_kwDOBm6k_c49SoXJ", "user": {"value": 770231, "label": "strada"}, "created_at": "2022-02-02T19:42:03Z", "updated_at": "2022-02-02T19:42:03Z", "author_association": "NONE", "body": "Thanks for looking into this. It might have been nice if `explain` surfaced these function calls. Looks like `explain query plan` does, but only for basic queries.\r\n\r\n```\r\nsqlite-utils fixtures.db 'explain query plan select * from pragma_function_list(), pragma_database_list(), pragma_module_list()' -t\r\n  id    parent    notused  detail\r\n----  --------  ---------  ------------------------------------------------\r\n   4         0          0  SCAN pragma_function_list VIRTUAL TABLE INDEX 0:\r\n   8         0          0  SCAN pragma_database_list VIRTUAL TABLE INDEX 0:\r\n  12         0          0  SCAN pragma_module_list VIRTUAL TABLE INDEX 0:\r\n```\r\n\r\n\r\n```\r\nsqlite-utils fixtures.db 'explain query plan select * from pragma_function_list() as fl, pragma_database_list() as dl, pragma_module_list() as ml' -t\r\n  id    parent    notused  detail\r\n----  --------  ---------  ------------------------------\r\n   4         0          0  SCAN fl VIRTUAL TABLE INDEX 0:\r\n   8         0          0  SCAN dl VIRTUAL TABLE INDEX 0:\r\n  12         0          0  SCAN ml VIRTUAL TABLE INDEX 0:\r\n```\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1027659018", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1027659018, "node_id": "IC_kwDOBm6k_c49QNUK", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-02T07:32:47Z", "updated_at": "2022-02-02T07:32:47Z", "author_association": "OWNER", "body": "I was hoping that `explain select ...` might be able to easily spot when people are calling PRAGMA functions, but this output doesn't look very helpful:\r\n```\r\n% sqlite-utils fixtures.db 'explain select * from pragma_database_list()' -t\r\n  addr  opcode         p1    p2    p3  p4                   p5  comment\r\n------  -----------  ----  ----  ----  -----------------  ----  ---------\r\n     0  Init            0    11     0                        0\r\n     1  VOpen           0     0     0  vtab:7F9C90AC3070     0\r\n     2  Integer         0     1     0                        0\r\n     3  Integer         0     2     0                        0\r\n     4  VFilter         0    10     1                        0\r\n     5  VColumn         0     0     3                        0\r\n     6  VColumn         0     1     4                        0\r\n     7  VColumn         0     2     5                        0\r\n     8  ResultRow       3     3     0                        0\r\n     9  VNext           0     5     0                        0\r\n    10  Halt            0     0     0                        0\r\n    11  Transaction     0     0    35  0                     1\r\n    12  Goto            0     1     0                        0\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1027656518", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1027656518, "node_id": "IC_kwDOBm6k_c49QMtG", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-02T07:28:14Z", "updated_at": "2022-02-02T07:31:30Z", "author_association": "OWNER", "body": "I also need to consider if supposedly harmless side-effect free pragma functions could be used to work around the Datasette permissions system. My hunch is that wouldn't be a problem, because if you're allowing arbitrary SQL queries you're already letting people ignore the permissions system.\r\n\r\nOne example:\r\n```\r\nsqlite-utils fixtures.db 'pragma database_list' -t\r\n  seq  name    file\r\n-----  ------  ------------------------------------------------------\r\n    0  main    /Users/simon/Dropbox/Development/datasette/fixtures.db\r\n```\r\nThough it looks like I already allow-listed that one in #761: https://latest.datasette.io/_memory?sql=select+*+from+pragma_database_list%28%29", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1027656000", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1027656000, "node_id": "IC_kwDOBm6k_c49QMlA", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-02T07:27:14Z", "updated_at": "2022-02-02T07:27:14Z", "author_association": "OWNER", "body": "I also just realized that `pragma pragma_list` can be used to generate a list of all known pragmas for the connection:\r\n\r\n    sqlite-utils fixtures.db 'pragma pragma_list' --fmt github\r\n\r\n| name                      |\r\n|---------------------------|\r\n| analysis_limit            |\r\n| application_id            |\r\n| auto_vacuum               |\r\n| automatic_index           |\r\n| busy_timeout              |\r\n| cache_size                |\r\n| cache_spill               |\r\n| case_sensitive_like       |\r\n| cell_size_check           |\r\n| checkpoint_fullfsync      |\r\n| collation_list            |\r\n| compile_options           |\r\n| count_changes             |\r\n| data_version              |\r\n| database_list             |\r\n| default_cache_size        |\r\n| defer_foreign_keys        |\r\n| empty_result_callbacks    |\r\n| encoding                  |\r\n| foreign_key_check         |\r\n| foreign_key_list          |\r\n| foreign_keys              |\r\n| freelist_count            |\r\n| full_column_names         |\r\n| fullfsync                 |\r\n| function_list             |\r\n| hard_heap_limit           |\r\n| ignore_check_constraints  |\r\n| incremental_vacuum        |\r\n| index_info                |\r\n| index_list                |\r\n| index_xinfo               |\r\n| integrity_check           |\r\n| journal_mode              |\r\n| journal_size_limit        |\r\n| legacy_alter_table        |\r\n| lock_proxy_file           |\r\n| locking_mode              |\r\n| max_page_count            |\r\n| mmap_size                 |\r\n| module_list               |\r\n| optimize                  |\r\n| page_count                |\r\n| page_size                 |\r\n| pragma_list               |\r\n| query_only                |\r\n| quick_check               |\r\n| read_uncommitted          |\r\n| recursive_triggers        |\r\n| reverse_unordered_selects |\r\n| schema_version            |\r\n| secure_delete             |\r\n| short_column_names        |\r\n| shrink_memory             |\r\n| soft_heap_limit           |\r\n| synchronous               |\r\n| table_info                |\r\n| table_list                |\r\n| table_xinfo               |\r\n| temp_store                |\r\n| temp_store_directory      |\r\n| threads                   |\r\n| trusted_schema            |\r\n| user_version              |\r\n| wal_autocheckpoint        |\r\n| wal_checkpoint            |\r\n| writable_schema           |\r\n\r\nSo I could use that list to create a much more specific regular expression, which would then allow the word \"pragma\" to be used more freely while still protecting against any known pragma function being called.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1027654979", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1027654979, "node_id": "IC_kwDOBm6k_c49QMVD", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-02T07:25:22Z", "updated_at": "2022-02-02T07:25:22Z", "author_association": "OWNER", "body": "But... I just noticed something I had missed in the docs for https://www.sqlite.org/pragma.html#pragfunc\r\n\r\n> Table-valued functions exist only for PRAGMAs that return results and that have no side-effects.\r\n\r\nSo it's possible I'm being overly paranoid here after all: what I want to block here is people running things like `PRAGMA case_sensitive_like = 1` which could affect the global state for that connection and cause unexpected behaviour later on.\r\n\r\nSo maybe I should allow all pragma functions. I previously allowed an allow-list of them in:\r\n- #761", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1618#issuecomment-1027653005", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1618", "id": 1027653005, "node_id": "IC_kwDOBm6k_c49QL2N", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-02T07:22:13Z", "updated_at": "2022-02-02T07:22:13Z", "author_association": "OWNER", "body": "There's a workaround for this at the moment, which is to use parameterized SQL queries. For example, this:\r\n\r\nhttps://fivethirtyeight.datasettes.com/polls?sql=select+*+from+books+where+title+%3D+%3Atitle&title=The+Pragmatic+Programmer\r\n\r\nSo the SQL query is `select * from books where title = :title` and then `&title=...` is added to the URL.\r\n\r\nThe reason behind the quite aggressive pragma filtering is that SQLite allows you to execute pragmas using function calls, like this one:\r\n\r\n```sql\r\nSELECT * FROM pragma_index_info('idx52');\r\n```\r\nThese can be nested arbitrarily deeply in sub-queries, so it's difficult to write a regular expression that will definitely catch them.\r\n\r\nI'm open to relaxing the regex a bit, but I need to be very confident that it's safe to do so.\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1121121305, "label": "Reconsider policy on blocking queries containing the string \"pragma\""}, "performed_via_github_app": null}