{"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074331743", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074331743, "node_id": "IC_kwDOBm6k_c5ACQBf", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:30:05Z", "updated_at": "2022-03-21T19:30:05Z", "author_association": "OWNER", "body": "https://github.com/simonw/datasette/blob/1a7750eb29fd15dd2eea3b9f6e33028ce441b143/datasette/app.py#L118-L122 sets it to 50ms for facet suggestion but that's not going to pass `ms < 50`:\r\n\r\n```python\r\n Setting(\r\n \"facet_suggest_time_limit_ms\",\r\n 50,\r\n \"Time limit for calculating a suggested facet\",\r\n ),\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074332325", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074332325, "node_id": "IC_kwDOBm6k_c5ACQKl", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:30:44Z", "updated_at": "2022-03-21T19:30:44Z", "author_association": "OWNER", "body": "So it looks like even for facet suggestion `n=1000` always - it's never reduced to `n=1`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074332718", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074332718, "node_id": "IC_kwDOBm6k_c5ACQQu", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:31:10Z", "updated_at": "2022-03-21T19:31:10Z", "author_association": "OWNER", "body": "How long does it take for SQLite to execute 1000 opcodes anyway?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074337997", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074337997, "node_id": "IC_kwDOBm6k_c5ACRjN", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:37:08Z", "updated_at": "2022-03-21T19:37:08Z", "author_association": "OWNER", "body": "This is weird:\r\n```python\r\nimport sqlite3\r\n\r\ndb = sqlite3.connect(\":memory:\")\r\n\r\ni = 0\r\n\r\ndef count():\r\n global i\r\n i += 1\r\n\r\n\r\ndb.set_progress_handler(count, 1)\r\n\r\ndb.execute(\"\"\"\r\nwith recursive counter(x) as (\r\n select 0\r\n union\r\n select x + 1 from counter\r\n)\r\nselect * from counter limit 10000;\r\n\"\"\")\r\n\r\nprint(i)\r\n```\r\nOutputs `24`. But if you try the same thing in the SQLite console:\r\n```\r\nsqlite> .stats vmstep\r\nsqlite> with recursive counter(x) as (\r\n ...> select 0\r\n ...> union\r\n ...> select x + 1 from counter\r\n ...> )\r\n ...> select * from counter limit 10000;\r\n...\r\nVM-steps: 200007\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074341924", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074341924, "node_id": "IC_kwDOBm6k_c5ACSgk", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:42:08Z", "updated_at": "2022-03-21T19:42:08Z", "author_association": "OWNER", "body": "Here's the Python-C implementation of `set_progress_handler`: https://github.com/python/cpython/blob/4674fd4e938eb4a29ccd5b12c15455bd2a41c335/Modules/_sqlite/connection.c#L1177-L1201\r\n\r\nIt calls `sqlite3_progress_handler(self->db, n, progress_callback, ctx);`\r\n\r\nhttps://www.sqlite.org/c3ref/progress_handler.html says:\r\n\r\n> The parameter N is the approximate number of [virtual machine instructions](https://www.sqlite.org/opcode.html) that are evaluated between successive invocations of the callback X\r\n\r\nSo maybe VM-steps and virtual machine instructions are different things?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074347023", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074347023, "node_id": "IC_kwDOBm6k_c5ACTwP", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T19:48:59Z", "updated_at": "2022-03-21T19:48:59Z", "author_association": "OWNER", "body": "Posed a question about that here: https://sqlite.org/forum/forumpost/de9ff10fa7", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074439309", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074439309, "node_id": "IC_kwDOBm6k_c5ACqSN", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T21:28:58Z", "updated_at": "2022-03-21T21:28:58Z", "author_association": "OWNER", "body": "David Raymond solved it there: https://sqlite.org/forum/forumpost/330c8532d8a88bcd\r\n\r\n> Don't forget to step through the results. All .execute() has done is prepared it.\r\n>\r\n> db.execute(query).fetchall()\r\n\r\nSure enough, adding that gets the VM steps number up to 190,007 which is close enough that I'm happy.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074446576", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074446576, "node_id": "IC_kwDOBm6k_c5ACsDw", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T21:38:27Z", "updated_at": "2022-03-21T21:38:27Z", "author_association": "OWNER", "body": "OK here's a microbenchmark script:\r\n```python\r\nimport sqlite3\r\nimport timeit\r\n\r\ndb = sqlite3.connect(\":memory:\")\r\ndb_with_progress_handler_1 = sqlite3.connect(\":memory:\")\r\ndb_with_progress_handler_1000 = sqlite3.connect(\":memory:\")\r\n\r\ndb_with_progress_handler_1.set_progress_handler(lambda: None, 1)\r\ndb_with_progress_handler_1000.set_progress_handler(lambda: None, 1000)\r\n\r\ndef execute_query(db):\r\n cursor = db.execute(\"\"\"\r\n with recursive counter(x) as (\r\n select 0\r\n union\r\n select x + 1 from counter\r\n )\r\n select * from counter limit 10000;\r\n \"\"\")\r\n list(cursor.fetchall())\r\n\r\n\r\nprint(\"Without progress_handler\")\r\nprint(timeit.timeit(lambda: execute_query(db), number=100))\r\n\r\nprint(\"progress_handler every 1000 ops\")\r\nprint(timeit.timeit(lambda: execute_query(db_with_progress_handler_1000), number=100))\r\n\r\nprint(\"progress_handler every 1 op\")\r\nprint(timeit.timeit(lambda: execute_query(db_with_progress_handler_1), number=100))\r\n```\r\nResults:\r\n```\r\n% python3 bench.py\r\nWithout progress_handler\r\n0.8789225700311363\r\nprogress_handler every 1000 ops\r\n0.8829826560104266\r\nprogress_handler every 1 op\r\n2.8892734259716235\r\n```\r\n\r\nSo running every 1000 ops makes almost no difference at all, but running every single op is a 3.2x performance degradation.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074454687", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074454687, "node_id": "IC_kwDOBm6k_c5ACuCf", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T21:48:02Z", "updated_at": "2022-03-21T21:48:02Z", "author_association": "OWNER", "body": "Here's another microbenchmark that measures how many nanoseconds it takes to run 1,000 vmops:\r\n\r\n```python\r\nimport sqlite3\r\nimport time\r\n\r\ndb = sqlite3.connect(\":memory:\")\r\n\r\ni = 0\r\nout = []\r\n\r\ndef count():\r\n global i\r\n i += 1000\r\n out.append(((i, time.perf_counter_ns())))\r\n\r\ndb.set_progress_handler(count, 1000)\r\n\r\nprint(\"Start:\", time.perf_counter_ns())\r\nall = db.execute(\"\"\"\r\nwith recursive counter(x) as (\r\n select 0\r\n union\r\n select x + 1 from counter\r\n)\r\nselect * from counter limit 10000;\r\n\"\"\").fetchall()\r\nprint(\"End:\", time.perf_counter_ns())\r\n\r\nprint()\r\nprint(\"So how long does it take to execute 1000 ops?\")\r\n\r\nprev_time_ns = None\r\nfor i, time_ns in out:\r\n if prev_time_ns is not None:\r\n print(time_ns - prev_time_ns, \"ns\")\r\n prev_time_ns = time_ns\r\n```\r\nRunning it:\r\n```\r\n% python nanobench.py\r\nStart: 330877620374821\r\nEnd: 330877632515822\r\n\r\nSo how long does it take to execute 1000 ops?\r\n47290 ns\r\n49573 ns\r\n48226 ns\r\n45674 ns\r\n53238 ns\r\n47313 ns\r\n52346 ns\r\n48689 ns\r\n47092 ns\r\n87596 ns\r\n69999 ns\r\n52522 ns\r\n52809 ns\r\n53259 ns\r\n52478 ns\r\n53478 ns\r\n65812 ns\r\n```\r\n87596ns is 0.087596ms - so even a measure rate of every 1000 ops is easily finely grained enough to capture differences of less than 0.1ms.\r\n\r\nIf anything I could bump that default 1000 up - and I can definitely eliminate the `if ms < 50` branch entirely.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074458506", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074458506, "node_id": "IC_kwDOBm6k_c5ACu-K", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T21:53:47Z", "updated_at": "2022-03-21T21:53:47Z", "author_association": "OWNER", "body": "Oh interesting, it turns out there is ONE place in the code that sets the `ms` to less than 20 - this test fixture: https://github.com/simonw/datasette/blob/4e47a2d894b96854348343374c8e97c9d7055cf6/tests/fixtures.py#L224-L226", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1679#issuecomment-1074459746", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1679", "id": 1074459746, "node_id": "IC_kwDOBm6k_c5ACvRi", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T21:55:45Z", "updated_at": "2022-03-21T21:55:45Z", "author_association": "OWNER", "body": "I'm going to change the original logic to set n=1 for times that are `<= 20ms` - and update the comments to make it more obvious what is happening.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1175854982, "label": "Research: how much overhead does the n=1 time limit have?"}, "performed_via_github_app": null}