{"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-1030530071", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 1030530071, "node_id": "IC_kwDOBm6k_c49bKQX", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-05T05:21:35Z", "updated_at": "2022-02-05T05:21:35Z", "author_association": "OWNER", "body": "New documentation section: https://docs.datasette.io/en/latest/internals.html#datasette-tracer", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-1030528532", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 1030528532, "node_id": "IC_kwDOBm6k_c49bJ4U", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-05T05:09:57Z", "updated_at": "2022-02-05T05:09:57Z", "author_association": "OWNER", "body": "Needs documentation. I'll document `from datasette.tracer import trace` too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-1030525218", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 1030525218, "node_id": "IC_kwDOBm6k_c49bJEi", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-05T04:45:11Z", "updated_at": "2022-02-05T04:45:11Z", "author_association": "OWNER", "body": "Got a prototype working with `contextvars` - it identified two parallel executing queries using the patch from above:\r\n\r\n![CleanShot 2022-02-04 at 20 41 50@2x](https://user-images.githubusercontent.com/9599/152628949-cf766b13-13cf-4831-b48d-2f23cadb6a05.png)\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-1017112543", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 1017112543, "node_id": "IC_kwDOBm6k_c48n-ff", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-01-20T04:35:00Z", "updated_at": "2022-02-05T04:33:46Z", "author_association": "OWNER", "body": "I dropped support for Python 3.6 in fae3983c51f4a3aca8335f3e01ff85ef27076fbf so now free to use `contextvars` for this.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-1027635925", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 1027635925, "node_id": "IC_kwDOBm6k_c49QHrV", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-02-02T06:47:20Z", "updated_at": "2022-02-02T06:47:20Z", "author_association": "OWNER", "body": "Here's what I was hacking around with when I uncovered this problem:\r\n```diff\r\ndiff --git a/datasette/views/table.py b/datasette/views/table.py\r\nindex 77fb285..8c57d08 100644\r\n--- a/datasette/views/table.py\r\n+++ b/datasette/views/table.py\r\n@@ -1,3 +1,4 @@\r\n+import asyncio\r\n import urllib\r\n import itertools\r\n import json\r\n@@ -615,44 +616,37 @@ class TableView(RowTableShared):\r\n if request.args.get(\"_timelimit\"):\r\n extra_args[\"custom_time_limit\"] = int(request.args.get(\"_timelimit\"))\r\n \r\n- # Execute the main query!\r\n- results = await db.execute(sql, params, truncate=True, **extra_args)\r\n-\r\n- # Calculate the total count for this query\r\n- filtered_table_rows_count = None\r\n- if (\r\n- not db.is_mutable\r\n- and self.ds.inspect_data\r\n- and count_sql == f\"select count(*) from {table} \"\r\n- ):\r\n- # We can use a previously cached table row count\r\n- try:\r\n- filtered_table_rows_count = self.ds.inspect_data[database][\"tables\"][\r\n- table\r\n- ][\"count\"]\r\n- except KeyError:\r\n- pass\r\n-\r\n- # Otherwise run a select count(*) ...\r\n- if count_sql and filtered_table_rows_count is None and not nocount:\r\n- try:\r\n- count_rows = list(await db.execute(count_sql, from_sql_params))\r\n- filtered_table_rows_count = count_rows[0][0]\r\n- except QueryInterrupted:\r\n- pass\r\n-\r\n- # Faceting\r\n- if not self.ds.setting(\"allow_facet\") and any(\r\n- arg.startswith(\"_facet\") for arg in request.args\r\n- ):\r\n- raise BadRequest(\"_facet= is not allowed\")\r\n+ async def execute_count():\r\n+ # Calculate the total count for this query\r\n+ filtered_table_rows_count = None\r\n+ if (\r\n+ not db.is_mutable\r\n+ and self.ds.inspect_data\r\n+ and count_sql == f\"select count(*) from {table} \"\r\n+ ):\r\n+ # We can use a previously cached table row count\r\n+ try:\r\n+ filtered_table_rows_count = self.ds.inspect_data[database][\r\n+ \"tables\"\r\n+ ][table][\"count\"]\r\n+ except KeyError:\r\n+ pass\r\n+\r\n+ if count_sql and filtered_table_rows_count is None and not nocount:\r\n+ try:\r\n+ count_rows = list(await db.execute(count_sql, from_sql_params))\r\n+ filtered_table_rows_count = count_rows[0][0]\r\n+ except QueryInterrupted:\r\n+ pass\r\n+\r\n+ return filtered_table_rows_count\r\n+\r\n+ filtered_table_rows_count = await execute_count()\r\n \r\n # pylint: disable=no-member\r\n facet_classes = list(\r\n itertools.chain.from_iterable(pm.hook.register_facet_classes())\r\n )\r\n- facet_results = {}\r\n- facets_timed_out = []\r\n facet_instances = []\r\n for klass in facet_classes:\r\n facet_instances.append(\r\n@@ -668,33 +662,58 @@ class TableView(RowTableShared):\r\n )\r\n )\r\n \r\n- if not nofacet:\r\n- for facet in facet_instances:\r\n- (\r\n- instance_facet_results,\r\n- instance_facets_timed_out,\r\n- ) = await facet.facet_results()\r\n- for facet_info in instance_facet_results:\r\n- base_key = facet_info[\"name\"]\r\n- key = base_key\r\n- i = 1\r\n- while key in facet_results:\r\n- i += 1\r\n- key = f\"{base_key}_{i}\"\r\n- facet_results[key] = facet_info\r\n- facets_timed_out.extend(instance_facets_timed_out)\r\n-\r\n- # Calculate suggested facets\r\n- suggested_facets = []\r\n- if (\r\n- self.ds.setting(\"suggest_facets\")\r\n- and self.ds.setting(\"allow_facet\")\r\n- and not _next\r\n- and not nofacet\r\n- and not nosuggest\r\n- ):\r\n- for facet in facet_instances:\r\n- suggested_facets.extend(await facet.suggest())\r\n+ async def execute_suggested_facets():\r\n+ # Calculate suggested facets\r\n+ suggested_facets = []\r\n+ if (\r\n+ self.ds.setting(\"suggest_facets\")\r\n+ and self.ds.setting(\"allow_facet\")\r\n+ and not _next\r\n+ and not nofacet\r\n+ and not nosuggest\r\n+ ):\r\n+ for facet in facet_instances:\r\n+ suggested_facets.extend(await facet.suggest())\r\n+ return suggested_facets\r\n+\r\n+ async def execute_facets():\r\n+ facet_results = {}\r\n+ facets_timed_out = []\r\n+ if not self.ds.setting(\"allow_facet\") and any(\r\n+ arg.startswith(\"_facet\") for arg in request.args\r\n+ ):\r\n+ raise BadRequest(\"_facet= is not allowed\")\r\n+\r\n+ if not nofacet:\r\n+ for facet in facet_instances:\r\n+ (\r\n+ instance_facet_results,\r\n+ instance_facets_timed_out,\r\n+ ) = await facet.facet_results()\r\n+ for facet_info in instance_facet_results:\r\n+ base_key = facet_info[\"name\"]\r\n+ key = base_key\r\n+ i = 1\r\n+ while key in facet_results:\r\n+ i += 1\r\n+ key = f\"{base_key}_{i}\"\r\n+ facet_results[key] = facet_info\r\n+ facets_timed_out.extend(instance_facets_timed_out)\r\n+\r\n+ return facet_results, facets_timed_out\r\n+\r\n+ # Execute the main query, facets and facet suggestions in parallel:\r\n+ (\r\n+ results,\r\n+ suggested_facets,\r\n+ (facet_results, facets_timed_out),\r\n+ ) = await asyncio.gather(\r\n+ db.execute(sql, params, truncate=True, **extra_args),\r\n+ execute_suggested_facets(),\r\n+ execute_facets(),\r\n+ )\r\n+\r\n+ results = await db.execute(sql, params, truncate=True, **extra_args)\r\n \r\n # Figure out columns and rows for the query\r\n columns = [r[0] for r in results.description]\r\n```\r\nIt's a hacky attempt at running some of the table page queries in parallel to see what happens.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-1000935523", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 1000935523, "node_id": "IC_kwDOBm6k_c47qRBj", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-24T21:33:05Z", "updated_at": "2021-12-24T21:33:05Z", "author_association": "OWNER", "body": "Another option would be to attempt to import `contextvars` and, if the import fails (for Python 3.6) continue using the current mechanism - then let Python 3.6 users know in the documentation that under Python 3.6 they will miss out on nested traces.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999990414", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999990414, "node_id": "IC_kwDOBm6k_c47mqSO", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T02:08:39Z", "updated_at": "2021-12-23T18:16:35Z", "author_association": "OWNER", "body": "It's tiny: I'm tempted to vendor it. https://github.com/Skyscanner/aiotask-context/blob/master/aiotask_context/__init__.py\r\n\r\nNo, I'll add it as a pinned dependency, which I can then drop when I drop 3.6 support.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999987418", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999987418, "node_id": "IC_kwDOBm6k_c47mpja", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-23T01:59:58Z", "updated_at": "2021-12-23T02:02:12Z", "author_association": "OWNER", "body": "Another option: https://github.com/Skyscanner/aiotask-context - looks like it might be better as it's been updated for Python 3.7 in this commit https://github.com/Skyscanner/aiotask-context/commit/67108c91d2abb445655cc2af446fdb52ca7890c4\r\n\r\nThe Skyscanner one doesn't attempt to wrap any existing factories, but that's OK for my purposes since I don't need to handle arbitrary `asyncio` code written by other people.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999876666", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999876666, "node_id": "IC_kwDOBm6k_c47mOg6", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:59:22Z", "updated_at": "2021-12-22T21:18:09Z", "author_association": "OWNER", "body": "This article is relevant: [Context information storage for asyncio](https://blog.sqreen.com/asyncio/) - in particular the section https://blog.sqreen.com/asyncio/#context-inheritance-between-tasks which describes exactly the problem I have and their solution, which involves this trickery:\r\n\r\n```python\r\ndef request_task_factory(loop, coro):\r\n child_task = asyncio.tasks.Task(coro, loop=loop)\r\n parent_task = asyncio.Task.current_task(loop=loop)\r\n current_request = getattr(parent_task, 'current_request', None)\r\n setattr(child_task, 'current_request', current_request)\r\n return child_task\r\n\r\nloop = asyncio.get_event_loop()\r\nloop.set_task_factory(request_task_factory)\r\n```\r\n\r\nThey released their solution as a library: https://pypi.org/project/aiocontext/ and https://github.com/sqreen/AioContext - but that company was acquired by Datadog back in April and doesn't seem to be actively maintaining their open source stuff any more: https://twitter.com/SqreenIO/status/1384906075506364417", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999878907", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999878907, "node_id": "IC_kwDOBm6k_c47mPD7", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T21:03:49Z", "updated_at": "2021-12-22T21:10:46Z", "author_association": "OWNER", "body": "`context_vars` can solve this but they were introduced in Python 3.7: https://www.python.org/dev/peps/pep-0567/\r\n\r\nPython 3.6 support ends in a few days time, and it looks like Glitch has updated to 3.7 now - so maybe I can get away with Datasette needing 3.7 these days?\r\n\r\nTweeted about that here: https://twitter.com/simonw/status/1473761478155010048", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999874886", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999874886, "node_id": "IC_kwDOBm6k_c47mOFG", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:55:42Z", "updated_at": "2021-12-22T20:57:28Z", "author_association": "OWNER", "body": "One way to solve this would be to introduce a `set_task_id()` method, which sets an ID which will be returned by `get_task_id()` instead of using `id(current_task(loop=loop))`.\r\n\r\nIt would be really nice if I could solve this using `with` syntax somehow. Something like:\r\n```python\r\nwith trace_child_tasks():\r\n (\r\n suggested_facets,\r\n (facet_results, facets_timed_out),\r\n ) = await asyncio.gather(\r\n execute_suggested_facets(),\r\n execute_facets(),\r\n )\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1576#issuecomment-999874484", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1576", "id": 999874484, "node_id": "IC_kwDOBm6k_c47mN-0", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-12-22T20:54:52Z", "updated_at": "2021-12-22T20:54:52Z", "author_association": "OWNER", "body": "Here's the full current relevant code from `tracer.py`: https://github.com/simonw/datasette/blob/ace86566b28280091b3844cf5fbecd20158e9004/datasette/tracer.py#L8-L64\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1087181951, "label": "Traces should include SQL executed by subtasks created with `asyncio.gather`"}, "performed_via_github_app": null}