{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973635157", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973635157, "node_id": "IC_kwDOBm6k_c46CH5V", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T01:07:08Z", "updated_at": "2021-11-19T01:07:08Z", "author_association": "OWNER", "body": "This exercise is proving so useful in getting my head around how the enormous and complex `TableView` class works again.\r\n\r\nHere's where I've got to now - I'm systematically working through the variables that are returned for HTML and for JSON copying across code to get it to work:\r\n\r\n```python\r\nfrom datasette.database import QueryInterrupted\r\nfrom datasette.utils import escape_sqlite\r\nfrom datasette.utils.asgi import Response, NotFound, Forbidden\r\nfrom datasette.views.base import DatasetteError\r\nfrom datasette import hookimpl\r\nfrom asyncinject import AsyncInject, inject\r\nfrom pprint import pformat\r\n\r\n\r\nclass Table(AsyncInject):\r\n    @inject\r\n    async def database(self, request, datasette):\r\n        # TODO: all that nasty hash resolving stuff can go here\r\n        db_name = request.url_vars[\"db_name\"]\r\n        try:\r\n            db = datasette.databases[db_name]\r\n        except KeyError:\r\n            raise NotFound(f\"Database '{db_name}' does not exist\")\r\n        return db\r\n\r\n    @inject\r\n    async def table_and_format(self, request, database, datasette):\r\n        table_and_format = request.url_vars[\"table_and_format\"]\r\n        # TODO: be a lot smarter here\r\n        if \".\" in table_and_format:\r\n            return table_and_format.split(\".\", 2)\r\n        else:\r\n            return table_and_format, \"html\"\r\n\r\n    @inject\r\n    async def main(self, request, database, table_and_format, datasette):\r\n        # TODO: if this is actually a canned query, dispatch to it\r\n\r\n        table, format = table_and_format\r\n\r\n        is_view = bool(await database.get_view_definition(table))\r\n        table_exists = bool(await database.table_exists(table))\r\n        if not is_view and not table_exists:\r\n            raise NotFound(f\"Table not found: {table}\")\r\n\r\n        await check_permissions(\r\n            datasette,\r\n            request,\r\n            [\r\n                (\"view-table\", (database.name, table)),\r\n                (\"view-database\", database.name),\r\n                \"view-instance\",\r\n            ],\r\n        )\r\n\r\n        private = not await datasette.permission_allowed(\r\n            None, \"view-table\", (database.name, table), default=True\r\n        )\r\n\r\n        pks = await database.primary_keys(table)\r\n        table_columns = await database.table_columns(table)\r\n\r\n        specified_columns = await columns_to_select(datasette, database, table, request)\r\n        select_specified_columns = \", \".join(\r\n            escape_sqlite(t) for t in specified_columns\r\n        )\r\n        select_all_columns = \", \".join(escape_sqlite(t) for t in table_columns)\r\n\r\n        use_rowid = not pks and not is_view\r\n        if use_rowid:\r\n            select_specified_columns = f\"rowid, {select_specified_columns}\"\r\n            select_all_columns = f\"rowid, {select_all_columns}\"\r\n            order_by = \"rowid\"\r\n            order_by_pks = \"rowid\"\r\n        else:\r\n            order_by_pks = \", \".join([escape_sqlite(pk) for pk in pks])\r\n            order_by = order_by_pks\r\n\r\n        if is_view:\r\n            order_by = \"\"\r\n\r\n        nocount = request.args.get(\"_nocount\")\r\n        nofacet = request.args.get(\"_nofacet\")\r\n\r\n        if request.args.get(\"_shape\") in (\"array\", \"object\"):\r\n            nocount = True\r\n            nofacet = True\r\n\r\n        # Next, a TON of SQL to build where_params and filters and suchlike\r\n        # skipping that and jumping straight to...\r\n        where_clauses = []\r\n        where_clause = \"\"\r\n        if where_clauses:\r\n            where_clause = f\"where {' and '.join(where_clauses)} \"\r\n\r\n        from_sql = \"from {table_name} {where}\".format(\r\n            table_name=escape_sqlite(table),\r\n            where=(\"where {} \".format(\" and \".join(where_clauses)))\r\n            if where_clauses\r\n            else \"\",\r\n        )\r\n        from_sql_params ={}\r\n        params = {}\r\n        count_sql = f\"select count(*) {from_sql}\"\r\n        sql_no_order_no_limit = (\r\n            \"select {select_all_columns} from {table_name} {where}\".format(\r\n                select_all_columns=select_all_columns,\r\n                table_name=escape_sqlite(table),\r\n                where=where_clause,\r\n            )\r\n        )\r\n\r\n        page_size = 100\r\n        offset = \" offset 0\"\r\n\r\n        sql = \"select {select_specified_columns} from {table_name} {where}{order_by} limit {page_size}{offset}\".format(\r\n            select_specified_columns=select_specified_columns,\r\n            table_name=escape_sqlite(table),\r\n            where=where_clause,\r\n            order_by=order_by,\r\n            page_size=page_size + 1,\r\n            offset=offset,\r\n        )\r\n\r\n        # Fetch rows\r\n        results = await database.execute(sql, params, truncate=True)\r\n        columns = [r[0] for r in results.description]\r\n        rows = list(results.rows)\r\n\r\n        # Fetch count\r\n        filtered_table_rows_count = None\r\n        if count_sql:\r\n            try:\r\n                count_rows = list(await database.execute(count_sql, from_sql_params))\r\n                filtered_table_rows_count = count_rows[0][0]\r\n            except QueryInterrupted:\r\n                pass\r\n\r\n\r\n        vars = {\r\n            \"json\": {\r\n                # THIS STUFF is from the regular JSON\r\n                \"database\": database.name,\r\n                \"table\": table,\r\n                \"is_view\": is_view,\r\n                # \"human_description_en\": human_description_en,\r\n                \"rows\": rows[:page_size],\r\n                \"truncated\": results.truncated,\r\n                \"filtered_table_rows_count\": filtered_table_rows_count,\r\n                # \"expanded_columns\": expanded_columns,\r\n                # \"expandable_columns\": expandable_columns,\r\n                \"columns\": columns,\r\n                \"primary_keys\": pks,\r\n                # \"units\": units,\r\n                \"query\": {\"sql\": sql, \"params\": params},\r\n                # \"facet_results\": facet_results,\r\n                # \"suggested_facets\": suggested_facets,\r\n                # \"next\": next_value and str(next_value) or None,\r\n                # \"next_url\": next_url,\r\n                \"private\": private,\r\n                \"allow_execute_sql\": await datasette.permission_allowed(\r\n                    request.actor, \"execute-sql\", database, default=True\r\n                ),\r\n            },\r\n            \"html\": {\r\n                # ... this is the HTML special stuff\r\n                # \"table_actions\": table_actions,\r\n                # \"supports_search\": bool(fts_table),\r\n                # \"search\": search or \"\",\r\n                \"use_rowid\": use_rowid,\r\n                # \"filters\": filters,\r\n                # \"display_columns\": display_columns,\r\n                # \"filter_columns\": filter_columns,\r\n                # \"display_rows\": display_rows,\r\n                # \"facets_timed_out\": facets_timed_out,\r\n                # \"sorted_facet_results\": sorted(\r\n                #     facet_results.values(),\r\n                #     key=lambda f: (len(f[\"results\"]), f[\"name\"]),\r\n                #     reverse=True,\r\n                # ),\r\n                # \"show_facet_counts\": special_args.get(\"_facet_size\") == \"max\",\r\n                # \"extra_wheres_for_ui\": extra_wheres_for_ui,\r\n                # \"form_hidden_args\": form_hidden_args,\r\n                # \"is_sortable\": any(c[\"sortable\"] for c in display_columns),\r\n                # \"path_with_replaced_args\": path_with_replaced_args,\r\n                # \"path_with_removed_args\": path_with_removed_args,\r\n                # \"append_querystring\": append_querystring,\r\n                \"request\": request,\r\n                # \"sort\": sort,\r\n                # \"sort_desc\": sort_desc,\r\n                \"disable_sort\": is_view,\r\n                # \"custom_table_templates\": [\r\n                #     f\"_table-{to_css_class(database)}-{to_css_class(table)}.html\",\r\n                #     f\"_table-table-{to_css_class(database)}-{to_css_class(table)}.html\",\r\n                #     \"_table.html\",\r\n                # ],\r\n                # \"metadata\": metadata,\r\n                # \"view_definition\": await db.get_view_definition(table),\r\n                # \"table_definition\": await db.get_table_definition(table),\r\n            },\r\n        }\r\n\r\n        # I'm just trying to get HTML to work for the moment\r\n        if format == \"json\":\r\n            return Response.json(dict(vars, locals=locals()), default=repr)\r\n        else:\r\n            return Response.html(repr(vars[\"html\"]))\r\n\r\n    async def view(self, request, datasette):\r\n        return await self.main(request=request, datasette=datasette)\r\n\r\n\r\n@hookimpl\r\ndef register_routes():\r\n    return [\r\n        (r\"/t/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)\", Table().view),\r\n    ]\r\n\r\n\r\nasync def check_permissions(datasette, request, permissions):\r\n    \"\"\"permissions is a list of (action, resource) tuples or 'action' strings\"\"\"\r\n    for permission in permissions:\r\n        if isinstance(permission, str):\r\n            action = permission\r\n            resource = None\r\n        elif isinstance(permission, (tuple, list)) and len(permission) == 2:\r\n            action, resource = permission\r\n        else:\r\n            assert (\r\n                False\r\n            ), \"permission should be string or tuple of two items: {}\".format(\r\n                repr(permission)\r\n            )\r\n        ok = await datasette.permission_allowed(\r\n            request.actor,\r\n            action,\r\n            resource=resource,\r\n            default=None,\r\n        )\r\n        if ok is not None:\r\n            if ok:\r\n                return\r\n            else:\r\n                raise Forbidden(action)\r\n\r\n\r\nasync def columns_to_select(datasette, database, table, request):\r\n    table_columns = await database.table_columns(table)\r\n    pks = await database.primary_keys(table)\r\n    columns = list(table_columns)\r\n    if \"_col\" in request.args:\r\n        columns = list(pks)\r\n        _cols = request.args.getlist(\"_col\")\r\n        bad_columns = [column for column in _cols if column not in table_columns]\r\n        if bad_columns:\r\n            raise DatasetteError(\r\n                \"_col={} - invalid columns\".format(\", \".join(bad_columns)),\r\n                status=400,\r\n            )\r\n        # De-duplicate maintaining order:\r\n        columns.extend(dict.fromkeys(_cols))\r\n    if \"_nocol\" in request.args:\r\n        # Return all columns EXCEPT these\r\n        bad_columns = [\r\n            column\r\n            for column in request.args.getlist(\"_nocol\")\r\n            if (column not in table_columns) or (column in pks)\r\n        ]\r\n        if bad_columns:\r\n            raise DatasetteError(\r\n                \"_nocol={} - invalid columns\".format(\", \".join(bad_columns)),\r\n                status=400,\r\n            )\r\n        tmp_columns = [\r\n            column for column in columns if column not in request.args.getlist(\"_nocol\")\r\n        ]\r\n        columns = tmp_columns\r\n    return columns\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973568285", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973568285, "node_id": "IC_kwDOBm6k_c46B3kd", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T00:29:20Z", "updated_at": "2021-11-19T00:29:20Z", "author_association": "OWNER", "body": "This is working!\r\n```python\r\nfrom datasette.utils.asgi import Response\r\nfrom datasette import hookimpl\r\nimport html\r\nfrom asyncinject import AsyncInject, inject\r\n\r\n\r\nclass Table(AsyncInject):\r\n    @inject\r\n    async def database(self, request):\r\n        return request.url_vars[\"db_name\"]\r\n\r\n    @inject\r\n    async def main(self, request, database):\r\n        return Response.html(\"Database: {}\".format(\r\n            html.escape(database)\r\n        ))\r\n\r\n    async def view(self, request):\r\n        return await self.main(request=request)\r\n\r\n\r\n@hookimpl\r\ndef register_routes():\r\n    return [\r\n        (r\"/t/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)\", Table().view),\r\n    ]\r\n```\r\nThis project will definitely show me if I actually like the `asyncinject` patterns or not.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973564260", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973564260, "node_id": "IC_kwDOBm6k_c46B2lk", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T00:27:06Z", "updated_at": "2021-11-19T00:27:06Z", "author_association": "OWNER", "body": "Problem: the fancy `asyncinject` stuff inteferes with the fancy Datasette thing that introspects view functions to look for what parameters they take:\r\n```python\r\nclass Table(asyncinject.AsyncInjectAll):\r\n    async def view(self, request):\r\n        return Response.html(\"Hello from {}\".format(\r\n            html.escape(repr(request.url_vars))\r\n        ))\r\n\r\n\r\n@hookimpl\r\ndef register_routes():\r\n    return [\r\n        (r\"/t/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)\", Table().view),\r\n    ]\r\n```\r\nThis failed with error: \"Table.view() takes 1 positional argument but 2 were given\"\r\n\r\nSo I'm going to use `AsyncInject` and have the `view` function NOT use the `@inject` decorator.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973554024", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973554024, "node_id": "IC_kwDOBm6k_c46B0Fo", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T00:21:20Z", "updated_at": "2021-11-19T00:21:20Z", "author_association": "OWNER", "body": "That's annoying: it looks like plugins can't use `register_routes()` to over-ride default routes within Datasette itself. This didn't work:\r\n```python\r\nfrom datasette.utils.asgi import Response\r\nfrom datasette import hookimpl\r\nimport html\r\n\r\n\r\nasync def table(request):\r\n    return Response.html(\"Hello from {}\".format(\r\n        html.escape(repr(request.url_vars))\r\n    ))\r\n\r\n\r\n@hookimpl\r\ndef register_routes():\r\n    return [\r\n        (r\"/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)\", table),\r\n    ]\r\n```\r\nI'll use a `/t/` prefix for the moment, but this is probably something I'll fix in Datasette itself later.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973542284", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973542284, "node_id": "IC_kwDOBm6k_c46BxOM", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T00:16:44Z", "updated_at": "2021-11-19T00:16:44Z", "author_association": "OWNER", "body": "```\r\nDevelopment % cookiecutter gh:simonw/datasette-plugin\r\nYou've downloaded /Users/simon/.cookiecutters/datasette-plugin before. Is it okay to delete and re-download it? [yes]: yes\r\nplugin_name []: table-new\r\ndescription []: New implementation of TableView, see https://github.com/simonw/datasette/issues/878\r\nhyphenated [table-new]: \r\nunderscored [table_new]: \r\ngithub_username []: simonw\r\nauthor_name []: Simon Willison\r\ninclude_static_directory []:  \r\ninclude_templates_directory []: \r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-973527870", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 973527870, "node_id": "IC_kwDOBm6k_c46Bts-", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-19T00:13:43Z", "updated_at": "2021-11-19T00:13:43Z", "author_association": "OWNER", "body": "New plan: I'm going to build a brand new implementation of `TableView` starting out as a plugin, using the `register_routes()` plugin hook.\r\n\r\nIt will reuse the existing HTML template but will be a completely new Python implementation, based on `asyncinject`.\r\n\r\nI'm going to start by just getting the table to show up on the page - then I'll add faceting, suggested facets, filters and so-on.\r\n\r\nBonus: I'm going to see if I can get it to work for arbitrary SQL queries too (stretch goal).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1516#issuecomment-972858458", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1516", "id": 972858458, "node_id": "IC_kwDOBm6k_c45_KRa", "user": {"value": 22429695, "label": "codecov[bot]"}, "created_at": "2021-11-18T13:19:01Z", "updated_at": "2021-11-18T13:19:01Z", "author_association": "NONE", "body": "# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report\n> Merging [#1516](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (a82c620) into [main](https://codecov.io/gh/simonw/datasette/commit/0156c6b5e52d541e93f0d68e9245f20ae83bc933?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (0156c6b) will **not change** coverage.\n> The diff coverage is `n/a`.\n\n[![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1516/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n\n```diff\n@@           Coverage Diff           @@\n##             main    #1516   +/-   ##\n=======================================\n  Coverage   91.82%   91.82%           \n=======================================\n  Files          34       34           \n  Lines        4430     4430           \n=======================================\n  Hits         4068     4068           \n  Misses        362      362           \n```\n\n\n\n------\n\n[Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n> `\u0394 = absolute <relative> (impact)`, `\u00f8 = not affected`, `? = missing data`\n> Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Last update [0156c6b...a82c620](https://codecov.io/gh/simonw/datasette/pull/1516?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1057340779, "label": "Bump black from 21.9b0 to 21.11b1"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1514#issuecomment-972852184", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1514", "id": 972852184, "node_id": "IC_kwDOBm6k_c45_IvY", "user": {"value": 49699333, "label": "dependabot[bot]"}, "created_at": "2021-11-18T13:11:15Z", "updated_at": "2021-11-18T13:11:15Z", "author_association": "CONTRIBUTOR", "body": "Superseded by #1516.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1056117435, "label": "Bump black from 21.9b0 to 21.11b0"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1514#issuecomment-971575746", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1514", "id": 971575746, "node_id": "IC_kwDOBm6k_c456RHC", "user": {"value": 22429695, "label": "codecov[bot]"}, "created_at": "2021-11-17T13:18:58Z", "updated_at": "2021-11-17T13:18:58Z", "author_association": "NONE", "body": "# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report\n> Merging [#1514](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (b02c35a) into [main](https://codecov.io/gh/simonw/datasette/commit/0156c6b5e52d541e93f0d68e9245f20ae83bc933?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (0156c6b) will **not change** coverage.\n> The diff coverage is `n/a`.\n\n[![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1514/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n\n```diff\n@@           Coverage Diff           @@\n##             main    #1514   +/-   ##\n=======================================\n  Coverage   91.82%   91.82%           \n=======================================\n  Files          34       34           \n  Lines        4430     4430           \n=======================================\n  Hits         4068     4068           \n  Misses        362      362           \n```\n\n\n\n------\n\n[Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n> `\u0394 = absolute <relative> (impact)`, `\u00f8 = not affected`, `? = missing data`\n> Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Last update [0156c6b...b02c35a](https://codecov.io/gh/simonw/datasette/pull/1514?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1056117435, "label": "Bump black from 21.9b0 to 21.11b0"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1500#issuecomment-971568829", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1500", "id": 971568829, "node_id": "IC_kwDOBm6k_c456Pa9", "user": {"value": 49699333, "label": "dependabot[bot]"}, "created_at": "2021-11-17T13:13:58Z", "updated_at": "2021-11-17T13:13:58Z", "author_association": "CONTRIBUTOR", "body": "Superseded by #1514.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1041158024, "label": "Bump black from 21.9b0 to 21.10b0"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-971209475", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 971209475, "node_id": "IC_kwDOBm6k_c4543sD", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-17T05:41:42Z", "updated_at": "2021-11-17T05:41:42Z", "author_association": "OWNER", "body": "I'm going to build a brand new implementation of the `TableView` class that doesn't subclass `BaseView` at all, instead using `asyncinject`. If I'm lucky that will clean up the grungiest part of the codebase.\r\n\r\nI can maybe even run the tests against old `TableView` and `TableView2` to check that they behave the same.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-971057553", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 971057553, "node_id": "IC_kwDOBm6k_c454SmR", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-17T01:40:45Z", "updated_at": "2021-11-17T01:40:45Z", "author_association": "OWNER", "body": "I shipped that code as a new library, `asyncinject`: https://pypi.org/project/asyncinject/ - I'll open a new PR to attempt to refactor `TableView` to use it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-971056169", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 971056169, "node_id": "IC_kwDOBm6k_c454SQp", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-17T01:39:44Z", "updated_at": "2021-11-17T01:39:44Z", "author_association": "OWNER", "body": "Closing this PR because I shipped the code in it as a separate library instead.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-971055677", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 971055677, "node_id": "IC_kwDOBm6k_c454SI9", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-17T01:39:25Z", "updated_at": "2021-11-17T01:39:25Z", "author_association": "OWNER", "body": "https://github.com/simonw/asyncinject version 0.1a0 is now live on PyPI: https://pypi.org/project/asyncinject/", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-971010724", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 971010724, "node_id": "IC_kwDOBm6k_c454HKk", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-17T01:12:22Z", "updated_at": "2021-11-17T01:12:22Z", "author_association": "OWNER", "body": "I'm going to extract out the `asyncinject` stuff into a separate library.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-970718652", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 970718652, "node_id": "IC_kwDOBm6k_c452_28", "user": {"value": 22429695, "label": "codecov[bot]"}, "created_at": "2021-11-16T22:02:59Z", "updated_at": "2021-11-16T23:51:48Z", "author_association": "NONE", "body": "# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report\n> Merging [#1512](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (8f757da) into [main](https://codecov.io/gh/simonw/datasette/commit/0156c6b5e52d541e93f0d68e9245f20ae83bc933?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (0156c6b) will **decrease** coverage by `2.10%`.\n> The diff coverage is `36.20%`.\n\n[![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1512/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n\n```diff\n@@            Coverage Diff             @@\n##             main    #1512      +/-   ##\n==========================================\n- Coverage   91.82%   89.72%   -2.11%     \n==========================================\n  Files          34       36       +2     \n  Lines        4430     4604     +174     \n==========================================\n+ Hits         4068     4131      +63     \n- Misses        362      473     +111     \n```\n\n\n| [Impacted Files](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) | Coverage \u0394 | |\n|---|---|---|\n| [datasette/utils/vendored\\_graphlib.py](https://codecov.io/gh/simonw/datasette/pull/1512/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-ZGF0YXNldHRlL3V0aWxzL3ZlbmRvcmVkX2dyYXBobGliLnB5) | `0.00% <0.00%> (\u00f8)` | |\n| [datasette/utils/asyncdi.py](https://codecov.io/gh/simonw/datasette/pull/1512/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-ZGF0YXNldHRlL3V0aWxzL2FzeW5jZGkucHk=) | `96.92% <96.92%> (\u00f8)` | |\n\n------\n\n[Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n> `\u0394 = absolute <relative> (impact)`, `\u00f8 = not affected`, `? = missing data`\n> Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Last update [0156c6b...8f757da](https://codecov.io/gh/simonw/datasette/pull/1512?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-970861628", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 970861628, "node_id": "IC_kwDOBm6k_c453iw8", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:46:07Z", "updated_at": "2021-11-16T23:46:07Z", "author_association": "OWNER", "body": "I made the changes locally and tested them with Python 3.6 like so:\r\n```\r\ncd /tmp\r\nmkdir v\r\ncd v\r\npipenv shell --python=python3.6\r\ncd ~/Dropbox/Development/datasette\r\npip install -e '.[test]'\r\npytest tests/test_asyncdi.py\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-970857411", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 970857411, "node_id": "IC_kwDOBm6k_c453hvD", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:43:21Z", "updated_at": "2021-11-16T23:43:21Z", "author_association": "OWNER", "body": "```\r\nE     File \"/home/runner/work/datasette/datasette/datasette/utils/vendored_graphlib.py\", line 56\r\nE       if (result := self._node2info.get(node)) is None:\r\nE                  ^\r\nE   SyntaxError: invalid syntax\r\n```\r\nOh no - the vendored code I use has `:=` so doesn't work on Python 3.6! Will have to backport it more thoroughly.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970855084", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970855084, "node_id": "IC_kwDOBm6k_c453hKs", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:41:46Z", "updated_at": "2021-11-16T23:41:46Z", "author_association": "OWNER", "body": "Conclusion: using a giant convoluted CTE and UNION ALL query to attempt to calculate facets at the same time as retrieving rows is a net LOSS for performance! Very surprised to see that.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970853917", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970853917, "node_id": "IC_kwDOBm6k_c453g4d", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:41:01Z", "updated_at": "2021-11-16T23:41:01Z", "author_association": "OWNER", "body": "One very interesting difference between the two: on the single giant query page:\r\n\r\n```json\r\n{\r\n  \"request_duration_ms\": 376.4317020000476,\r\n  \"sum_trace_duration_ms\": 370.0828700000329,\r\n  \"num_traces\": 5\r\n}\r\n```\r\nAnd on the page that uses separate queries:\r\n```json\r\n{\r\n  \"request_duration_ms\": 819.012272000009,\r\n  \"sum_trace_duration_ms\": 201.52852100000018,\r\n  \"num_traces\": 19\r\n}\r\n```\r\nThe separate pages page takes 819ms total to render the page, but spends 201ms across 19 SQL queries.\r\n\r\nThe single big query takes 376ms total to render the page, spending 370ms in 5 queries\r\n\r\n<details><summary>Those 5 queries, if you're interested</summary>\r\n\r\n```sql\r\nselect database_name, schema_version from databases\r\nPRAGMA schema_version\r\nPRAGMA schema_version\r\nexplain with cte as (\\r\\n  select rowid, date, county, state, fips, cases, deaths\\r\\n  from ny_times_us_counties\\r\\n),\\r\\ntruncated as (\\r\\n  select null as _facet, null as facet_name, null as facet_count, rowid, date, county, state, fips, cases, deaths\\r\\n  from cte order by date desc limit 4\\r\\n),\\r\\nstate_facet as (\\r\\n  select 'state' as _facet, state as facet_name, count(*) as facet_count,\\r\\n  null, null, null, null, null, null, null\\r\\n  from cte group by facet_name order by facet_count desc limit 3\\r\\n),\\r\\nfips_facet as (\\r\\n  select 'fips' as _facet, fips as facet_name, count(*) as facet_count,\\r\\n  null, null, null, null, null, null, null\\r\\n  from cte group by facet_name order by facet_count desc limit 3\\r\\n),\\r\\ncounty_facet as (\\r\\n  select 'county' as _facet, county as facet_name, count(*) as facet_count,\\r\\n  null, null, null, null, null, null, null\\r\\n  from cte group by facet_name order by facet_count desc limit 3\\r\\n)\\r\\nselect * from truncated\\r\\nunion all select * from state_facet\\r\\nunion all select * from fips_facet\\r\\nunion all select * from county_facet\r\nwith cte as (\\r\\n  select rowid, date, county, state, fips, cases, deaths\\r\\n  from ny_times_us_counties\\r\\n),\\r\\ntruncated as (\\r\\n  select null as _facet, null as facet_name, null as facet_count, rowid, date, county, state, fips, cases, deaths\\r\\n  from cte order by date desc limit 4\\r\\n),\\r\\nstate_facet as (\\r\\n  select 'state' as _facet, state as facet_name, count(*) as facet_count,\\r\\n  null, null, null, null, null, null, null\\r\\n  from cte group by facet_name order by facet_count desc limit 3\\r\\n),\\r\\nfips_facet as (\\r\\n  select 'fips' as _facet, fips as facet_name, count(*) as facet_count,\\r\\n  null, null, null, null, null, null, null\\r\\n  from cte group by facet_name order by facet_count desc limit 3\\r\\n),\\r\\ncounty_facet as (\\r\\n  select 'county' as _facet, county as facet_name, count(*) as facet_count,\\r\\n  null, null, null, null, null, null, null\\r\\n  from cte group by facet_name order by facet_count desc limit 3\\r\\n)\\r\\nselect * from truncated\\r\\nunion all select * from state_facet\\r\\nunion all select * from fips_facet\\r\\nunion all select * from county_facet\r\n```\r\n</details>\r\n\r\nAll of that additional non-SQL overhead must be stuff relating to Python and template rendering code running on the page. I'm really surprised at how much overhead that is! This is worth researching separately.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970845844", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970845844, "node_id": "IC_kwDOBm6k_c453e6U", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:35:38Z", "updated_at": "2021-11-16T23:35:38Z", "author_association": "OWNER", "body": "I tried adding `cases > 10000` but the SQL query now takes too long - so moving this to my laptop.\r\n\r\n```\r\ncd /tmp\r\nwget https://covid-19.datasettes.com/covid.db\r\ndatasette covid.db \\\r\n  --setting facet_time_limit_ms 10000 \\\r\n  --setting sql_time_limit_ms 10000 \\\r\n  --setting trace_debug 1\r\n```\r\n`http://127.0.0.1:8006/covid/ny_times_us_counties?_trace=1&_facet_size=3&_size=2&cases__gt=10000` shows in the traces:\r\n\r\n```json\r\n[\r\n    {\r\n      \"type\": \"sql\",\r\n      \"start\": 12.693033525,\r\n      \"end\": 12.694056904,\r\n      \"duration_ms\": 1.0233789999993803,\r\n      \"traceback\": [\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\\\", line 262, in get\\n    return await self.view_get(\\n\",\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\\\", line 477, in view_get\\n    response_or_template_contexts = await self.data(\\n\",\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\\\", line 705, in data\\n    results = await db.execute(sql, params, truncate=True, **extra_args)\\n\"\r\n      ],\r\n      \"database\": \"covid\",\r\n      \"sql\": \"select rowid, date, county, state, fips, cases, deaths from ny_times_us_counties where \\\"cases\\\" > :p0 order by rowid limit 3\",\r\n      \"params\": {\r\n        \"p0\": 10000\r\n      }\r\n    },\r\n    {\r\n      \"type\": \"sql\",\r\n      \"start\": 12.694285093,\r\n      \"end\": 12.814936275,\r\n      \"duration_ms\": 120.65118200000136,\r\n      \"traceback\": [\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\\\", line 262, in get\\n    return await self.view_get(\\n\",\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/base.py\\\", line 477, in view_get\\n    response_or_template_contexts = await self.data(\\n\",\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\\\", line 723, in data\\n    count_rows = list(await db.execute(count_sql, from_sql_params))\\n\"\r\n      ],\r\n      \"database\": \"covid\",\r\n      \"sql\": \"select count(*) from ny_times_us_counties where \\\"cases\\\" > :p0\",\r\n      \"params\": {\r\n        \"p0\": 10000\r\n      }\r\n    },\r\n    {\r\n      \"type\": \"sql\",\r\n      \"start\": 12.818812089,\r\n      \"end\": 12.851172544,\r\n      \"duration_ms\": 32.360455000000954,\r\n      \"traceback\": [\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\\\", line 856, in data\\n    suggested_facets.extend(await facet.suggest())\\n\",\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/facets.py\\\", line 164, in suggest\\n    distinct_values = await self.ds.execute(\\n\",\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/app.py\\\", line 634, in execute\\n    return await self.databases[db_name].execute(\\n\"\r\n      ],\r\n      \"database\": \"covid\",\r\n      \"sql\": \"select county, count(*) as n from (\\n                    select rowid, date, county, state, fips, cases, deaths from ny_times_us_counties where \\\"cases\\\" > :p0 \\n                ) where county is not null\\n                group by county\\n                limit 4\",\r\n      \"params\": {\r\n        \"p0\": 10000\r\n      }\r\n    },\r\n    {\r\n      \"type\": \"sql\",\r\n      \"start\": 12.851418868,\r\n      \"end\": 12.871268359,\r\n      \"duration_ms\": 19.84949100000044,\r\n      \"traceback\": [\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\\\", line 856, in data\\n    suggested_facets.extend(await facet.suggest())\\n\",\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/facets.py\\\", line 164, in suggest\\n    distinct_values = await self.ds.execute(\\n\",\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/app.py\\\", line 634, in execute\\n    return await self.databases[db_name].execute(\\n\"\r\n      ],\r\n      \"database\": \"covid\",\r\n      \"sql\": \"select state, count(*) as n from (\\n                    select rowid, date, county, state, fips, cases, deaths from ny_times_us_counties where \\\"cases\\\" > :p0 \\n                ) where state is not null\\n                group by state\\n                limit 4\",\r\n      \"params\": {\r\n        \"p0\": 10000\r\n      }\r\n    },\r\n    {\r\n      \"type\": \"sql\",\r\n      \"start\": 12.871497655,\r\n      \"end\": 12.897715027,\r\n      \"duration_ms\": 26.217371999999628,\r\n      \"traceback\": [\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/views/table.py\\\", line 856, in data\\n    suggested_facets.extend(await facet.suggest())\\n\",\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/facets.py\\\", line 164, in suggest\\n    distinct_values = await self.ds.execute(\\n\",\r\n        \"  File \\\"/usr/local/Cellar/datasette/0.58.1/libexec/lib/python3.9/site-packages/datasette/app.py\\\", line 634, in execute\\n    return await self.databases[db_name].execute(\\n\"\r\n      ],\r\n      \"database\": \"covid\",\r\n      \"sql\": \"select fips, count(*) as n from (\\n                    select rowid, date, county, state, fips, cases, deaths from ny_times_us_counties where \\\"cases\\\" > :p0 \\n                ) where fips is not null\\n                group by fips\\n                limit 4\",\r\n      \"params\": {\r\n        \"p0\": 10000\r\n      }\r\n    }\r\n]\r\n```\r\nSo that's:\r\n```\r\nfetch rows: 1.0233789999993803 ms\r\ncount: 120.65118200000136 ms\r\nfacet county: 32.360455000000954 ms\r\nfacet state: 19.84949100000044 ms\r\nfacet fips: 26.217371999999628 ms\r\n```\r\n= 200.1 ms total\r\n\r\nCompared to: `http://127.0.0.1:8006/covid?sql=with+cte+as+(%0D%0A++select+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+ny_times_us_counties%0D%0A)%2C%0D%0Atruncated+as+(%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+cte+order+by+date+desc+limit+4%0D%0A)%2C%0D%0Astate_facet+as+(%0D%0A++select+%27state%27+as+_facet%2C+state+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%2C%0D%0Afips_facet+as+(%0D%0A++select+%27fips%27+as+_facet%2C+fips+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%2C%0D%0Acounty_facet+as+(%0D%0A++select+%27county%27+as+_facet%2C+county+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+state_facet%0D%0Aunion+all+select+*+from+fips_facet%0D%0Aunion+all+select+*+from+county_facet&_trace=1`\r\n\r\nWhich is 353ms total.\r\n\r\nThe separate queries ran faster! Really surprising result there.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970828568", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970828568, "node_id": "IC_kwDOBm6k_c453asY", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:27:11Z", "updated_at": "2021-11-16T23:27:11Z", "author_association": "OWNER", "body": "One last experiment: I'm going to try running an expensive query in the CTE portion.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970827674", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970827674, "node_id": "IC_kwDOBm6k_c453aea", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:26:58Z", "updated_at": "2021-11-16T23:26:58Z", "author_association": "OWNER", "body": "With trace.\r\n\r\nhttps://covid-19.datasettes.com/covid/ny_times_us_counties?_trace=1&_facet_size=3&_size=2&_trace=1 shows the following:\r\n\r\n```\r\nfetch rows: 0.41762600005768036 ms\r\nfacet state: 284.30423800000426 ms\r\nfacet county: 273.2565999999679 ms\r\nfacet fips: 197.80996999998024 ms\r\n```\r\n= 755.78843400001ms total\r\n\r\nIt didn't run a count because that's the homepage and the count is cached. So I dropped the count from the query and ran it:\r\n\r\nhttps://covid-19.datasettes.com/covid?sql=with+cte+as+(%0D%0A++select+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+ny_times_us_counties%0D%0A)%2C%0D%0Atruncated+as+(%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+cte+order+by+date+desc+limit+4%0D%0A)%2C%0D%0Astate_facet+as+(%0D%0A++select+%27state%27+as+_facet%2C+state+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%2C%0D%0Afips_facet+as+(%0D%0A++select+%27fips%27+as+_facet%2C+fips+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%2C%0D%0Acounty_facet+as+(%0D%0A++select+%27county%27+as+_facet%2C+county+as+facet_name%2C+count(*)+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A)%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+state_facet%0D%0Aunion+all+select+*+from+fips_facet%0D%0Aunion+all+select+*+from+county_facet&_trace=1\r\n\r\nShows 649.4359889999259 ms for the query - compared to 755.78843400001ms for the separate. So it saved about 100ms.\r\n\r\nStill not a huge difference though!\r\n\r\n\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970780866", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970780866, "node_id": "IC_kwDOBm6k_c453PDC", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T23:01:57Z", "updated_at": "2021-11-16T23:01:57Z", "author_association": "OWNER", "body": "One disadvantage to this approach: if you have a SQL time limit of 1s and it takes 0.9s to return the rows but then 0.5s to calculate each of the requested facets the entire query will exceed the time limit.\r\n\r\nCould work around this by catching that error and then re-running the query just for the rows, but that would result in the user having to wait longer for the results.\r\n\r\nCould try to remember if that has happened using an in-memory Python data structure and skip the faceting optimization if it's caused problems in the past? That seems a bit gross.\r\n\r\nMaybe this becomes an opt-in optimization you can request in your `metadata.json` setting for that table, which massively increases the time limit? That's a bit weird too - now there are two separate implementations of the faceting logic, which had better have a REALLY big pay-off to be worth maintaining.\r\n\r\nWhat if we kept the query that returns the rows to be displayed on the page separate from the facets, but then executed all of the facets together using this method such that the `cte` only (presumably) has to be calculated once? That would still lead to multiple facets potentially exceeding the SQL time limit when single facets would not have.\r\n\r\nMaybe a better optimization would be to move facets to happening via `fetch()` calls from the client, so the user gets to see their rows instantly and the facets then appear as and when they are available (though it would cause page jank).\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970766486", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970766486, "node_id": "IC_kwDOBm6k_c453LiW", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:52:56Z", "updated_at": "2021-11-16T22:56:07Z", "author_association": "OWNER", "body": "https://covid-19.datasettes.com/covid is 805.2MB\r\n\r\nhttps://covid-19.datasettes.com/covid/ny_times_us_counties?_trace=1&_facet_size=3&_size=2\r\n\r\nEquivalent SQL:\r\n\r\nhttps://covid-19.datasettes.com/covid?sql=with+cte+as+%28%0D%0A++select+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+ny_times_us_counties%0D%0A%29%2C%0D%0Atruncated+as+%28%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+date%2C+county%2C+state%2C+fips%2C+cases%2C+deaths%0D%0A++from+cte+order+by+date+desc+limit+4%0D%0A%29%2C%0D%0Astate_facet+as+%28%0D%0A++select+%27state%27+as+_facet%2C+state+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Afips_facet+as+%28%0D%0A++select+%27fips%27+as+_facet%2C+fips+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Acounty_facet+as+%28%0D%0A++select+%27county%27+as+_facet%2C+county+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Atotal_count+as+%28%0D%0A++select+%27COUNT%27+as+_facet%2C+%27%27+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte%0D%0A%29%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+state_facet%0D%0Aunion+all+select+*+from+fips_facet%0D%0Aunion+all+select+*+from+county_facet%0D%0Aunion+all+select+*+from+total_count\r\n\r\n```sql\r\nwith cte as (\r\n  select rowid, date, county, state, fips, cases, deaths\r\n  from ny_times_us_counties\r\n),\r\ntruncated as (\r\n  select null as _facet, null as facet_name, null as facet_count, rowid, date, county, state, fips, cases, deaths\r\n  from cte order by date desc limit 4\r\n),\r\nstate_facet as (\r\n  select 'state' as _facet, state as facet_name, count(*) as facet_count,\r\n  null, null, null, null, null, null, null\r\n  from cte group by facet_name order by facet_count desc limit 3\r\n),\r\nfips_facet as (\r\n  select 'fips' as _facet, fips as facet_name, count(*) as facet_count,\r\n  null, null, null, null, null, null, null\r\n  from cte group by facet_name order by facet_count desc limit 3\r\n),\r\ncounty_facet as (\r\n  select 'county' as _facet, county as facet_name, count(*) as facet_count,\r\n  null, null, null, null, null, null, null\r\n  from cte group by facet_name order by facet_count desc limit 3\r\n),\r\ntotal_count as (\r\n  select 'COUNT' as _facet, '' as facet_name, count(*) as facet_count,\r\n  null, null, null, null, null, null, null\r\n  from cte\r\n)\r\nselect * from truncated\r\nunion all select * from state_facet\r\nunion all select * from fips_facet\r\nunion all select * from county_facet\r\nunion all select * from total_count\r\n```\r\n\r\n_facet | facet_name | facet_count | rowid | date | county | state | fips | cases | deaths\r\n-- | -- | -- | -- | -- | -- | -- | -- | -- | --\r\n\u00a0 | \u00a0 | \u00a0 | 1917344 | 2021-11-15 | Autauga | Alabama | 1001 | 10407 | 154\r\n\u00a0 | \u00a0 | \u00a0 | 1917345 | 2021-11-15 | Baldwin | Alabama | 1003 | 37875 | 581\r\n\u00a0 | \u00a0 | \u00a0 | 1917346 | 2021-11-15 | Barbour | Alabama | 1005 | 3648 | 79\r\n\u00a0 | \u00a0 | \u00a0 | 1917347 | 2021-11-15 | Bibb | Alabama | 1007 | 4317 | 92\r\nstate | Texas | 148028 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nstate | Georgia | 96249 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nstate | Virginia | 79315 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nfips | \u00a0 | 17580 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nfips | 53061 | 665 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nfips | 17031 | 662 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\ncounty | Washington | 18666 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\ncounty | Unknown | 15840 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\ncounty | Jefferson | 15637 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nCOUNT | \u00a0 | 1920593 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970770304", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970770304, "node_id": "IC_kwDOBm6k_c453MeA", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:55:19Z", "updated_at": "2021-11-16T22:55:19Z", "author_association": "OWNER", "body": "(One thing I really like about this pattern is that it should work exactly the same when used to facet the results of arbitrary SQL queries as it does when faceting results from the table page.)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970767952", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970767952, "node_id": "IC_kwDOBm6k_c453L5Q", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:53:52Z", "updated_at": "2021-11-16T22:53:52Z", "author_association": "OWNER", "body": "It's going to take another 15 minutes for the build to finish and deploy the version with `_trace=1`: https://github.com/simonw/covid-19-datasette/actions/runs/1469150112", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970758179", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970758179, "node_id": "IC_kwDOBm6k_c453Jgj", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:47:38Z", "updated_at": "2021-11-16T22:47:38Z", "author_association": "OWNER", "body": "Trace now enabled: https://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_facet_size=3&_size=2&_nocount=1&_trace=1\r\n\r\nHere are the relevant traces:\r\n```json\r\n[\r\n    {\r\n      \"type\": \"sql\",\r\n      \"start\": 31.214430154,\r\n      \"end\": 31.214817089,\r\n      \"duration_ms\": 0.3869350000016425,\r\n      \"traceback\": [\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/base.py\\\", line 262, in get\\n    return await self.view_get(\\n\",\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/base.py\\\", line 477, in view_get\\n    response_or_template_contexts = await self.data(\\n\",\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/table.py\\\", line 705, in data\\n    results = await db.execute(sql, params, truncate=True, **extra_args)\\n\"\r\n      ],\r\n      \"database\": \"global-power-plants\",\r\n      \"sql\": \"select rowid, country, country_long, name, gppd_idnr, capacity_mw, latitude, longitude, primary_fuel, other_fuel1, other_fuel2, other_fuel3, commissioning_year, owner, source, url, geolocation_source, wepp_id, year_of_capacity_data, generation_gwh_2013, generation_gwh_2014, generation_gwh_2015, generation_gwh_2016, generation_gwh_2017, generation_data_source, estimated_generation_gwh from [global-power-plants] order by rowid limit 3\",\r\n      \"params\": {}\r\n    },\r\n    {\r\n      \"type\": \"sql\",\r\n      \"start\": 31.215234586,\r\n      \"end\": 31.220110342,\r\n      \"duration_ms\": 4.875756000000564,\r\n      \"traceback\": [\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/table.py\\\", line 760, in data\\n    ) = await facet.facet_results()\\n\",\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/facets.py\\\", line 212, in facet_results\\n    facet_rows_results = await self.ds.execute(\\n\",\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/app.py\\\", line 634, in execute\\n    return await self.databases[db_name].execute(\\n\"\r\n      ],\r\n      \"database\": \"global-power-plants\",\r\n      \"sql\": \"select country_long as value, count(*) as count from (\\n                    select rowid, country, country_long, name, gppd_idnr, capacity_mw, latitude, longitude, primary_fuel, other_fuel1, other_fuel2, other_fuel3, commissioning_year, owner, source, url, geolocation_source, wepp_id, year_of_capacity_data, generation_gwh_2013, generation_gwh_2014, generation_gwh_2015, generation_gwh_2016, generation_gwh_2017, generation_data_source, estimated_generation_gwh from [global-power-plants] \\n                )\\n                where country_long is not null\\n                group by country_long order by count desc, value limit 4\",\r\n      \"params\": []\r\n    },\r\n    {\r\n      \"type\": \"sql\",\r\n      \"start\": 31.221062485,\r\n      \"end\": 31.228968364,\r\n      \"duration_ms\": 7.905878999999061,\r\n      \"traceback\": [\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/table.py\\\", line 760, in data\\n    ) = await facet.facet_results()\\n\",\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/facets.py\\\", line 212, in facet_results\\n    facet_rows_results = await self.ds.execute(\\n\",\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/app.py\\\", line 634, in execute\\n    return await self.databases[db_name].execute(\\n\"\r\n      ],\r\n      \"database\": \"global-power-plants\",\r\n      \"sql\": \"select owner as value, count(*) as count from (\\n                    select rowid, country, country_long, name, gppd_idnr, capacity_mw, latitude, longitude, primary_fuel, other_fuel1, other_fuel2, other_fuel3, commissioning_year, owner, source, url, geolocation_source, wepp_id, year_of_capacity_data, generation_gwh_2013, generation_gwh_2014, generation_gwh_2015, generation_gwh_2016, generation_gwh_2017, generation_data_source, estimated_generation_gwh from [global-power-plants] \\n                )\\n                where owner is not null\\n                group by owner order by count desc, value limit 4\",\r\n      \"params\": []\r\n    },\r\n    {\r\n      \"type\": \"sql\",\r\n      \"start\": 31.229809757,\r\n      \"end\": 31.253902162,\r\n      \"duration_ms\": 24.09240499999754,\r\n      \"traceback\": [\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/views/table.py\\\", line 760, in data\\n    ) = await facet.facet_results()\\n\",\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/facets.py\\\", line 212, in facet_results\\n    facet_rows_results = await self.ds.execute(\\n\",\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/app.py\\\", line 634, in execute\\n    return await self.databases[db_name].execute(\\n\"\r\n      ],\r\n      \"database\": \"global-power-plants\",\r\n      \"sql\": \"select primary_fuel as value, count(*) as count from (\\n                    select rowid, country, country_long, name, gppd_idnr, capacity_mw, latitude, longitude, primary_fuel, other_fuel1, other_fuel2, other_fuel3, commissioning_year, owner, source, url, geolocation_source, wepp_id, year_of_capacity_data, generation_gwh_2013, generation_gwh_2014, generation_gwh_2015, generation_gwh_2016, generation_gwh_2017, generation_data_source, estimated_generation_gwh from [global-power-plants] \\n                )\\n                where primary_fuel is not null\\n                group by primary_fuel order by count desc, value limit 4\",\r\n      \"params\": []\r\n    },\r\n    {\r\n      \"type\": \"sql\",\r\n      \"start\": 31.255699745,\r\n      \"end\": 31.256243889,\r\n      \"duration_ms\": 0.544143999999136,\r\n      \"traceback\": [\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/facets.py\\\", line 145, in suggest\\n    row_count = await self.get_row_count()\\n\",\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/facets.py\\\", line 132, in get_row_count\\n    await self.ds.execute(\\n\",\r\n        \"  File \\\"/usr/local/lib/python3.8/site-packages/datasette/app.py\\\", line 634, in execute\\n    return await self.databases[db_name].execute(\\n\"\r\n      ],\r\n      \"database\": \"global-power-plants\",\r\n      \"sql\": \"select count(*) from (select rowid, country, country_long, name, gppd_idnr, capacity_mw, latitude, longitude, primary_fuel, other_fuel1, other_fuel2, other_fuel3, commissioning_year, owner, source, url, geolocation_source, wepp_id, year_of_capacity_data, generation_gwh_2013, generation_gwh_2014, generation_gwh_2015, generation_gwh_2016, generation_gwh_2017, generation_data_source, estimated_generation_gwh from [global-power-plants] )\",\r\n      \"params\": []\r\n    }\r\n]\r\n```\r\n```\r\nfetch rows: 0.3869350000016425 ms\r\nfacet country_long: 4.875756000000564 ms\r\nfacet owner: 7.905878999999061 ms\r\nfacet primary_fuel: 24.09240499999754 ms\r\ncount: 0.544143999999136 ms\r\n```\r\nTotal = 37.8ms\r\n\r\nI modified the query to include the total count as well: https://global-power-plants.datasettes.com/global-power-plants?sql=with+cte+as+%28%0D%0A++select+rowid%2C+country%2C+country_long%2C+name%2C+owner%2C+primary_fuel%0D%0A++from+%5Bglobal-power-plants%5D%0D%0A%29%2C%0D%0Atruncated+as+%28%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+country%2C+country_long%2C+name%2C+owner%2C+primary_fuel%0D%0A++from+cte+order+by+rowid+limit+4%0D%0A%29%2C%0D%0Acountry_long_facet+as+%28%0D%0A++select+%27country_long%27+as+_facet%2C+country_long+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Aowner_facet+as+%28%0D%0A++select+%27owner%27+as+_facet%2C+owner+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Aprimary_fuel_facet+as+%28%0D%0A++select+%27primary_fuel%27+as+_facet%2C+primary_fuel+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Atotal_count+as+%28%0D%0A++select+%27COUNT%27+as+_facet%2C+%27%27+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte%0D%0A%29%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+country_long_facet%0D%0Aunion+all+select+*+from+owner_facet%0D%0Aunion+all+select+*+from+primary_fuel_facet%0D%0Aunion+all+select+*+from+total_count&_trace=1\r\n\r\n```sql\r\nwith cte as (\r\n  select rowid, country, country_long, name, owner, primary_fuel\r\n  from [global-power-plants]\r\n),\r\ntruncated as (\r\n  select null as _facet, null as facet_name, null as facet_count, rowid, country, country_long, name, owner, primary_fuel\r\n  from cte order by rowid limit 4\r\n),\r\ncountry_long_facet as (\r\n  select 'country_long' as _facet, country_long as facet_name, count(*) as facet_count,\r\n  null, null, null, null, null, null\r\n  from cte group by facet_name order by facet_count desc limit 3\r\n),\r\nowner_facet as (\r\n  select 'owner' as _facet, owner as facet_name, count(*) as facet_count,\r\n  null, null, null, null, null, null\r\n  from cte group by facet_name order by facet_count desc limit 3\r\n),\r\nprimary_fuel_facet as (\r\n  select 'primary_fuel' as _facet, primary_fuel as facet_name, count(*) as facet_count,\r\n  null, null, null, null, null, null\r\n  from cte group by facet_name order by facet_count desc limit 3\r\n),\r\ntotal_count as (\r\n  select 'COUNT' as _facet, '' as facet_name, count(*) as facet_count,\r\n  null, null, null, null, null, null\r\n  from cte\r\n)\r\nselect * from truncated\r\nunion all select * from country_long_facet\r\nunion all select * from owner_facet\r\nunion all select * from primary_fuel_facet\r\nunion all select * from total_count\r\n```\r\nThe trace says that query took 34.801436999998714 ms.\r\n\r\nTo my huge surprise, this convoluted optimization only shaves the sum query time down from 37.8ms to 34.8ms!\r\n\r\nThat entire database file is just 11.1 MB though. Maybe it would make a meaningful difference on something larger?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970742415", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970742415, "node_id": "IC_kwDOBm6k_c453FqP", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:37:14Z", "updated_at": "2021-11-16T22:37:14Z", "author_association": "OWNER", "body": "The query takes 42.794ms to run.\r\n\r\nHere's the equivalent page using separate queries: https://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_facet_size=3&_size=2&_nocount=1\r\n\r\nAnnoyingly I can't disable facet suggestions but keep facets.\r\n\r\nI'm going to turn on tracing so I can see how long the separate queries took.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1513#issuecomment-970738130", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1513", "id": 970738130, "node_id": "IC_kwDOBm6k_c453EnS", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:32:19Z", "updated_at": "2021-11-16T22:32:19Z", "author_association": "OWNER", "body": "I came up with the following query which seems to work!\r\n\r\n```sql\r\nwith cte as (\r\n  select rowid, country, country_long, name, owner, primary_fuel\r\n  from [global-power-plants]\r\n),\r\ntruncated as (\r\n  select null as _facet, null as facet_name, null as facet_count, rowid, country, country_long, name, owner, primary_fuel\r\n  from cte order by rowid limit 4\r\n),\r\ncountry_long_facet as (\r\n  select 'country_long' as _facet, country_long as facet_name, count(*) as facet_count,\r\n  null, null, null, null, null, null\r\n  from cte group by facet_name order by facet_count desc limit 3\r\n),\r\nowner_facet as (\r\n  select 'owner' as _facet, owner as facet_name, count(*) as facet_count,\r\n  null, null, null, null, null, null\r\n  from cte group by facet_name order by facet_count desc limit 3\r\n),\r\nprimary_fuel_facet as (\r\n  select 'primary_fuel' as _facet, primary_fuel as facet_name, count(*) as facet_count,\r\n  null, null, null, null, null, null\r\n  from cte group by facet_name order by facet_count desc limit 3\r\n)\r\nselect * from truncated\r\nunion all select * from country_long_facet\r\nunion all select * from owner_facet\r\nunion all select * from primary_fuel_facet\r\n```\r\n(Limits should be 101, 31, 31, 31 but I reduced size to get a shorter example table).\r\n\r\nResults [look like this](https://global-power-plants.datasettes.com/global-power-plants?sql=with+cte+as+%28%0D%0A++select+rowid%2C+country%2C+country_long%2C+name%2C+owner%2C+primary_fuel%0D%0A++from+%5Bglobal-power-plants%5D%0D%0A%29%2C%0D%0Atruncated+as+%28%0D%0A++select+null+as+_facet%2C+null+as+facet_name%2C+null+as+facet_count%2C+rowid%2C+country%2C+country_long%2C+name%2C+owner%2C+primary_fuel%0D%0A++from+cte+order+by+rowid+limit+4%0D%0A%29%2C%0D%0Acountry_long_facet+as+%28%0D%0A++select+%27country_long%27+as+_facet%2C+country_long+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Aowner_facet+as+%28%0D%0A++select+%27owner%27+as+_facet%2C+owner+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%2C%0D%0Aprimary_fuel_facet+as+%28%0D%0A++select+%27primary_fuel%27+as+_facet%2C+primary_fuel+as+facet_name%2C+count%28*%29+as+facet_count%2C%0D%0A++null%2C+null%2C+null%2C+null%2C+null%2C+null%0D%0A++from+cte+group+by+facet_name+order+by+facet_count+desc+limit+3%0D%0A%29%0D%0Aselect+*+from+truncated%0D%0Aunion+all+select+*+from+country_long_facet%0D%0Aunion+all+select+*+from+owner_facet%0D%0Aunion+all+select+*+from+primary_fuel_facet):\r\n\r\n_facet | facet_name | facet_count | rowid | country | country_long | name | owner | primary_fuel\r\n-- | -- | -- | -- | -- | -- | -- | -- | --\r\n\u00a0 | \u00a0 | \u00a0 | 1 | AFG | Afghanistan | Kajaki Hydroelectric Power Plant Afghanistan | \u00a0 | Hydro\r\n\u00a0 | \u00a0 | \u00a0 | 2 | AFG | Afghanistan | Kandahar DOG | \u00a0 | Solar\r\n\u00a0 | \u00a0 | \u00a0 | 3 | AFG | Afghanistan | Kandahar JOL | \u00a0 | Solar\r\n\u00a0 | \u00a0 | \u00a0 | 4 | AFG | Afghanistan | Mahipar Hydroelectric Power Plant Afghanistan | \u00a0 | Hydro\r\ncountry_long | United States of America | 8688 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\ncountry_long | China | 4235 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\ncountry_long | United Kingdom | 2603 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nowner | \u00a0 | 14112 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nowner | Lightsource Renewable Energy | 120 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nowner | Cypress Creek Renewables | 109 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nprimary_fuel | Solar | 9662 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nprimary_fuel | Hydro | 7155 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\nprimary_fuel | Wind | 5188 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0 | \u00a0\r\n\r\nThis is a neat proof of concept. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055469073, "label": "Research: CTEs and union all to calculate facets AND query at the same time"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1512#issuecomment-970718337", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1512", "id": 970718337, "node_id": "IC_kwDOBm6k_c452_yB", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T22:02:30Z", "updated_at": "2021-11-16T22:02:30Z", "author_association": "OWNER", "body": "I've decided to make the clever `asyncio` dependency injection opt-in - so you can either decorate with `@inject` or you can set `inject_all = True` on the class - for example:\r\n```python\r\nimport asyncio\r\nfrom datasette.utils.asyncdi import AsyncBase, inject\r\n\r\n\r\nclass Simple(AsyncBase):\r\n    def __init__(self):\r\n        self.log = []\r\n\r\n    @inject\r\n    async def two(self):\r\n        self.log.append(\"two\")\r\n\r\n    @inject\r\n    async def one(self, two):\r\n        self.log.append(\"one\")\r\n        return self.log\r\n\r\n    async def not_inject(self, one, two):\r\n        return one + two\r\n\r\n\r\nclass Complex(AsyncBase):\r\n    inject_all = True\r\n\r\n    def __init__(self):\r\n        self.log = []\r\n\r\n    async def b(self):\r\n        self.log.append(\"b\")\r\n\r\n    async def a(self, b):\r\n        self.log.append(\"a\")\r\n\r\n    async def go(self, a):\r\n        self.log.append(\"go\")\r\n        return self.log\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1055402144, "label": "New pattern for async view classes"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970712713", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970712713, "node_id": "IC_kwDOBm6k_c452-aJ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T21:54:33Z", "updated_at": "2021-11-16T21:54:33Z", "author_association": "OWNER", "body": "I'm going to continue working on this in a PR.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970705738", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970705738, "node_id": "IC_kwDOBm6k_c4528tK", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T21:44:31Z", "updated_at": "2021-11-16T21:44:31Z", "author_association": "OWNER", "body": "Wrote a TIL about what I learned using `TopologicalSorter`: https://til.simonwillison.net/python/graphlib-topologicalsorter", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970673085", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970673085, "node_id": "IC_kwDOBm6k_c4520u9", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T20:58:24Z", "updated_at": "2021-11-16T20:58:24Z", "author_association": "OWNER", "body": "New test:\r\n```python\r\n\r\nclass Complex(AsyncBase):\r\n    def __init__(self):\r\n        self.log = []\r\n\r\n    async def d(self):\r\n        await asyncio.sleep(random() * 0.1)\r\n        print(\"LOG: d\")\r\n        self.log.append(\"d\")\r\n\r\n    async def c(self):\r\n        await asyncio.sleep(random() * 0.1)\r\n        print(\"LOG: c\")\r\n        self.log.append(\"c\")\r\n\r\n    async def b(self, c, d):\r\n        print(\"LOG: b\")\r\n        self.log.append(\"b\")\r\n\r\n    async def a(self, b, c):\r\n        print(\"LOG: a\")\r\n        self.log.append(\"a\")\r\n\r\n    async def go(self, a):\r\n        print(\"LOG: go\")\r\n        self.log.append(\"go\")\r\n        return self.log\r\n\r\n\r\n@pytest.mark.asyncio\r\nasync def test_complex():\r\n    result = await Complex().go()\r\n    # 'c' should only be called once\r\n    assert tuple(result) in (\r\n        # c and d could happen in either order\r\n        (\"c\", \"d\", \"b\", \"a\", \"go\"),\r\n        (\"d\", \"c\", \"b\", \"a\", \"go\"),\r\n    )\r\n```\r\nAnd this code passes it:\r\n```python\r\nimport asyncio\r\nfrom functools import wraps\r\nimport inspect\r\n\r\ntry:\r\n    import graphlib\r\nexcept ImportError:\r\n    from . import vendored_graphlib as graphlib\r\n\r\n\r\nclass AsyncMeta(type):\r\n    def __new__(cls, name, bases, attrs):\r\n        # Decorate any items that are 'async def' methods\r\n        _registry = {}\r\n        new_attrs = {\"_registry\": _registry}\r\n        for key, value in attrs.items():\r\n            if inspect.iscoroutinefunction(value) and not value.__name__ == \"resolve\":\r\n                new_attrs[key] = make_method(value)\r\n                _registry[key] = new_attrs[key]\r\n            else:\r\n                new_attrs[key] = value\r\n        # Gather graph for later dependency resolution\r\n        graph = {\r\n            key: {\r\n                p\r\n                for p in inspect.signature(method).parameters.keys()\r\n                if p != \"self\" and not p.startswith(\"_\")\r\n            }\r\n            for key, method in _registry.items()\r\n        }\r\n        new_attrs[\"_graph\"] = graph\r\n        return super().__new__(cls, name, bases, new_attrs)\r\n\r\n\r\ndef make_method(method):\r\n    parameters = inspect.signature(method).parameters.keys()\r\n\r\n    @wraps(method)\r\n    async def inner(self, _results=None, **kwargs):\r\n        print(\"\\n{}.{}({}) _results={}\".format(self, method.__name__, kwargs, _results))\r\n\r\n        # Any parameters not provided by kwargs are resolved from registry\r\n        to_resolve = [p for p in parameters if p not in kwargs and p != \"self\"]\r\n        missing = [p for p in to_resolve if p not in self._registry]\r\n        assert (\r\n            not missing\r\n        ), \"The following DI parameters could not be found in the registry: {}\".format(\r\n            missing\r\n        )\r\n\r\n        results = {}\r\n        results.update(kwargs)\r\n        if to_resolve:\r\n            resolved_parameters = await self.resolve(to_resolve, _results)\r\n            results.update(resolved_parameters)\r\n        return_value = await method(self, **results)\r\n        if _results is not None:\r\n            _results[method.__name__] = return_value\r\n        return return_value\r\n\r\n    return inner\r\n\r\n\r\nclass AsyncBase(metaclass=AsyncMeta):\r\n    async def resolve(self, names, results=None):\r\n        print(\"\\n  resolve: \", names)\r\n        if results is None:\r\n            results = {}\r\n\r\n        # Come up with an execution plan, just for these nodes\r\n        ts = graphlib.TopologicalSorter()\r\n        to_do = set(names)\r\n        done = set()\r\n        while to_do:\r\n            item = to_do.pop()\r\n            dependencies = self._graph[item]\r\n            ts.add(item, *dependencies)\r\n            done.add(item)\r\n            # Add any not-done dependencies to the queue\r\n            to_do.update({k for k in dependencies if k not in done})\r\n\r\n        ts.prepare()\r\n        plan = []\r\n        while ts.is_active():\r\n            node_group = ts.get_ready()\r\n            plan.append(node_group)\r\n            ts.done(*node_group)\r\n\r\n        print(\"plan:\", plan)\r\n\r\n        results = {}\r\n        for node_group in plan:\r\n            awaitables = [\r\n                self._registry[name](\r\n                    self,\r\n                    _results=results,\r\n                    **{k: v for k, v in results.items() if k in self._graph[name]},\r\n                )\r\n                for name in node_group\r\n            ]\r\n            print(\"    results = \", results)\r\n            print(\"    awaitables: \", awaitables)\r\n            awaitable_results = await asyncio.gather(*awaitables)\r\n            results.update(\r\n                {p[0].__name__: p[1] for p in zip(awaitables, awaitable_results)}\r\n            )\r\n\r\n        print(\"  End of resolve(), returning\", results)\r\n        return {key: value for key, value in results.items() if key in names}\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970660299", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970660299, "node_id": "IC_kwDOBm6k_c452xnL", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T20:39:43Z", "updated_at": "2021-11-16T20:42:27Z", "author_association": "OWNER", "body": "But that does seem to be the plan that `TopographicalSorter` provides:\r\n```python\r\ngraph = {\"go\": {\"a\"}, \"a\": {\"b\", \"c\"}, \"b\": {\"c\", \"d\"}}\r\n\r\nts = TopologicalSorter(graph)\r\nts.prepare()\r\nwhile ts.is_active():\r\n    nodes = ts.get_ready()\r\n    print(nodes)\r\n    ts.done(*nodes)\r\n```\r\nOutputs:\r\n```\r\n('c', 'd')\r\n('b',)\r\n('a',)\r\n('go',)\r\n```\r\nAlso:\r\n```python\r\ngraph = {\"go\": {\"d\", \"e\", \"f\"}, \"d\": {\"b\", \"c\"}, \"b\": {\"c\"}}\r\n\r\nts = TopologicalSorter(graph)\r\nts.prepare()\r\nwhile ts.is_active():\r\n    nodes = ts.get_ready()\r\n    print(nodes)\r\n    ts.done(*nodes)\r\n```\r\nGives:\r\n```\r\n('e', 'f', 'c')\r\n('b',)\r\n('d',)\r\n('go',)\r\n```\r\nI'm confident that `TopologicalSorter` is the way to do this. I think I need to rewrite my code to call it once to get that plan, then `await asyncio.gather(*nodes)` in turn to execute it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970657874", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970657874, "node_id": "IC_kwDOBm6k_c452xBS", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T20:36:01Z", "updated_at": "2021-11-16T20:36:01Z", "author_association": "OWNER", "body": "My goal here is to calculate the most efficient way to resolve the different nodes, running them in parallel where possible.\r\n\r\nSo for this class:\r\n\r\n```python\r\nclass Complex(AsyncBase):\r\n    async def d(self):\r\n        pass\r\n\r\n    async def c(self):\r\n        pass\r\n\r\n    async def b(self, c, d):\r\n        pass\r\n\r\n    async def a(self, b, c):\r\n        pass\r\n\r\n    async def go(self, a):\r\n        pass\r\n```\r\nA call to `go()` should do this:\r\n\r\n- `c` and `d` in parallel\r\n- `b`\r\n- `a`\r\n- `go`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970655927", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970655927, "node_id": "IC_kwDOBm6k_c452wi3", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T20:33:11Z", "updated_at": "2021-11-16T20:33:11Z", "author_association": "OWNER", "body": "What should be happening here instead is it should resolve the full graph and notice that `c` is depended on by both `b` and `a` - so it should run `c` first, then run the next ones in parallel.\r\n\r\nSo maybe the algorithm I'm inheriting from https://docs.python.org/3/library/graphlib.html isn't the correct algorithm?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970655304", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970655304, "node_id": "IC_kwDOBm6k_c452wZI", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T20:32:16Z", "updated_at": "2021-11-16T20:32:16Z", "author_association": "OWNER", "body": "This code is really fiddly. I just got to this version:\r\n```python\r\nimport asyncio\r\nfrom functools import wraps\r\nimport inspect\r\n\r\ntry:\r\n    import graphlib\r\nexcept ImportError:\r\n    from . import vendored_graphlib as graphlib\r\n\r\n\r\nclass AsyncMeta(type):\r\n    def __new__(cls, name, bases, attrs):\r\n        # Decorate any items that are 'async def' methods\r\n        _registry = {}\r\n        new_attrs = {\"_registry\": _registry}\r\n        for key, value in attrs.items():\r\n            if inspect.iscoroutinefunction(value) and not value.__name__ == \"resolve\":\r\n                new_attrs[key] = make_method(value)\r\n                _registry[key] = new_attrs[key]\r\n            else:\r\n                new_attrs[key] = value\r\n        # Gather graph for later dependency resolution\r\n        graph = {\r\n            key: {\r\n                p\r\n                for p in inspect.signature(method).parameters.keys()\r\n                if p != \"self\" and not p.startswith(\"_\")\r\n            }\r\n            for key, method in _registry.items()\r\n        }\r\n        new_attrs[\"_graph\"] = graph\r\n        return super().__new__(cls, name, bases, new_attrs)\r\n\r\n\r\ndef make_method(method):\r\n    @wraps(method)\r\n    async def inner(self, _results=None, **kwargs):\r\n        print(\"inner - _results=\", _results)\r\n        parameters = inspect.signature(method).parameters.keys()\r\n        # Any parameters not provided by kwargs are resolved from registry\r\n        to_resolve = [p for p in parameters if p not in kwargs and p != \"self\"]\r\n        missing = [p for p in to_resolve if p not in self._registry]\r\n        assert (\r\n            not missing\r\n        ), \"The following DI parameters could not be found in the registry: {}\".format(\r\n            missing\r\n        )\r\n        results = {}\r\n        results.update(kwargs)\r\n        if to_resolve:\r\n            resolved_parameters = await self.resolve(to_resolve, _results)\r\n            results.update(resolved_parameters)\r\n        return_value = await method(self, **results)\r\n        if _results is not None:\r\n            _results[method.__name__] = return_value\r\n        return return_value\r\n\r\n    return inner\r\n\r\n\r\nclass AsyncBase(metaclass=AsyncMeta):\r\n    async def resolve(self, names, results=None):\r\n        print(\"\\n  resolve: \", names)\r\n        if results is None:\r\n            results = {}\r\n\r\n        # Resolve them in the correct order\r\n        ts = graphlib.TopologicalSorter()\r\n        for name in names:\r\n            ts.add(name, *self._graph[name])\r\n        ts.prepare()\r\n\r\n        async def resolve_nodes(nodes):\r\n            print(\"    resolve_nodes\", nodes)\r\n            print(\"    (current results = {})\".format(repr(results)))\r\n            awaitables = [\r\n                self._registry[name](\r\n                    self,\r\n                    _results=results,\r\n                    **{k: v for k, v in results.items() if k in self._graph[name]},\r\n                )\r\n                for name in nodes\r\n                if name not in results\r\n            ]\r\n            print(\"    awaitables: \", awaitables)\r\n            awaitable_results = await asyncio.gather(*awaitables)\r\n            results.update(\r\n                {p[0].__name__: p[1] for p in zip(awaitables, awaitable_results)}\r\n            )\r\n\r\n        if not ts.is_active():\r\n            # Nothing has dependencies - just resolve directly\r\n            print(\"    no dependencies, resolve directly\")\r\n            await resolve_nodes(names)\r\n        else:\r\n            # Resolve in topological order\r\n            while ts.is_active():\r\n                nodes = ts.get_ready()\r\n                print(\"    ts.get_ready() returned nodes:\", nodes)\r\n                await resolve_nodes(nodes)\r\n                for node in nodes:\r\n                    ts.done(node)\r\n\r\n        print(\"  End of resolve(), returning\", results)\r\n        return {key: value for key, value in results.items() if key in names}\r\n```\r\nWith this test:\r\n```python\r\nclass Complex(AsyncBase):\r\n    def __init__(self):\r\n        self.log = []\r\n\r\n    async def c(self):\r\n        print(\"LOG: c\")\r\n        self.log.append(\"c\")\r\n\r\n    async def b(self, c):\r\n        print(\"LOG: b\")\r\n        self.log.append(\"b\")\r\n\r\n    async def a(self, b, c):\r\n        print(\"LOG: a\")\r\n        self.log.append(\"a\")\r\n\r\n    async def go(self, a):\r\n        print(\"LOG: go\")\r\n        self.log.append(\"go\")\r\n        return self.log\r\n\r\n\r\n@pytest.mark.asyncio\r\nasync def test_complex():\r\n    result = await Complex().go()\r\n    # 'c' should only be called once\r\n    assert result == [\"c\", \"b\", \"a\", \"go\"]\r\n```\r\nThis test sometimes passes, and sometimes fails!\r\n\r\nOutput for a pass:\r\n```\r\ntests/test_asyncdi.py inner - _results= None\r\n\r\n  resolve:  ['a']\r\n    ts.get_ready() returned nodes: ('c', 'b')\r\n    resolve_nodes ('c', 'b')\r\n    (current results = {})\r\n    awaitables:  [<coroutine object Complex.c at 0x1074ac890>, <coroutine object Complex.b at 0x1074ac820>]\r\ninner - _results= {}\r\nLOG: c\r\ninner - _results= {'c': None}\r\n\r\n  resolve:  ['c']\r\n    ts.get_ready() returned nodes: ('c',)\r\n    resolve_nodes ('c',)\r\n    (current results = {'c': None})\r\n    awaitables:  []\r\n  End of resolve(), returning {'c': None}\r\nLOG: b\r\n    ts.get_ready() returned nodes: ('a',)\r\n    resolve_nodes ('a',)\r\n    (current results = {'c': None, 'b': None})\r\n    awaitables:  [<coroutine object Complex.a at 0x1074ac7b0>]\r\ninner - _results= {'c': None, 'b': None}\r\nLOG: a\r\n  End of resolve(), returning {'c': None, 'b': None, 'a': None}\r\nLOG: go\r\n```\r\nOutput for a fail:\r\n```\r\ntests/test_asyncdi.py inner - _results= None\r\n\r\n  resolve:  ['a']\r\n    ts.get_ready() returned nodes: ('b', 'c')\r\n    resolve_nodes ('b', 'c')\r\n    (current results = {})\r\n    awaitables:  [<coroutine object Complex.b at 0x10923c890>, <coroutine object Complex.c at 0x10923c820>]\r\ninner - _results= {}\r\n\r\n  resolve:  ['c']\r\n    ts.get_ready() returned nodes: ('c',)\r\n    resolve_nodes ('c',)\r\n    (current results = {})\r\n    awaitables:  [<coroutine object Complex.c at 0x10923c6d0>]\r\ninner - _results= {}\r\nLOG: c\r\ninner - _results= {'c': None}\r\nLOG: c\r\n  End of resolve(), returning {'c': None}\r\nLOG: b\r\n    ts.get_ready() returned nodes: ('a',)\r\n    resolve_nodes ('a',)\r\n    (current results = {'c': None, 'b': None})\r\n    awaitables:  [<coroutine object Complex.a at 0x10923c6d0>]\r\ninner - _results= {'c': None, 'b': None}\r\nLOG: a\r\n  End of resolve(), returning {'c': None, 'b': None, 'a': None}\r\nLOG: go\r\nF\r\n\r\n=================================================================================================== FAILURES ===================================================================================================\r\n_________________________________________________________________________________________________ test_complex _________________________________________________________________________________________________\r\n\r\n    @pytest.mark.asyncio\r\n    async def test_complex():\r\n        result = await Complex().go()\r\n        # 'c' should only be called once\r\n>       assert result == [\"c\", \"b\", \"a\", \"go\"]\r\nE       AssertionError: assert ['c', 'c', 'b', 'a', 'go'] == ['c', 'b', 'a', 'go']\r\nE         At index 1 diff: 'c' != 'b'\r\nE         Left contains one more item: 'go'\r\nE         Use -v to get the full diff\r\n\r\ntests/test_asyncdi.py:48: AssertionError\r\n================== short test summary info ================================\r\nFAILED tests/test_asyncdi.py::test_complex - AssertionError: assert ['c', 'c', 'b', 'a', 'go'] == ['c', 'b', 'a', 'go']\r\n```\r\nI figured out why this is happening.\r\n\r\n`a` requires `b` and `c`\r\n\r\n`b` also requires `c`\r\n\r\nThe code decides to run `b` and `c` in parallel.\r\n\r\nIf `c` completes first, then when `b` runs it gets to use the already-calculated result for `c` - so it doesn't need to call `c` again.\r\n\r\nIf `b` gets to that point before `c` does it also needs to call `c`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/878#issuecomment-970624197", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/878", "id": 970624197, "node_id": "IC_kwDOBm6k_c452ozF", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T19:49:05Z", "updated_at": "2021-11-16T19:49:05Z", "author_association": "OWNER", "body": "Here's the latest version of my weird dependency injection async class:\r\n```python\r\nimport inspect\r\n\r\nclass AsyncMeta(type):\r\n    def __new__(cls, name, bases, attrs):\r\n        # Decorate any items that are 'async def' methods\r\n        _registry = {}\r\n        new_attrs = {\"_registry\": _registry}\r\n        for key, value in attrs.items():\r\n            if inspect.iscoroutinefunction(value) and not value.__name__ == \"resolve\":\r\n                new_attrs[key] = make_method(value)\r\n                _registry[key] = new_attrs[key]\r\n            else:\r\n                new_attrs[key] = value\r\n\r\n        # Topological sort of _registry by parameter dependencies\r\n        graph = {\r\n            key: {\r\n                p for p in inspect.signature(method).parameters.keys()\r\n                if p != \"self\" and not p.startswith(\"_\")\r\n            }\r\n            for key, method in _registry.items()\r\n        }\r\n        new_attrs[\"_graph\"] = graph\r\n        return super().__new__(cls, name, bases, new_attrs)\r\n\r\n\r\ndef make_method(method):\r\n    @wraps(method)\r\n    async def inner(self, **kwargs):\r\n        parameters = inspect.signature(method).parameters.keys()\r\n        # Any parameters not provided by kwargs are resolved from registry\r\n        to_resolve = [p for p in parameters if p not in kwargs and p != \"self\"]\r\n        missing = [p for p in to_resolve if p not in self._registry]\r\n        assert (\r\n            not missing\r\n        ), \"The following DI parameters could not be found in the registry: {}\".format(\r\n            missing\r\n        )\r\n        results = {}\r\n        results.update(kwargs)\r\n        results.update(await self.resolve(to_resolve))\r\n        return await method(self, **results)\r\n\r\n    return inner\r\n\r\n\r\nbad = [0]\r\n\r\nclass AsyncBase(metaclass=AsyncMeta):\r\n    async def resolve(self, names):\r\n        print(\" resolve({})\".format(names))\r\n        results = {}\r\n        # Resolve them in the correct order\r\n        ts = TopologicalSorter()\r\n        ts2 = TopologicalSorter()\r\n        print(\" names = \", names)\r\n        print(\" self._graph = \", self._graph)\r\n        for name in names:\r\n            if self._graph[name]:\r\n                ts.add(name, *self._graph[name])\r\n                ts2.add(name, *self._graph[name])\r\n        print(\" static_order =\", tuple(ts2.static_order()))\r\n        ts.prepare()\r\n        while ts.is_active():\r\n            print(\"  is_active, i = \", bad[0])\r\n            bad[0] += 1\r\n            if bad[0] > 20:\r\n                print(\"    Infinite loop?\")\r\n                break\r\n            nodes = ts.get_ready()\r\n            print(\"  Do nodes:\", nodes)\r\n            awaitables = [self._registry[name](self, **{\r\n                k: v for k, v in results.items() if k in self._graph[name]\r\n            }) for name in nodes]\r\n            print(\"  awaitables: \", awaitables)\r\n            awaitable_results = await asyncio.gather(*awaitables)\r\n            results.update({\r\n                p[0].__name__: p[1] for p in zip(awaitables, awaitable_results)\r\n            })\r\n            print(results)\r\n            for node in nodes:\r\n                ts.done(node)\r\n\r\n        return results\r\n```\r\nExample usage:\r\n```python\r\nclass Foo(AsyncBase):\r\n    async def graa(self, boff):\r\n        print(\"graa\")\r\n        return 5\r\n    async def boff(self):\r\n        print(\"boff\")\r\n        return 8\r\n    async def other(self, boff, graa):\r\n        print(\"other\")\r\n        return 5 + boff + graa\r\n\r\nfoo = Foo()\r\nawait foo.other()\r\n```\r\nOutput:\r\n```\r\n resolve(['boff', 'graa'])\r\n names =  ['boff', 'graa']\r\n self._graph =  {'graa': {'boff'}, 'boff': set(), 'other': {'graa', 'boff'}}\r\n static_order = ('boff', 'graa')\r\n  is_active, i =  0\r\n  Do nodes: ('boff',)\r\n  awaitables:  [<coroutine object Foo.boff at 0x10bd81a40>]\r\n resolve([])\r\n names =  []\r\n self._graph =  {'graa': {'boff'}, 'boff': set(), 'other': {'graa', 'boff'}}\r\n static_order = ()\r\nboff\r\n{'boff': 8}\r\n  is_active, i =  1\r\n  Do nodes: ('graa',)\r\n  awaitables:  [<coroutine object Foo.graa at 0x10d66b340>]\r\n resolve([])\r\n names =  []\r\n self._graph =  {'graa': {'boff'}, 'boff': set(), 'other': {'graa', 'boff'}}\r\n static_order = ()\r\ngraa\r\n{'boff': 8, 'graa': 5}\r\nother\r\n18\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 648435885, "label": "New pattern for views that return either JSON or HTML, available for plugins"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-970554697", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 970554697, "node_id": "IC_kwDOBm6k_c452X1J", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T18:32:03Z", "updated_at": "2021-11-16T18:32:03Z", "author_association": "OWNER", "body": "I'm going to take another look at this:\r\n- https://github.com/simonw/datasette/issues/878", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/782#issuecomment-970553780", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/782", "id": 970553780, "node_id": "IC_kwDOBm6k_c452Xm0", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T18:30:51Z", "updated_at": "2021-11-16T18:30:58Z", "author_association": "OWNER", "body": "OK, I'm ready to start working on this today.\r\n\r\nI'm going to go with a default representation that looks like this:\r\n\r\n```json\r\n{\r\n    \"rows\": [\r\n        {\"id\": 1, \"name\": \"One\"},\r\n        {\"id\": 2, \"name\": \"Two\"}\r\n    ],\r\n    \"next_url\": null\r\n}\r\n```\r\nNote that there's no `count` - all it provides is the current selection of results and an indication as to how the next can be retrieved (`null` if there are no more results).\r\n\r\nI'll implement `?_extra=` to provide everything else.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 627794879, "label": "Redesign default .json format"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1509#issuecomment-970544733", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1509", "id": 970544733, "node_id": "IC_kwDOBm6k_c452VZd", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T18:22:32Z", "updated_at": "2021-11-16T18:22:32Z", "author_association": "OWNER", "body": "This is mainly happening here:\r\n- https://github.com/simonw/datasette/issues/782", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1054243511, "label": "Datasette 1.0 JSON API (and documentation)"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1012#issuecomment-970266123", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1012", "id": 970266123, "node_id": "IC_kwDOBm6k_c451RYL", "user": {"value": 45380, "label": "bollwyvl"}, "created_at": "2021-11-16T13:18:36Z", "updated_at": "2021-11-16T13:18:36Z", "author_association": "CONTRIBUTOR", "body": "Congratulations, looks like it went through! There was a bit of a hold-up\non the JupyterLab ones, but it's semi automated: a dependabot pr to\nwarehouse and a CI deploy, with a click in between.\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 718540751, "label": "For 1.0 update trove classifier in setup.py"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1505#issuecomment-970188065", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1505", "id": 970188065, "node_id": "IC_kwDOBm6k_c450-Uh", "user": {"value": 7094907, "label": "Segerberg"}, "created_at": "2021-11-16T11:40:52Z", "updated_at": "2021-11-16T11:40:52Z", "author_association": "NONE", "body": "A suggestion is to have the option to choose an arbitrary delimiter (and quoting characters )", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1052247023, "label": "Datasette should have an option to output CSV with semicolons"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/448#issuecomment-969621662", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/448", "id": 969621662, "node_id": "IC_kwDOBm6k_c45y0Ce", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T01:32:04Z", "updated_at": "2021-11-16T01:32:04Z", "author_association": "OWNER", "body": "Tests are failing and I think it's because the facets come back in different orders, need a tie-breaker. https://github.com/simonw/datasette/runs/4219325197?check_suite_focus=true", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 440222719, "label": "_facet_array should work against views"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1176#issuecomment-969616626", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1176", "id": 969616626, "node_id": "IC_kwDOBm6k_c45yyzy", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T01:29:13Z", "updated_at": "2021-11-16T01:29:13Z", "author_association": "OWNER", "body": "I'm inclined to create a Sphinx reference documentation page for this, as I did for `sqlite-utils` here: https://sqlite-utils.datasette.io/en/stable/reference.html", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 779691739, "label": "Policy on documenting \"public\" datasette.utils functions"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1012#issuecomment-969613166", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1012", "id": 969613166, "node_id": "IC_kwDOBm6k_c45yx9u", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T01:27:25Z", "updated_at": "2021-11-16T01:27:25Z", "author_association": "OWNER", "body": "Requested here:\r\n\r\n- https://github.com/pypa/trove-classifiers/pull/85", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 718540751, "label": "For 1.0 update trove classifier in setup.py"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1012#issuecomment-969602825", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1012", "id": 969602825, "node_id": "IC_kwDOBm6k_c45yvcJ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T01:21:14Z", "updated_at": "2021-11-16T01:21:14Z", "author_association": "OWNER", "body": "I'd been wondering how to get new classifiers into Trove - thanks, I'll give this a go.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 718540751, "label": "For 1.0 update trove classifier in setup.py"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1511#issuecomment-969600859", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1511", "id": 969600859, "node_id": "IC_kwDOBm6k_c45yu9b", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T01:20:13Z", "updated_at": "2021-11-16T01:20:13Z", "author_association": "OWNER", "body": "See:\r\n- #830", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1054246919, "label": "Review plugin hooks for Datasette 1.0"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/448#issuecomment-969582098", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/448", "id": 969582098, "node_id": "IC_kwDOBm6k_c45yqYS", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T01:10:28Z", "updated_at": "2021-11-16T01:10:28Z", "author_association": "OWNER", "body": "Also note that this demo data is using a SQL view to create the JSON arrays - the view is defined as such:\r\n\r\n```sql\r\nCREATE VIEW ads_with_targets as\r\nselect\r\n  ads.*,\r\n  json_group_array(targets.name) as target_names\r\nfrom\r\n  ads\r\n  join ad_targets on ad_targets.ad_id = ads.id\r\n  join targets on ad_targets.target_id = targets.id\r\ngroup by\r\n  ad_targets.ad_id;\r\n```\r\nSo running JSON faceting on top of that view is a pretty big ask!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 440222719, "label": "_facet_array should work against views"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/448#issuecomment-969578466", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/448", "id": 969578466, "node_id": "IC_kwDOBm6k_c45ypfi", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T01:08:29Z", "updated_at": "2021-11-16T01:08:29Z", "author_association": "OWNER", "body": "Actually with the cache warmed up it looks like the facet query is taking 150ms which is good enough.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 440222719, "label": "_facet_array should work against views"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/448#issuecomment-969572281", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/448", "id": 969572281, "node_id": "IC_kwDOBm6k_c45yn-5", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T01:05:11Z", "updated_at": "2021-11-16T01:05:11Z", "author_association": "OWNER", "body": "I tried this and it seems to work correctly:\r\n```python\r\n        for source_and_config in self.get_configs():\r\n            config = source_and_config[\"config\"]\r\n            source = source_and_config[\"source\"]\r\n            column = config.get(\"column\") or config[\"simple\"]\r\n            facet_sql = \"\"\"\r\n                with inner as ({sql}),\r\n                deduped_array_items as (\r\n                    select\r\n                        distinct j.value,\r\n                        inner.*\r\n                    from\r\n                        json_each([inner].{col}) j\r\n                        join inner\r\n                )\r\n                select\r\n                    value as value,\r\n                    count(*) as count\r\n                from\r\n                    deduped_array_items\r\n                group by\r\n                    value\r\n                order by\r\n                    count(*) desc limit {limit}\r\n            \"\"\".format(\r\n                col=escape_sqlite(column), sql=self.sql, limit=facet_size + 1\r\n            )\r\n```\r\nThe queries are _very_ slow though - I had to bump up to 2s time limit even against only a view returning 3,499 rows.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 440222719, "label": "_facet_array should work against views"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/448#issuecomment-969557008", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/448", "id": 969557008, "node_id": "IC_kwDOBm6k_c45ykQQ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T00:56:09Z", "updated_at": "2021-11-16T00:59:59Z", "author_association": "OWNER", "body": "This looks like it might work:\r\n```sql\r\nwith inner as (\r\n  select\r\n    *\r\n  from\r\n    ads_with_targets\r\n  where\r\n    :p0 in (\r\n      select\r\n        value\r\n      from\r\n        json_each([ads_with_targets].[target_names])\r\n    )\r\n),\r\ndeduped_array_items as (\r\n  select\r\n    distinct j.value,\r\n    inner.*\r\n  from\r\n    json_each([inner].[target_names]) j\r\n    join inner\r\n)\r\nselect\r\n  value,\r\n  count(*)\r\nfrom\r\n  deduped_array_items\r\ngroup by\r\n  value\r\norder by\r\n  count(*) desc\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 440222719, "label": "_facet_array should work against views"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/448#issuecomment-969557972", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/448", "id": 969557972, "node_id": "IC_kwDOBm6k_c45ykfU", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-16T00:56:58Z", "updated_at": "2021-11-16T00:56:58Z", "author_association": "OWNER", "body": "It uses a CTE which were introduced in SQLite 3.8 - and AWS Lambda Python 3.9 still provides 3.7 - but I've checked and I can use `pysqlite3-binary` to work around that there so I'm OK relying on CTEs for this.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 440222719, "label": "_facet_array should work against views"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/448#issuecomment-969449772", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/448", "id": 969449772, "node_id": "IC_kwDOBm6k_c45yKEs", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T23:48:37Z", "updated_at": "2021-11-15T23:48:37Z", "author_association": "OWNER", "body": "Given this query: https://json-view-facet-bug-demo-j7hipcg4aq-uc.a.run.app/russian-ads?sql=select%0D%0A++j.value+as+value%2C%0D%0A++count%28*%29+as+count%0D%0Afrom%0D%0A++%28%0D%0A++++select%0D%0A++++++id%2C%0D%0A++++++file%2C%0D%0A++++++clicks%2C%0D%0A++++++impressions%2C%0D%0A++++++text%2C%0D%0A++++++url%2C%0D%0A++++++spend_amount%2C%0D%0A++++++spend_currency%2C%0D%0A++++++created%2C%0D%0A++++++ended%2C%0D%0A++++++target_names%0D%0A++++from%0D%0A++++++ads_with_targets%0D%0A++++where%0D%0A++++++%3Ap0+in+%28%0D%0A++++++++select%0D%0A++++++++++value%0D%0A++++++++from%0D%0A++++++++++json_each%28%5Bads_with_targets%5D.%5Btarget_names%5D%29%0D%0A++++++%29%0D%0A++%29%0D%0A++join+json_each%28target_names%29+j%0D%0Agroup+by%0D%0A++j.value%0D%0Aorder+by%0D%0A++count+desc%2C%0D%0A++value%0D%0Alimit%0D%0A++31&p0=people_who_match%3Ainterests%3AAfrican-American+culture\r\n\r\n```sql\r\nselect\r\n  j.value as value,\r\n  count(*) as count\r\nfrom\r\n  (\r\n    select\r\n      id,\r\n      file,\r\n      clicks,\r\n      impressions,\r\n      text,\r\n      url,\r\n      spend_amount,\r\n      spend_currency,\r\n      created,\r\n      ended,\r\n      target_names\r\n    from\r\n      ads_with_targets\r\n    where\r\n      :p0 in (\r\n        select\r\n          value\r\n        from\r\n          json_each([ads_with_targets].[target_names])\r\n      )\r\n  )\r\n  join json_each(target_names) j\r\ngroup by\r\n  j.value\r\norder by\r\n  count desc,\r\n  value\r\nlimit\r\n  31\r\n```\r\nHow can I return a count of the number of documents containing each tag, but not the number of total tags that match including duplicates?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 440222719, "label": "_facet_array should work against views"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/448#issuecomment-969446972", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/448", "id": 969446972, "node_id": "IC_kwDOBm6k_c45yJY8", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T23:46:13Z", "updated_at": "2021-11-15T23:46:13Z", "author_association": "OWNER", "body": "It looks like the problem here is that some of the tags occur more than once in the documents:\r\n\r\n<img width=\"1146\" alt=\"russian-ads__ads_with_targets__172_rows_where_where_target_names_contains__people_who_match_interests_African-American_culture_\" src=\"https://user-images.githubusercontent.com/9599/141870312-5ffe7151-826e-42e0-b1cc-36c46de6b618.png\">\r\n\r\nSo they get counted more than once, hence the 182 count for something that couldn't possibly return more than 172 documents.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 440222719, "label": "_facet_array should work against views"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/448#issuecomment-969442215", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/448", "id": 969442215, "node_id": "IC_kwDOBm6k_c45yIOn", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T23:42:03Z", "updated_at": "2021-11-15T23:42:03Z", "author_association": "OWNER", "body": "I think this code is wrong in the `ArrayFacet` class: https://github.com/simonw/datasette/blob/502c02fa6dde6a8bb840af6c4c8cf858aa1db687/datasette/facets.py#L357-L364", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 440222719, "label": "_facet_array should work against views"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/448#issuecomment-969440918", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/448", "id": 969440918, "node_id": "IC_kwDOBm6k_c45yH6W", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T23:40:17Z", "updated_at": "2021-11-15T23:40:35Z", "author_association": "OWNER", "body": "Applied that fix to the `arraycontains` filter but I'm still getting bad results for the faceting:\r\n\r\n<img width=\"915\" alt=\"russian-ads__ads_with_targets__172_rows_where_where_target_names_contains__people_who_match_interests_African-American_culture__and_datasette_\u2014_pipenv_shell_\u25b8_python_\u2014_80\u00d724\" src=\"https://user-images.githubusercontent.com/9599/141869800-0092e0e1-005e-43bc-bac5-bf0e159224a8.png\">\r\n\r\nShould never get 182 results on a page that faceting against only 172 items.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 440222719, "label": "_facet_array should work against views"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/448#issuecomment-969436930", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/448", "id": 969436930, "node_id": "IC_kwDOBm6k_c45yG8C", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T23:31:58Z", "updated_at": "2021-11-15T23:31:58Z", "author_association": "OWNER", "body": "I think this SQL recipe may work instead:\r\n```sql\r\nselect\r\n  *\r\nfrom\r\n  ads_with_targets\r\nwhere\r\n  'people_who_match:interests:African-American Civil Rights Movement (1954\u201468)' in (\r\n    select\r\n      value\r\n    from\r\n      json_each(target_names)\r\n  )\r\n  and 'interests:Martin Luther King III' in (\r\n    select\r\n      value\r\n    from\r\n      json_each(target_names)\r\n  )\r\n```\r\nhttps://json-view-facet-bug-demo-j7hipcg4aq-uc.a.run.app/russian-ads?sql=select%0D%0A++*%0D%0Afrom%0D%0A++ads_with_targets%0D%0Awhere%0D%0A++%27people_who_match%3Ainterests%3AAfrican-American+Civil+Rights+Movement+%281954%E2%80%9468%29%27+in+%28%0D%0A++++select%0D%0A++++++value%0D%0A++++from%0D%0A++++++json_each%28target_names%29%0D%0A++%29%0D%0A++and+%27interests%3AMartin+Luther+King+III%27+in+%28%0D%0A++++select%0D%0A++++++value%0D%0A++++from%0D%0A++++++json_each%28target_names%29%0D%0A++%29&interests=&African=&Martin=", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 440222719, "label": "_facet_array should work against views"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/519#issuecomment-969433734", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/519", "id": 969433734, "node_id": "IC_kwDOBm6k_c45yGKG", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T23:26:11Z", "updated_at": "2021-11-15T23:26:11Z", "author_association": "OWNER", "body": "I'm happy with this as the goals for 1.0. I'm going to close this issue and create three tracking tickets for the three key themes:\r\n\r\n- https://github.com/simonw/datasette/issues/1509\r\n- https://github.com/simonw/datasette/issues/1510\r\n- https://github.com/simonw/datasette/issues/1511", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459590021, "label": "Decide what goes into Datasette 1.0"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1508#issuecomment-968904414", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1508", "id": 968904414, "node_id": "IC_kwDOBm6k_c45wE7e", "user": {"value": 22429695, "label": "codecov[bot]"}, "created_at": "2021-11-15T13:20:49Z", "updated_at": "2021-11-15T13:20:49Z", "author_association": "NONE", "body": "# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1508?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report\n> Merging [#1508](https://codecov.io/gh/simonw/datasette/pull/1508?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (299774b) into [main](https://codecov.io/gh/simonw/datasette/commit/502c02fa6dde6a8bb840af6c4c8cf858aa1db687?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (502c02f) will **not change** coverage.\n> The diff coverage is `n/a`.\n\n[![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1508/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/datasette/pull/1508?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n\n```diff\n@@           Coverage Diff           @@\n##             main    #1508   +/-   ##\n=======================================\n  Coverage   91.82%   91.82%           \n=======================================\n  Files          34       34           \n  Lines        4430     4430           \n=======================================\n  Hits         4068     4068           \n  Misses        362      362           \n```\n\n\n\n------\n\n[Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1508?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)\n> `\u0394 = absolute <relative> (impact)`, `\u00f8 = not affected`, `? = missing data`\n> Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1508?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Last update [502c02f...299774b](https://codecov.io/gh/simonw/datasette/pull/1508?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1053655062, "label": "Update docutils requirement from <0.18 to <0.19"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/329#issuecomment-968470212", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/329", "id": 968470212, "node_id": "IC_kwDOCGYnMM45ua7E", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T02:49:28Z", "updated_at": "2021-11-15T02:49:28Z", "author_association": "OWNER", "body": "I was going to replace all of the `validate_column_names()` bits with something that fixed them instead, but I think I have a better idea: I'm only going to apply the fix for the various '.insert()` methods that create the initial tables.\r\n\r\nI'll keep the `validate_column_names()` where they are at the moment. Once you've inserted the data and created the tables it will be up to you to use the new, correct column names.\r\n\r\nThis avoids the whole issue of needing to rewrite parameters, and solves the immediate problem which is consuming CSV files with bad column names.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1005891028, "label": "Rethink approach to [ and ] in column names (currently throws error)"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/329#issuecomment-968458837", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/329", "id": 968458837, "node_id": "IC_kwDOCGYnMM45uYJV", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T02:21:15Z", "updated_at": "2021-11-15T02:21:15Z", "author_association": "OWNER", "body": "I'm not going to implement a fix that rewrites the `pk` and `column_order` and other parameters - at least not yet. The main thing I'm trying to fix here is what happens when you attempt to import a CSV file with `[ ]` in the column names, which should be unaffected by that second challenge.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1005891028, "label": "Rethink approach to [ and ] in column names (currently throws error)"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/329#issuecomment-968453129", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/329", "id": 968453129, "node_id": "IC_kwDOCGYnMM45uWwJ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T02:07:46Z", "updated_at": "2021-11-15T02:07:46Z", "author_association": "OWNER", "body": "If I replace `validate_column_names(row.keys())` with `fix_column_names(row)` I need to decide what to do about things like `pk=` and `column_order=`.\r\n\r\nWhat should the following do?\r\n\r\n```python\r\ntable.insert({\"foo[bar]\": 4}, pk=\"foo[bar]\", column_order=[\"foo[bar]\"])\r\n```\r\nShould it spot the old column names in the `pk=` and `column_order=` parameters and pretend that `foo_bar_` was passed instead?\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1005891028, "label": "Rethink approach to [ and ] in column names (currently throws error)"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/329#issuecomment-968451954", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/329", "id": 968451954, "node_id": "IC_kwDOCGYnMM45uWdy", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T02:05:29Z", "updated_at": "2021-11-15T02:05:29Z", "author_association": "OWNER", "body": "> I could even have those replacement characters be properties of the `Database` class, so uses can sub-class and change them.\r\n\r\nI'm not going to do this, it's unnecessary extra complexity and it means the function that fixes the column names needs to have access to the current `Database` instance.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1005891028, "label": "Rethink approach to [ and ] in column names (currently throws error)"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/339#issuecomment-968450579", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/339", "id": 968450579, "node_id": "IC_kwDOCGYnMM45uWIT", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T02:02:34Z", "updated_at": "2021-11-15T02:02:34Z", "author_association": "OWNER", "body": "Documentation: https://github.com/simonw/sqlite-utils/blob/54a2269e91ce72b059618662ed133a85f3d42e4a/docs/python-api.rst#working-with-lookup-tables", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1053122092, "label": "`table.lookup()` option to populate additional columns when creating a record"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/339#issuecomment-968435041", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/339", "id": 968435041, "node_id": "IC_kwDOCGYnMM45uSVh", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T01:44:42Z", "updated_at": "2021-11-15T01:44:42Z", "author_association": "OWNER", "body": "`lookup(column_values, extra_values)` is one option.\r\n\r\n`column_values` isn't actually a great name for the first parameter any more, since the second parameter also takes column values. The first parameter is now all about the unique lookup values.\r\n\r\nMaybe this:\r\n\r\n    lookup(lookup_values, extra_values)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1053122092, "label": "`table.lookup()` option to populate additional columns when creating a record"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/339#issuecomment-968434594", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/339", "id": 968434594, "node_id": "IC_kwDOCGYnMM45uSOi", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T01:43:10Z", "updated_at": "2021-11-15T01:43:10Z", "author_association": "OWNER", "body": "What should I call this parameter? Django has a similar feature where it calls them `defaults=` (for `get_or_create()`) but I'm not a huge fan of that name.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1053122092, "label": "`table.lookup()` option to populate additional columns when creating a record"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/339#issuecomment-968434425", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/339", "id": 968434425, "node_id": "IC_kwDOCGYnMM45uSL5", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T01:42:36Z", "updated_at": "2021-11-15T01:42:36Z", "author_association": "OWNER", "body": "Here's the current signature of `table.lookup()`: https://github.com/simonw/sqlite-utils/blob/9cda5b070f885a7995f0c307bcc4f45f0812994a/sqlite_utils/db.py#L2716-L2729\r\n\r\nI'm going to add a second positional argument which can provide a dictionary of column->value to use when creating the original table and populating the initial row. If the row already exists, those columns will be ignored entirely.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1053122092, "label": "`table.lookup()` option to populate additional columns when creating a record"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/322#issuecomment-968401459", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/322", "id": 968401459, "node_id": "IC_kwDOCGYnMM45uKIz", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-15T00:26:42Z", "updated_at": "2021-11-15T00:26:42Z", "author_association": "OWNER", "body": "This relates to the fact that dictionaries, lists and tuples get special treatment and are converted to JSON strings, using this code: https://github.com/simonw/sqlite-utils/blob/e8d958109ee290cfa1b44ef7a39629bb50ab673e/sqlite_utils/db.py#L2937-L2947\r\n\r\nSo the `COLUMN_TYPE_MAPPING` should include those  too - right now it looks like this: https://github.com/simonw/sqlite-utils/blob/e8d958109ee290cfa1b44ef7a39629bb50ab673e/sqlite_utils/db.py#L165-L188", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 979612115, "label": "Add dict type to be mapped as TEXT in sqllite"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/324#issuecomment-968384988", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/324", "id": 968384988, "node_id": "IC_kwDOCGYnMM45uGHc", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T23:25:16Z", "updated_at": "2021-11-14T23:25:16Z", "author_association": "OWNER", "body": "Yes this was absolutely the intention! Thanks, I wonder how often I've made that mistake in other projects?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 988013247, "label": "Use python-dateutil package instead of dateutils"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/331#issuecomment-968384005", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/331", "id": 968384005, "node_id": "IC_kwDOCGYnMM45uF4F", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T23:19:29Z", "updated_at": "2021-11-14T23:20:32Z", "author_association": "OWNER", "body": "Tested it like this, against a freshly built `.tar.gz` package from my development environment:\r\n```\r\n(w) w % mypy .                    \r\nhello.py:1: error: Skipping analyzing \"sqlite_utils\": found module but no type hints or library stubs\r\nhello.py:1: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports\r\nFound 1 error in 1 file (checked 1 source file)\r\n(w) w % pip install ~/Dropbox/Development/sqlite-utils/dist/sqlite-utils-3.17.1.tar.gz\r\nProcessing /Users/simon/Dropbox/Development/sqlite-utils/dist/sqlite-utils-3.17.1.tar.gz\r\n...\r\nSuccessfully installed sqlite-utils-3.17.1\r\n(w) w % mypy .                                                                        \r\nSuccess: no issues found in 1 source file\r\n```\r\nI tested against the `.whl` too.\r\n\r\nMy `hello.py` script contained this:\r\n```python\r\nimport sqlite_utils\r\nfrom typing import cast\r\n\r\nif __name__ == \"__main__\":\r\n    db = sqlite_utils.Database(memory=True)\r\n    table = cast(sqlite_utils.db.Table, db[\"foo\"])\r\n    table.insert({\"id\": 5})\r\n    print(list(db.query(\"select * from foo\")))\r\n```\r\nThat `cast()` is necessary because without it you get this error:\r\n```\r\n(w) w % mypy .\r\nhello.py:7: error: Item \"View\" of \"Union[Table, View]\" has no attribute \"insert\"\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1026794056, "label": "Mypy error: found module but no type hints or library stubs"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/331#issuecomment-968381939", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/331", "id": 968381939, "node_id": "IC_kwDOCGYnMM45uFXz", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T23:06:20Z", "updated_at": "2021-11-14T23:06:20Z", "author_association": "OWNER", "body": "Thanks - I didn't know this was needed!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1026794056, "label": "Mypy error: found module but no type hints or library stubs"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/332#issuecomment-968380675", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/332", "id": 968380675, "node_id": "IC_kwDOCGYnMM45uFED", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T22:57:56Z", "updated_at": "2021-11-14T22:57:56Z", "author_association": "OWNER", "body": "This is a great idea.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1028056713, "label": "`sqlite-utils memory --flatten` option to flatten nested JSON"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/335#issuecomment-968380387", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/335", "id": 968380387, "node_id": "IC_kwDOCGYnMM45uE_j", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T22:55:56Z", "updated_at": "2021-11-14T22:55:56Z", "author_association": "OWNER", "body": "OK, this should fix it.", "reactions": "{\"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 1, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1042569687, "label": "sqlite-utils index-foreign-keys fails due to pre-existing index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/335#issuecomment-968371112", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/335", "id": 968371112, "node_id": "IC_kwDOCGYnMM45uCuo", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T21:57:43Z", "updated_at": "2021-11-14T22:21:31Z", "author_association": "OWNER", "body": "`create_index(..., find_unique_name=)` is good. Default to false. `index_foreign_keys` can set it to true.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1042569687, "label": "sqlite-utils index-foreign-keys fails due to pre-existing index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/335#issuecomment-968361671", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/335", "id": 968361671, "node_id": "IC_kwDOCGYnMM45uAbH", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T20:54:53Z", "updated_at": "2021-11-14T21:01:14Z", "author_association": "OWNER", "body": "I'm leaning towards `table.create_index(columns, ignore_existing_name=True)` now.\r\n\r\nOr `resolve_existing_name` - or `skip_existing_name`?\r\n\r\n\"ignore\" sounds like it might not create the index if the name exists, but we want to still create the index but pick a new name.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1042569687, "label": "sqlite-utils index-foreign-keys fails due to pre-existing index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/335#issuecomment-968362285", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/335", "id": 968362285, "node_id": "IC_kwDOCGYnMM45uAkt", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T20:59:44Z", "updated_at": "2021-11-14T20:59:44Z", "author_association": "OWNER", "body": "I think I'll attempt to create the index and re-try if it fails with that error.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1042569687, "label": "sqlite-utils index-foreign-keys fails due to pre-existing index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/335#issuecomment-968362214", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/335", "id": 968362214, "node_id": "IC_kwDOCGYnMM45uAjm", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T20:59:15Z", "updated_at": "2021-11-14T20:59:15Z", "author_association": "OWNER", "body": "How to figure out if an index name is already in use? `PRAGMA index_list(t)` requires a table name. This does it:\r\n\r\n```sql\r\nSELECT name \r\nFROM sqlite_master \r\nWHERE type = 'index';\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1042569687, "label": "sqlite-utils index-foreign-keys fails due to pre-existing index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/335#issuecomment-968361409", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/335", "id": 968361409, "node_id": "IC_kwDOCGYnMM45uAXB", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T20:52:55Z", "updated_at": "2021-11-14T20:52:55Z", "author_association": "OWNER", "body": "Looking at the method signature: https://github.com/simonw/sqlite-utils/blob/92aa5c9c5d26b0889c8c3d97c76a908d5f8af211/sqlite_utils/db.py#L1518-L1524\r\n\r\n`if_not_exists` just adds a `IF NOT EXISTS` clause here: https://github.com/simonw/sqlite-utils/blob/92aa5c9c5d26b0889c8c3d97c76a908d5f8af211/sqlite_utils/db.py#L1549-L1561", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1042569687, "label": "sqlite-utils index-foreign-keys fails due to pre-existing index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/335#issuecomment-968361285", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/335", "id": 968361285, "node_id": "IC_kwDOCGYnMM45uAVF", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T20:51:57Z", "updated_at": "2021-11-14T20:51:57Z", "author_association": "OWNER", "body": "SQLite will happily create multiple identical indexes on a table, using more disk space each time:\r\n```pycon\r\n>>> import sqlite_utils\r\n>>> db = sqlite_utils.Database(\"dupes.db\")\r\n>>> db[\"t\"].insert_all({\"id\": i} for i in range(10000))\r\n<Table t (id)>\r\n# dupes.db is 98304 bytes\r\n>>> db[\"t\"].create_index([\"id\"])\r\n<Table t (id)>\r\n# dupes.db is 204800 bytes\r\n>>> db[\"t\"].indexes\r\n[Index(seq=0, name='idx_t_id', unique=0, origin='c', partial=0, columns=['id'])]\r\n>>> db[\"t\"].create_index([\"id\"], index_name=\"t_idx_t_id_2\")\r\n<Table t (id)>\r\n# 311296 bytes\r\n>>> db[\"t\"].create_index([\"id\"], index_name=\"t_idx_t_id_3\")\r\n<Table t (id)>\r\n# 417792 bytes\r\n>>> db.vacuum()\r\n# Still 417792 bytes\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1042569687, "label": "sqlite-utils index-foreign-keys fails due to pre-existing index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/335#issuecomment-968360538", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/335", "id": 968360538, "node_id": "IC_kwDOCGYnMM45uAJa", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T20:46:56Z", "updated_at": "2021-11-14T20:46:56Z", "author_association": "OWNER", "body": "I'm tempted to not provide an opt-out option either: if you call `table.create_index(...)` without specifying an index name I think the tool should create the index for you, quietly picking an index name that works.\r\n\r\nBut... it feels wasteful to create an index that exactly duplicates an existing index. Would SQLite even let you do that or would it notice and NOT double the amount of disk space used for that index?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1042569687, "label": "sqlite-utils index-foreign-keys fails due to pre-existing index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/335#issuecomment-968360387", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/335", "id": 968360387, "node_id": "IC_kwDOCGYnMM45uAHD", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T20:45:44Z", "updated_at": "2021-11-14T20:45:44Z", "author_association": "OWNER", "body": "What would such an option be called? Some options:\r\n\r\n- `table.create_index([fk.column], force=True)` - not obvious what `force` means here\r\n- `table.create_index([fk.column], ignore_existing_name=True)` - not obvious what `ignore` means here\r\n- `table.create_index([fk.column], pick_unique_name=True)` - bit verbose\r\n\r\nIf the user doesn't pass in an explicit name it seems like their intent is \"just create me the index, I don't care what name you use\" - so actually perhaps the default behaviour here should be to pick a new unique name if that name is already in use.\r\n\r\nThen maybe there should be an option to disable that - some options there:\r\n\r\n- `table.create_index([fk.column], error_on_existing_index_name=True)` - too verbose\r\n- `table.create_index([fk.column], force=False)` - not clear what `force` means\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1042569687, "label": "sqlite-utils index-foreign-keys fails due to pre-existing index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/335#issuecomment-968359868", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/335", "id": 968359868, "node_id": "IC_kwDOCGYnMM45t_-8", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T20:41:42Z", "updated_at": "2021-11-14T20:41:42Z", "author_association": "OWNER", "body": "The \"index idx_generators_eia860_report_date already exists\" error suggests that the problem here is actually one of an index name collision.\r\n\r\n```python\r\n                 table.create_index([fk.column]) \r\n```\r\nThis will derive a name for the index automatically from the name of the table and the name of the passed in columns: https://github.com/simonw/sqlite-utils/blob/92aa5c9c5d26b0889c8c3d97c76a908d5f8af211/sqlite_utils/db.py#L1536-L1539\r\n\r\nSo perhaps `.create_index()` should grow an extra option that creates the index even if the name already exists, by finding a new name.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1042569687, "label": "sqlite-utils index-foreign-keys fails due to pre-existing index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/335#issuecomment-968359137", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/335", "id": 968359137, "node_id": "IC_kwDOCGYnMM45t_zh", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T20:37:00Z", "updated_at": "2021-11-14T20:37:00Z", "author_association": "OWNER", "body": "This is strange - the code already checks that an index doesn't exist before attempting to create it: https://github.com/simonw/sqlite-utils/blob/92aa5c9c5d26b0889c8c3d97c76a908d5f8af211/sqlite_utils/db.py#L893-L902", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1042569687, "label": "sqlite-utils index-foreign-keys fails due to pre-existing index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1507#issuecomment-968210842", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1507", "id": 968210842, "node_id": "IC_kwDOBm6k_c45tbma", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T05:41:55Z", "updated_at": "2021-11-14T05:41:55Z", "author_association": "OWNER", "body": "Here's the build with that fix: https://readthedocs.org/projects/datasette/builds/15268498/\r\n\r\nIt passed and published the docs: https://docs.datasette.io/en/latest/changelog.html", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1052851176, "label": "ReadTheDocs build failed for 0.59.2 release"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1507#issuecomment-968210222", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1507", "id": 968210222, "node_id": "IC_kwDOBm6k_c45tbcu", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T05:34:14Z", "updated_at": "2021-11-14T05:34:14Z", "author_association": "OWNER", "body": "Here's the new build using Python 3: https://readthedocs.org/projects/datasette/builds/15268482/\r\n\r\nIt's still broken. Here's one of many issue threads about it, this one has a workaround fix: https://github.com/readthedocs/readthedocs.org/issues/8616#issuecomment-952034858\r\n\r\n> For future readers, the solution for this problem is to pin `docutils<0.18` in your `requirements.txt` file, and have a `.readthedocs.yaml` file with these contents:\r\n> \r\n> ```\r\n> version: 2\r\n> \r\n> python:\r\n>    install:\r\n>    - requirements: docs/requirements.txt\r\n> ```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1052851176, "label": "ReadTheDocs build failed for 0.59.2 release"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1507#issuecomment-968209957", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1507", "id": 968209957, "node_id": "IC_kwDOBm6k_c45tbYl", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T05:31:07Z", "updated_at": "2021-11-14T05:31:07Z", "author_association": "OWNER", "body": "Looks like ReadTheDocs builds started failing for `latest` a few weeks ago:\r\n\r\n<img width=\"845\" alt=\"Banners_and_Alerts_and_Builds___Read_the_Docs\" src=\"https://user-images.githubusercontent.com/9599/141668921-4dd56f14-fe16-4e7f-82c1-12e6893dddd7.png\">\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1052851176, "label": "ReadTheDocs build failed for 0.59.2 release"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1507#issuecomment-968209731", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1507", "id": 968209731, "node_id": "IC_kwDOBm6k_c45tbVD", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T05:28:41Z", "updated_at": "2021-11-14T05:28:41Z", "author_association": "OWNER", "body": "I will try adding a `.readthedocs.yml` file: https://docs.readthedocs.io/en/stable/config-file/v2.html#python-version\r\n\r\nThis might work:\r\n\r\n```\r\nversion: 2\r\n\r\nbuild:\r\n  os: ubuntu-20.04\r\n  tools:\r\n    python: \"3.9\"\r\n\r\nsphinx:\r\n   configuration: docs/conf.py\r\n```\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1052851176, "label": "ReadTheDocs build failed for 0.59.2 release"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1507#issuecomment-968209616", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1507", "id": 968209616, "node_id": "IC_kwDOBm6k_c45tbTQ", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T05:27:22Z", "updated_at": "2021-11-14T05:27:22Z", "author_association": "OWNER", "body": "https://blog.readthedocs.com/default-python-3/ they started defaulting new projects to Python 3 back in Feb 2019 but clearly my project was created before then.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1052851176, "label": "ReadTheDocs build failed for 0.59.2 release"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1507#issuecomment-968209560", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1507", "id": 968209560, "node_id": "IC_kwDOBm6k_c45tbSY", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T05:26:36Z", "updated_at": "2021-11-14T05:26:36Z", "author_association": "OWNER", "body": "It looks like my builds there still run on Python 2!\r\n\r\n```\r\ngit clone --no-single-branch --depth 50 https://github.com/simonw/datasette .\r\ngit checkout --force de1e031713f47fbd51eb7239db3e7e6025fbf81a\r\ngit clean -d -f -f\r\npython2.7 -mvirtualenv /home/docs/checkouts/readthedocs.org/user_builds/datasette/envs/0.59.2\r\n/home/docs/checkouts/readthedocs.org/user_builds/datasette/envs/0.59.2/bin/python -m pip install --upgrade --no-cache-dir pip setuptools\r\n/home/docs/checkouts/readthedocs.org/user_builds/datasette/envs/0.59.2/bin/python -m pip install --upgrade --no-cache-dir mock==1.0.1 pillow==5.4.1 alabaster>=0.7,<0.8,!=0.7.5 commonmark==0.8.1 recommonmark==0.5.0 sphinx<2 sphinx-rtd-theme<0.5 readthedocs-sphinx-ext<2.2\r\ncat docs/conf.py \r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1052851176, "label": "ReadTheDocs build failed for 0.59.2 release"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1503#issuecomment-968207906", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1503", "id": 968207906, "node_id": "IC_kwDOBm6k_c45ta4i", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T05:08:26Z", "updated_at": "2021-11-14T05:08:26Z", "author_association": "OWNER", "body": "Error:\r\n```\r\n    def test_table_html_filter_form_column_options(\r\n        path, expected_column_options, app_client\r\n    ):\r\n        response = app_client.get(path)\r\n        assert response.status == 200\r\n        form = Soup(response.body, \"html.parser\").find(\"form\")\r\n        column_options = [\r\n            o.attrs.get(\"value\") or o.string\r\n            for o in form.select(\"select[name=_filter_column] option\")\r\n        ]\r\n>       assert expected_column_options == column_options\r\nE       AssertionError: assert ['- column -'...wid', 'value'] == ['- column -', 'value']\r\nE         At index 1 diff: 'rowid' != 'value'\r\nE         Left contains one more item: 'value'\r\nE         Use -v to get the full diff\r\n```\r\nThis is because `rowid` isn't a table column but IS returned by the query used on that page.\r\n\r\nMy solution: start with the query columns, but then add any table columns that were not already returned by the query to the end of the `filter_columns` list.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1050163432, "label": "`?_nocol=` removes that column from the filter interface"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1506#issuecomment-968192980", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1506", "id": 968192980, "node_id": "IC_kwDOBm6k_c45tXPU", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-14T02:22:40Z", "updated_at": "2021-11-14T02:22:40Z", "author_association": "OWNER", "body": "I think the answer is to spot this case and link to `?_item_exact=x` instead of `?_item=x` - it looks like that's already recommended in the documentation here: https://docs.datasette.io/en/stable/json_api.html#column-filter-arguments\r\n\r\n> **?column__exact=value** or **?_column=value**\r\n>     Returns rows where the specified column exactly matches the value.\r\n\r\nSo maybe the facet selection rendering logic needs to spot this and link correctly to it?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1052826038, "label": "Columns beginning with an underscore do not facet correctly"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1380#issuecomment-967801997", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1380", "id": 967801997, "node_id": "IC_kwDOBm6k_c45r3yN", "user": {"value": 7094907, "label": "Segerberg"}, "created_at": "2021-11-13T08:05:37Z", "updated_at": "2021-11-13T08:09:11Z", "author_association": "NONE", "body": "@glasnt yeah I guess that could be an option.  I run datasette on large databases > 75gb and the startup time is a bit slow for me even with -i --inspect-file options. Here's a quick sketch for a plugin that will reload db's in a folder that you set for the plugin in metadata.json. If you request /-reload-db new db's will be added. (You probably want to implement some authentication for this =) ) \r\n\r\nhttps://gist.github.com/Segerberg/b96a0e0a5389dce2396497323cda7042\r\n", "reactions": "{\"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 1, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 924748955, "label": "Serve all db files in a folder"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1380#issuecomment-967747190", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1380", "id": 967747190, "node_id": "IC_kwDOBm6k_c45rqZ2", "user": {"value": 813732, "label": "glasnt"}, "created_at": "2021-11-13T00:47:26Z", "updated_at": "2021-11-13T00:47:26Z", "author_association": "CONTRIBUTOR", "body": "Would it make sense to run datasette with a fswatch/inotifywait on a folder, then? ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 924748955, "label": "Serve all db files in a folder"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1380#issuecomment-967181828", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1380", "id": 967181828, "node_id": "IC_kwDOBm6k_c45pgYE", "user": {"value": 7094907, "label": "Segerberg"}, "created_at": "2021-11-12T15:00:18Z", "updated_at": "2021-11-12T20:02:29Z", "author_association": "NONE", "body": "There is no such option see https://github.com/simonw/datasette/issues/43. But you could write a plugin using the datasette.add_database(db, name=None) https://docs.datasette.io/en/stable/internals.html#add-database-db-name-none ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 924748955, "label": "Serve all db files in a folder"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-964205475", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 964205475, "node_id": "IC_kwDOCGYnMM45eJuj", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-11-09T14:31:29Z", "updated_at": "2021-11-09T14:31:29Z", "author_association": "CONTRIBUTOR", "body": "i was just reaching for a tool to do this this morning", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/336#issuecomment-962411119", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/336", "id": 962411119, "node_id": "IC_kwDOCGYnMM45XTpv", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-11-06T07:21:04Z", "updated_at": "2021-11-06T07:21:04Z", "author_association": "OWNER", "body": "I've never used `DEFAULT 'CURRENT_TIMESTAMP'` myself so this one should be an interesting bug to explore.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1044267332, "label": "sqlite-util tranform --column-order mangles columns of type \"timestamp\""}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/337#issuecomment-962259527", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/337", "id": 962259527, "node_id": "IC_kwDOCGYnMM45WupH", "user": {"value": 771193, "label": "urbas"}, "created_at": "2021-11-05T22:33:02Z", "updated_at": "2021-11-05T22:33:02Z", "author_association": "NONE", "body": "Smokes, it looks like there was a bug in click 8.0.2 (fixed in 8.0.3: https://github.com/pallets/click/issues/2089). Meaning this PR is not needed. Closing.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1046271107, "label": "Default values for `--attach` and `--param` options"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1284#issuecomment-851567204", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1284", "id": 851567204, "node_id": "MDEyOklzc3VlQ29tbWVudDg1MTU2NzIwNA==", "user": {"value": 192568, "label": "mroswell"}, "created_at": "2021-05-31T15:42:10Z", "updated_at": "2021-11-04T03:15:01Z", "author_association": "CONTRIBUTOR", "body": "I very much want to make:\r\n  https://list.SaferDisinfectants.org/disinfectants/listN \r\nhave this URL:\r\n https://list.SaferDisinfectants.org/\r\n \r\nI'm using only one table page on the site, with no pagination. I'm not using the home page, though when I tried to move my table to the home page as mentioned above, I failed to figure out how. \r\n\r\nI am using cloudflare, but I haven't figured out a forwarding or HTML re-write method of doing this, either.\r\n\r\nIs there any way I can get a prettier list URL? I'm on Vercel.\r\n\r\n(I have a wordpress site on the main domain on a separate host.)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 845794436, "label": "Feature or Documentation Request: Individual table as home page template"}, "performed_via_github_app": null}