home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

26 rows where issue = 648435885 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • simonw 25
  • 20after4 1

author_association 2

  • OWNER 25
  • NONE 1

issue 1

  • New pattern for views that return either JSON or HTML, available for plugins · 26 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1073037939 https://github.com/simonw/datasette/issues/878#issuecomment-1073037939 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c4_9UJz simonw 9599 2022-03-19T16:19:30Z 2022-03-19T16:19:30Z OWNER

On revisiting https://gist.github.com/simonw/281eac9c73b062c3469607ad86470eb2 a few months later I'm having second thoughts about using @inject on the main() method.

But I still like the pattern as a way to resolve more complex cases like "to generate GeoJSON of the expanded view with labels, the label expansion code needs to run once at some before the GeoJSON formatting code does".

So I'm going to stick with it a tiny bit longer, but maybe try to make it a lot more explicit when it's going to happen rather than having the main view methods themselves also use async DI.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
1001699559 https://github.com/simonw/datasette/issues/878#issuecomment-1001699559 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c47tLjn simonw 9599 2021-12-27T18:53:04Z 2021-12-27T18:53:04Z OWNER

I'm going to see if I can come up with the simplest possible version of this pattern for the /-/metadata and /-/metadata.json page, then try it for the database query page, before tackling the much more complex table page.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973678931 https://github.com/simonw/datasette/issues/878#issuecomment-973678931 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46CSlT simonw 9599 2021-11-19T02:51:17Z 2021-11-19T02:51:17Z OWNER

OK, I managed to get a table to render! Here's the code I used - I had to copy a LOT of stuff. https://gist.github.com/simonw/281eac9c73b062c3469607ad86470eb2

I'm going to move this work into a new, separate issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973635157 https://github.com/simonw/datasette/issues/878#issuecomment-973635157 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46CH5V simonw 9599 2021-11-19T01:07:08Z 2021-11-19T01:07:08Z OWNER

This exercise is proving so useful in getting my head around how the enormous and complex TableView class works again.

Here's where I've got to now - I'm systematically working through the variables that are returned for HTML and for JSON copying across code to get it to work:

```python from datasette.database import QueryInterrupted from datasette.utils import escape_sqlite from datasette.utils.asgi import Response, NotFound, Forbidden from datasette.views.base import DatasetteError from datasette import hookimpl from asyncinject import AsyncInject, inject from pprint import pformat

class Table(AsyncInject): @inject async def database(self, request, datasette): # TODO: all that nasty hash resolving stuff can go here db_name = request.url_vars["db_name"] try: db = datasette.databases[db_name] except KeyError: raise NotFound(f"Database '{db_name}' does not exist") return db

@inject
async def table_and_format(self, request, database, datasette):
    table_and_format = request.url_vars["table_and_format"]
    # TODO: be a lot smarter here
    if "." in table_and_format:
        return table_and_format.split(".", 2)
    else:
        return table_and_format, "html"

@inject
async def main(self, request, database, table_and_format, datasette):
    # TODO: if this is actually a canned query, dispatch to it

    table, format = table_and_format

    is_view = bool(await database.get_view_definition(table))
    table_exists = bool(await database.table_exists(table))
    if not is_view and not table_exists:
        raise NotFound(f"Table not found: {table}")

    await check_permissions(
        datasette,
        request,
        [
            ("view-table", (database.name, table)),
            ("view-database", database.name),
            "view-instance",
        ],
    )

    private = not await datasette.permission_allowed(
        None, "view-table", (database.name, table), default=True
    )

    pks = await database.primary_keys(table)
    table_columns = await database.table_columns(table)

    specified_columns = await columns_to_select(datasette, database, table, request)
    select_specified_columns = ", ".join(
        escape_sqlite(t) for t in specified_columns
    )
    select_all_columns = ", ".join(escape_sqlite(t) for t in table_columns)

    use_rowid = not pks and not is_view
    if use_rowid:
        select_specified_columns = f"rowid, {select_specified_columns}"
        select_all_columns = f"rowid, {select_all_columns}"
        order_by = "rowid"
        order_by_pks = "rowid"
    else:
        order_by_pks = ", ".join([escape_sqlite(pk) for pk in pks])
        order_by = order_by_pks

    if is_view:
        order_by = ""

    nocount = request.args.get("_nocount")
    nofacet = request.args.get("_nofacet")

    if request.args.get("_shape") in ("array", "object"):
        nocount = True
        nofacet = True

    # Next, a TON of SQL to build where_params and filters and suchlike
    # skipping that and jumping straight to...
    where_clauses = []
    where_clause = ""
    if where_clauses:
        where_clause = f"where {' and '.join(where_clauses)} "

    from_sql = "from {table_name} {where}".format(
        table_name=escape_sqlite(table),
        where=("where {} ".format(" and ".join(where_clauses)))
        if where_clauses
        else "",
    )
    from_sql_params ={}
    params = {}
    count_sql = f"select count(*) {from_sql}"
    sql_no_order_no_limit = (
        "select {select_all_columns} from {table_name} {where}".format(
            select_all_columns=select_all_columns,
            table_name=escape_sqlite(table),
            where=where_clause,
        )
    )

    page_size = 100
    offset = " offset 0"

    sql = "select {select_specified_columns} from {table_name} {where}{order_by} limit {page_size}{offset}".format(
        select_specified_columns=select_specified_columns,
        table_name=escape_sqlite(table),
        where=where_clause,
        order_by=order_by,
        page_size=page_size + 1,
        offset=offset,
    )

    # Fetch rows
    results = await database.execute(sql, params, truncate=True)
    columns = [r[0] for r in results.description]
    rows = list(results.rows)

    # Fetch count
    filtered_table_rows_count = None
    if count_sql:
        try:
            count_rows = list(await database.execute(count_sql, from_sql_params))
            filtered_table_rows_count = count_rows[0][0]
        except QueryInterrupted:
            pass


    vars = {
        "json": {
            # THIS STUFF is from the regular JSON
            "database": database.name,
            "table": table,
            "is_view": is_view,
            # "human_description_en": human_description_en,
            "rows": rows[:page_size],
            "truncated": results.truncated,
            "filtered_table_rows_count": filtered_table_rows_count,
            # "expanded_columns": expanded_columns,
            # "expandable_columns": expandable_columns,
            "columns": columns,
            "primary_keys": pks,
            # "units": units,
            "query": {"sql": sql, "params": params},
            # "facet_results": facet_results,
            # "suggested_facets": suggested_facets,
            # "next": next_value and str(next_value) or None,
            # "next_url": next_url,
            "private": private,
            "allow_execute_sql": await datasette.permission_allowed(
                request.actor, "execute-sql", database, default=True
            ),
        },
        "html": {
            # ... this is the HTML special stuff
            # "table_actions": table_actions,
            # "supports_search": bool(fts_table),
            # "search": search or "",
            "use_rowid": use_rowid,
            # "filters": filters,
            # "display_columns": display_columns,
            # "filter_columns": filter_columns,
            # "display_rows": display_rows,
            # "facets_timed_out": facets_timed_out,
            # "sorted_facet_results": sorted(
            #     facet_results.values(),
            #     key=lambda f: (len(f["results"]), f["name"]),
            #     reverse=True,
            # ),
            # "show_facet_counts": special_args.get("_facet_size") == "max",
            # "extra_wheres_for_ui": extra_wheres_for_ui,
            # "form_hidden_args": form_hidden_args,
            # "is_sortable": any(c["sortable"] for c in display_columns),
            # "path_with_replaced_args": path_with_replaced_args,
            # "path_with_removed_args": path_with_removed_args,
            # "append_querystring": append_querystring,
            "request": request,
            # "sort": sort,
            # "sort_desc": sort_desc,
            "disable_sort": is_view,
            # "custom_table_templates": [
            #     f"_table-{to_css_class(database)}-{to_css_class(table)}.html",
            #     f"_table-table-{to_css_class(database)}-{to_css_class(table)}.html",
            #     "_table.html",
            # ],
            # "metadata": metadata,
            # "view_definition": await db.get_view_definition(table),
            # "table_definition": await db.get_table_definition(table),
        },
    }

    # I'm just trying to get HTML to work for the moment
    if format == "json":
        return Response.json(dict(vars, locals=locals()), default=repr)
    else:
        return Response.html(repr(vars["html"]))

async def view(self, request, datasette):
    return await self.main(request=request, datasette=datasette)

@hookimpl def register_routes(): return [ (r"/t/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)", Table().view), ]

async def check_permissions(datasette, request, permissions): """permissions is a list of (action, resource) tuples or 'action' strings""" for permission in permissions: if isinstance(permission, str): action = permission resource = None elif isinstance(permission, (tuple, list)) and len(permission) == 2: action, resource = permission else: assert ( False ), "permission should be string or tuple of two items: {}".format( repr(permission) ) ok = await datasette.permission_allowed( request.actor, action, resource=resource, default=None, ) if ok is not None: if ok: return else: raise Forbidden(action)

async def columns_to_select(datasette, database, table, request): table_columns = await database.table_columns(table) pks = await database.primary_keys(table) columns = list(table_columns) if "_col" in request.args: columns = list(pks) _cols = request.args.getlist("_col") bad_columns = [column for column in _cols if column not in table_columns] if bad_columns: raise DatasetteError( "_col={} - invalid columns".format(", ".join(bad_columns)), status=400, ) # De-duplicate maintaining order: columns.extend(dict.fromkeys(_cols)) if "_nocol" in request.args: # Return all columns EXCEPT these bad_columns = [ column for column in request.args.getlist("_nocol") if (column not in table_columns) or (column in pks) ] if bad_columns: raise DatasetteError( "_nocol={} - invalid columns".format(", ".join(bad_columns)), status=400, ) tmp_columns = [ column for column in columns if column not in request.args.getlist("_nocol") ] columns = tmp_columns return columns ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973568285 https://github.com/simonw/datasette/issues/878#issuecomment-973568285 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46B3kd simonw 9599 2021-11-19T00:29:20Z 2021-11-19T00:29:20Z OWNER

This is working! ```python from datasette.utils.asgi import Response from datasette import hookimpl import html from asyncinject import AsyncInject, inject

class Table(AsyncInject): @inject async def database(self, request): return request.url_vars["db_name"]

@inject
async def main(self, request, database):
    return Response.html("Database: {}".format(
        html.escape(database)
    ))

async def view(self, request):
    return await self.main(request=request)

@hookimpl def register_routes(): return [ (r"/t/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)", Table().view), ] `` This project will definitely show me if I actually like theasyncinject` patterns or not.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973564260 https://github.com/simonw/datasette/issues/878#issuecomment-973564260 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46B2lk simonw 9599 2021-11-19T00:27:06Z 2021-11-19T00:27:06Z OWNER

Problem: the fancy asyncinject stuff inteferes with the fancy Datasette thing that introspects view functions to look for what parameters they take: ```python class Table(asyncinject.AsyncInjectAll): async def view(self, request): return Response.html("Hello from {}".format( html.escape(repr(request.url_vars)) ))

@hookimpl def register_routes(): return [ (r"/t/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)", Table().view), ] ``` This failed with error: "Table.view() takes 1 positional argument but 2 were given"

So I'm going to use AsyncInject and have the view function NOT use the @inject decorator.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973554024 https://github.com/simonw/datasette/issues/878#issuecomment-973554024 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46B0Fo simonw 9599 2021-11-19T00:21:20Z 2021-11-19T00:21:20Z OWNER

That's annoying: it looks like plugins can't use register_routes() to over-ride default routes within Datasette itself. This didn't work: ```python from datasette.utils.asgi import Response from datasette import hookimpl import html

async def table(request): return Response.html("Hello from {}".format( html.escape(repr(request.url_vars)) ))

@hookimpl def register_routes(): return [ (r"/(?P<db_name>[^/]+)/(?P<table_and_format>[^/]+?$)", table), ] `` I'll use a/t/` prefix for the moment, but this is probably something I'll fix in Datasette itself later.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973542284 https://github.com/simonw/datasette/issues/878#issuecomment-973542284 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46BxOM simonw 9599 2021-11-19T00:16:44Z 2021-11-19T00:16:44Z OWNER

Development % cookiecutter gh:simonw/datasette-plugin You've downloaded /Users/simon/.cookiecutters/datasette-plugin before. Is it okay to delete and re-download it? [yes]: yes plugin_name []: table-new description []: New implementation of TableView, see https://github.com/simonw/datasette/issues/878 hyphenated [table-new]: underscored [table_new]: github_username []: simonw author_name []: Simon Willison include_static_directory []: include_templates_directory []:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
973527870 https://github.com/simonw/datasette/issues/878#issuecomment-973527870 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c46Bts- simonw 9599 2021-11-19T00:13:43Z 2021-11-19T00:13:43Z OWNER

New plan: I'm going to build a brand new implementation of TableView starting out as a plugin, using the register_routes() plugin hook.

It will reuse the existing HTML template but will be a completely new Python implementation, based on asyncinject.

I'm going to start by just getting the table to show up on the page - then I'll add faceting, suggested facets, filters and so-on.

Bonus: I'm going to see if I can get it to work for arbitrary SQL queries too (stretch goal).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
971209475 https://github.com/simonw/datasette/issues/878#issuecomment-971209475 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c4543sD simonw 9599 2021-11-17T05:41:42Z 2021-11-17T05:41:42Z OWNER

I'm going to build a brand new implementation of the TableView class that doesn't subclass BaseView at all, instead using asyncinject. If I'm lucky that will clean up the grungiest part of the codebase.

I can maybe even run the tests against old TableView and TableView2 to check that they behave the same.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
971057553 https://github.com/simonw/datasette/issues/878#issuecomment-971057553 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c454SmR simonw 9599 2021-11-17T01:40:45Z 2021-11-17T01:40:45Z OWNER

I shipped that code as a new library, asyncinject: https://pypi.org/project/asyncinject/ - I'll open a new PR to attempt to refactor TableView to use it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
970712713 https://github.com/simonw/datasette/issues/878#issuecomment-970712713 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c452-aJ simonw 9599 2021-11-16T21:54:33Z 2021-11-16T21:54:33Z OWNER

I'm going to continue working on this in a PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
970705738 https://github.com/simonw/datasette/issues/878#issuecomment-970705738 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c4528tK simonw 9599 2021-11-16T21:44:31Z 2021-11-16T21:44:31Z OWNER

Wrote a TIL about what I learned using TopologicalSorter: https://til.simonwillison.net/python/graphlib-topologicalsorter

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
970673085 https://github.com/simonw/datasette/issues/878#issuecomment-970673085 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c4520u9 simonw 9599 2021-11-16T20:58:24Z 2021-11-16T20:58:24Z OWNER

New test: ```python

class Complex(AsyncBase): def init(self): self.log = []

async def d(self):
    await asyncio.sleep(random() * 0.1)
    print("LOG: d")
    self.log.append("d")

async def c(self):
    await asyncio.sleep(random() * 0.1)
    print("LOG: c")
    self.log.append("c")

async def b(self, c, d):
    print("LOG: b")
    self.log.append("b")

async def a(self, b, c):
    print("LOG: a")
    self.log.append("a")

async def go(self, a):
    print("LOG: go")
    self.log.append("go")
    return self.log

@pytest.mark.asyncio async def test_complex(): result = await Complex().go() # 'c' should only be called once assert tuple(result) in ( # c and d could happen in either order ("c", "d", "b", "a", "go"), ("d", "c", "b", "a", "go"), ) And this code passes it:python import asyncio from functools import wraps import inspect

try: import graphlib except ImportError: from . import vendored_graphlib as graphlib

class AsyncMeta(type): def new(cls, name, bases, attrs): # Decorate any items that are 'async def' methods registry = {} new_attrs = {"_registry": _registry} for key, value in attrs.items(): if inspect.iscoroutinefunction(value) and not value.__name__ == "resolve": new_attrs[key] = make_method(value) _registry[key] = new_attrs[key] else: new_attrs[key] = value # Gather graph for later dependency resolution graph = { key: { p for p in inspect.signature(method).parameters.keys() if p != "self" and not p.startswith("") } for key, method in _registry.items() } new_attrs["_graph"] = graph return super().new(cls, name, bases, new_attrs)

def make_method(method): parameters = inspect.signature(method).parameters.keys()

@wraps(method)
async def inner(self, _results=None, **kwargs):
    print("\n{}.{}({}) _results={}".format(self, method.__name__, kwargs, _results))

    # Any parameters not provided by kwargs are resolved from registry
    to_resolve = [p for p in parameters if p not in kwargs and p != "self"]
    missing = [p for p in to_resolve if p not in self._registry]
    assert (
        not missing
    ), "The following DI parameters could not be found in the registry: {}".format(
        missing
    )

    results = {}
    results.update(kwargs)
    if to_resolve:
        resolved_parameters = await self.resolve(to_resolve, _results)
        results.update(resolved_parameters)
    return_value = await method(self, **results)
    if _results is not None:
        _results[method.__name__] = return_value
    return return_value

return inner

class AsyncBase(metaclass=AsyncMeta): async def resolve(self, names, results=None): print("\n resolve: ", names) if results is None: results = {}

    # Come up with an execution plan, just for these nodes
    ts = graphlib.TopologicalSorter()
    to_do = set(names)
    done = set()
    while to_do:
        item = to_do.pop()
        dependencies = self._graph[item]
        ts.add(item, *dependencies)
        done.add(item)
        # Add any not-done dependencies to the queue
        to_do.update({k for k in dependencies if k not in done})

    ts.prepare()
    plan = []
    while ts.is_active():
        node_group = ts.get_ready()
        plan.append(node_group)
        ts.done(*node_group)

    print("plan:", plan)

    results = {}
    for node_group in plan:
        awaitables = [
            self._registry[name](
                self,
                _results=results,
                **{k: v for k, v in results.items() if k in self._graph[name]},
            )
            for name in node_group
        ]
        print("    results = ", results)
        print("    awaitables: ", awaitables)
        awaitable_results = await asyncio.gather(*awaitables)
        results.update(
            {p[0].__name__: p[1] for p in zip(awaitables, awaitable_results)}
        )

    print("  End of resolve(), returning", results)
    return {key: value for key, value in results.items() if key in names}

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
970660299 https://github.com/simonw/datasette/issues/878#issuecomment-970660299 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c452xnL simonw 9599 2021-11-16T20:39:43Z 2021-11-16T20:42:27Z OWNER

But that does seem to be the plan that TopographicalSorter provides: ```python graph = {"go": {"a"}, "a": {"b", "c"}, "b": {"c", "d"}}

ts = TopologicalSorter(graph) ts.prepare() while ts.is_active(): nodes = ts.get_ready() print(nodes) ts.done(*nodes) Outputs: ('c', 'd') ('b',) ('a',) ('go',) Also:python graph = {"go": {"d", "e", "f"}, "d": {"b", "c"}, "b": {"c"}}

ts = TopologicalSorter(graph) ts.prepare() while ts.is_active(): nodes = ts.get_ready() print(nodes) ts.done(nodes) Gives: ('e', 'f', 'c') ('b',) ('d',) ('go',) `` I'm confident thatTopologicalSorteris the way to do this. I think I need to rewrite my code to call it once to get that plan, thenawait asyncio.gather(nodes)` in turn to execute it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
970657874 https://github.com/simonw/datasette/issues/878#issuecomment-970657874 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c452xBS simonw 9599 2021-11-16T20:36:01Z 2021-11-16T20:36:01Z OWNER

My goal here is to calculate the most efficient way to resolve the different nodes, running them in parallel where possible.

So for this class:

```python class Complex(AsyncBase): async def d(self): pass

async def c(self):
    pass

async def b(self, c, d):
    pass

async def a(self, b, c):
    pass

async def go(self, a):
    pass

`` A call togo()` should do this:

  • c and d in parallel
  • b
  • a
  • go
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
970655927 https://github.com/simonw/datasette/issues/878#issuecomment-970655927 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c452wi3 simonw 9599 2021-11-16T20:33:11Z 2021-11-16T20:33:11Z OWNER

What should be happening here instead is it should resolve the full graph and notice that c is depended on by both b and a - so it should run c first, then run the next ones in parallel.

So maybe the algorithm I'm inheriting from https://docs.python.org/3/library/graphlib.html isn't the correct algorithm?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
970655304 https://github.com/simonw/datasette/issues/878#issuecomment-970655304 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c452wZI simonw 9599 2021-11-16T20:32:16Z 2021-11-16T20:32:16Z OWNER

This code is really fiddly. I just got to this version: ```python import asyncio from functools import wraps import inspect

try: import graphlib except ImportError: from . import vendored_graphlib as graphlib

class AsyncMeta(type): def new(cls, name, bases, attrs): # Decorate any items that are 'async def' methods registry = {} new_attrs = {"_registry": _registry} for key, value in attrs.items(): if inspect.iscoroutinefunction(value) and not value.__name__ == "resolve": new_attrs[key] = make_method(value) _registry[key] = new_attrs[key] else: new_attrs[key] = value # Gather graph for later dependency resolution graph = { key: { p for p in inspect.signature(method).parameters.keys() if p != "self" and not p.startswith("") } for key, method in _registry.items() } new_attrs["_graph"] = graph return super().new(cls, name, bases, new_attrs)

def make_method(method): @wraps(method) async def inner(self, _results=None, kwargs): print("inner - _results=", _results) parameters = inspect.signature(method).parameters.keys() # Any parameters not provided by kwargs are resolved from registry to_resolve = [p for p in parameters if p not in kwargs and p != "self"] missing = [p for p in to_resolve if p not in self._registry] assert ( not missing ), "The following DI parameters could not be found in the registry: {}".format( missing ) results = {} results.update(kwargs) if to_resolve: resolved_parameters = await self.resolve(to_resolve, _results) results.update(resolved_parameters) return_value = await method(self, results) if _results is not None: _results[method.name] = return_value return return_value

return inner

class AsyncBase(metaclass=AsyncMeta): async def resolve(self, names, results=None): print("\n resolve: ", names) if results is None: results = {}

    # Resolve them in the correct order
    ts = graphlib.TopologicalSorter()
    for name in names:
        ts.add(name, *self._graph[name])
    ts.prepare()

    async def resolve_nodes(nodes):
        print("    resolve_nodes", nodes)
        print("    (current results = {})".format(repr(results)))
        awaitables = [
            self._registry[name](
                self,
                _results=results,
                **{k: v for k, v in results.items() if k in self._graph[name]},
            )
            for name in nodes
            if name not in results
        ]
        print("    awaitables: ", awaitables)
        awaitable_results = await asyncio.gather(*awaitables)
        results.update(
            {p[0].__name__: p[1] for p in zip(awaitables, awaitable_results)}
        )

    if not ts.is_active():
        # Nothing has dependencies - just resolve directly
        print("    no dependencies, resolve directly")
        await resolve_nodes(names)
    else:
        # Resolve in topological order
        while ts.is_active():
            nodes = ts.get_ready()
            print("    ts.get_ready() returned nodes:", nodes)
            await resolve_nodes(nodes)
            for node in nodes:
                ts.done(node)

    print("  End of resolve(), returning", results)
    return {key: value for key, value in results.items() if key in names}

With this test:python class Complex(AsyncBase): def init(self): self.log = []

async def c(self):
    print("LOG: c")
    self.log.append("c")

async def b(self, c):
    print("LOG: b")
    self.log.append("b")

async def a(self, b, c):
    print("LOG: a")
    self.log.append("a")

async def go(self, a):
    print("LOG: go")
    self.log.append("go")
    return self.log

@pytest.mark.asyncio async def test_complex(): result = await Complex().go() # 'c' should only be called once assert result == ["c", "b", "a", "go"] ``` This test sometimes passes, and sometimes fails!

Output for a pass: ``` tests/test_asyncdi.py inner - _results= None

resolve: ['a'] ts.get_ready() returned nodes: ('c', 'b') resolve_nodes ('c', 'b') (current results = {}) awaitables: [<coroutine object Complex.c at 0x1074ac890>, <coroutine object Complex.b at 0x1074ac820>] inner - _results= {} LOG: c inner - _results= {'c': None}

resolve: ['c'] ts.get_ready() returned nodes: ('c',) resolve_nodes ('c',) (current results = {'c': None}) awaitables: [] End of resolve(), returning {'c': None} LOG: b ts.get_ready() returned nodes: ('a',) resolve_nodes ('a',) (current results = {'c': None, 'b': None}) awaitables: [<coroutine object Complex.a at 0x1074ac7b0>] inner - _results= {'c': None, 'b': None} LOG: a End of resolve(), returning {'c': None, 'b': None, 'a': None} LOG: go Output for a fail: tests/test_asyncdi.py inner - _results= None

resolve: ['a'] ts.get_ready() returned nodes: ('b', 'c') resolve_nodes ('b', 'c') (current results = {}) awaitables: [<coroutine object Complex.b at 0x10923c890>, <coroutine object Complex.c at 0x10923c820>] inner - _results= {}

resolve: ['c'] ts.get_ready() returned nodes: ('c',) resolve_nodes ('c',) (current results = {}) awaitables: [<coroutine object Complex.c at 0x10923c6d0>] inner - _results= {} LOG: c inner - _results= {'c': None} LOG: c End of resolve(), returning {'c': None} LOG: b ts.get_ready() returned nodes: ('a',) resolve_nodes ('a',) (current results = {'c': None, 'b': None}) awaitables: [<coroutine object Complex.a at 0x10923c6d0>] inner - _results= {'c': None, 'b': None} LOG: a End of resolve(), returning {'c': None, 'b': None, 'a': None} LOG: go F

=================================================================================================== FAILURES =================================================================================================== _______________ test_complex _________________

@pytest.mark.asyncio
async def test_complex():
    result = await Complex().go()
    # 'c' should only be called once
  assert result == ["c", "b", "a", "go"]

E AssertionError: assert ['c', 'c', 'b', 'a', 'go'] == ['c', 'b', 'a', 'go'] E At index 1 diff: 'c' != 'b' E Left contains one more item: 'go' E Use -v to get the full diff

tests/test_asyncdi.py:48: AssertionError ================== short test summary info ================================ FAILED tests/test_asyncdi.py::test_complex - AssertionError: assert ['c', 'c', 'b', 'a', 'go'] == ['c', 'b', 'a', 'go'] ``` I figured out why this is happening.

a requires b and c

b also requires c

The code decides to run b and c in parallel.

If c completes first, then when b runs it gets to use the already-calculated result for c - so it doesn't need to call c again.

If b gets to that point before c does it also needs to call c.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
970624197 https://github.com/simonw/datasette/issues/878#issuecomment-970624197 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c452ozF simonw 9599 2021-11-16T19:49:05Z 2021-11-16T19:49:05Z OWNER

Here's the latest version of my weird dependency injection async class: ```python import inspect

class AsyncMeta(type): def new(cls, name, bases, attrs): # Decorate any items that are 'async def' methods _registry = {} new_attrs = {"_registry": _registry} for key, value in attrs.items(): if inspect.iscoroutinefunction(value) and not value.name == "resolve": new_attrs[key] = make_method(value) _registry[key] = new_attrs[key] else: new_attrs[key] = value

    # Topological sort of _registry by parameter dependencies
    graph = {
        key: {
            p for p in inspect.signature(method).parameters.keys()
            if p != "self" and not p.startswith("_")
        }
        for key, method in _registry.items()
    }
    new_attrs["_graph"] = graph
    return super().__new__(cls, name, bases, new_attrs)

def make_method(method): @wraps(method) async def inner(self, kwargs): parameters = inspect.signature(method).parameters.keys() # Any parameters not provided by kwargs are resolved from registry to_resolve = [p for p in parameters if p not in kwargs and p != "self"] missing = [p for p in to_resolve if p not in self._registry] assert ( not missing ), "The following DI parameters could not be found in the registry: {}".format( missing ) results = {} results.update(kwargs) results.update(await self.resolve(to_resolve)) return await method(self, results)

return inner

bad = [0]

class AsyncBase(metaclass=AsyncMeta): async def resolve(self, names): print(" resolve({})".format(names)) results = {} # Resolve them in the correct order ts = TopologicalSorter() ts2 = TopologicalSorter() print(" names = ", names) print(" self._graph = ", self._graph) for name in names: if self._graph[name]: ts.add(name, self._graph[name]) ts2.add(name, self._graph[name]) print(" static_order =", tuple(ts2.static_order())) ts.prepare() while ts.is_active(): print(" is_active, i = ", bad[0]) bad[0] += 1 if bad[0] > 20: print(" Infinite loop?") break nodes = ts.get_ready() print(" Do nodes:", nodes) awaitables = [self._registryname for name in nodes] print(" awaitables: ", awaitables) awaitable_results = await asyncio.gather(*awaitables) results.update({ p[0].name: p[1] for p in zip(awaitables, awaitable_results) }) print(results) for node in nodes: ts.done(node)

    return results

Example usage:python class Foo(AsyncBase): async def graa(self, boff): print("graa") return 5 async def boff(self): print("boff") return 8 async def other(self, boff, graa): print("other") return 5 + boff + graa

foo = Foo() await foo.other() Output: resolve(['boff', 'graa']) names = ['boff', 'graa'] self._graph = {'graa': {'boff'}, 'boff': set(), 'other': {'graa', 'boff'}} static_order = ('boff', 'graa') is_active, i = 0 Do nodes: ('boff',) awaitables: [<coroutine object Foo.boff at 0x10bd81a40>] resolve([]) names = [] self._graph = {'graa': {'boff'}, 'boff': set(), 'other': {'graa', 'boff'}} static_order = () boff {'boff': 8} is_active, i = 1 Do nodes: ('graa',) awaitables: [<coroutine object Foo.graa at 0x10d66b340>] resolve([]) names = [] self._graph = {'graa': {'boff'}, 'boff': set(), 'other': {'graa', 'boff'}} static_order = () graa {'boff': 8, 'graa': 5} other 18 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
951740637 https://github.com/simonw/datasette/issues/878#issuecomment-951740637 https://api.github.com/repos/simonw/datasette/issues/878 IC_kwDOBm6k_c44umjd 20after4 30934 2021-10-26T09:12:15Z 2021-10-26T09:12:15Z NONE

This sounds really ambitious but also really awesome. I like the idea that basically any piece of a page could be selectively replaced.

It sort of sounds like a python asyncio version of https://github.com/observablehq/runtime

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
803473015 https://github.com/simonw/datasette/issues/878#issuecomment-803473015 https://api.github.com/repos/simonw/datasette/issues/878 MDEyOklzc3VlQ29tbWVudDgwMzQ3MzAxNQ== simonw 9599 2021-03-20T22:33:05Z 2021-03-20T22:33:05Z OWNER

Things this mechanism needs to be able to support:

  • Returning a default JSON representation
  • Defining "extra" JSON representations blocks, which can be requested using ?_extra=
  • Returning rendered HTML, based on the default JSON + one or more extras + a template
  • Using Datasette output renderers to return e.g. CSV data
  • Potentially also supporting streaming output renderers for streaming CSV/TSV/JSON-nl etc
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
803472595 https://github.com/simonw/datasette/issues/878#issuecomment-803472595 https://api.github.com/repos/simonw/datasette/issues/878 MDEyOklzc3VlQ29tbWVudDgwMzQ3MjU5NQ== simonw 9599 2021-03-20T22:28:12Z 2021-03-20T22:28:12Z OWNER

Another idea I had: a view is a class that takes the datasette instance in its constructor, and defines a __call__ method that accepts a request and returns a response. Except await __call__ looks like it might be a bit messy, discussion in https://github.com/encode/starlette/issues/886

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
803472278 https://github.com/simonw/datasette/issues/878#issuecomment-803472278 https://api.github.com/repos/simonw/datasette/issues/878 MDEyOklzc3VlQ29tbWVudDgwMzQ3MjI3OA== simonw 9599 2021-03-20T22:25:04Z 2021-03-20T22:25:04Z OWNER

I came up with a slightly wild idea for this that would involve pytest-style dependency injection.

Prototype here: https://gist.github.com/simonw/496b24fdad44f6f8b7237fe394a0ced7

Copying from my private notes:

Use the lazy evaluated DI mechanism to break up table view into different pieces eg for faceting

Use that to solve JSON vs HTML views

Oh here's an idea: what if the various components of the table view were each defined as async functions.... and then executed using asyncio.gather in order to run the SQL queries in parallel? Then try benchmarking with different numbers of threads?

The async_call_with_arguments function could do this automatically for any awaitable dependencies

This would give me massively parallel dependency injection

(I could build an entire framework around this and call it c64)

Idea: arguments called eg "count" are executed and the result passed to the function. If called count_fn then a reference to the not-yet-called function is passed instead

I'm not going to completely combine the views mechanism and the render hooks. Instead, the core view will define a bunch of functions used to compose the page and the render hook will have conditional access to those functions - which will otherwise be asyncio.gather executed directly by the HTML page version

Using asyncio.gather to execute facets and suggest facets in parallel would be VERY interesting

suggest facets should be VERY cachable - doesn't matter if it's wrong unlike actual facets themselves

What if all Datasette views were defined in terms of dependency injection - and those dependency functions could themselves depend on others just like pytest fixtures. Everything would become composable and async stuff could execute in parallel

FURTHER IDEA: use this for the ?_extra= mechanism as well.

Any view in Datasette can be defined as a collection of named keys. Each of those keys maps to a function or an async function that accepts as input other named keys, using DI to handle them.

The HTML view is a defined function. So are the other outputs.

Default original inputs include “request” and “datasette”.

So… maybe a view function is a class methods that use DI. One of those methods as an .html() method used for the default page.

Output formats are a bit more complicated because they are supposed to be defined separately in plugins. They are unified across query, row and table though.

I’m going to try breaking up the TableView to see what happens.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
803471917 https://github.com/simonw/datasette/issues/878#issuecomment-803471917 https://api.github.com/repos/simonw/datasette/issues/878 MDEyOklzc3VlQ29tbWVudDgwMzQ3MTkxNw== simonw 9599 2021-03-20T22:21:33Z 2021-03-20T22:21:33Z OWNER

This has been blocking things for too long.

If this becomes a documented pattern, things like adding a JSON output to https://github.com/dogsheep/dogsheep-beta becomes easier too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
709503359 https://github.com/simonw/datasette/issues/878#issuecomment-709503359 https://api.github.com/repos/simonw/datasette/issues/878 MDEyOklzc3VlQ29tbWVudDcwOTUwMzM1OQ== simonw 9599 2020-10-15T18:15:28Z 2020-10-15T18:15:28Z OWNER

I think this is blocking #619

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  
709502889 https://github.com/simonw/datasette/issues/878#issuecomment-709502889 https://api.github.com/repos/simonw/datasette/issues/878 MDEyOklzc3VlQ29tbWVudDcwOTUwMjg4OQ== simonw 9599 2020-10-15T18:14:34Z 2020-10-15T18:14:34Z OWNER

The BaseView class does this for Datasette internals at the moment, but I'm not convinced it works as well as it could.

I'd like to turn this into a class that is documented and available to plugins as well.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New pattern for views that return either JSON or HTML, available for plugins 648435885  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1.2ms · About: github-to-sqlite