github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/datasette/issues/878#issuecomment-970712713 | https://api.github.com/repos/simonw/datasette/issues/878 | 970712713 | IC_kwDOBm6k_c452-aJ | 9599 | 2021-11-16T21:54:33Z | 2021-11-16T21:54:33Z | OWNER | I'm going to continue working on this in a PR. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970705738 | https://api.github.com/repos/simonw/datasette/issues/878 | 970705738 | IC_kwDOBm6k_c4528tK | 9599 | 2021-11-16T21:44:31Z | 2021-11-16T21:44:31Z | OWNER | Wrote a TIL about what I learned using `TopologicalSorter`: https://til.simonwillison.net/python/graphlib-topologicalsorter | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970673085 | https://api.github.com/repos/simonw/datasette/issues/878 | 970673085 | IC_kwDOBm6k_c4520u9 | 9599 | 2021-11-16T20:58:24Z | 2021-11-16T20:58:24Z | OWNER | New test: ```python class Complex(AsyncBase): def __init__(self): self.log = [] async def d(self): await asyncio.sleep(random() * 0.1) print("LOG: d") self.log.append("d") async def c(self): await asyncio.sleep(random() * 0.1) print("LOG: c") self.log.append("c") async def b(self, c, d): print("LOG: b") self.log.append("b") async def a(self, b, c): print("LOG: a") self.log.append("a") async def go(self, a): print("LOG: go") self.log.append("go") return self.log @pytest.mark.asyncio async def test_complex(): result = await Complex().go() # 'c' should only be called once assert tuple(result) in ( # c and d could happen in either order ("c", "d", "b", "a", "go"), ("d", "c", "b", "a", "go"), ) ``` And this code passes it: ```python import asyncio from functools import wraps import inspect try: import graphlib except ImportError: from . import vendored_graphlib as graphlib class AsyncMeta(type): def __new__(cls, name, bases, attrs): # Decorate any items that are 'async def' methods _registry = {} new_attrs = {"_registry": _registry} for key, value in attrs.items(): if inspect.iscoroutinefunction(value) and not value.__name__ == "resolve": new_attrs[key] = make_method(value) _registry[key] = new_attrs[key] else: new_attrs[key] = value # Gather graph for later dependency resolution graph = { key: { p for p in inspect.signature(method).parameters.keys() if p != "self" and not p.startswith("_") } for key, method in _registry.items() } new_attrs["_graph"] = graph return super().__new__(cls, name, bases, new_attrs) def make_met… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970660299 | https://api.github.com/repos/simonw/datasette/issues/878 | 970660299 | IC_kwDOBm6k_c452xnL | 9599 | 2021-11-16T20:39:43Z | 2021-11-16T20:42:27Z | OWNER | But that does seem to be the plan that `TopographicalSorter` provides: ```python graph = {"go": {"a"}, "a": {"b", "c"}, "b": {"c", "d"}} ts = TopologicalSorter(graph) ts.prepare() while ts.is_active(): nodes = ts.get_ready() print(nodes) ts.done(*nodes) ``` Outputs: ``` ('c', 'd') ('b',) ('a',) ('go',) ``` Also: ```python graph = {"go": {"d", "e", "f"}, "d": {"b", "c"}, "b": {"c"}} ts = TopologicalSorter(graph) ts.prepare() while ts.is_active(): nodes = ts.get_ready() print(nodes) ts.done(*nodes) ``` Gives: ``` ('e', 'f', 'c') ('b',) ('d',) ('go',) ``` I'm confident that `TopologicalSorter` is the way to do this. I think I need to rewrite my code to call it once to get that plan, then `await asyncio.gather(*nodes)` in turn to execute it. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970657874 | https://api.github.com/repos/simonw/datasette/issues/878 | 970657874 | IC_kwDOBm6k_c452xBS | 9599 | 2021-11-16T20:36:01Z | 2021-11-16T20:36:01Z | OWNER | My goal here is to calculate the most efficient way to resolve the different nodes, running them in parallel where possible. So for this class: ```python class Complex(AsyncBase): async def d(self): pass async def c(self): pass async def b(self, c, d): pass async def a(self, b, c): pass async def go(self, a): pass ``` A call to `go()` should do this: - `c` and `d` in parallel - `b` - `a` - `go` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970655927 | https://api.github.com/repos/simonw/datasette/issues/878 | 970655927 | IC_kwDOBm6k_c452wi3 | 9599 | 2021-11-16T20:33:11Z | 2021-11-16T20:33:11Z | OWNER | What should be happening here instead is it should resolve the full graph and notice that `c` is depended on by both `b` and `a` - so it should run `c` first, then run the next ones in parallel. So maybe the algorithm I'm inheriting from https://docs.python.org/3/library/graphlib.html isn't the correct algorithm? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970655304 | https://api.github.com/repos/simonw/datasette/issues/878 | 970655304 | IC_kwDOBm6k_c452wZI | 9599 | 2021-11-16T20:32:16Z | 2021-11-16T20:32:16Z | OWNER | This code is really fiddly. I just got to this version: ```python import asyncio from functools import wraps import inspect try: import graphlib except ImportError: from . import vendored_graphlib as graphlib class AsyncMeta(type): def __new__(cls, name, bases, attrs): # Decorate any items that are 'async def' methods _registry = {} new_attrs = {"_registry": _registry} for key, value in attrs.items(): if inspect.iscoroutinefunction(value) and not value.__name__ == "resolve": new_attrs[key] = make_method(value) _registry[key] = new_attrs[key] else: new_attrs[key] = value # Gather graph for later dependency resolution graph = { key: { p for p in inspect.signature(method).parameters.keys() if p != "self" and not p.startswith("_") } for key, method in _registry.items() } new_attrs["_graph"] = graph return super().__new__(cls, name, bases, new_attrs) def make_method(method): @wraps(method) async def inner(self, _results=None, **kwargs): print("inner - _results=", _results) parameters = inspect.signature(method).parameters.keys() # Any parameters not provided by kwargs are resolved from registry to_resolve = [p for p in parameters if p not in kwargs and p != "self"] missing = [p for p in to_resolve if p not in self._registry] assert ( not missing ), "The following DI parameters could not be found in the registry: {}".format( missing ) results = {} results.update(kwargs) if to_resolve: resolved_parameters = await self.resolve(to_resolve, _results) results.update(resolved_parameters) return_value = await method(self, **results) if _results is not None: _res… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 | |
https://github.com/simonw/datasette/issues/878#issuecomment-970624197 | https://api.github.com/repos/simonw/datasette/issues/878 | 970624197 | IC_kwDOBm6k_c452ozF | 9599 | 2021-11-16T19:49:05Z | 2021-11-16T19:49:05Z | OWNER | Here's the latest version of my weird dependency injection async class: ```python import inspect class AsyncMeta(type): def __new__(cls, name, bases, attrs): # Decorate any items that are 'async def' methods _registry = {} new_attrs = {"_registry": _registry} for key, value in attrs.items(): if inspect.iscoroutinefunction(value) and not value.__name__ == "resolve": new_attrs[key] = make_method(value) _registry[key] = new_attrs[key] else: new_attrs[key] = value # Topological sort of _registry by parameter dependencies graph = { key: { p for p in inspect.signature(method).parameters.keys() if p != "self" and not p.startswith("_") } for key, method in _registry.items() } new_attrs["_graph"] = graph return super().__new__(cls, name, bases, new_attrs) def make_method(method): @wraps(method) async def inner(self, **kwargs): parameters = inspect.signature(method).parameters.keys() # Any parameters not provided by kwargs are resolved from registry to_resolve = [p for p in parameters if p not in kwargs and p != "self"] missing = [p for p in to_resolve if p not in self._registry] assert ( not missing ), "The following DI parameters could not be found in the registry: {}".format( missing ) results = {} results.update(kwargs) results.update(await self.resolve(to_resolve)) return await method(self, **results) return inner bad = [0] class AsyncBase(metaclass=AsyncMeta): async def resolve(self, names): print(" resolve({})".format(names)) results = {} # Resolve them in the correct order ts = TopologicalSorter() ts2 = TopologicalSorter() print(" names = ", names) print(" s… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
648435885 |