home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where author_association = "OWNER", issue = 1058072543 and "updated_at" is on date 2021-12-22 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 7

issue 1

  • Complete refactor of TableView and table.html template · 7 ✖

author_association 1

  • OWNER · 7 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
999870993 https://github.com/simonw/datasette/issues/1518#issuecomment-999870993 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c47mNIR simonw 9599 2021-12-22T20:47:18Z 2021-12-22T20:50:24Z OWNER

The reason they aren't showing up in the traces is that traces are stored just for the currently executing asyncio task ID: https://github.com/simonw/datasette/blob/ace86566b28280091b3844cf5fbecd20158e9004/datasette/tracer.py#L13-L25

This is so traces for other incoming requests don't end up mixed together. But there's no current mechanism to track async tasks that are effectively "child tasks" of the current request, and hence should be tracked the same.

https://stackoverflow.com/a/69349501/6083 suggests that you pass the task ID as an argument to the child tasks that are executed using asyncio.gather() to work around this kind of problem.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
999870282 https://github.com/simonw/datasette/issues/1518#issuecomment-999870282 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c47mM9K simonw 9599 2021-12-22T20:45:56Z 2021-12-22T20:46:08Z OWNER

New short-term goal: get facets and suggested facets to execute in parallel with the main query. Generate a trace graph that proves that is happening using datasette-pretty-traces.

I wrote code to execute those in parallel using asyncio.gather() - which seems to work but causes the SQL run inside the parallel async def functions not to show up in the trace graph at all.

```diff diff --git a/datasette/views/table.py b/datasette/views/table.py index 9808fd2..ec9db64 100644 --- a/datasette/views/table.py +++ b/datasette/views/table.py @@ -1,3 +1,4 @@ +import asyncio import urllib import itertools import json @@ -615,44 +616,37 @@ class TableView(RowTableShared): if request.args.get("_timelimit"): extra_args["custom_time_limit"] = int(request.args.get("_timelimit"))

  • Execute the main query!

  • results = await db.execute(sql, params, truncate=True, **extra_args)

  • Calculate the total count for this query

  • filtered_table_rows_count = None
  • if (
  • not db.is_mutable
  • and self.ds.inspect_data
  • and count_sql == f"select count(*) from {table} "
  • ):
  • We can use a previously cached table row count

  • try:
  • filtered_table_rows_count = self.ds.inspect_data[database]["tables"][
  • table
  • ]["count"]
  • except KeyError:
  • pass

  • Otherwise run a select count(*) ...

  • if count_sql and filtered_table_rows_count is None and not nocount:
  • try:
  • count_rows = list(await db.execute(count_sql, from_sql_params))
  • filtered_table_rows_count = count_rows[0][0]
  • except QueryInterrupted:
  • pass

  • Faceting

  • if not self.ds.setting("allow_facet") and any(
  • arg.startswith("_facet") for arg in request.args
  • ):
  • raise BadRequest("_facet= is not allowed")
  • async def execute_count():
  • Calculate the total count for this query

  • filtered_table_rows_count = None
  • if (
  • not db.is_mutable
  • and self.ds.inspect_data
  • and count_sql == f"select count(*) from {table} "
  • ):
  • We can use a previously cached table row count

  • try:
  • filtered_table_rows_count = self.ds.inspect_data[database][
  • "tables"
  • ][table]["count"]
  • except KeyError:
  • pass +
  • if count_sql and filtered_table_rows_count is None and not nocount:
  • try:
  • count_rows = list(await db.execute(count_sql, from_sql_params))
  • filtered_table_rows_count = count_rows[0][0]
  • except QueryInterrupted:
  • pass +
  • return filtered_table_rows_count +
  • filtered_table_rows_count = await execute_count()

     # pylint: disable=no-member
     facet_classes = list(
         itertools.chain.from_iterable(pm.hook.register_facet_classes())
     )
    
    • facet_results = {}
    • facets_timed_out = [] facet_instances = [] for klass in facet_classes: facet_instances.append( @@ -668,33 +662,58 @@ class TableView(RowTableShared): ) )
  • if not nofacet:

  • for facet in facet_instances:
  • (
  • instance_facet_results,
  • instance_facets_timed_out,
  • ) = await facet.facet_results()
  • for facet_info in instance_facet_results:
  • base_key = facet_info["name"]
  • key = base_key
  • i = 1
  • while key in facet_results:
  • i += 1
  • key = f"{base_key}_{i}"
  • facet_results[key] = facet_info
  • facets_timed_out.extend(instance_facets_timed_out)

  • Calculate suggested facets

  • suggested_facets = []
  • if (
  • self.ds.setting("suggest_facets")
  • and self.ds.setting("allow_facet")
  • and not _next
  • and not nofacet
  • and not nosuggest
  • ):
  • for facet in facet_instances:
  • suggested_facets.extend(await facet.suggest())
  • async def execute_suggested_facets():
  • Calculate suggested facets

  • suggested_facets = []
  • if (
  • self.ds.setting("suggest_facets")
  • and self.ds.setting("allow_facet")
  • and not _next
  • and not nofacet
  • and not nosuggest
  • ):
  • for facet in facet_instances:
  • suggested_facets.extend(await facet.suggest())
  • return suggested_facets +
  • async def execute_facets():
  • facet_results = {}
  • facets_timed_out = []
  • if not self.ds.setting("allow_facet") and any(
  • arg.startswith("_facet") for arg in request.args
  • ):
  • raise BadRequest("_facet= is not allowed") +
  • if not nofacet:
  • for facet in facet_instances:
  • (
  • instance_facet_results,
  • instance_facets_timed_out,
  • ) = await facet.facet_results()
  • for facet_info in instance_facet_results:
  • base_key = facet_info["name"]
  • key = base_key
  • i = 1
  • while key in facet_results:
  • i += 1
  • key = f"{base_key}_{i}"
  • facet_results[key] = facet_info
  • facets_timed_out.extend(instance_facets_timed_out) +
  • return facet_results, facets_timed_out +
  • Execute the main query, facets and facet suggestions in parallel:

  • (
  • results,
  • suggested_facets,
  • (facet_results, facets_timed_out),
  • ) = await asyncio.gather(
  • db.execute(sql, params, truncate=True, **extra_args),
  • execute_suggested_facets(),
  • execute_facets(),
  • ) +
  • results = await db.execute(sql, params, truncate=True, **extra_args)
     # Figure out columns and rows for the query
     columns = [r[0] for r in results.description]
    

    `` Here's the trace forhttp://127.0.0.1:4422/fixtures/compound_three_primary_keys?_trace=1&_facet=pk1&_facet=pk2` with the missing facet and facet suggestion queries:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
999863269 https://github.com/simonw/datasette/issues/1518#issuecomment-999863269 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c47mLPl simonw 9599 2021-12-22T20:35:41Z 2021-12-22T20:37:13Z OWNER

It looks like the count has to be executed before facets can be, because the facet_class constructor needs that total count figure: https://github.com/simonw/datasette/blob/6b1384b2f529134998fb507e63307609a5b7f5c0/datasette/views/table.py#L660-L671

It's used in facet suggestion logic here: https://github.com/simonw/datasette/blob/ace86566b28280091b3844cf5fbecd20158e9004/datasette/facets.py#L172-L178

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
999850191 https://github.com/simonw/datasette/issues/1518#issuecomment-999850191 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c47mIDP simonw 9599 2021-12-22T20:29:38Z 2021-12-22T20:29:38Z OWNER

New short-term goal: get facets and suggested facets to execute in parallel with the main query. Generate a trace graph that proves that is happening using datasette-pretty-traces.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
999837569 https://github.com/simonw/datasette/issues/1518#issuecomment-999837569 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c47mE-B simonw 9599 2021-12-22T20:15:45Z 2021-12-22T20:15:45Z OWNER

Also the whole special_args v.s. request.args thing is pretty confusing, I think that might be an older code pattern back from when I was using Sanic.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
999837220 https://github.com/simonw/datasette/issues/1518#issuecomment-999837220 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c47mE4k simonw 9599 2021-12-22T20:15:04Z 2021-12-22T20:15:04Z OWNER

I think I can move this much higher up in the method, it's a bit confusing having it half way through: https://github.com/simonw/datasette/blob/6b1384b2f529134998fb507e63307609a5b7f5c0/datasette/views/table.py#L414-L436

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  
999831967 https://github.com/simonw/datasette/issues/1518#issuecomment-999831967 https://api.github.com/repos/simonw/datasette/issues/1518 IC_kwDOBm6k_c47mDmf simonw 9599 2021-12-22T20:04:47Z 2021-12-22T20:10:11Z OWNER

I think I might be able to clean up a lot of the stuff in here using the render_cell plugin hook: https://github.com/simonw/datasette/blob/6b1384b2f529134998fb507e63307609a5b7f5c0/datasette/views/table.py#L87-L89

The catch with that hook - https://docs.datasette.io/en/stable/plugin_hooks.html#render-cell-value-column-table-database-datasette - is that it gets called for every single cell. I don't want the overhead of looking up the foreign key relationships etc once for every value in a specific column.

But maybe I could extend the hook to include a shared cache that gets used for all of the cells in a specific table? Something like this: python render_cell(value, column, table, database, datasette, cache) cache is a dictionary - and the same dictionary is passed to every call to that hook while rendering a specific page.

It's a bit of a gross hack though, and would it ever be useful for plugins outside of the default plugin in Datasette which does the foreign key stuff?

If I can think of one other potential application for this cache then I might implement it.

No, this optimization doesn't make sense: the most complex cell enrichment logic is the stuff that does a select * from categories where id in (2, 5, 6) query, using just the distinct set of IDs that are rendered on the current page. That's not going to fit in the render_cell hook no matter how hard I try to warp it into the right shape, because it needs full visibility of all of the results that are being rendered in order to collect those unique ID values.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Complete refactor of TableView and table.html template 1058072543  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 22.023ms · About: github-to-sqlite