home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

11 rows where issue = 811367257 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 11

issue 1

  • Race condition errors in new refresh_schemas() mechanism · 11 ✖

author_association 1

  • OWNER 11
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
881677620 https://github.com/simonw/datasette/issues/1231#issuecomment-881677620 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jVU0 simonw 9599 2021-07-16T19:44:12Z 2021-07-16T19:44:12Z OWNER

That fixed the race condition in the datasette-graphql tests, which is the only place that I've been able to successfully replicate this. I'm going to land this change.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881674857 https://github.com/simonw/datasette/issues/1231#issuecomment-881674857 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jUpp simonw 9599 2021-07-16T19:38:39Z 2021-07-16T19:38:39Z OWNER

I can't replicate the race condition locally with or without this patch. I'm going to push the commit and then test the CI run from datasette-graphql that was failing against it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881671706 https://github.com/simonw/datasette/issues/1231#issuecomment-881671706 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jT4a simonw 9599 2021-07-16T19:32:05Z 2021-07-16T19:32:05Z OWNER

The test suite passes with that change.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881668759 https://github.com/simonw/datasette/issues/1231#issuecomment-881668759 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jTKX simonw 9599 2021-07-16T19:27:46Z 2021-07-16T19:27:46Z OWNER

Second attempt at this: ```diff diff --git a/datasette/app.py b/datasette/app.py index 5976d8b..5f348cb 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -224,6 +224,7 @@ class Datasette: self.inspect_data = inspect_data self.immutables = set(immutables or []) self.databases = collections.OrderedDict() + self._refresh_schemas_lock = asyncio.Lock() self.crossdb = crossdb if memory or crossdb or not self.files: self.add_database(Database(self, is_memory=True), name="_memory") @@ -332,6 +333,12 @@ class Datasette: self.client = DatasetteClient(self)

 async def refresh_schemas(self):
  • if self._refresh_schemas_lock.locked():
  • return
  • async with self._refresh_schemas_lock:
  • await self._refresh_schemas() +
  • async def _refresh_schemas(self): internal_db = self.databases["_internal"] if not self.internal_db_created: await init_internal_db(internal_db) ```
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881665383 https://github.com/simonw/datasette/issues/1231#issuecomment-881665383 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jSVn simonw 9599 2021-07-16T19:21:35Z 2021-07-16T19:21:35Z OWNER

https://stackoverflow.com/a/25799871/6083 has a good example of using asyncio.Lock():

```python stuff_lock = asyncio.Lock()

async def get_stuff(url): async with stuff_lock: if url in cache: return cache[url] stuff = await aiohttp.request('GET', url) cache[url] = stuff return stuff ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881664408 https://github.com/simonw/datasette/issues/1231#issuecomment-881664408 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jSGY simonw 9599 2021-07-16T19:19:35Z 2021-07-16T19:19:35Z OWNER

The only place that calls refresh_schemas() is here: https://github.com/simonw/datasette/blob/dd5ee8e66882c94343cd3f71920878c6cfd0da41/datasette/views/base.py#L120-L124

Ideally only one call to refresh_schemas() would be running at any one time.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881663968 https://github.com/simonw/datasette/issues/1231#issuecomment-881663968 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40jR_g simonw 9599 2021-07-16T19:18:42Z 2021-07-16T19:18:42Z OWNER

The race condition happens inside this method - initially with the call to await init_internal_db(): https://github.com/simonw/datasette/blob/dd5ee8e66882c94343cd3f71920878c6cfd0da41/datasette/app.py#L334-L359

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881204782 https://github.com/simonw/datasette/issues/1231#issuecomment-881204782 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40hh4u simonw 9599 2021-07-16T06:14:12Z 2021-07-16T06:14:12Z OWNER

Here's the traceback I got from datasette-graphql (annoyingly only running the tests in GitHub Actions CI - I've not been able to replicate on my laptop yet):

``` tests/test_utils.py . [100%]

=================================== FAILURES =================================== ____ testgraphql_examples[path0] _____

ds = <datasette.app.Datasette object at 0x7f6b8b6f8fd0> path = PosixPath('/home/runner/work/datasette-graphql/datasette-graphql/examples/filters.md')

@pytest.mark.asyncio
@pytest.mark.parametrize(
    "path", (pathlib.Path(__file__).parent.parent / "examples").glob("*.md")
)
async def test_graphql_examples(ds, path):
    content = path.read_text()
    query = graphql_re.search(content)[1]
    try:
        variables = variables_re.search(content)[1]
    except TypeError:
        variables = "{}"
    expected = json.loads(json_re.search(content)[1])
    response = await ds.client.post(
        "/graphql",
        json={
            "query": query,
            "variables": json.loads(variables),
        },
    )
  assert response.status_code == 200, response.json()

E AssertionError: {'data': {'repos_arraycontains': None, 'users_contains': None, 'users_date': None, 'users_endswith': None, ...}, 'erro...", 'path': ['users_gt']}, {'locations': [{'column': 5, 'line': 34}], 'message': "'rows'", 'path': ['users_gte']}, ...]} E assert 500 == 200 E + where 500 = <Response [500 Internal Server Error]>.status_code

tests/test_graphql.py:142: AssertionError ----------------------------- Captured stderr call ----------------------------- table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists table databases already exists Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/app.py", line 1171, in route_path response = await view(request, send) File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/views/base.py", line 151, in view request, **request.scope["url_route"]["kwargs"] File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/views/base.py", line 123, in dispatch_request await self.ds.refresh_schemas() File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/app.py", line 338, in refresh_schemas await init_internal_db(internal_db) File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/utils/internal_db.py", line 16, in init_internal_db block=True, File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/database.py", line 102, in execute_write return await self.execute_write_fn(_inner, block=block) File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/database.py", line 118, in execute_write_fn raise result File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/database.py", line 139, in _execute_writes result = task.fn(conn) File "/opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/datasette/database.py", line 100, in _inner return conn.execute(sql, params or []) sqlite3.OperationalError: table databases already exists ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
881204343 https://github.com/simonw/datasette/issues/1231#issuecomment-881204343 https://api.github.com/repos/simonw/datasette/issues/1231 IC_kwDOBm6k_c40hhx3 simonw 9599 2021-07-16T06:13:11Z 2021-07-16T06:13:11Z OWNER

This just broke the datasette-graphql test suite: https://github.com/simonw/datasette-graphql/issues/77 - I need to figure out a solution here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
781560989 https://github.com/simonw/datasette/issues/1231#issuecomment-781560989 https://api.github.com/repos/simonw/datasette/issues/1231 MDEyOklzc3VlQ29tbWVudDc4MTU2MDk4OQ== simonw 9599 2021-02-18T18:50:53Z 2021-02-18T18:50:53Z OWNER

Ideally I'd figure out a way to replicate this error in a concurrent unit test.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
781560865 https://github.com/simonw/datasette/issues/1231#issuecomment-781560865 https://api.github.com/repos/simonw/datasette/issues/1231 MDEyOklzc3VlQ29tbWVudDc4MTU2MDg2NQ== simonw 9599 2021-02-18T18:50:38Z 2021-02-18T18:50:38Z OWNER

I started trying to use locks to resolve this but I've not figured out the right way to do that yet - here's my first experiment: ```diff diff --git a/datasette/app.py b/datasette/app.py index 9e15a16..1681c9d 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -217,6 +217,7 @@ class Datasette: self.inspect_data = inspect_data self.immutables = set(immutables or []) self.databases = collections.OrderedDict() + self._refresh_schemas_lock = threading.Lock() if memory or not self.files: self.add_database(Database(self, is_memory=True), name="_memory") # memory_name is a random string so that each Datasette instance gets its own @@ -324,6 +325,13 @@ class Datasette: self.client = DatasetteClient(self)

 async def refresh_schemas(self):
  • return
  • if self._refresh_schemas_lock.locked():
  • return
  • with self._refresh_schemas_lock:
  • await self._refresh_schemas() +
  • async def _refresh_schemas(self): internal_db = self.databases["_internal"] if not self.internal_db_created: await init_internal_db(internal_db) ```
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 18.567ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows