home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

37 rows where "updated_at" is on date 2022-12-15 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: updated_at (date)

issue 8

  • invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 15
  • Refactor test suite to use mostly `async def` tests 7
  • Port as many tests as possible to async def tests against ds_client 5
  • Intermittent "Too many open files" error running tests 4
  • `.json` errors should be returned as JSON 2
  • Release notes for Datasette 1.0a2 2
  • JSON link on row page is 404 if base_url setting is used 1
  • datasette --root running in Docker doesn't reliably show the magic URL 1

user 2

  • simonw 36
  • noslouch 1

author_association 2

  • OWNER 36
  • NONE 1
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1353812913 https://github.com/simonw/datasette/pull/1960#issuecomment-1353812913 https://api.github.com/repos/simonw/datasette/issues/1960 IC_kwDOBm6k_c5QsYux simonw 9599 2022-12-15T22:48:54Z 2022-12-15T22:48:54Z OWNER

This is all very broken: ``` % pytest -x --pdb ================================================================================== test session starts ================================================================================== platform darwin -- Python 3.10.3, pytest-7.1.3, pluggy-1.0.0 SQLite: 3.39.4 rootdir: /Users/simon/Dropbox/Development/datasette, configfile: pytest.ini plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, asyncio-0.19.0, timeout-2.1.0, profiling-1.7.0 asyncio: mode=strict collected 1295 items

tests/test_package.py .. [ 0%] tests/test_cli.py . [ 0%] tests/test_cli_serve_get.py .. [ 0%] tests/test_cli.py . [ 0%] tests/test_black.py . [ 0%] tests/test_api.py E

traceback >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

fixturedef = <FixtureDef argname='event_loop' scope='session' baseid='tests'>, request = <SubRequest 'event_loop' for \<Function test_homepage>>

@pytest.hookimpl(hookwrapper=True)
def pytest_fixture_setup(
    fixturedef: FixtureDef, request: SubRequest
) -> Optional[object]:
    """Adjust the event loop policy when an event loop is produced."""
    if fixturedef.argname == "event_loop":
        outcome = yield
      loop = outcome.get_result()

/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/pytest_asyncio/plugin.py:377:


/Users/simon/Dropbox/Development/datasette/tests/conftest.py:30: in event_loop return asyncio.get_event_loop()


self = <asyncio.unix_events._UnixDefaultEventLoopPolicy object at 0x111e34670>

def get_event_loop(self):
    """Get the event loop for the current context.

    Returns an instance of EventLoop or raises an exception.
    """
    if (self._local._loop is None and
            not self._local._set_called and
            threading.current_thread() is threading.main_thread()):
        self.set_event_loop(self.new_event_loop())

    if self._local._loop is None:
      raise RuntimeError('There is no current event loop in thread %r.'
                           % threading.current_thread().name)

E RuntimeError: There is no current event loop in thread 'MainThread'.

/Users/simon/.pyenv/versions/3.10.3/lib/python3.10/asyncio/events.py:656: RuntimeError

entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

PDB post_mortem (IO-capturing turned off) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /Users/simon/.pyenv/versions/3.10.3/lib/python3.10/asyncio/events.py(656)get_event_loop() -> raise RuntimeError('There is no current event loop in thread %r.' (Pdb) q

=================================================================================== warnings summary ==================================================================================== tests/test_cli.py::test_inspect_cli_writes_to_file tests/test_cli.py::test_inspect_cli /Users/simon/Dropbox/Development/datasette/datasette/cli.py:163: DeprecationWarning: There is no current event loop loop = asyncio.get_event_loop()

tests/test_cli_serve_get.py::test_serve_with_get_exit_code_for_error tests/test_cli_serve_get.py::test_serve_with_get /Users/simon/Dropbox/Development/datasette/datasette/cli.py:596: DeprecationWarning: There is no current event loop asyncio.get_event_loop().run_until_complete(ds.invoke_startup())

tests/test_cli_serve_get.py::test_serve_with_get_exit_code_for_error tests/test_cli_serve_get.py::test_serve_with_get /Users/simon/Dropbox/Development/datasette/datasette/cli.py:599: DeprecationWarning: There is no current event loop asyncio.get_event_loop().run_until_complete(check_databases(ds))

tests/test_api.py::test_homepage /Users/simon/Dropbox/Development/datasette/tests/conftest.py:30: DeprecationWarning: There is no current event loop return asyncio.get_event_loop()

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ================================================================================ short test summary info ================================================================================ ERROR tests/test_api.py::test_homepage - RuntimeError: There is no current event loop in thread 'MainThread'. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! _pytest.outcomes.Exit: Quitting debugger !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ======================================================================== 7 passed, 7 warnings, 1 error in 19.15s ======================================================================== (datasette) datasette % ``` The problem looks to be caused by this:

https://github.com/simonw/datasette/blob/87737aa1ace82fa7b54c60c41471ec9a661f5299/tests/conftest.py#L28-L30

Which I found necessary in order to have async def fixtures that could be shared on the scope="session" basis.

Can I work around this, or is scope="session" for async fixtures incompatible with my test suite for some reason?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Port as many tests as possible to async def tests against ds_client 1499150951  
1353805839 https://github.com/simonw/datasette/pull/1960#issuecomment-1353805839 https://api.github.com/repos/simonw/datasette/issues/1960 IC_kwDOBm6k_c5QsXAP simonw 9599 2022-12-15T22:38:37Z 2022-12-15T22:38:37Z OWNER

I'm going to make .status_code work on TestClient response too, so I don't have to worry about using both status or status_code depending on which kind of object I am using.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Port as many tests as possible to async def tests against ds_client 1499150951  
1353765125 https://github.com/simonw/datasette/pull/1960#issuecomment-1353765125 https://api.github.com/repos/simonw/datasette/issues/1960 IC_kwDOBm6k_c5QsNEF simonw 9599 2022-12-15T22:00:04Z 2022-12-15T22:00:04Z OWNER

I'm going to punt on that for the moment and continue to use app_client for tests that use that mechanism.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Port as many tests as possible to async def tests against ds_client 1499150951  
1353763837 https://github.com/simonw/datasette/pull/1960#issuecomment-1353763837 https://api.github.com/repos/simonw/datasette/issues/1960 IC_kwDOBm6k_c5QsMv9 simonw 9599 2022-12-15T21:59:05Z 2022-12-15T21:59:05Z OWNER

Here's an annoying error: ```

  response4 = await ds_client.post(
        "/-/logout",
        csrftoken_from=True,
        cookies={"ds_actor": ds_client.actor_cookie({"id": "test"})},
    )

tests/test_auth.py:88:


self = <datasette.app.DatasetteClient object at 0x1033cc4f0>, path = '/-/logout' kwargs = {'cookies': {'ds_actor': 'eyJhIjp7ImlkIjoidGVzdCJ9fQ.fuFCTJG5XE-RNnUM7dcnXx9sPvE'}, 'csrftoken_from': True}, client = <httpx.AsyncClient object at 0x10347a9b0>

async def post(self, path, **kwargs):
    await self.ds.invoke_startup()
    async with httpx.AsyncClient(app=self.app) as client:
      return await client.post(self._fix(path), **kwargs)

E TypeError: AsyncClient.post() got an unexpected keyword argument 'csrftoken_from' `` I need an alternative to thecsrftoken_frommechanism I built forTestClient`:

https://github.com/simonw/datasette/blob/0b68996cc511b3a801f0cd0157bd66332d75f46f/datasette/utils/testing.py#L77-L103

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Port as many tests as possible to async def tests against ds_client 1499150951  
1353749401 https://github.com/simonw/datasette/pull/1960#issuecomment-1353749401 https://api.github.com/repos/simonw/datasette/issues/1960 IC_kwDOBm6k_c5QsJOZ simonw 9599 2022-12-15T21:47:27Z 2022-12-15T21:47:27Z OWNER

I'm using this new mark: python @pytest.mark.ds_client Purely so I can run all of the tests that I've refactored using:

pytest -m ds_client I'll likely remove this once the test refactoring project is complete.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Port as many tests as possible to async def tests against ds_client 1499150951  
1353747370 https://github.com/simonw/datasette/issues/1959#issuecomment-1353747370 https://api.github.com/repos/simonw/datasette/issues/1959 IC_kwDOBm6k_c5QsIuq simonw 9599 2022-12-15T21:45:14Z 2022-12-15T21:45:14Z OWNER

I'm going to do this in a PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Refactor test suite to use mostly `async def` tests 1499081664  
1353738075 https://github.com/simonw/datasette/issues/1959#issuecomment-1353738075 https://api.github.com/repos/simonw/datasette/issues/1959 IC_kwDOBm6k_c5QsGdb simonw 9599 2022-12-15T21:35:56Z 2022-12-15T21:35:56Z OWNER

I built that OldResponse class: diff diff --git a/tests/utils.py b/tests/utils.py index 191ead9b..f39ac434 100644 --- a/tests/utils.py +++ b/tests/utils.py @@ -30,3 +30,25 @@ def inner_html(soup): def has_load_extension(): conn = sqlite3.connect(":memory:") return hasattr(conn, "enable_load_extension") + + +class OldResponse: + "Transform an HTTPX response to simulate the older TestClient responses" + # https://github.com/simonw/datasette/issues/1959#issuecomment-1353721091 + def __init__(self, response): + self.response = response + self._json = None + + @property + def headers(self): + return self.response.headers + + @property + def status(self): + return self.response.status_code + + @property + def json(self): + if self._json is None: + self._json = self.response.json() + return self._json I can use it in tests like this: python @pytest.mark.asyncio async def test_homepage(ds_client): response = OldResponse(await ds_client.get("/.json")) assert response.status == 200 assert "application/json; charset=utf-8" == response.headers["content-type"] assert response.json.keys() == {"fixtures": 0}.keys() d = response.json["fixtures"] assert d["name"] == "fixtures" assert d["tables_count"] == 24 assert len(d["tables_and_views_truncated"]) == 5 assert d["tables_and_views_more"] is True # 4 hidden FTS tables + no_primary_key (hidden in metadata) assert d["hidden_tables_count"] == 6 # 201 in no_primary_key, plus 6 in other hidden tables: assert d["hidden_table_rows_sum"] == 207, response.json assert d["views_count"] == 4 But as I work through the tests I'm finding it's actually not too hard to port them over, so I likely won't use it after all.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Refactor test suite to use mostly `async def` tests 1499081664  
1353728682 https://github.com/simonw/datasette/issues/1959#issuecomment-1353728682 https://api.github.com/repos/simonw/datasette/issues/1959 IC_kwDOBm6k_c5QsEKq simonw 9599 2022-12-15T21:28:35Z 2022-12-15T21:28:35Z OWNER

Got this error trying to have two tests use the same ds_client async fixture when I added scope="session" to that fixture:

  • https://github.com/tortoise/tortoise-orm/issues/638

Adding this to conftest.py (as suggested in that issue thread) seemed to fix it:

python @pytest.fixture(scope="session") def event_loop(): return asyncio.get_event_loop()

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Refactor test suite to use mostly `async def` tests 1499081664  
1353721442 https://github.com/simonw/datasette/issues/1619#issuecomment-1353721442 https://api.github.com/repos/simonw/datasette/issues/1619 IC_kwDOBm6k_c5QsCZi noslouch 2090382 2022-12-15T21:20:53Z 2022-12-15T21:20:53Z NONE

i'm also getting bit by this. I'm trying to set up an nginx reverse proxy in front of multiple datasette backends.

When I run it locally or behind the proxy, I see the base_url value added a second time to the path for various action links on table pages (view as JSON, sort by column, etc).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
JSON link on row page is 404 if base_url setting is used 1121583414  
1353721091 https://github.com/simonw/datasette/issues/1959#issuecomment-1353721091 https://api.github.com/repos/simonw/datasette/issues/1959 IC_kwDOBm6k_c5QsCUD simonw 9599 2022-12-15T21:20:32Z 2022-12-15T21:20:32Z OWNER

Rather than tediously rewriting every single test to the new shape I'm going to try a wrapper for that HTTPX response that transforms it into an imitation of the one returned by the existing TestClient class.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Refactor test suite to use mostly `async def` tests 1499081664  
1353720559 https://github.com/simonw/datasette/issues/1959#issuecomment-1353720559 https://api.github.com/repos/simonw/datasette/issues/1959 IC_kwDOBm6k_c5QsCLv simonw 9599 2022-12-15T21:19:56Z 2022-12-15T21:19:56Z OWNER

Here's a port of the first def ...(app_client) test. Note that the TestClient object works slightly differently from the HTTPX response returned by await datasette.client.get(...):

```diff diff --git a/datasette/app.py b/datasette/app.py index f3cb8876..b770b469 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -281,7 +281,7 @@ class Datasette: raise self.crossdb = crossdb self.nolock = nolock - if memory or crossdb or not self.files: + if memory or crossdb or (not self.files and memory is not False): self.add_database( Database(self, is_mutable=False, is_memory=True), name="_memory" ) diff --git a/pytest.ini b/pytest.ini index 559e518c..0bcb0d1e 100644 --- a/pytest.ini +++ b/pytest.ini @@ -8,4 +8,5 @@ filterwarnings= ignore:.current_task.:PendingDeprecationWarning markers = serial: tests to avoid using with pytest-xdist + ds_client: tests using the ds_client fixture asyncio_mode = strict diff --git a/tests/conftest.py b/tests/conftest.py index cd735e12..648423ba 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -2,6 +2,7 @@ import httpx import os import pathlib import pytest +import pytest_asyncio import re import subprocess import tempfile @@ -23,6 +24,22 @@ UNDOCUMENTED_PERMISSIONS = { }

+@pytest_asyncio.fixture +async def ds_client(): + from datasette.app import Datasette + from .fixtures import METADATA, PLUGINS_DIR + ds = Datasette(memory=False, metadata=METADATA, plugins_dir=PLUGINS_DIR) + from .fixtures import TABLES, TABLE_PARAMETERIZED_SQL + db = ds.add_memory_database("fixtures") + def prepare(conn): + conn.executescript(TABLES) + for sql, params in TABLE_PARAMETERIZED_SQL: + with conn: + conn.execute(sql, params) + await db.execute_write_fn(prepare) + return ds.client + + def pytest_report_header(config): return "SQLite: {}".format( sqlite3.connect(":memory:").execute("select sqlite_version()").fetchone()[0] diff --git a/tests/test_api.py b/tests/test_api.py index 5f2a6ea6..ddf4219c 100644 --- a/tests/test_api.py +++ b/tests/test_api.py @@ -23,12 +23,15 @@ import sys import urllib

-def test_homepage(app_client): - response = app_client.get("/.json") - assert response.status == 200 +@pytest.mark.ds_client +@pytest.mark.asyncio +async def test_homepage(ds_client): + response = await ds_client.get("/.json") + assert response.status_code == 200 assert "application/json; charset=utf-8" == response.headers["content-type"] - assert response.json.keys() == {"fixtures": 0}.keys() - d = response.json["fixtures"] + data = response.json() + assert data.keys() == {"fixtures": 0}.keys() + d = data["fixtures"] assert d["name"] == "fixtures" assert d["tables_count"] == 24 assert len(d["tables_and_views_truncated"]) == 5 @@ -36,7 +39,7 @@ def test_homepage(app_client): # 4 hidden FTS tables + no_primary_key (hidden in metadata) assert d["hidden_tables_count"] == 6 # 201 in no_primary_key, plus 6 in other hidden tables: - assert d["hidden_table_rows_sum"] == 207, response.json + assert d["hidden_table_rows_sum"] == 207, data assert d["views_count"] == 4 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Refactor test suite to use mostly `async def` tests 1499081664  
1353707828 https://github.com/simonw/datasette/issues/1959#issuecomment-1353707828 https://api.github.com/repos/simonw/datasette/issues/1959 IC_kwDOBm6k_c5Qr_E0 simonw 9599 2022-12-15T21:06:29Z 2022-12-15T21:06:29Z OWNER

Previous, abandoned attempt at this work (for #1843): ```diff diff --git a/datasette/app.py b/datasette/app.py index 7e682498..cf35c3a2 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -228,7 +228,7 @@ class Datasette: template_dir=None, plugins_dir=None, static_mounts=None, - memory=False, + memory=None, settings=None, secret=None, version_note=None, @@ -238,6 +238,7 @@ class Datasette: nolock=False, ): self._startup_invoked = False + self._extra_on_startup = [] assert config_dir is None or isinstance( config_dir, Path ), "config_dir= should be a pathlib.Path" @@ -278,7 +279,7 @@ class Datasette: raise self.crossdb = crossdb self.nolock = nolock - if memory or crossdb or not self.files: + if memory or crossdb or (not self.files and memory is not False): self.add_database( Database(self, is_mutable=False, is_memory=True), name="_memory" ) @@ -391,6 +392,9 @@ class Datasette: self._root_token = secrets.token_hex(32) self.client = DatasetteClient(self)

  • def _add_on_startup(self, fn):
  • self._extra_on_startup.append(fn) + async def refresh_schemas(self): if self._refresh_schemas_lock.locked(): return @@ -431,6 +435,8 @@ class Datasette: # This must be called for Datasette to be in a usable state if self._startup_invoked: return
  • for fn in self._extra_on_startup:
  • await fn() # Register permissions, but watch out for duplicate name/abbr names = {} abbrs = {} @@ -1431,9 +1437,9 @@ class Datasette: ) if self.setting("trace_debug"): asgi = AsgiTracer(asgi)
  • asgi = AsgiRunOnFirstRequest(asgi, on_startup=[setup_db, self.invoke_startup]) for wrapper in pm.hook.asgi_wrapper(datasette=self): asgi = wrapper(asgi)
  • asgi = AsgiRunOnFirstRequest(asgi, on_startup=[setup_db, self.invoke_startup]) return asgi

diff --git a/datasette/utils/asgi.py b/datasette/utils/asgi.py index 56690251..986755cb 100644 --- a/datasette/utils/asgi.py +++ b/datasette/utils/asgi.py @@ -423,9 +423,9 @@ class AsgiFileDownload:

class AsgiRunOnFirstRequest: - def init(self, asgi, on_startup): + def init(self, app, on_startup): assert isinstance(on_startup, list) - self.asgi = asgi + self.app = app self.on_startup = on_startup self._started = False

@@ -434,4 +434,4 @@ class AsgiRunOnFirstRequest: self._started = True for hook in self.on_startup: await hook() - return await self.asgi(scope, receive, send) + return await self.app(scope, receive, send) diff --git a/tests/conftest.py b/tests/conftest.py index cd735e12..d1301943 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -23,6 +23,15 @@ UNDOCUMENTED_PERMISSIONS = { }

+# @pytest.fixture(autouse=True) +# def log_name_of_test_before_test(request): +# # To help identify tests that are hanging +# name = str(request.node) +# with open("/tmp/test.log", "a") as f: +# f.write(name + "\n") +# yield + + def pytest_report_header(config): return "SQLite: {}".format( sqlite3.connect(":memory:").execute("select sqlite_version()").fetchone()[0] diff --git a/tests/fixtures.py b/tests/fixtures.py index a6700239..18d3f1b7 100644 --- a/tests/fixtures.py +++ b/tests/fixtures.py @@ -101,6 +101,19 @@ EXPECTED_PLUGINS = [ ]

+def _populate_connection(conn): + # Drop any tables and views that exist + to_drop = conn.execute( + "SELECT name, type FROM sqlite_master where type in ('table', 'view')" + ).fetchall() + for name, type in to_drop: + conn.execute(f"DROP {type} IF EXISTS [{name}]") + conn.executescript(TABLES) + for sql, params in TABLE_PARAMETERIZED_SQL: + with conn: + conn.execute(sql, params) + + @contextlib.contextmanager def make_app_client( sql_time_limit_ms=None, @@ -117,45 +130,22 @@ def make_app_client( metadata=None, crossdb=False, ): - with tempfile.TemporaryDirectory() as tmpdir: - filepath = os.path.join(tmpdir, filename) - if is_immutable: - files = [] - immutables = [filepath] - else: - files = [filepath] - immutables = [] - conn = sqlite3.connect(filepath) - conn.executescript(TABLES) - for sql, params in TABLE_PARAMETERIZED_SQL: - with conn: - conn.execute(sql, params) - # Close the connection to avoid "too many open files" errors - conn.close() - if extra_databases is not None: - for extra_filename, extra_sql in extra_databases.items(): - extra_filepath = os.path.join(tmpdir, extra_filename) - c2 = sqlite3.connect(extra_filepath) - c2.executescript(extra_sql) - c2.close() - # Insert at start to help test /-/databases ordering: - files.insert(0, extra_filepath) - os.chdir(os.path.dirname(filepath)) - settings = settings or {} - for key, value in { - "default_page_size": 50, - "max_returned_rows": max_returned_rows or 100, - "sql_time_limit_ms": sql_time_limit_ms or 200, - # Default is 3 but this results in "too many open files" - # errors when running the full test suite: - "num_sql_threads": 1, - }.items(): - if key not in settings: - settings[key] = value + settings = settings or {} + for key, value in { + "default_page_size": 50, + "max_returned_rows": max_returned_rows or 100, + "sql_time_limit_ms": sql_time_limit_ms or 200, + # Default is 3 but this results in "too many open files" + # errors when running the full test suite: + "num_sql_threads": 1, + }.items(): + if key not in settings: + settings[key] = value + # We can use an in-memory database, but only if we're not doing anything + # with is_immutable or extra_databases and filename is the default + if not is_immutable and not extra_databases and filename == "fixtures.db": ds = Datasette( - files, - immutables=immutables, - memory=memory, + memory=memory or False, cors=cors, metadata=metadata or METADATA, plugins_dir=PLUGINS_DIR, @@ -165,12 +155,57 @@ def make_app_client( template_dir=template_dir, crossdb=crossdb, ) + db = ds.add_memory_database("fixtures") + + async def populate_fixtures(): + print("Here we go... populating fixtures") + await db.execute_write_fn(_populate_connection) + + ds._add_on_startup(populate_fixtures) yield TestClient(ds) - # Close as many database connections as possible - # to try and avoid too many open files error - for db in ds.databases.values(): - if not db.is_memory: - db.close() + else: + with tempfile.TemporaryDirectory() as tmpdir: + filepath = os.path.join(tmpdir, filename) + if is_immutable: + files = [] + immutables = [filepath] + else: + files = [filepath] + immutables = [] + + conn = sqlite3.connect(filepath) + _populate_connection(conn) + # Close the connection to reduce "too many open files" errors + conn.close() + + if extra_databases is not None: + for extra_filename, extra_sql in extra_databases.items(): + extra_filepath = os.path.join(tmpdir, extra_filename) + c2 = sqlite3.connect(extra_filepath) + c2.executescript(extra_sql) + c2.close() + # Insert at start to help test /-/databases ordering: + files.insert(0, extra_filepath) + os.chdir(os.path.dirname(filepath)) + ds = Datasette( + files, + immutables=immutables, + memory=memory, + cors=cors, + metadata=metadata or METADATA, + plugins_dir=PLUGINS_DIR, + settings=settings, + inspect_data=inspect_data, + static_mounts=static_mounts, + template_dir=template_dir, + crossdb=crossdb, + ) + yield TestClient(ds) + # Close as many database connections as possible + # to try and avoid too many open files error + for db in ds.databases.values(): + if not db.is_memory: + db.close()

@pytest.fixture(scope="session") diff --git a/tests/test_cli.py b/tests/test_cli.py index d3e015fa..d9e4e457 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -1,5 +1,6 @@ from .fixtures import ( app_client, + app_client_with_cors, make_app_client, TestClient as _TestClient, EXPECTED_PLUGINS, @@ -38,7 +39,7 @@ def test_inspect_cli(app_client): assert expected_count == database["tables"][table_name]["count"]

-def test_inspect_cli_writes_to_file(app_client): +def test_inspect_cli_writes_to_file(app_client_with_cors): runner = CliRunner() result = runner.invoke( cli, ["inspect", "fixtures.db", "--inspect-file", "foo.json"] ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Refactor test suite to use mostly `async def` tests 1499081664  
1353705072 https://github.com/simonw/datasette/issues/1959#issuecomment-1353705072 https://api.github.com/repos/simonw/datasette/issues/1959 IC_kwDOBm6k_c5Qr-Zw simonw 9599 2022-12-15T21:04:07Z 2022-12-15T21:04:07Z OWNER

I'm going to start by getting every test that uses the raw (app_client) fixture and nothing else (194 at the moment) to switch to async def using a shared Datasette instance and datasette.client.get().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Refactor test suite to use mostly `async def` tests 1499081664  
1353701674 https://github.com/simonw/datasette/issues/1955#issuecomment-1353701674 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5Qr9kq simonw 9599 2022-12-15T21:00:51Z 2022-12-15T21:00:51Z OWNER

OK, I've broken the test suite here.

I'm going to revert these two commits:

  • https://github.com/simonw/datasette/commit/dc18f62089e5672d03176f217d7840cdafa5c447
  • https://github.com/simonw/datasette/commit/51ee8caa4a697fa3f4120e93b1c205b714a6cdc7

Then I'll do a bunch of work making the test suite more robust before I try this again.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1353694582 https://github.com/simonw/datasette/issues/1955#issuecomment-1353694582 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5Qr712 simonw 9599 2022-12-15T20:52:46Z 2022-12-15T20:52:46Z OWNER

Just noticed this: https://github.com/simonw/datasette/actions/runs/3706504228/jobs/6281796135

This suggests that the regular tests passed in CI fine, but the non-serial ones failed.

I'm going to try running everything using pytest -n auto without splitting serial and non-serial tests. Maybe the serial thing isn't needed any more?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1353690591 https://github.com/simonw/datasette/issues/1843#issuecomment-1353690591 https://api.github.com/repos/simonw/datasette/issues/1843 IC_kwDOBm6k_c5Qr63f simonw 9599 2022-12-15T20:49:05Z 2022-12-15T20:49:05Z OWNER

I have a nasty feeling the cleaner solution for this would involve porting my entire test suite from def test_blah(app_client) sync functions (which work due to a @async_to_sync call in TestClient) to async def test_blah(async_fixture): functions instead.

I've been using that latter pattern for new tests (and plugin tests) for quite a while now, but I never took on the job of refactoring all of the old ones.

A search for (app_client): across the whole project currently returns 194 results which might be a reasonable target to try switching to the new pattern as a starting point.

No idea if it will have much impact on the "Too many open files" errors though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Intermittent "Too many open files" error running tests 1408757705  
1353683238 https://github.com/simonw/datasette/issues/1955#issuecomment-1353683238 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5Qr5Em simonw 9599 2022-12-15T20:42:18Z 2022-12-15T20:42:18Z OWNER

Possibly related issue: - https://github.com/pytest-dev/pytest-xdist/issues/60

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1353680261 https://github.com/simonw/datasette/issues/1955#issuecomment-1353680261 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5Qr4WF simonw 9599 2022-12-15T20:39:19Z 2022-12-15T20:39:19Z OWNER

When I hit Ctr+C here's the traceback I get: ^C^CException ignored in: <module 'threading' from '/Users/simon/.pyenv/versions/3.10.3/lib/python3.10/threading.py'> Traceback (most recent call last): File "/Users/simon/.pyenv/versions/3.10.3/lib/python3.10/threading.py", line 1530, in _shutdown !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! /Users/simon/.pyenv/versions/3.10.3/lib/python3.10/threading.py:324: KeyboardInterrupt (to show a full traceback on KeyboardInterrupt use --full-trace) Traceback (most recent call last): File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/bin/pytest", line 8, in <module> atexit_call() File "/Users/simon/.pyenv/versions/3.10.3/lib/python3.10/concurrent/futures/thread.py", line 31, in _python_exit sys.exit(console_main()) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/_pytest/config/__init__.py", line 187, in console_main t.join() File "/Users/simon/.pyenv/versions/3.10.3/lib/python3.10/threading.py", line 1089, in join self._wait_for_tstate_lock() File "/Users/simon/.pyenv/versions/3.10.3/lib/python3.10/threading.py", line 1109, in _wait_for_tstate_lock if lock.acquire(block, timeout): KeyboardInterrupt: code = main() File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/_pytest/config/__init__.py", line 164, in main ret: Union[ExitCode, int] = config.hook.pytest_cmdline_main( File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/pluggy/_hooks.py", line 265, in __call__ return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/pluggy/_manager.py", line 80, in _hookexec return self._inner_hookexec(hook_name, methods, kwargs, firstresult) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/pluggy/_callers.py", line 60, in _multicall return outcome.get_result() File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/pluggy/_result.py", line 60, in get_result raise ex[1].with_traceback(ex[2]) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/pluggy/_callers.py", line 39, in _multicall res = hook_impl.function(*args) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/_pytest/main.py", line 315, in pytest_cmdline_main return wrap_session(config, _main) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/_pytest/main.py", line 303, in wrap_session config.hook.pytest_sessionfinish( File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/pluggy/_hooks.py", line 265, in __call__ return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/pluggy/_manager.py", line 80, in _hookexec return self._inner_hookexec(hook_name, methods, kwargs, firstresult) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/pluggy/_callers.py", line 55, in _multicall gen.send(outcome) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/_pytest/terminal.py", line 798, in pytest_sessionfinish outcome.get_result() File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/pluggy/_result.py", line 60, in get_result raise ex[1].with_traceback(ex[2]) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/pluggy/_callers.py", line 39, in _multicall res = hook_impl.function(*args) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/xdist/dsession.py", line 88, in pytest_sessionfinish nm.teardown_nodes() File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/xdist/workermanage.py", line 79, in teardown_nodes self.group.terminate(self.EXIT_TIMEOUT) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/execnet/multi.py", line 215, in terminate safe_terminate( File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/execnet/multi.py", line 311, in safe_terminate reply.get() File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/execnet/gateway_base.py", line 206, in get self.waitfinish(timeout) File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/execnet/gateway_base.py", line 213, in waitfinish if not self._result_ready.wait(timeout): File "/Users/simon/.pyenv/versions/3.10.3/lib/python3.10/threading.py", line 600, in wait signaled = self._cond.wait(timeout) File "/Users/simon/.pyenv/versions/3.10.3/lib/python3.10/threading.py", line 320, in wait waiter.acquire() KeyboardInterrupt It looks to me like this relates to pytest-xdist istelf - it's waiting on some locks but site-packages/xdist/workermanage.py shows up in that track.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1353522652 https://github.com/simonw/datasette/issues/1843#issuecomment-1353522652 https://api.github.com/repos/simonw/datasette/issues/1843 IC_kwDOBm6k_c5QrR3c simonw 9599 2022-12-15T18:21:27Z 2022-12-15T18:21:27Z OWNER

I'll still use on-disk test databases for is_immutable=True, but not for the majority of tests.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Intermittent "Too many open files" error running tests 1408757705  
1353522211 https://github.com/simonw/datasette/issues/1843#issuecomment-1353522211 https://api.github.com/repos/simonw/datasette/issues/1843 IC_kwDOBm6k_c5QrRwj simonw 9599 2022-12-15T18:21:02Z 2022-12-15T18:21:02Z OWNER

When I initially built this test suite Datasette didn't have the memory_name= mechanism for creating persistent in-memory databases.

I'm going to see if I can switch to that for the majority of Datasette's tests. Might find that doing so both fixes this "too many open files" issue AND gives me a significant speed improvement to the test site too!

Hopefully I can do most of the work on that in this big ugly function:

https://github.com/simonw/datasette/blob/dc18f62089e5672d03176f217d7840cdafa5c447/tests/fixtures.py#L104-L173

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Intermittent "Too many open files" error running tests 1408757705  
1353520615 https://github.com/simonw/datasette/issues/1843#issuecomment-1353520615 https://api.github.com/repos/simonw/datasette/issues/1843 IC_kwDOBm6k_c5QrRXn simonw 9599 2022-12-15T18:19:25Z 2022-12-15T18:19:25Z OWNER

I've been seeing this error again:

ERROR tests/test_api_write.py::test_create_table[input16-400-expected_response16] - OSError: [Errno 24] Too ... ERROR tests/test_api_write.py::test_create_table[input17-400-expected_response17] - OSError: [Errno 24] Too ... ERROR tests/test_api_write.py::test_create_table[input18-400-expected_response18] - OSError: [Errno 24] Too ...

It doesn't happen in CI, and it turns out that's because CI runs pytest -n auto which splits the tests across multiple parallel processes.

I've been working around the error on my laptop using pytest -n auto there too, but I'd rather not have to do that.

This is also getting in my way when I try to debug other issues, like this one:

  • 1955

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Intermittent "Too many open files" error running tests 1408757705  
1353516572 https://github.com/simonw/datasette/issues/1955#issuecomment-1353516572 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5QrQYc simonw 9599 2022-12-15T18:15:28Z 2022-12-15T18:15:28Z OWNER

I added return to the first line of that test to disable it, then ran again - and now it's hanging at about the same progress point through the tests but in a different test:

So this time it was hanging at test_urlsafe_components().

So it's clearly not the individual tests themselves that are the problem - something about running the entire test suite in one go is incompatible with this change for some reason.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1353512099 https://github.com/simonw/datasette/issues/1955#issuecomment-1353512099 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5QrPSj simonw 9599 2022-12-15T18:11:27Z 2022-12-15T18:11:27Z OWNER

This is surprising!

The logs suggest that the test suite hung running this test here:

https://github.com/simonw/datasette/blob/dc18f62089e5672d03176f217d7840cdafa5c447/tests/test_utils.py#L55-L58

I find that very hard to believe.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1353509776 https://github.com/simonw/datasette/issues/1955#issuecomment-1353509776 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5QrOuQ simonw 9599 2022-12-15T18:09:26Z 2022-12-15T18:09:26Z OWNER

I added this to conftest.py:

python @pytest.fixture(autouse=True) def log_name_of_test_before_test(request): # To help identify tests that are hanging name = str(request.node) with open("/tmp/test.log", "a") as f: f.write(name + "\n") yield This logs out the name of each test to /tmp/test.log before running the test - so I can wait until it hangs and see which test it was that caused that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1353473571 https://github.com/simonw/datasette/issues/1955#issuecomment-1353473571 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5QrF4j simonw 9599 2022-12-15T17:43:28Z 2022-12-15T17:43:48Z OWNER

Running:

pytest -n auto -x -v

On may laptop to see if I can replicate.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1353473086 https://github.com/simonw/datasette/issues/1955#issuecomment-1353473086 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5QrFw- simonw 9599 2022-12-15T17:43:08Z 2022-12-15T17:43:08Z OWNER

It looks like that fix almost works... except it seems to push the tests into an infinite loop or similar? They're not finishing their runs from what I can see.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1353448095 https://github.com/simonw/datasette/issues/1955#issuecomment-1353448095 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5Qq_qf simonw 9599 2022-12-15T17:25:05Z 2022-12-15T17:25:05Z OWNER

So actually that setup_db() function I wrote back in 2019 has not been executing for most of Datasette's tests. Which seems bad.

I'm inclined to ditch AsgiLifespan entirely in favour of the mechanism I described above, where invoke_startup() is called for every request on the first request processed by the server.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1353443718 https://github.com/simonw/datasette/issues/1955#issuecomment-1353443718 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5Qq-mG simonw 9599 2022-12-15T17:23:12Z 2022-12-15T17:23:55Z OWNER

That may not be the best fix here. It turns out this pattern: python async def get(self, path, **kwargs): async with httpx.AsyncClient(app=self.app) as client: return await client.get(self._fix(path), **kwargs) Doesn't trigger that AsgiLifespan class.

I wrote about that previously in this TIL: https://til.simonwillison.net/asgi/lifespan-test-httpx

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1353423584 https://github.com/simonw/datasette/issues/1955#issuecomment-1353423584 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5Qq5rg simonw 9599 2022-12-15T17:13:18Z 2022-12-15T17:22:59Z OWNER

Wow, just spotted this in the code - it turns out I solved this problem a different (and better) way long before i introduced invoke_startup()!

https://github.com/simonw/datasette/blob/e054704fb64d1f23154ec43b81b6c9481ff8202f/datasette/app.py#L1416-L1440

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1352674924 https://github.com/simonw/datasette/issues/1955#issuecomment-1352674924 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5QoC5s simonw 9599 2022-12-15T07:46:36Z 2022-12-15T07:46:36Z OWNER

It's possible the fix for this might be for the first incoming HTTP request to trigger invoke_startup() if it hasn't been called yet - similar to the hack I put in place for datasette.client.get() in tests:

https://github.com/simonw/datasette/blob/e054704fb64d1f23154ec43b81b6c9481ff8202f/datasette/app.py#L1728-L1731

This would be a much more elegant fix, I could remove those multiple invoke_startup() calls entirely - and remove this tip from the documentation too: https://docs.datasette.io/en/0.63.2/testing_plugins.html#setting-up-a-datasette-test-instance

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1352644281 https://github.com/simonw/datasette/issues/1958#issuecomment-1352644281 https://api.github.com/repos/simonw/datasette/issues/1958 IC_kwDOBm6k_c5Qn7a5 simonw 9599 2022-12-15T07:08:14Z 2022-12-15T07:08:14Z OWNER

Thanks for the details write-up! This looks like a bug in Datasette itself when run with Docker. Moving this issue there.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
datasette --root running in Docker doesn't reliably show the magic URL 1497909798  
1352643333 https://github.com/simonw/datasette/issues/1955#issuecomment-1352643333 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5Qn7MF simonw 9599 2022-12-15T07:07:29Z 2022-12-15T07:07:29Z OWNER

Datasette 0.63 is the release that broke this, thanks to this issue:

  • https://github.com/simonw/datasette/issues/1809
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1352643049 https://github.com/simonw/datasette/issues/1955#issuecomment-1352643049 https://api.github.com/repos/simonw/datasette/issues/1955 IC_kwDOBm6k_c5Qn7Hp simonw 9599 2022-12-15T07:07:10Z 2022-12-15T07:07:10Z OWNER

This is definitely a regression: Datasette is meant to work in those environments, and I didn't think to test them when I added the invoke_startup() hook.

Coincidentally I actually built a plugin for running Datasette with Gunicorn just a couple of months ago:

https://datasette.io/plugins/datasette-gunicorn

And I just tested and it has the same bug you describe here! Filed:

  • https://github.com/simonw/datasette-gunicorn/issues/5
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things 1496652622  
1352459146 https://github.com/simonw/datasette/issues/1953#issuecomment-1352459146 https://api.github.com/repos/simonw/datasette/issues/1953 IC_kwDOBm6k_c5QnOOK simonw 9599 2022-12-15T02:02:15Z 2022-12-15T02:02:15Z OWNER

``` The third Datasette 1.0 alpha release adds upsert support to the JSON API, plus the ability to specify finely grained permissions when creating an API token.

  • New /db/table/-/upsert API, documented here. upsert is an update-or-replace: existing rows will have specified keys updated, but if no row matches the incoming primary key a brand new row will be inserted instead. (#1878)
  • New register_permissions(datasette) plugin hook. Plugins can now register named permissions, which will then be listed in various interfaces that show available permissions. (#1940)
  • The /db/-/create API for creating a table now accepts "ignore": true and "replace": true options when called with the "rows" property that creates a new table based on an example set of rows. This means the API can be called multiple times with different rows, setting rules for what should happen if a primary key collides with an existing row. (#1927)
  • Arbitrary permissions can now be configured at the instance, database and resource (table, SQL view or canned query) level in Datasette's Metadata JSON and YAML files. The new "permissions" key can be used to specify which actors should have which permissions. See Other permissions in metadata for details. (#1636)
  • The /-/create-token page can now be used to create API tokens which are restricted to just a subset of actions, including against specific databases or resources. See API Tokens for details. (#1947)
  • Likewise, the datasette create-token CLI command can now create tokens with a subset of permissions. (#1855)
  • New datasette.create_token() API method <datasette_create_token>` for programmatically creating signed API tokens. (#1951)
  • /db/-/create API now requires actor to have insert-row permission in order to use the "row" or "rows" properties. (#1937) ```
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Release notes for Datasette 1.0a2 1495821607  
1352411327 https://github.com/simonw/datasette/issues/1949#issuecomment-1352411327 https://api.github.com/repos/simonw/datasette/issues/1949 IC_kwDOBm6k_c5QnCi_ simonw 9599 2022-12-15T00:46:27Z 2022-12-15T00:46:27Z OWNER

I got this far: ```diff diff --git a/datasette/handle_exception.py b/datasette/handle_exception.py index 8b7e83e3..31d41e00 100644 --- a/datasette/handle_exception.py +++ b/datasette/handle_exception.py @@ -54,7 +54,17 @@ def handle_exception(datasette, request, exception): headers = {} if datasette.cors: add_cors_headers(headers) - if request.path.split("?")[0].endswith(".json"): + # Return JSON error under certain conditions + should_return_json = ( + # URL ends in .json + request.path.split("?")[0].endswith(".json") + or + # Hints from incoming request headers + request.headers.get("content-type") == "application/json" + or "application/json" in request.headers.get("accept", "") + ) + breakpoint() + if should_return_json: return Response.json(info, status=status, headers=headers) else: template = datasette.jinja_env.select_template(templates) diff --git a/tests/test_api_write.py b/tests/test_api_write.py index f27d143f..982543a6 100644 --- a/tests/test_api_write.py +++ b/tests/test_api_write.py @@ -1140,6 +1140,38 @@ async def test_create_table_permissions( assert data["errors"] == expected_errors

+@pytest.mark.asyncio +@pytest.mark.parametrize( + "headers,expect_json", + ( + ({}, False), + ({"Accept": "text/html"}, True), + ({"Accept": "application/json"}, True), + ({"Content-Type": "application/json"}, True), + ({"Accept": "application/json, text/plain, /"}, True), + ({"Content-Type": "application/json"}, True), + ({"accept": "application/json, text/plain, /"}, True), + ({"content-type": "application/json"}, True), + ), +) +async def test_permission_errors_html_and_json(ds_write, headers, expect_json): + request_headers = {"Authorization": "Bearer bad_token"} + request_headers.update(headers) + response = await ds_write.client.post( + "/data/-/create", + json={}, + headers=request_headers, + ) + assert response.status_code == 403 + if expect_json: + data = response.json() + assert data["ok"] is False + assert data["errors"] == ["Permission denied"] + else: + assert response.headers["Content-Type"] == "text/html; charset=utf-8" + assert "Permission denied" in response.text + + @pytest.mark.asyncio @pytest.mark.parametrize( "input,expected_rows_after", ``` Then decided I would punt this until the next milestone.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`.json` errors should be returned as JSON 1493471221  
1352410078 https://github.com/simonw/datasette/issues/1953#issuecomment-1352410078 https://api.github.com/repos/simonw/datasette/issues/1953 IC_kwDOBm6k_c5QnCPe simonw 9599 2022-12-15T00:44:56Z 2022-12-15T00:44:56Z OWNER

Highlights:

  • /db/table/-/upsert
  • ignore and replace for /db/-/create
  • register_permissions() plugin hook
  • datasette create-token can create restricted tokens
  • /-/create-token can too
  • datasette --get --token option
  • datasette.create_token() API method

Plus some smaller things.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Release notes for Datasette 1.0a2 1495821607  
1352378370 https://github.com/simonw/datasette/issues/1949#issuecomment-1352378370 https://api.github.com/repos/simonw/datasette/issues/1949 IC_kwDOBm6k_c5Qm6gC simonw 9599 2022-12-15T00:02:08Z 2022-12-15T00:04:54Z OWNER

I fixed this issue to help research this further: - https://github.com/simonw/datasette-ripgrep/issues/26

Now this search works:

https://ripgrep.datasette.io/-/ripgrep?pattern=return+_error&literal=on&glob=datasette%2F**

I wish I had this feature! - https://github.com/simonw/datasette-ripgrep/issues/24

Looks like I have both _error() and _errors() functions in there!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`.json` errors should be returned as JSON 1493471221  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1555.685ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows