home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

6,906 rows where user = 9599 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

issue >30

  • Show column metadata plus links for foreign keys on arbitrary query results 50
  • Rethink how .ext formats (v.s. ?_format=) works before 1.0 47
  • Updated Dockerfile with SpatiaLite version 5.0 45
  • Complete refactor of TableView and table.html template 45
  • Redesign default .json format 43
  • Port Datasette to ASGI 38
  • Authentication (and permissions) as a core concept 38
  • JavaScript plugin hooks mechanism similar to pluggy 38
  • await datasette.client.get(path) mechanism for executing internal requests 33
  • Maintain an in-memory SQLite table of connected databases and their tables 31
  • Deploy a live instance of demos/apache-proxy 31
  • Ability to sort (and paginate) by column 29
  • Research: demonstrate if parallel SQL queries are worthwhile 29
  • Export to CSV 27
  • Optimize all those calls to index_list and foreign_key_list 27
  • Ability for a canned query to write to the database 26
  • table.transform() method for advanced alter table 26
  • Proof of concept for Datasette on AWS Lambda with EFS 25
  • New pattern for views that return either JSON or HTML, available for plugins 25
  • Support cross-database joins 24
  • Redesign register_output_renderer callback 24
  • "datasette insert" command and plugin hook 23
  • Datasette Plugins 21
  • table.extract(...) method and "sqlite-utils extract" command 21
  • ?sort=colname~numeric to sort by by column cast to real 21
  • Idea: import CSV to memory, run SQL, export in a single command 21
  • base_url is omitted in JSON and CSV views 21
  • Switch documentation theme to Furo 21
  • "flash messages" mechanism 20
  • Move CI to GitHub Issues 20
  • …

author_association 2

  • OWNER 6,420
  • MEMBER 486

user 1

  • simonw · 6,906 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1133417432 https://github.com/simonw/sqlite-utils/issues/435#issuecomment-1133417432 https://api.github.com/repos/simonw/sqlite-utils/issues/435 IC_kwDOCGYnMM5DjpPY simonw 9599 2022-05-20T21:56:10Z 2022-05-20T21:56:10Z OWNER

Before:

After:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch to Furo documentation theme 1243704847  
1133416698 https://github.com/simonw/sqlite-utils/issues/435#issuecomment-1133416698 https://api.github.com/repos/simonw/sqlite-utils/issues/435 IC_kwDOCGYnMM5DjpD6 simonw 9599 2022-05-20T21:54:43Z 2022-05-20T21:54:43Z OWNER

Done: https://sqlite-utils.datasette.io/en/latest/reference.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch to Furo documentation theme 1243704847  
1133396285 https://github.com/simonw/datasette/issues/1746#issuecomment-1133396285 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjkE9 simonw 9599 2022-05-20T21:28:29Z 2022-05-20T21:28:29Z OWNER

That fixed it:

https://user-images.githubusercontent.com/9599/169614893-ec81fe0e-6043-4d7d-b429-7f087dbeaf61.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133348094 https://github.com/simonw/datasette/issues/1746#issuecomment-1133348094 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjYT- simonw 9599 2022-05-20T20:40:09Z 2022-05-20T20:40:09Z OWNER

Relevant JavaScript: https://github.com/simonw/datasette/blob/1d33fd03b3c211e0f48a8f3bde83880af89e4e69/docs/_static/js/custom.js#L20-L24

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133347051 https://github.com/simonw/datasette/issues/1746#issuecomment-1133347051 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjYDr simonw 9599 2022-05-20T20:39:17Z 2022-05-20T20:39:17Z OWNER

Now live at https://docs.datasette.io/en/latest/ - the JavaScript that adds the banner about that not being the stable version doesn't seem to work though.

Before:

https://user-images.githubusercontent.com/9599/169607254-99bc0358-4a08-43bf-9aac-a24cc2121979.png">

After:

https://user-images.githubusercontent.com/9599/169607296-bcec1ed2-517c-4acc-a9a9-d1119d0cc589.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133335940 https://github.com/simonw/datasette/issues/1746#issuecomment-1133335940 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjVWE simonw 9599 2022-05-20T20:30:29Z 2022-05-20T20:30:29Z OWNER

I think the trick will be to extend the base.html template from Furo using the same trick I used in https://til.simonwillison.net/readthedocs/custom-sphinx-templates

https://github.com/pradyunsg/furo/blob/2022.04.07/src/furo/theme/furo/base.html - the site_meta block looks good.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133333144 https://github.com/simonw/datasette/issues/1746#issuecomment-1133333144 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjUqY simonw 9599 2022-05-20T20:28:25Z 2022-05-20T20:28:25Z OWNER

One last question: how to include the Plausible analytics?

Furo doesn't have any specific tools for this:

  • https://github.com/pradyunsg/furo/discussions/243
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133331997 https://github.com/simonw/datasette/issues/1746#issuecomment-1133331997 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjUYd simonw 9599 2022-05-20T20:27:31Z 2022-05-20T20:27:31Z OWNER

I'm going to move my custom JavaScript from layout.html into js/custom.js, similar to how the custom CSS works.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133331564 https://github.com/simonw/datasette/issues/1746#issuecomment-1133331564 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjURs simonw 9599 2022-05-20T20:27:12Z 2022-05-20T20:27:12Z OWNER

This seems to work for brand.html:

<div class="sidebar-brand centered">
  {% block brand_content %}
  <div class="sidebar-logo-container">
    <a href="https://datasette.io/"><img class="sidebar-logo" src="{{ logo_url }}" alt="Datasette"></a>
  </div>
  {%- set nav_version = version %}
  {% if READTHEDOCS and current_version %}
    {%- set nav_version = current_version %}
  {% endif %}
  {% if nav_version %}
    <div class="version">
      {{ nav_version }}
    </div>
  {% endif %}
  {% endblock brand_content %}
</div>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133310253 https://github.com/simonw/datasette/issues/1746#issuecomment-1133310253 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjPEt simonw 9599 2022-05-20T20:11:00Z 2022-05-20T20:11:00Z OWNER

Oh but rg display_version is a lot more interesting:

lib/python3.10/site-packages/sphinx/builders/html/__init__.py:from sphinx import __display_version__, package_dir
lib/python3.10/site-packages/sphinx/builders/html/__init__.py:            'sphinx_version': __display_version__,
lib/python3.10/site-packages/sphinx/application.py:        logger.info(bold(__('Running Sphinx v%s') % sphinx.__display_version__))
lib/python3.10/site-packages/sphinx/application.py:        if self.config.needs_sphinx and self.config.needs_sphinx > sphinx.__display_version__:
lib/python3.10/site-packages/sphinx/application.py:        if version > sphinx.__display_version__[:3]:
lib/python3.10/site-packages/sphinx/cmd/build.py:from sphinx import __display_version__, package_dir
lib/python3.10/site-packages/sphinx/cmd/build.py:                        version='%%(prog)s %s' % __display_version__)
lib/python3.10/site-packages/sphinx/cmd/make_mode.py:        print(bold("Sphinx v%s" % sphinx.__display_version__))
lib/python3.10/site-packages/sphinx/__init__.py:__display_version__ = __version__  # used for command line version
lib/python3.10/site-packages/sphinx/__init__.py:    __display_version__ = __version__
lib/python3.10/site-packages/sphinx/__init__.py:            __display_version__ += '/' + ret.stdout.strip()
lib/python3.10/site-packages/sphinx/ext/githubpages.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/ext/intersphinx.py:        'version': sphinx.__display_version__,
lib/python3.10/site-packages/sphinx/cmd/quickstart.py:from sphinx import __display_version__, package_dir
lib/python3.10/site-packages/sphinx/cmd/quickstart.py:    print(bold(__('Welcome to the Sphinx %s quickstart utility.')) % __display_version__)
lib/python3.10/site-packages/sphinx/cmd/quickstart.py:                        version='%%(prog)s %s' % __display_version__)
lib/python3.10/site-packages/sphinx/ext/viewcode.py:        'version': sphinx.__display_version__,
lib/python3.10/site-packages/sphinx/util/__init__.py:                  (sphinx.__display_version__,
lib/python3.10/site-packages/sphinx/ext/ifconfig.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/ext/todo.py:        'version': sphinx.__display_version__,
lib/python3.10/site-packages/sphinx/ext/doctest.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/ext/autosummary/__init__.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/ext/napoleon/__init__.py:from sphinx import __display_version__ as __version__
lib/python3.10/site-packages/sphinx/ext/autosummary/generate.py:from sphinx import __display_version__, package_dir
lib/python3.10/site-packages/sphinx/ext/autosummary/generate.py:                        version='%%(prog)s %s' % __display_version__)
lib/python3.10/site-packages/sphinx/ext/inheritance_diagram.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/ext/imgmath.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/ext/linkcode.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/ext/coverage.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/writers/texinfo.py:from sphinx import __display_version__, addnodes
lib/python3.10/site-packages/sphinx/writers/texinfo.py:@*Generated by Sphinx """ + __display_version__ + """.@*
lib/python3.10/site-packages/sphinx/ext/graphviz.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/ext/mathjax.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/ext/extlinks.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/ext/apidoc.py:from sphinx import __display_version__, package_dir
lib/python3.10/site-packages/sphinx/ext/apidoc.py:                        version='%%(prog)s %s' % __display_version__)
lib/python3.10/site-packages/sphinx/ext/autodoc/type_comment.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/sphinx/ext/autodoc/__init__.py:    return {'version': sphinx.__display_version__, 'parallel_read_safe': True}
lib/python3.10/site-packages/pip/_internal/models/target_python.py:        display_version = None
lib/python3.10/site-packages/pip/_internal/models/target_python.py:            display_version = '.'.join(
lib/python3.10/site-packages/pip/_internal/models/target_python.py:            ('version_info', display_version),
lib/python3.10/site-packages/sphinx_rtd_theme/theme.conf:display_version = True
lib/python3.10/site-packages/sphinx_rtd_theme/layout.html:          {%- if theme_display_version %}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133309452 https://github.com/simonw/datasette/issues/1746#issuecomment-1133309452 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjO4M simonw 9599 2022-05-20T20:10:36Z 2022-05-20T20:10:36Z OWNER

Weird, I cannot figure out this theme_display_version thing - I even tried this:

cd /tmp
mkdir s
cd s
python3 -m venv venv
source venv/bin/activate
pip install sphinx_rtd_theme
cd venv
rg theme_display_version

And got just that one reference:

lib/python3.10/site-packages/sphinx_rtd_theme/layout.html
154:          {%- if theme_display_version %}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133299417 https://github.com/simonw/datasette/issues/1746#issuecomment-1133299417 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjMbZ simonw 9599 2022-05-20T20:05:34Z 2022-05-20T20:05:34Z OWNER

I can't get that thing that displays the version working.

https://github.com/readthedocs/sphinx_rtd_theme/blob/9264091087620d421b0804c00937b00980ac3916/sphinx_rtd_theme/layout.html#L154 is wheresphinx_rtd_theme implements it - {%- if theme_display_version %} - but I can't find where that variable is first set, from searching both that theme and Sphinx itself!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133288501 https://github.com/simonw/datasette/issues/1746#issuecomment-1133288501 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjJw1 simonw 9599 2022-05-20T20:00:17Z 2022-05-20T20:00:17Z OWNER

Here's a TIL from when I first customized the layout.html template: https://til.simonwillison.net/readthedocs/custom-sphinx-templates

Note that Furo doesn't use layout.html so I need to completely change how I did that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133267290 https://github.com/simonw/datasette/issues/1153#issuecomment-1133267290 https://api.github.com/repos/simonw/datasette/issues/1153 IC_kwDOBm6k_c5DjEla simonw 9599 2022-05-20T19:44:05Z 2022-05-20T19:50:58Z OWNER

Undocumented Sphinx feature: you can add extra classes to a code example like this:

.. code-block:: json
   :class: metadata-json

    {
        "databases": {
            "russian-ads": {
                "tables": {
                    "display_ads": {
                        "fts_table": "ads_fts",
                        "fts_pk": "id",
                        "searchmode": "raw"
                    }
                }
            }
        }
    }

https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#directive-code-block doesn't mention this.

Filed an issue about the lack of documentation here:
- https://github.com/sphinx-doc/sphinx/issues/10461

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Use YAML examples in documentation by default, not JSON 771202454  
1133254599 https://github.com/simonw/datasette/issues/1746#issuecomment-1133254599 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjBfH simonw 9599 2022-05-20T19:33:08Z 2022-05-20T19:33:08Z OWNER

Actually maybe I don't? I just noticed that on other pages on https://docs.datasette.io/en/stable/installation.html the only way to get back to that useful table of context / index page at https://docs.datasette.io/en/stable/index.html is by clicking the tiny house icon. Can I do better or should I have the logo do that?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133252598 https://github.com/simonw/datasette/issues/1746#issuecomment-1133252598 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjA_2 simonw 9599 2022-05-20T19:31:30Z 2022-05-20T19:31:30Z OWNER

I'd also like to bring back this stable / latest / version indicator:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133250151 https://github.com/simonw/datasette/issues/1746#issuecomment-1133250151 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjAZn simonw 9599 2022-05-20T19:29:37Z 2022-05-20T19:29:37Z OWNER

I want the Datasette logo in the sidebar to link to https://datasette.io/

Looks like I can do that by dropping in my own sidebar/brand.html template based on this:

https://github.com/pradyunsg/furo/blob/0c2acbbd23f8146dd0ae50a2ba57258c1f63ea9f/src/furo/theme/furo/sidebar/brand.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133246791 https://github.com/simonw/datasette/issues/1746#issuecomment-1133246791 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5Di_lH simonw 9599 2022-05-20T19:26:49Z 2022-05-20T19:26:49Z OWNER

Putting this in the css/custom.css file seems to work for fixing that logo problem:

body[data-theme="dark"] .sidebar-logo-container {
    background-color: white;
    padding: 5px;
    opacity: 0.6;
}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133242063 https://github.com/simonw/datasette/issues/1746#issuecomment-1133242063 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5Di-bP simonw 9599 2022-05-20T19:22:49Z 2022-05-20T19:22:49Z OWNER

I have some custom CSS in this file:

https://github.com/simonw/datasette/blob/1465fea4798599eccfe7e8f012bd8d9adfac3039/docs/_static/css/custom.css#L1-L7

I tested and the overflow-wrap: anywhere is still needed for this fix:
- #828

The .wy-side-nav-search bit is no longer needed with the new theme.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133232301 https://github.com/simonw/datasette/issues/1748#issuecomment-1133232301 https://api.github.com/repos/simonw/datasette/issues/1748 IC_kwDOBm6k_c5Di8Ct simonw 9599 2022-05-20T19:15:00Z 2022-05-20T19:15:00Z OWNER

Now live on https://docs.datasette.io/en/latest/testing_plugins.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add copy buttons next to code examples in the documentation 1243517592  
1133229196 https://github.com/simonw/datasette/issues/1747#issuecomment-1133229196 https://api.github.com/repos/simonw/datasette/issues/1747 IC_kwDOBm6k_c5Di7SM simonw 9599 2022-05-20T19:12:30Z 2022-05-20T19:12:30Z OWNER

https://docs.datasette.io/en/latest/getting_started.html#follow-a-tutorial

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add tutorials to the getting started guide 1243512344  
1133225441 https://github.com/simonw/datasette/issues/1748#issuecomment-1133225441 https://api.github.com/repos/simonw/datasette/issues/1748 IC_kwDOBm6k_c5Di6Xh simonw 9599 2022-05-20T19:09:13Z 2022-05-20T19:09:13Z OWNER

I'm going to add this Sphinx plugin: https://github.com/executablebooks/sphinx-copybutton

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add copy buttons next to code examples in the documentation 1243517592  
1133222848 https://github.com/simonw/datasette/issues/1153#issuecomment-1133222848 https://api.github.com/repos/simonw/datasette/issues/1153 IC_kwDOBm6k_c5Di5vA simonw 9599 2022-05-20T19:07:10Z 2022-05-20T19:07:10Z OWNER

I could use https://github.com/pradyunsg/sphinx-inline-tabs for this - recommended by https://pradyunsg.me/furo/recommendations/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Use YAML examples in documentation by default, not JSON 771202454  
1133217219 https://github.com/simonw/datasette/issues/1746#issuecomment-1133217219 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5Di4XD simonw 9599 2022-05-20T18:58:54Z 2022-05-20T18:58:54Z OWNER

Need to address other customizations I've made in https://github.com/simonw/datasette/blob/0.62a0/docs/_templates/layout.html - such as Plausible analytics and some custom JavaScript.

https://github.com/simonw/datasette/blob/943aa2e1f7341cb51e60332cde46bde650c64217/docs/_templates/layout.html#L1-L61

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133215684 https://github.com/simonw/datasette/issues/1746#issuecomment-1133215684 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5Di3_E simonw 9599 2022-05-20T18:56:29Z 2022-05-20T18:56:29Z OWNER

One other problem: in dark mode the Datasette logo looks bad:

https://user-images.githubusercontent.com/9599/169594178-6a779d9b-1388-4ab9-837f-4b8bfbfc6011.png">

This helps a bit:

.sidebar-logo-container {
  background-color: white;
  padding: 5px;
  opacity: 0.6;
}

https://user-images.githubusercontent.com/9599/169594107-9c153bee-e2e5-4ba1-90de-ecc0a853e608.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133210942 https://github.com/simonw/datasette/issues/1746#issuecomment-1133210942 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5Di20- simonw 9599 2022-05-20T18:49:40Z 2022-05-20T18:49:40Z OWNER

And for those local table of contents, do this:

.. contents::
   :local:
   :class: this-will-duplicate-information-and-it-is-still-useful-here
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133210651 https://github.com/simonw/datasette/issues/1746#issuecomment-1133210651 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5Di2wb simonw 9599 2022-05-20T18:49:11Z 2022-05-20T18:49:11Z OWNER

I found a workaround for the no-longer-nested left hand navigation: drop this into _templates/sidebar/navigation.html:

<div class="sidebar-tree">
  {{ toctree(
    collapse=True,
    titles_only=False,
    maxdepth=3,
    includehidden=True,
) }}
</div>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133210032 https://github.com/simonw/datasette/issues/1746#issuecomment-1133210032 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5Di2mw simonw 9599 2022-05-20T18:48:17Z 2022-05-20T18:48:17Z OWNER

A couple of changes I want to make. First, I don't really like the way Furo keeps the in-page titles in a separate menu on the right rather than expanding them on the left.

I like this:

Furo wants to do this instead:

https://user-images.githubusercontent.com/9599/169592991-979865af-511b-43de-bb6c-13bc24e2c88e.png">

I also still want to include those inline tables of contents on the two pages that have them:

  • https://docs.datasette.io/en/stable/installation.html
  • https://docs.datasette.io/en/stable/plugin_hooks.html
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1129251699 https://github.com/simonw/datasette/issues/1744#issuecomment-1129251699 https://api.github.com/repos/simonw/datasette/issues/1744 IC_kwDOBm6k_c5DTwNz simonw 9599 2022-05-17T19:44:47Z 2022-05-17T19:46:38Z OWNER

Updated docs: https://docs.datasette.io/en/latest/getting_started.html#using-datasette-on-your-own-computer and https://docs.datasette.io/en/latest/cli-reference.html#datasette-serve-help

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--nolock` feature for opening locked databases 1239008850  
1129252603 https://github.com/simonw/datasette/issues/1745#issuecomment-1129252603 https://api.github.com/repos/simonw/datasette/issues/1745 IC_kwDOBm6k_c5DTwb7 simonw 9599 2022-05-17T19:45:51Z 2022-05-17T19:45:51Z OWNER

Now documented here: https://docs.datasette.io/en/latest/contributing.html#running-cog

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Documentation on running cog 1239080102  
1129243427 https://github.com/simonw/datasette/issues/1744#issuecomment-1129243427 https://api.github.com/repos/simonw/datasette/issues/1744 IC_kwDOBm6k_c5DTuMj simonw 9599 2022-05-17T19:35:02Z 2022-05-17T19:35:02Z OWNER

One thing to note is that the datasette-copy-to-memory plugin broke with a locked file, because it does this: https://github.com/simonw/datasette-copy-to-memory/blob/d541c18a78ae6f707a8f9b1e7fc4c020a9f68f2e/datasette_copy_to_memory/__init__.py#L27

tmp.execute("ATTACH DATABASE ? AS _copy_from", [db.path])

That would need to use a URI filename too for it to work with locked files.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--nolock` feature for opening locked databases 1239008850  
1129241873 https://github.com/simonw/datasette/issues/1744#issuecomment-1129241873 https://api.github.com/repos/simonw/datasette/issues/1744 IC_kwDOBm6k_c5DTt0R simonw 9599 2022-05-17T19:33:16Z 2022-05-17T19:33:16Z OWNER

I'm going to skip adding a test for this - the test logic would have to be pretty convoluted to exercise it properly, and it's a pretty minor and low-risk feature in the scheme of things.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--nolock` feature for opening locked databases 1239008850  
1129241283 https://github.com/simonw/datasette/issues/1744#issuecomment-1129241283 https://api.github.com/repos/simonw/datasette/issues/1744 IC_kwDOBm6k_c5DTtrD simonw 9599 2022-05-17T19:32:35Z 2022-05-17T19:32:35Z OWNER

I tried writing a test like this:

@pytest.mark.parametrize("locked", (True, False))
def test_locked_sqlite_db(tmp_path_factory, locked):
    dir = tmp_path_factory.mktemp("test_locked_sqlite_db")
    test_db = str(dir / "test.db")
    sqlite3.connect(test_db).execute("create table t (id integer primary key)")
    if locked:
        fp = open(test_db, "w")
        fcntl.lockf(fp.fileno(), fcntl.LOCK_EX)
    runner = CliRunner()
    result = runner.invoke(
        cli,
        [
            "serve",
            "--memory",
            "--get",
            "/test",
        ],
        catch_exceptions=False,
    )

But it didn't work, because the test runs in the same process - so taking an exclusive lock on that file didn't cause an error when the test later tried to access it via Datasette!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--nolock` feature for opening locked databases 1239008850  
1129187486 https://github.com/simonw/datasette/issues/1744#issuecomment-1129187486 https://api.github.com/repos/simonw/datasette/issues/1744 IC_kwDOBm6k_c5DTgie simonw 9599 2022-05-17T18:28:49Z 2022-05-17T18:28:49Z OWNER

I think I do that with fcntl.flock(): https://docs.python.org/3/library/fcntl.html#fcntl.flock

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--nolock` feature for opening locked databases 1239008850  
1129185356 https://github.com/simonw/datasette/issues/1744#issuecomment-1129185356 https://api.github.com/repos/simonw/datasette/issues/1744 IC_kwDOBm6k_c5DTgBM simonw 9599 2022-05-17T18:26:26Z 2022-05-17T18:26:26Z OWNER

Not sure how to test this - I'd need to open my own lock against a database file somehow.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--nolock` feature for opening locked databases 1239008850  
1129184908 https://github.com/simonw/datasette/issues/1744#issuecomment-1129184908 https://api.github.com/repos/simonw/datasette/issues/1744 IC_kwDOBm6k_c5DTf6M simonw 9599 2022-05-17T18:25:57Z 2022-05-17T18:25:57Z OWNER

I knocked out a quick prototype of this and it worked!

datasette ~/Library/Application\ Support/Google/Chrome/Default/History --nolock

Here's the prototype diff:

diff --git a/datasette/app.py b/datasette/app.py
index b7b8437..f43700d 100644
--- a/datasette/app.py
+++ b/datasette/app.py
@@ -213,6 +213,7 @@ class Datasette:
         config_dir=None,
         pdb=False,
         crossdb=False,
+        nolock=False,
     ):
         assert config_dir is None or isinstance(
             config_dir, Path
@@ -238,6 +239,7 @@ class Datasette:
         self.databases = collections.OrderedDict()
         self._refresh_schemas_lock = asyncio.Lock()
         self.crossdb = crossdb
+        self.nolock = nolock
         if memory or crossdb or not self.files:
             self.add_database(Database(self, is_memory=True), name="_memory")
         # memory_name is a random string so that each Datasette instance gets its own
diff --git a/datasette/cli.py b/datasette/cli.py
index 3c6e1b2..7e44665 100644
--- a/datasette/cli.py
+++ b/datasette/cli.py
@@ -452,6 +452,11 @@ def uninstall(packages, yes):
     is_flag=True,
     help="Enable cross-database joins using the /_memory database",
 )
+@click.option(
+    "--nolock",
+    is_flag=True,
+    help="Ignore locking and open locked files in read-only mode",
+)
 @click.option(
     "--ssl-keyfile",
     help="SSL key file",
@@ -486,6 +491,7 @@ def serve(
     open_browser,
     create,
     crossdb,
+    nolock,
     ssl_keyfile,
     ssl_certfile,
     return_instance=False,
@@ -545,6 +551,7 @@ def serve(
         version_note=version_note,
         pdb=pdb,
         crossdb=crossdb,
+        nolock=nolock,
     )

     # if files is a single directory, use that as config_dir=
diff --git a/datasette/database.py b/datasette/database.py
index 44d3266..fa55804 100644
--- a/datasette/database.py
+++ b/datasette/database.py
@@ -89,6 +89,8 @@ class Database:
         # mode=ro or immutable=1?
         if self.is_mutable:
             qs = "?mode=ro"
+            if self.ds.nolock:
+                qs += "&nolock=1"
         else:
             qs = "?immutable=1"
         assert not (write and not self.is_mutable)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--nolock` feature for opening locked databases 1239008850  
1128052948 https://github.com/simonw/datasette/issues/1742#issuecomment-1128052948 https://api.github.com/repos/simonw/datasette/issues/1742 IC_kwDOBm6k_c5DPLjU simonw 9599 2022-05-16T19:28:31Z 2022-05-16T19:28:31Z OWNER

The trace mechanism is a bit gnarly - it's actually done by some ASGI middleware I wrote, so I'm pretty sure the bug is in there somewhere: https://github.com/simonw/datasette/blob/280ff372ab30df244f6c54f6f3002da57334b3d7/datasette/tracer.py#L73

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
?_trace=1 fails with datasette-geojson for some reason 1237586379  
1128033018 https://github.com/simonw/datasette/issues/1742#issuecomment-1128033018 https://api.github.com/repos/simonw/datasette/issues/1742 IC_kwDOBm6k_c5DPGr6 simonw 9599 2022-05-16T19:06:38Z 2022-05-16T19:06:38Z OWNER

The same URL with .json instead works fine: https://calands.datasettes.com/calands/CPAD_2020a_SuperUnits.json?_sort=id&id__exact=4&_labels=on&_trace=1

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
?_trace=1 fails with datasette-geojson for some reason 1237586379  
1117662420 https://github.com/simonw/datasette/issues/1739#issuecomment-1117662420 https://api.github.com/repos/simonw/datasette/issues/1739 IC_kwDOBm6k_c5CnizU simonw 9599 2022-05-04T18:21:18Z 2022-05-04T18:21:18Z OWNER

That prototype is now public: https://github.com/simonw/datasette-lite

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.db downloads should be served with an ETag 1223699280  
1116215371 https://github.com/simonw/datasette/issues/1739#issuecomment-1116215371 https://api.github.com/repos/simonw/datasette/issues/1739 IC_kwDOBm6k_c5CiBhL simonw 9599 2022-05-03T15:12:16Z 2022-05-03T15:12:16Z OWNER

That worked - both DBs are 304 for me now on a subsequent load of the page:

https://user-images.githubusercontent.com/9599/166481669-9570d225-78d4-461b-88c8-044f02b68a64.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.db downloads should be served with an ETag 1223699280  
1116183369 https://github.com/simonw/datasette/issues/1739#issuecomment-1116183369 https://api.github.com/repos/simonw/datasette/issues/1739 IC_kwDOBm6k_c5Ch5tJ simonw 9599 2022-05-03T14:43:14Z 2022-05-03T14:43:14Z OWNER

Relevant tests start here: https://github.com/simonw/datasette/blob/d60f163528f466b1127b2935c3b6869c34fd6545/tests/test_html.py#L395

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.db downloads should be served with an ETag 1223699280  
1116180599 https://github.com/simonw/datasette/issues/1739#issuecomment-1116180599 https://api.github.com/repos/simonw/datasette/issues/1739 IC_kwDOBm6k_c5Ch5B3 simonw 9599 2022-05-03T14:40:32Z 2022-05-03T14:40:32Z OWNER

Database downloads are served here: https://github.com/simonw/datasette/blob/d60f163528f466b1127b2935c3b6869c34fd6545/datasette/views/database.py#L186-L192

Here's AsgiFileDownload: https://github.com/simonw/datasette/blob/d60f163528f466b1127b2935c3b6869c34fd6545/datasette/utils/asgi.py#L410-L430

I can add an etag= parameter to that and populate it with db.hash, if it is populated (which it always should be for immutable databases that can be downloaded).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.db downloads should be served with an ETag 1223699280  
1116178727 https://github.com/simonw/datasette/issues/1739#issuecomment-1116178727 https://api.github.com/repos/simonw/datasette/issues/1739 IC_kwDOBm6k_c5Ch4kn simonw 9599 2022-05-03T14:38:46Z 2022-05-03T14:38:46Z OWNER

Reminded myself how this works by reviewing conditional-get: https://github.com/simonw/conditional-get/blob/db6dfec0a296080aaf68fcd80e55fb3f0714e738/conditional_get/cli.py#L33-L52

Simply add a If-None-Match: last-known-etag header to the request and check that the response is a status 304 with an empty body.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.db downloads should be served with an ETag 1223699280  
1115760104 https://github.com/simonw/datasette/issues/1739#issuecomment-1115760104 https://api.github.com/repos/simonw/datasette/issues/1739 IC_kwDOBm6k_c5CgSXo simonw 9599 2022-05-03T05:50:19Z 2022-05-03T05:50:19Z OWNER

Here's how Starlette does it: https://github.com/encode/starlette/blob/830f3486537916bae6b46948ff922adc14a22b7c/starlette/staticfiles.py#L213

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.db downloads should be served with an ETag 1223699280  
1115533820 https://github.com/simonw/datasette/issues/1732#issuecomment-1115533820 https://api.github.com/repos/simonw/datasette/issues/1732 IC_kwDOBm6k_c5CfbH8 simonw 9599 2022-05-03T01:42:25Z 2022-05-03T01:42:25Z OWNER

Thanks, this definitely sounds like a bug. Do you have simple steps to reproduce this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Custom page variables aren't decoded 1221849746  
1115470180 https://github.com/simonw/datasette/issues/1737#issuecomment-1115470180 https://api.github.com/repos/simonw/datasette/issues/1737 IC_kwDOBm6k_c5CfLlk simonw 9599 2022-05-02T23:39:29Z 2022-05-02T23:39:29Z OWNER

Test ran in 38 seconds and passed! https://github.com/simonw/datasette/runs/6265954274?check_suite_focus=true

I'm going to have it run on every commit and PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Automated test for Pyodide compatibility 1223459734  
1115468193 https://github.com/simonw/datasette/issues/1737#issuecomment-1115468193 https://api.github.com/repos/simonw/datasette/issues/1737 IC_kwDOBm6k_c5CfLGh simonw 9599 2022-05-02T23:35:26Z 2022-05-02T23:35:26Z OWNER

https://github.com/simonw/datasette/runs/6265915080?check_suite_focus=true failed but looks like it passed because I forgot to use set -e at the start of the bash script.

It failed because it didn't have build available.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Automated test for Pyodide compatibility 1223459734  
1115464097 https://github.com/simonw/datasette/issues/1737#issuecomment-1115464097 https://api.github.com/repos/simonw/datasette/issues/1737 IC_kwDOBm6k_c5CfKGh simonw 9599 2022-05-02T23:27:40Z 2022-05-02T23:27:40Z OWNER

I'm going to start off by running this manually - I may run it on every commit once this is all a little bit more stable.

I can base the workflow on https://github.com/simonw/scrape-hacker-news-by-domain/blob/main/.github/workflows/scrape.yml

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Automated test for Pyodide compatibility 1223459734  
1115462720 https://github.com/simonw/datasette/issues/1737#issuecomment-1115462720 https://api.github.com/repos/simonw/datasette/issues/1737 IC_kwDOBm6k_c5CfJxA simonw 9599 2022-05-02T23:25:03Z 2022-05-02T23:25:03Z OWNER

Here's a script that seems to work. It builds the wheel, starts a Python web server that serves the wheel, runs a test with shot-scraper and then shuts down the server again.

#!/bin/bash

# Build the wheel
python3 -m build

# Find name of wheel
wheel=$(basename $(ls dist/*.whl))
# strip off the dist/


# Create a blank index page
echo '
<script src="https://cdn.jsdelivr.net/pyodide/v0.20.0/full/pyodide.js"></script>
' > dist/index.html

# Run a server for that dist/ folder
cd dist
python3 -m http.server 8529 &
cd ..

shot-scraper javascript http://localhost:8529/ "
async () => {
  let pyodide = await loadPyodide();
  await pyodide.loadPackage(['micropip', 'ssl', 'setuptools']);
  let output = await pyodide.runPythonAsync(\`
    import micropip
    await micropip.install('h11==0.12.0')
    await micropip.install('http://localhost:8529/$wheel')
    import ssl
    import setuptools
    from datasette.app import Datasette
    ds = Datasette(memory=True, settings={'num_sql_threads': 0})
    (await ds.client.get('/_memory.json?sql=select+55+as+itworks&_shape=array')).text
  \`);
  if (JSON.parse(output)[0].itworks != 55) {
    throw 'Got ' + output + ', expected itworks: 55';
  }
  return 'Test passed!';
}
"

# Shut down the server
pkill -f 'http.server 8529'
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Automated test for Pyodide compatibility 1223459734  
1115404729 https://github.com/simonw/datasette/issues/1733#issuecomment-1115404729 https://api.github.com/repos/simonw/datasette/issues/1733 IC_kwDOBm6k_c5Ce7m5 simonw 9599 2022-05-02T21:49:01Z 2022-05-02T21:49:38Z OWNER

That alpha release works!

https://pyodide.org/en/stable/console.html

Welcome to the Pyodide terminal emulator 🐍
Python 3.10.2 (main, Apr  9 2022 20:52:01) on WebAssembly VM
Type "help", "copyright", "credits" or "license" for more information.
>>> import micropip
>>> await micropip.install("datasette==0.62a0")
>>> import ssl
>>> import setuptools
>>> from datasette.app import Datasette
>>> ds = Datasette(memory=True, settings={"num_sql_threads": 0})
>>> await ds.client.get("/.json")
<Response [200 OK]>
>>> (await ds.client.get("/.json")).json()
{'_memory': {'name': '_memory', 'hash': None, 'color': 'a6c7b9', 'path': '/_memory', 'tables_and_views_truncated': [], 'tab
les_and_views_more': False, 'tables_count': 0, 'table_rows_sum': 0, 'show_table_row_counts': False, 'hidden_table_rows_sum'
: 0, 'hidden_tables_count': 0, 'views_count': 0, 'private': False}}
>>> 
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette compatible with Pyodide 1223234932  
1115318417 https://github.com/simonw/datasette/issues/1733#issuecomment-1115318417 https://api.github.com/repos/simonw/datasette/issues/1733 IC_kwDOBm6k_c5CemiR simonw 9599 2022-05-02T20:13:43Z 2022-05-02T20:13:43Z OWNER

This is good enough to push an alpha.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette compatible with Pyodide 1223234932  
1115318303 https://github.com/simonw/datasette/issues/1733#issuecomment-1115318303 https://api.github.com/repos/simonw/datasette/issues/1733 IC_kwDOBm6k_c5Cemgf simonw 9599 2022-05-02T20:13:36Z 2022-05-02T20:13:36Z OWNER

I got a build from the pyodide branch to work!

Welcome to the Pyodide terminal emulator 🐍
Python 3.10.2 (main, Apr  9 2022 20:52:01) on WebAssembly VM
Type "help", "copyright", "credits" or "license" for more information.
>>> import micropip
>>> await micropip.install("https://s3.amazonaws.com/simonwillison-cors-allowed-public/datasette-0.62a0-py3-none-any.whl")
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python3.10/asyncio/futures.py", line 284, in __await__
    yield self  # This tells Task to wait for completion.
  File "/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup
    future.result()
  File "/lib/python3.10/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/lib/python3.10/asyncio/tasks.py", line 234, in __step
    result = coro.throw(exc)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 183, in install
    transaction = await self.gather_requirements(requirements, ctx, keep_going)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 173, in gather_requirements
    await gather(*requirement_promises)
  File "/lib/python3.10/asyncio/futures.py", line 284, in __await__
    yield self  # This tells Task to wait for completion.
  File "/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup
    future.result()
  File "/lib/python3.10/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/lib/python3.10/asyncio/tasks.py", line 232, in __step
    result = coro.send(None)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 245, in add_requirement
    await self.add_wheel(name, wheel, version, (), ctx, transaction)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 316, in add_wheel
    await self.add_requirement(recurs_req, ctx, transaction)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 291, in add_requirement
    await self.add_wheel(
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 316, in add_wheel
    await self.add_requirement(recurs_req, ctx, transaction)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 291, in add_requirement
    await self.add_wheel(
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 316, in add_wheel
    await self.add_requirement(recurs_req, ctx, transaction)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 276, in add_requirement
    raise ValueError(
ValueError: Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed
>>> await micropip.install("https://s3.amazonaws.com/simonwillison-cors-allowed-public/datasette-0.62a0-py3-none-any.whl")
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python3.10/asyncio/futures.py", line 284, in __await__
    yield self  # This tells Task to wait for completion.
  File "/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup
    future.result()
  File "/lib/python3.10/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/lib/python3.10/asyncio/tasks.py", line 234, in __step
    result = coro.throw(exc)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 183, in install
    transaction = await self.gather_requirements(requirements, ctx, keep_going)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 173, in gather_requirements
    await gather(*requirement_promises)
  File "/lib/python3.10/asyncio/futures.py", line 284, in __await__
    yield self  # This tells Task to wait for completion.
  File "/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup
    future.result()
  File "/lib/python3.10/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/lib/python3.10/asyncio/tasks.py", line 232, in __step
    result = coro.send(None)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 245, in add_requirement
    await self.add_wheel(name, wheel, version, (), ctx, transaction)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 316, in add_wheel
    await self.add_requirement(recurs_req, ctx, transaction)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 291, in add_requirement
    await self.add_wheel(
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 316, in add_wheel
    await self.add_requirement(recurs_req, ctx, transaction)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 291, in add_requirement
    await self.add_wheel(
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 316, in add_wheel
    await self.add_requirement(recurs_req, ctx, transaction)
  File "/lib/python3.10/site-packages/micropip/_micropip.py", line 276, in add_requirement
    raise ValueError(
ValueError: Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed
>>> await micropip.install("h11==0.12")
>>> await micropip.install("https://s3.amazonaws.com/simonwillison-cors-allowed-public/datasette-0.62a0-py3-none-any.whl")
>>> import datasette
>>> from datasette.app import Datasette
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python3.10/site-packages/datasette/app.py", line 9, in <module>
    import httpx
  File "/lib/python3.10/site-packages/httpx/__init__.py", line 2, in <module>
    from ._api import delete, get, head, options, patch, post, put, request, stream
  File "/lib/python3.10/site-packages/httpx/_api.py", line 4, in <module>
    from ._client import Client
  File "/lib/python3.10/site-packages/httpx/_client.py", line 9, in <module>
    from ._auth import Auth, BasicAuth, FunctionAuth
  File "/lib/python3.10/site-packages/httpx/_auth.py", line 10, in <module>
    from ._models import Request, Response
  File "/lib/python3.10/site-packages/httpx/_models.py", line 16, in <module>
    from ._content import ByteStream, UnattachedStream, encode_request, encode_response
  File "/lib/python3.10/site-packages/httpx/_content.py", line 17, in <module>
    from ._multipart import MultipartStream
  File "/lib/python3.10/site-packages/httpx/_multipart.py", line 7, in <module>
    from ._types import (
  File "/lib/python3.10/site-packages/httpx/_types.py", line 5, in <module>
    import ssl
  File "/lib/python3.10/ssl.py", line 98, in <module>
    import _ssl             # if we can't import it, let the error propagate
ModuleNotFoundError: No module named '_ssl'
>>> import ssl
>>> from datasette.app import Datasette
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python3.10/site-packages/datasette/app.py", line 14, in <module>
    import pkg_resources
ModuleNotFoundError: No module named 'pkg_resources'
>>> import setuptools
>>> from datasette.app import Datasette
>>> ds = Datasette(memory=True)
>>> ds
<datasette.app.Datasette object at 0x1cc4fb8>
>>> await ds.client.get("/")
Traceback (most recent call last):
  File "/lib/python3.10/site-packages/datasette/app.py", line 1268, in route_path
    response = await view(request, send)
  File "/lib/python3.10/site-packages/datasette/views/base.py", line 134, in view
    return await self.dispatch_request(request)
  File "/lib/python3.10/site-packages/datasette/views/base.py", line 89, in dispatch_request
    await self.ds.refresh_schemas()
  File "/lib/python3.10/site-packages/datasette/app.py", line 353, in refresh_schemas
    await self._refresh_schemas()
  File "/lib/python3.10/site-packages/datasette/app.py", line 358, in _refresh_schemas
    await init_internal_db(internal_db)
  File "/lib/python3.10/site-packages/datasette/utils/internal_db.py", line 65, in init_internal_db
    await db.execute_write_script(create_tables_sql)
  File "/lib/python3.10/site-packages/datasette/database.py", line 116, in execute_write_script
    results = await self.execute_write_fn(_inner, block=block)
  File "/lib/python3.10/site-packages/datasette/database.py", line 155, in execute_write_fn
    self._write_thread.start()
  File "/lib/python3.10/threading.py", line 928, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
<Response [500 Internal Server Error]>
>>> ds = Datasette(memory=True, settings={"num_sql_threads": 0})
>>> await ds.client.get("/")
<Response [200 OK]>
>>> (await ds.client.get("/")).text
'<!DOCTYPE html>\n<html>\n<head>\n    <title>Datasette: _memory</title>\n    <link rel="stylesheet" href="/-/static/app.css
?cead5a">\n    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">\n\n<link rel="alterna
te" type="application/json+datasette" href="http://localhost/.json"></head>\n<body class="index">\n<div class="not-footer">
\n<header><nav>\n    \n    \n</nav></header>\n\n\n\n    \n\n\n\n<section class="content">\n\n<h1>Datasette</h1>\n\n\n\n\n\n
    <h2 
<long output truncated>
r detailsClickedWithin = null;\n    while (target && target.tagName != \'DETAILS\') {\n        target = target.parentNode;\
n    }\n    if (target && target.tagName == \'DETAILS\') {\n        detailsClickedWithin = target;\n    }\n    Array.from(d
ocument.getElementsByTagName(\'details\')).filter(\n        (details) => details.open && details != detailsClickedWithin\n 
   ).forEach(details => details.open = false);\n});\n</script>\n\n\n\n<!-- Templates considered: *index.html -->\n</body>\n
</html>'
>>> 

That ValueError: Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed error is annoying. I assume it's a uvicorn dependency clash of some sort, because I wasn't getting that when I removed uvicorn as a dependency.

I can avoid it by running this first though:

await micropip.install("h11==0.12")
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette compatible with Pyodide 1223234932  
1115301733 https://github.com/simonw/datasette/issues/1735#issuecomment-1115301733 https://api.github.com/repos/simonw/datasette/issues/1735 IC_kwDOBm6k_c5Ceidl simonw 9599 2022-05-02T19:57:19Z 2022-05-02T19:59:03Z OWNER

This code breaks if that setting is 0:

https://github.com/simonw/datasette/blob/a29c1277896b6a7905ef5441c42a37bc15f67599/datasette/app.py#L291-L293

It's used here:

https://github.com/simonw/datasette/blob/a29c1277896b6a7905ef5441c42a37bc15f67599/datasette/database.py#L188-L190

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Datasette setting to disable threading (for Pyodide) 1223263540  
1115288284 https://github.com/simonw/datasette/issues/1733#issuecomment-1115288284 https://api.github.com/repos/simonw/datasette/issues/1733 IC_kwDOBm6k_c5CefLc simonw 9599 2022-05-02T19:40:33Z 2022-05-02T19:40:33Z OWNER

I'll release this as a 0.62a0 as soon as it's ready, so I can start testing it out in Pyodide for real.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette compatible with Pyodide 1223234932  
1115283922 https://github.com/simonw/datasette/issues/1734#issuecomment-1115283922 https://api.github.com/repos/simonw/datasette/issues/1734 IC_kwDOBm6k_c5CeeHS simonw 9599 2022-05-02T19:35:32Z 2022-05-02T19:35:32Z OWNER

I'll use my original from 2009: https://www.djangosnippets.org/snippets/1431/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Remove python-baseconv dependency 1223241647  
1115282773 https://github.com/simonw/datasette/issues/1734#issuecomment-1115282773 https://api.github.com/repos/simonw/datasette/issues/1734 IC_kwDOBm6k_c5Ced1V simonw 9599 2022-05-02T19:34:15Z 2022-05-02T19:34:15Z OWNER

I'm going to vendor it and update the documentation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Remove python-baseconv dependency 1223241647  
1115278325 https://github.com/simonw/datasette/issues/1733#issuecomment-1115278325 https://api.github.com/repos/simonw/datasette/issues/1733 IC_kwDOBm6k_c5Cecv1 simonw 9599 2022-05-02T19:29:05Z 2022-05-02T19:29:05Z OWNER

I'm going to add a Datasette setting to disable threading entirely, designed for usage in this particular case.

I thought about adding a new setting, then I noticed this:

datasette mydatabase.db --setting num_sql_threads 10

I'm going to let users set that to 0 to disable threaded execution of SQL queries.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette compatible with Pyodide 1223234932  
1115268245 https://github.com/simonw/datasette/issues/1733#issuecomment-1115268245 https://api.github.com/repos/simonw/datasette/issues/1733 IC_kwDOBm6k_c5CeaSV simonw 9599 2022-05-02T19:18:11Z 2022-05-02T19:18:11Z OWNER

Maybe I can leave uvicorn as a dependency? Installing it works OK, it only generates errors when you try to import it:

Welcome to the Pyodide terminal emulator 🐍
Python 3.10.2 (main, Apr  9 2022 20:52:01) on WebAssembly VM
Type "help", "copyright", "credits" or "license" for more information.
>>> import micropip
>>> await micropip.install("uvicorn")
>>> import uvicorn
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python3.10/site-packages/uvicorn/__init__.py", line 1, in <module>
    from uvicorn.config import Config
  File "/lib/python3.10/site-packages/uvicorn/config.py", line 8, in <module>
    import ssl
  File "/lib/python3.10/ssl.py", line 98, in <module>
    import _ssl             # if we can't import it, let the error propagate
ModuleNotFoundError: No module named '_ssl'
>>> import ssl
>>> import uvicorn
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python3.10/site-packages/uvicorn/__init__.py", line 2, in <module>
    from uvicorn.main import Server, main, run
  File "/lib/python3.10/site-packages/uvicorn/main.py", line 24, in <module>
    from uvicorn.supervisors import ChangeReload, Multiprocess
  File "/lib/python3.10/site-packages/uvicorn/supervisors/__init__.py", line 3, in <module>
    from uvicorn.supervisors.basereload import BaseReload
  File "/lib/python3.10/site-packages/uvicorn/supervisors/basereload.py", line 12, in <module>
    from uvicorn.subprocess import get_subprocess
  File "/lib/python3.10/site-packages/uvicorn/subprocess.py", line 14, in <module>
    multiprocessing.allow_connection_pickling()
  File "/lib/python3.10/multiprocessing/context.py", line 170, in allow_connection_pickling
    from . import connection
  File "/lib/python3.10/multiprocessing/connection.py", line 21, in <module>
    import _multiprocessing
ModuleNotFoundError: No module named '_multiprocessing'
>>> import multiprocessing
>>> import uvicorn
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python3.10/site-packages/uvicorn/__init__.py", line 2, in <module>
    from uvicorn.main import Server, main, run
  File "/lib/python3.10/site-packages/uvicorn/main.py", line 24, in <module>
    from uvicorn.supervisors import ChangeReload, Multiprocess
  File "/lib/python3.10/site-packages/uvicorn/supervisors/__init__.py", line 3, in <module>
    from uvicorn.supervisors.basereload import BaseReload
  File "/lib/python3.10/site-packages/uvicorn/supervisors/basereload.py", line 12, in <module>
    from uvicorn.subprocess import get_subprocess
  File "/lib/python3.10/site-packages/uvicorn/subprocess.py", line 14, in <module>
    multiprocessing.allow_connection_pickling()
  File "/lib/python3.10/multiprocessing/context.py", line 170, in allow_connection_pickling
    from . import connection
  File "/lib/python3.10/multiprocessing/connection.py", line 21, in <module>
    import _multiprocessing
ModuleNotFoundError: No module named '_multiprocessing'
>>> 

Since the import ssl trick fixed the _ssl error I was hopeful that import multiprocessing could fix the _multiprocessing one, but sadly it did not.

But it looks like i can address this issue just by making import uvicorn in app.py an optional import.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette compatible with Pyodide 1223234932  
1115262218 https://github.com/simonw/datasette/issues/1733#issuecomment-1115262218 https://api.github.com/repos/simonw/datasette/issues/1733 IC_kwDOBm6k_c5CeY0K simonw 9599 2022-05-02T19:11:51Z 2022-05-02T19:14:01Z OWNER

Here's the full diff I applied to Datasette to get it fully working in Pyodide:

https://github.com/simonw/datasette/compare/94a3171b01fde5c52697aeeff052e3ad4bab5391...8af32bc5b03c30b1f7a4a8cc4bd80eb7e2ee7b81

And as a visible diff:

diff --git a/datasette/app.py b/datasette/app.py
index d269372..6c0c5fc 100644
--- a/datasette/app.py
+++ b/datasette/app.py
@@ -15,7 +15,6 @@ import pkg_resources
 import re
 import secrets
 import sys
-import threading
 import traceback
 import urllib.parse
 from concurrent import futures
@@ -26,7 +25,6 @@ from itsdangerous import URLSafeSerializer
 from jinja2 import ChoiceLoader, Environment, FileSystemLoader, PrefixLoader
 from jinja2.environment import Template
 from jinja2.exceptions import TemplateNotFound
-import uvicorn

 from .views.base import DatasetteError, ureg
 from .views.database import DatabaseDownload, DatabaseView
@@ -813,7 +811,6 @@ class Datasette:
             },
             "datasette": datasette_version,
             "asgi": "3.0",
-            "uvicorn": uvicorn.__version__,
             "sqlite": {
                 "version": sqlite_version,
                 "fts_versions": fts_versions,
@@ -854,23 +851,7 @@ class Datasette:
         ]

     def _threads(self):
-        threads = list(threading.enumerate())
-        d = {
-            "num_threads": len(threads),
-            "threads": [
-                {"name": t.name, "ident": t.ident, "daemon": t.daemon} for t in threads
-            ],
-        }
-        # Only available in Python 3.7+
-        if hasattr(asyncio, "all_tasks"):
-            tasks = asyncio.all_tasks()
-            d.update(
-                {
-                    "num_tasks": len(tasks),
-                    "tasks": [_cleaner_task_str(t) for t in tasks],
-                }
-            )
-        return d
+        return {"num_threads": 0, "threads": []}

     def _actor(self, request):
         return {"actor": request.actor}
diff --git a/datasette/database.py b/datasette/database.py
index ba594a8..b50142d 100644
--- a/datasette/database.py
+++ b/datasette/database.py
@@ -4,7 +4,6 @@ from pathlib import Path
 import janus
 import queue
 import sys
-import threading
 import uuid

 from .tracer import trace
@@ -21,8 +20,6 @@ from .utils import (
 )
 from .inspect import inspect_hash

-connections = threading.local()
-
 AttachedDatabase = namedtuple("AttachedDatabase", ("seq", "name", "file"))


@@ -43,12 +40,12 @@ class Database:
         self.hash = None
         self.cached_size = None
         self._cached_table_counts = None
-        self._write_thread = None
-        self._write_queue = None
         if not self.is_mutable and not self.is_memory:
             p = Path(path)
             self.hash = inspect_hash(p)
             self.cached_size = p.stat().st_size
+        self._read_connection = None
+        self._write_connection = None

     @property
     def cached_table_counts(self):
@@ -134,60 +131,17 @@ class Database:
         return results

     async def execute_write_fn(self, fn, block=True):
-        task_id = uuid.uuid5(uuid.NAMESPACE_DNS, "datasette.io")
-        if self._write_queue is None:
-            self._write_queue = queue.Queue()
-        if self._write_thread is None:
-            self._write_thread = threading.Thread(
-                target=self._execute_writes, daemon=True
-            )
-            self._write_thread.start()
-        reply_queue = janus.Queue()
-        self._write_queue.put(WriteTask(fn, task_id, reply_queue))
-        if block:
-            result = await reply_queue.async_q.get()
-            if isinstance(result, Exception):
-                raise result
-            else:
-                return result
-        else:
-            return task_id
-
-    def _execute_writes(self):
-        # Infinite looping thread that protects the single write connection
-        # to this database
-        conn_exception = None
-        conn = None
-        try:
-            conn = self.connect(write=True)
-            self.ds._prepare_connection(conn, self.name)
-        except Exception as e:
-            conn_exception = e
-        while True:
-            task = self._write_queue.get()
-            if conn_exception is not None:
-                result = conn_exception
-            else:
-                try:
-                    result = task.fn(conn)
-                except Exception as e:
-                    sys.stderr.write("{}\n".format(e))
-                    sys.stderr.flush()
-                    result = e
-            task.reply_queue.sync_q.put(result)
+        # We always treat it as if block=True now
+        if self._write_connection is None:
+            self._write_connection = self.connect(write=True)
+            self.ds._prepare_connection(self._write_connection, self.name)
+        return fn(self._write_connection)

     async def execute_fn(self, fn):
-        def in_thread():
-            conn = getattr(connections, self.name, None)
-            if not conn:
-                conn = self.connect()
-                self.ds._prepare_connection(conn, self.name)
-                setattr(connections, self.name, conn)
-            return fn(conn)
-
-        return await asyncio.get_event_loop().run_in_executor(
-            self.ds.executor, in_thread
-        )
+        if self._read_connection is None:
+            self._read_connection = self.connect()
+            self.ds._prepare_connection(self._read_connection, self.name)
+        return fn(self._read_connection)

     async def execute(
         self,
diff --git a/setup.py b/setup.py
index 7f0562f..c41669c 100644
--- a/setup.py
+++ b/setup.py
@@ -44,20 +44,20 @@ setup(
     install_requires=[
         "asgiref>=3.2.10,<3.6.0",
         "click>=7.1.1,<8.2.0",
-        "click-default-group~=1.2.2",
+        # "click-default-group~=1.2.2",
         "Jinja2>=2.10.3,<3.1.0",
         "hupper~=1.9",
         "httpx>=0.20",
         "pint~=0.9",
         "pluggy>=1.0,<1.1",
-        "uvicorn~=0.11",
+        # "uvicorn~=0.11",
         "aiofiles>=0.4,<0.9",
         "janus>=0.6.2,<1.1",
         "asgi-csrf>=0.9",
         "PyYAML>=5.3,<7.0",
         "mergedeep>=1.1.1,<1.4.0",
         "itsdangerous>=1.1,<3.0",
-        "python-baseconv==1.2.2",
+        # "python-baseconv==1.2.2",
     ],
     entry_points="""
         [console_scripts]
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette compatible with Pyodide 1223234932  
1115260999 https://github.com/simonw/datasette/issues/1734#issuecomment-1115260999 https://api.github.com/repos/simonw/datasette/issues/1734 IC_kwDOBm6k_c5CeYhH simonw 9599 2022-05-02T19:10:34Z 2022-05-02T19:10:34Z OWNER

This is actually mostly a documentation thing: here: https://docs.datasette.io/en/0.61.1/authentication.html#including-an-expiry-time

In the code it's only used in these two places:

https://github.com/simonw/datasette/blob/0a7621f96f8ad14da17e7172e8a7bce24ef78966/datasette/actor_auth_cookie.py#L16-L20

https://github.com/simonw/datasette/blob/0a7621f96f8ad14da17e7172e8a7bce24ef78966/tests/test_auth.py#L56-L60

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Remove python-baseconv dependency 1223241647  
1115258737 https://github.com/simonw/datasette/issues/1733#issuecomment-1115258737 https://api.github.com/repos/simonw/datasette/issues/1733 IC_kwDOBm6k_c5CeX9x simonw 9599 2022-05-02T19:08:17Z 2022-05-02T19:08:17Z OWNER

I was going to vendor baseconv.py, but then I reconsidered - what if there are plugins out there that expect import baseconv to work because they have dependend on Datasette?

I used https://cs.github.com/ and as far as I can tell there aren't any!

So I'm going to remove that dependency and work out a smarter way to do this - probably by providing a utility function within Datasette itself.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette compatible with Pyodide 1223234932  
1115256318 https://github.com/simonw/datasette/issues/1733#issuecomment-1115256318 https://api.github.com/repos/simonw/datasette/issues/1733 IC_kwDOBm6k_c5CeXX- simonw 9599 2022-05-02T19:05:55Z 2022-05-02T19:05:55Z OWNER

I released a click-default-group-wheel package to solve that dependency issue. I've already upgraded sqlite-utils to that, so now you can use that in Pyodide:

  • https://github.com/simonw/sqlite-utils/pull/429

python-baseconv is only used for actor cookie expiration times:

https://github.com/simonw/datasette/blob/0a7621f96f8ad14da17e7172e8a7bce24ef78966/datasette/actor_auth_cookie.py#L16-L20

Datasette never actually sets that cookie itself - it instead encourages plugins to set it in the authentication documentation here: https://docs.datasette.io/en/0.61.1/authentication.html#including-an-expiry-time

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette compatible with Pyodide 1223234932  
1115196863 https://github.com/simonw/sqlite-utils/pull/429#issuecomment-1115196863 https://api.github.com/repos/simonw/sqlite-utils/issues/429 IC_kwDOCGYnMM5CeI2_ simonw 9599 2022-05-02T18:03:47Z 2022-05-02T18:52:42Z OWNER

I made a build of this branch and tested it like this: https://pyodide.org/en/stable/console.html

>>> import micropip
>>> await micropip.install("https://s3.amazonaws.com/simonwillison-cors-allowed-public/sqlite_utils-3.26-py3-none-any.whl")
>>> import sqlite_utils
>>> db = sqlite_utils.Database(memory=True)
>>> list(db.query("select 32443 + 55"))
[{'32443 + 55': 32498}]
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Depend on click-default-group-wheel 1223177069  
1115197644 https://github.com/simonw/sqlite-utils/pull/429#issuecomment-1115197644 https://api.github.com/repos/simonw/sqlite-utils/issues/429 IC_kwDOCGYnMM5CeJDM simonw 9599 2022-05-02T18:04:28Z 2022-05-02T18:04:28Z OWNER

I'm going to ship this straight away as 3.26.1.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Depend on click-default-group-wheel 1223177069  
1114058210 https://github.com/simonw/datasette/issues/1727#issuecomment-1114058210 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CZy3i simonw 9599 2022-04-30T21:39:34Z 2022-04-30T21:39:34Z OWNER

Something to consider if I look into subprocesses for parallel query execution:

https://sqlite.org/howtocorrupt.html#_carrying_an_open_database_connection_across_a_fork_

Do not open an SQLite database connection, then fork(), then try to use that database connection in the child process. All kinds of locking problems will result and you can easily end up with a corrupt database. SQLite is not designed to support that kind of behavior. Any database connection that is used in a child process must be opened in the child process, not inherited from the parent.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1114038259 https://github.com/simonw/datasette/issues/1729#issuecomment-1114038259 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CZt_z simonw 9599 2022-04-30T19:06:03Z 2022-04-30T19:06:03Z OWNER

but actually the facet results would be better if they were a list rather than a dictionary

I think facet_results in the JSON should match this (used by the HTML) instead:

https://github.com/simonw/datasette/blob/942411ef946e9a34a2094944d3423cddad27efd3/datasette/views/table.py#L737-L741

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1114036946 https://github.com/simonw/datasette/issues/1729#issuecomment-1114036946 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CZtrS simonw 9599 2022-04-30T18:56:25Z 2022-04-30T19:04:03Z OWNER

Related:
- #1558

Which talks about how there was confusion in this example: https://latest.datasette.io/fixtures/facetable.json?_facet=created&_facet_date=created&_facet=tags&_facet_array=tags&_nosuggest=1&_size=0

Which I fixed in #625 by introducing tags and tags_2 keys, but actually the facet results would be better if they were a list rather than a dictionary.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1114037521 https://github.com/simonw/datasette/issues/1729#issuecomment-1114037521 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CZt0R simonw 9599 2022-04-30T19:01:07Z 2022-04-30T19:01:07Z OWNER

I had to look up what hideable means - it means that you can't hide the current facet because it was defined in metadata, not as a ?_facet= parameter:

https://github.com/simonw/datasette/blob/4e47a2d894b96854348343374c8e97c9d7055cf6/datasette/facets.py#L228

That's a bit of a weird thing to expose in the API. Maybe change that to source so it can be metadata or request? That's very slightly less coupled to how the UI works.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1114013757 https://github.com/simonw/datasette/issues/1729#issuecomment-1114013757 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CZoA9 simonw 9599 2022-04-30T16:15:51Z 2022-04-30T18:54:39Z OWNER

Deployed a preview of this here: https://latest-1-0-alpha.datasette.io/

Examples:

  • https://latest-1-0-alpha.datasette.io/fixtures/facetable.json
  • https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count

Second example produces:

{
  "rows": [],
  "next": null,
  "next_url": null,
  "count": 15,
  "facet_results": {
    "state": {
      "name": "state",
      "type": "column",
      "hideable": true,
      "toggle_url": "/fixtures/facetable.json?_size=0&_extra=facet_results&_extra=count",
      "results": [
        {
          "value": "CA",
          "label": "CA",
          "count": 10,
          "toggle_url": "https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=CA",
          "selected": false
        },
        {
          "value": "MI",
          "label": "MI",
          "count": 4,
          "toggle_url": "https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=MI",
          "selected": false
        },
        {
          "value": "MC",
          "label": "MC",
          "count": 1,
          "toggle_url": "https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=MC",
          "selected": false
        }
      ],
      "truncated": false
    }
  }
}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1112889800 https://github.com/simonw/datasette/issues/1727#issuecomment-1112889800 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CVVnI simonw 9599 2022-04-29T05:29:38Z 2022-04-29T05:29:38Z OWNER

OK, I just got the most incredible result with that!

I started up a container running bash like this, from my datasette checkout. I'm mapping port 8005 on my laptop to port 8001 inside the container because laptop port 8001 was already doing something else:

docker run -it --rm --name my-running-script -p 8005:8001 -v "$PWD":/usr/src/myapp \
  -w /usr/src/myapp nogil/python bash

Then in bash I ran the following commands to install Datasette and its dependencies:

pip install -e '.[test]'
pip install datasette-pretty-traces # For debug tracing

Then I started Datasette against my github.db database (from github-to-sqlite.dogsheep.net/github.db) like this:

datasette github.db -h 0.0.0.0 --setting trace_debug 1

I hit the following two URLs to compare the parallel v.s. not parallel implementations:

  • http://127.0.0.1:8005/github/issues?_facet=milestone&_facet=repo&_trace=1&_size=10
  • http://127.0.0.1:8005/github/issues?_facet=milestone&_facet=repo&_trace=1&_size=10&_noparallel=1

And... the parallel one beat the non-parallel one decisively, on multiple page refreshes!

Not parallel: 77ms

Parallel: 47ms

https://user-images.githubusercontent.com/9599/165889437-60d4200d-698a-4175-af23-7c03bb456e66.png">

https://user-images.githubusercontent.com/9599/165889445-2dfb8676-d823-405e-aecb-ad28ec3043da.png">

So yeah, I'm very confident this is a problem with the GIL. And I am absolutely stunned that @colesbury's fork ran Datasette (which has some reasonably tricky threading and async stuff going on) out of the box!

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1112879463 https://github.com/simonw/datasette/issues/1727#issuecomment-1112879463 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CVTFn simonw 9599 2022-04-29T05:03:58Z 2022-04-29T05:03:58Z OWNER

It would be really fun to try running this with the in-development nogil Python from https://github.com/colesbury/nogil

There's a Docker container for it: https://hub.docker.com/r/nogil/python

It suggests you can run something like this:

docker run -it --rm --name my-running-script -v "$PWD":/usr/src/myapp \
  -w /usr/src/myapp nogil/python python your-daemon-or-script.py
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1112878955 https://github.com/simonw/datasette/issues/1727#issuecomment-1112878955 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CVS9r simonw 9599 2022-04-29T05:02:40Z 2022-04-29T05:02:40Z OWNER

Here's a very useful (recent) article about how the GIL works and how to think about it: https://pythonspeed.com/articles/python-gil/ - via https://lobste.rs/s/9hj80j/when_python_can_t_thread_deep_dive_into_gil

From that article:

For example, let's consider an extension module written in C or Rust that lets you talk to a PostgreSQL database server.

Conceptually, handling a SQL query with this library will go through three steps:

  1. Deserialize from Python to the internal library representation. Since this will be reading Python objects, it needs to hold the GIL.
  2. Send the query to the database server, and wait for a response. This doesn't need the GIL.
  3. Convert the response into Python objects. This needs the GIL again.

As you can see, how much parallelism you can get depends on how much time is spent in each step. If the bulk of time is spent in step 2, you'll get parallelism there. But if, for example, you run a SELECT and get a large number of rows back, the library will need to create many Python objects, and step 3 will have to hold GIL for a while.

That explains what I'm seeing here. I'm pretty convinced now that the reason I'm not getting a performance boost from parallel queries is that there's more time spent in Python code assembling the results than in SQLite C code executing the query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1112734577 https://github.com/simonw/datasette/issues/1729#issuecomment-1112734577 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CUvtx simonw 9599 2022-04-28T23:08:42Z 2022-04-28T23:08:42Z OWNER

That prototype is a very small amount of code so far:

diff --git a/datasette/renderer.py b/datasette/renderer.py
index 4508949..b600e1b 100644
--- a/datasette/renderer.py
+++ b/datasette/renderer.py
@@ -28,6 +28,10 @@ def convert_specific_columns_to_json(rows, columns, json_cols):

 def json_renderer(args, data, view_name):
     """Render a response as JSON"""
+    from pprint import pprint
+
+    pprint(data)
+
     status_code = 200

     # Handle the _json= parameter which may modify data["rows"]
@@ -43,6 +47,41 @@ def json_renderer(args, data, view_name):
     if "rows" in data and not value_as_boolean(args.get("_json_infinity", "0")):
         data["rows"] = [remove_infinites(row) for row in data["rows"]]

+    # Start building the default JSON here
+    columns = data["columns"]
+    next_url = data.get("next_url")
+    output = {
+        "rows": [dict(zip(columns, row)) for row in data["rows"]],
+        "next": data["next"],
+        "next_url": next_url,
+    }
+
+    extras = set(args.getlist("_extra"))
+
+    extras_map = {
+        # _extra=   :    data[field]
+        "count": "filtered_table_rows_count",
+        "facet_results": "facet_results",
+        "suggested_facets": "suggested_facets",
+        "columns": "columns",
+        "primary_keys": "primary_keys",
+        "query_ms": "query_ms",
+        "query": "query",
+    }
+    for extra_key, data_key in extras_map.items():
+        if extra_key in extras:
+            output[extra_key] = data[data_key]
+
+    body = json.dumps(output, cls=CustomJSONEncoder)
+    content_type = "application/json; charset=utf-8"
+    headers = {}
+    if next_url:
+        headers["link"] = f'<{next_url}>; rel="next"'
+    return Response(
+        body, status=status_code, headers=headers, content_type=content_type
+    )
+
+
     # Deal with the _shape option
     shape = args.get("_shape", "arrays")
     # if there's an error, ignore the shape entirely
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1112732563 https://github.com/simonw/datasette/issues/1729#issuecomment-1112732563 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CUvOT simonw 9599 2022-04-28T23:05:03Z 2022-04-28T23:05:03Z OWNER

OK, the prototype of this is looking really good - it's very pleasant to use.

http://127.0.0.1:8001/github_memory/issue_comments.json?_search=simon&_sort=id&_size=5&_extra=query_ms&_extra=count&_col=body returns this:

{
  "rows": [
    {
      "id": 338854988,
      "body": "    /database-name/table-name?name__contains=simon&sort=id+desc\r\n\r\nNote that if there's a column called \"sort\" you can still do sort__exact=blah\r\n\r\n"
    },
    {
      "id": 346427794,
      "body": "Thanks. There is a way to use pip to grab apsw, which also let's you configure it (flags to build extensions, use an internal sqlite, etc). Don't know how that works as a dependency for another package, though.\n\nOn November 22, 2017 11:38:06 AM EST, Simon Willison <notifications@github.com> wrote:\n>I have a solution for FTS already, but I'm interested in apsw as a\n>mechanism for allowing custom virtual tables to be written in Python\n>(pysqlite only lets you write custom functions)\n>\n>Not having PyPI support is pretty tough though. I'm planning a\n>plugin/extension system which would be ideal for things like an\n>optional apsw mode, but that's a lot harder if apsw isn't in PyPI.\n>\n>-- \n>You are receiving this because you authored the thread.\n>Reply to this email directly or view it on GitHub:\n>https://github.com/simonw/datasette/issues/144#issuecomment-346405660\n"
    },
    {
      "id": 348252037,
      "body": "WOW!\n\n--\nPaul Ford // (646) 369-7128 // @ftrain\n\nOn Thu, Nov 30, 2017 at 11:47 AM, Simon Willison <notifications@github.com>\nwrote:\n\n> Remaining work on this now lives in a milestone:\n> https://github.com/simonw/datasette/milestone/6\n>\n> —\n> You are receiving this because you were mentioned.\n> Reply to this email directly, view it on GitHub\n> <https://github.com/simonw/datasette/issues/153#issuecomment-348248406>,\n> or mute the thread\n> <https://github.com/notifications/unsubscribe-auth/AABPKHzaVPKwTOoHouK2aMUnM-mPnPk6ks5s7twzgaJpZM4Qq2zW>\n> .\n>\n"
    },
    {
      "id": 391141391,
      "body": "I'm going to clean this up for consistency tomorrow morning so hold off\nmerging until then please\n\nOn Tue, May 22, 2018 at 6:34 PM, Simon Willison <notifications@github.com>\nwrote:\n\n> Yeah let's try this without pysqlite3 and see if we still get the correct\n> version.\n>\n> —\n> You are receiving this because you authored the thread.\n> Reply to this email directly, view it on GitHub\n> <https://github.com/simonw/datasette/pull/280#issuecomment-391076458>, or mute\n> the thread\n> <https://github.com/notifications/unsubscribe-auth/AAihfMI-H6CBt-Py0xdBbH2xDK0KsjT2ks5t1EwYgaJpZM4UI_2m>\n> .\n>\n"
    },
    {
      "id": 391355030,
      "body": "No objections;\r\nIt's good to go @simonw\r\n\r\nOn Wed, 23 May 2018, 14:51 Simon Willison, <notifications@github.com> wrote:\r\n\r\n> @r4vi <https://github.com/r4vi> any objections to me merging this?\r\n>\r\n> —\r\n> You are receiving this because you were mentioned.\r\n> Reply to this email directly, view it on GitHub\r\n> <https://github.com/simonw/datasette/pull/280#issuecomment-391354237>, or mute\r\n> the thread\r\n> <https://github.com/notifications/unsubscribe-auth/AAihfM_2DN5WR2mkO-VK6ozDmkUQ4IMjks5t1WlcgaJpZM4UI_2m>\r\n> .\r\n>\r\n"
    }
  ],
  "next": "391355030,391355030",
  "next_url": "http://127.0.0.1:8001/github_memory/issue_comments.json?_search=simon&_size=5&_extra=query_ms&_extra=count&_col=body&_next=391355030%2C391355030&_sort=id",
  "count": 57,
  "query_ms": 21.780223003588617
}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1112730416 https://github.com/simonw/datasette/issues/1729#issuecomment-1112730416 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CUusw simonw 9599 2022-04-28T23:01:21Z 2022-04-28T23:01:21Z OWNER

I'm not sure what to do about the "truncated": true/false key.

It's not really relevant to table results, since they are paginated whether or not you ask for them to be.

It plays a role in query results, where you might run select * from table and get back 1000 results because Datasette truncates at that point rather than returning everything.

Adding it to every table result and always setting it to "truncated": false feels confusing.

I think I'm going to keep it exclusively in the default representation for the /db?sql=... query endpoint, and not return it at all for tables.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1112721321 https://github.com/simonw/datasette/issues/1729#issuecomment-1112721321 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CUsep simonw 9599 2022-04-28T22:44:05Z 2022-04-28T22:44:14Z OWNER

I may be able to implement this mostly in the json_renderer() function: https://github.com/simonw/datasette/blob/94a3171b01fde5c52697aeeff052e3ad4bab5391/datasette/renderer.py#L29-L34

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1112717745 https://github.com/simonw/datasette/issues/1729#issuecomment-1112717745 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CUrmx simonw 9599 2022-04-28T22:38:39Z 2022-04-28T22:39:05Z OWNER

(I remain keen on the idea of shipping a plugin that restores the old default API shape to people who have written pre-Datasette-1.0 code against it, but I'll tackle that much later. I really like how jQuery has a culture of doing this.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1112717210 https://github.com/simonw/datasette/issues/1729#issuecomment-1112717210 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CUrea simonw 9599 2022-04-28T22:37:37Z 2022-04-28T22:37:37Z OWNER

This means filtered_table_rows_count is going to become count. I had originally picked that terrible name to avoid confusion between the count of all rows in the table and the count of rows that were filtered.

I'll add ?_extra=table_count for getting back the full table count instead. I think count is clear enough!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1112716611 https://github.com/simonw/datasette/issues/1729#issuecomment-1112716611 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CUrVD simonw 9599 2022-04-28T22:36:24Z 2022-04-28T22:36:24Z OWNER

Then I'm going to implement the following ?_extra= options:

  • ?_extra=facet_results - to see facet results
  • ?_extra=suggested_facets - for suggested facets
  • ?_extra=count - for the count of total rows
  • ?_extra=columns - for a list of column names
  • ?_extra=primary_keys - for a list of primary keys
  • ?_extra=query - a {"sql" "select ...", "params": {}} object

I thought about having ?_extra=facet_results returned automatically if the user specifies at least one ?_facet - but that doesn't work for default facets configured in metadata.json - how can the user opt out of those being returned? So I'm going to say you don't see facets at all if you don't include ?_extra=facet_results.

I'm tempted to add ?_extra=_all to return everything, but I can decide if that's a good idea later.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1112713581 https://github.com/simonw/datasette/issues/1729#issuecomment-1112713581 https://api.github.com/repos/simonw/datasette/issues/1729 IC_kwDOBm6k_c5CUqlt simonw 9599 2022-04-28T22:31:11Z 2022-04-28T22:31:11Z OWNER

I'm going to change the default API response to look like this:

{
  "rows": [
    {
      "pk": 1,
      "created": "2019-01-14 08:00:00",
      "planet_int": 1,
      "on_earth": 1,
      "state": "CA",
      "_city_id": 1,
      "_neighborhood": "Mission",
      "tags": "[\"tag1\", \"tag2\"]",
      "complex_array": "[{\"foo\": \"bar\"}]",
      "distinct_some_null": "one",
      "n": "n1"
    },
    {
      "pk": 2,
      "created": "2019-01-14 08:00:00",
      "planet_int": 1,
      "on_earth": 1,
      "state": "CA",
      "_city_id": 1,
      "_neighborhood": "Dogpatch",
      "tags": "[\"tag1\", \"tag3\"]",
      "complex_array": "[]",
      "distinct_some_null": "two",
      "n": "n2"
    }
  ],
  "next": null,
  "next_url": null
}

Basically https://latest.datasette.io/fixtures/facetable.json?_shape=objects but with just the rows, next and next_url fields returned by default.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Implement ?_extra and new API design for TableView 1219385669  
1112711115 https://github.com/simonw/datasette/issues/1715#issuecomment-1112711115 https://api.github.com/repos/simonw/datasette/issues/1715 IC_kwDOBm6k_c5CUp_L simonw 9599 2022-04-28T22:26:56Z 2022-04-28T22:26:56Z OWNER

I'm not going to use asyncinject in this refactor - at least not until I really need it. My research in these issues has put me off the idea ( in favour of asyncio.gather() or even not trying for parallel execution at all):

  • 1727

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Refactor TableView to use asyncinject 1212823665  
1112668411 https://github.com/simonw/datasette/issues/1727#issuecomment-1112668411 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CUfj7 simonw 9599 2022-04-28T21:25:34Z 2022-04-28T21:25:44Z OWNER

The two most promising theories at the moment, from here and Twitter and the SQLite forum, are:

  • SQLite is I/O bound - it generally only goes as fast as it can load data from disk. Multiple connections all competing for the same file on disk are going to end up blocked at the file system layer. But maybe this means in-memory databases will perform better?
  • It's the GIL. The sqlite3 C code may release the GIL, but the bits that do things like assembling Row objects to return still happen in Python, and that Python can only run on a single core.

A couple of ways to research the in-memory theory:

  • Use a RAM disk on macOS (or Linux). https://stackoverflow.com/a/2033417/6083 has instructions - short version:

    hdiutil attach -nomount ram://$((2 * 1024 * 100))
    diskutil eraseVolume HFS+ RAMDisk name-returned-by-previous-command (was /dev/disk2 when I tried it)
    cd /Volumes/RAMDisk
    cp ~/fixtures.db .

  • Copy Datasette databases into an in-memory database on startup. I built a new plugin to do that here: https://github.com/simonw/datasette-copy-to-memory

I need to do some more, better benchmarks using these different approaches.

https://twitter.com/laurencerowe/status/1519780174560169987 also suggests:

Maybe try:
1. Copy the sqlite file to /dev/shm and rerun (all in ram.)
2. Create a CTE which calculates Fibonacci or similar so you can test something completely cpu bound (only return max value or something to avoid crossing between sqlite/Python.)

I like that second idea a lot - I could use the mandelbrot example from https://www.sqlite.org/lang_with.html#outlandish_recursive_query_examples

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1111726586 https://github.com/simonw/datasette/issues/1727#issuecomment-1111726586 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CQ5n6 simonw 9599 2022-04-28T04:17:16Z 2022-04-28T04:19:31Z OWNER

I could experiment with the await asyncio.run_in_executor(processpool_executor, fn) mechanism described in https://stackoverflow.com/a/29147750

Code examples: https://cs.github.com/?scopeName=All+repos&scope=&q=run_in_executor+ProcessPoolExecutor

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1111725638 https://github.com/simonw/datasette/issues/1727#issuecomment-1111725638 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CQ5ZG simonw 9599 2022-04-28T04:15:15Z 2022-04-28T04:15:15Z OWNER

Useful theory from Keith Medcalf https://sqlite.org/forum/forumpost/e363c69d3441172e

This is true, but the concurrency is limited to the execution which occurs with the GIL released (that is, in the native C sqlite3 library itself). Each row (for example) can be retrieved in parallel but "constructing the python return objects for each row" will be serialized (by the GIL).

That is to say that if your have two python threads each with their own connection, and each one is performing a select that returns 1,000,000 rows (lets say that is 25% of the candidates for each select) then the difference in execution time between executing two python threads in parallel vs a single serial thead will not be much different (if even detectable at all). In fact it is possible that the multiple-threaded version takes longer to run both queries to completion because of the increased contention over a shared resource (the GIL).

So maybe this is a GIL thing.

I should test with some expensive SQL queries (maybe big aggregations against large tables) and see if I can spot an improvement there.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1111714665 https://github.com/simonw/datasette/issues/1728#issuecomment-1111714665 https://api.github.com/repos/simonw/datasette/issues/1728 IC_kwDOBm6k_c5CQ2tp simonw 9599 2022-04-28T03:52:47Z 2022-04-28T03:52:58Z OWNER

Nice custom template/theme!

Yeah, for that I'd recommend hosting elsewhere - on a regular VPS (I use systemd like this: https://docs.datasette.io/en/stable/deploying.html#running-datasette-using-systemd ) or using Fly if you want to tub containers without managing a full server.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Writable canned queries fail with useless non-error against immutable databases 1218133366  
1111708206 https://github.com/simonw/datasette/issues/1728#issuecomment-1111708206 https://api.github.com/repos/simonw/datasette/issues/1728 IC_kwDOBm6k_c5CQ1Iu simonw 9599 2022-04-28T03:38:56Z 2022-04-28T03:38:56Z OWNER

In terms of this bug, there are a few potential fixes:

  1. Detect the write to a immutable database and show the user a proper, meaningful error message in the red error box at the top of the page
  2. Don't allow the user to even submit the form - show a message saying that this canned query is unavailable because the database cannot be written to
  3. Don't even allow Datasette to start running at all - if there's a canned query configured in metadata.yml and the database it refers to is in -i immutable mode throw an error on startup

I'm not keen on that last one because it would be frustrating if you couldn't launch Datasette just because you had an old canned query lying around in your metadata file.

So I'm leaning towards option 2.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Writable canned queries fail with useless non-error against immutable databases 1218133366  
1111707384 https://github.com/simonw/datasette/issues/1728#issuecomment-1111707384 https://api.github.com/repos/simonw/datasette/issues/1728 IC_kwDOBm6k_c5CQ074 simonw 9599 2022-04-28T03:36:46Z 2022-04-28T03:36:56Z OWNER

A more realistic solution (which I've been using on several of my own projects) is to keep the data itself in GitHub and encourage users to edit it there - using the GitHub web interface to edit YAML files or similar.

Needs your users to be comfortable hand-editing YAML though! You can at least guard against critical errors by having CI run tests against their YAML before deploying.

I have a dream of building a more friendly web forms interface which edits the YAML back on GitHub for the user, but that's just a concept at the moment.

Even more fun would be if a user-friendly form could submit PRs for review without the user having to know what a PR is!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Writable canned queries fail with useless non-error against immutable databases 1218133366  
1111706519 https://github.com/simonw/datasette/issues/1728#issuecomment-1111706519 https://api.github.com/repos/simonw/datasette/issues/1728 IC_kwDOBm6k_c5CQ0uX simonw 9599 2022-04-28T03:34:49Z 2022-04-28T03:34:49Z OWNER

I've wanted to do stuff like that on Cloud Run too. So far I've assumed that it's not feasible, but recently I've been wondering how hard it would be to have a small (like less than 100KB or so) Datasette instance which persists data to a backing GitHub repository such that when it starts up it can pull the latest copy and any time someone edits it can push their changes.

I'm still not sure it would work well on Cloud Run due to the uncertainty at what would happen if Cloud Run decided to boot up a second instance - but it's still an interesting thought exercise.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Writable canned queries fail with useless non-error against immutable databases 1218133366  
1111705069 https://github.com/simonw/datasette/issues/1728#issuecomment-1111705069 https://api.github.com/repos/simonw/datasette/issues/1728 IC_kwDOBm6k_c5CQ0Xt simonw 9599 2022-04-28T03:31:33Z 2022-04-28T03:31:33Z OWNER

Confirmed - this is a bug where immutable databases fail to show a useful error if you write to them with a canned query.

Steps to reproduce:

echo '
databases:
  writable:
    queries:
      add_name:
        sql: insert into names(name) values (:name)
        write: true
' > write-metadata.yml
echo '{"name": "Simon"}' | sqlite-utils insert writable.db names -
datasette writable.db -m write-metadata.yml

Then visit http://127.0.0.1:8001/writable/add_name - adding names works.

Now do this instead:

datasette -i writable.db -m write-metadata.yml

And I'm getting a broken error:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Writable canned queries fail with useless non-error against immutable databases 1218133366  
1111699175 https://github.com/simonw/datasette/issues/1727#issuecomment-1111699175 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CQy7n simonw 9599 2022-04-28T03:19:48Z 2022-04-28T03:20:08Z OWNER

I ran py-spy and then hammered refresh a bunch of times on the http://127.0.0.1:8856/github/commits?_facet=repo&_facet=committer&_trace=1&_noparallel= page - it generated this SVG profile for me.

The area on the right is the threads running the DB queries:

Interactive version here: https://static.simonwillison.net/static/2022/datasette-parallel-profile.svg

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1111698307 https://github.com/simonw/datasette/issues/1728#issuecomment-1111698307 https://api.github.com/repos/simonw/datasette/issues/1728 IC_kwDOBm6k_c5CQyuD simonw 9599 2022-04-28T03:18:02Z 2022-04-28T03:18:02Z OWNER

If the behaviour you are seeing is because the database is running in immutable mode then that's a bug - you should get a useful error message instead!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Writable canned queries fail with useless non-error against immutable databases 1218133366  
1111697985 https://github.com/simonw/datasette/issues/1728#issuecomment-1111697985 https://api.github.com/repos/simonw/datasette/issues/1728 IC_kwDOBm6k_c5CQypB simonw 9599 2022-04-28T03:17:20Z 2022-04-28T03:17:20Z OWNER

How did you deploy to Cloud Run?

datasette publish cloudrun defaults to running databases there in -i immutable mode, because if you managed to change a file on disk on Cloud Run those changes would be lost the next time your container restarted there.

That's why I upgraded datasette-publish-fly to provide a way of working with their volumes support - they're the best option I know of right now for running Datasette in a container with a persistent volume that can accept writes: https://simonwillison.net/2022/Feb/15/fly-volumes/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Writable canned queries fail with useless non-error against immutable databases 1218133366  
1111683539 https://github.com/simonw/datasette/issues/1727#issuecomment-1111683539 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CQvHT simonw 9599 2022-04-28T02:47:57Z 2022-04-28T02:47:57Z OWNER

Maybe this is the Python GIL after all?

I've been hoping that the GIL won't be an issue because the sqlite3 module releases the GIL for the duration of the execution of a SQL query - see https://github.com/python/cpython/blob/f348154c8f8a9c254503306c59d6779d4d09b3a9/Modules/_sqlite/cursor.c#L749-L759

So I've been hoping this means that SQLite code itself can run concurrently on multiple cores even when Python threads cannot.

But maybe I'm misunderstanding how that works?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1111681513 https://github.com/simonw/datasette/issues/1727#issuecomment-1111681513 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CQunp simonw 9599 2022-04-28T02:44:26Z 2022-04-28T02:44:26Z OWNER

I could try py-spy top, which I previously used here:
- https://github.com/simonw/datasette/issues/1673

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1111661331 https://github.com/simonw/datasette/issues/1727#issuecomment-1111661331 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CQpsT simonw 9599 2022-04-28T02:07:31Z 2022-04-28T02:07:31Z OWNER

Asked on the SQLite forum about this here: https://sqlite.org/forum/forumpost/ffbfa9f38e

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1111602802 https://github.com/simonw/datasette/issues/1727#issuecomment-1111602802 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CQbZy simonw 9599 2022-04-28T00:21:35Z 2022-04-28T00:21:35Z OWNER

Tried this but I'm getting back an empty JSON array of traces at the bottom of the page most of the time (intermittently it works correctly):

diff --git a/datasette/database.py b/datasette/database.py
index ba594a8..d7f9172 100644
--- a/datasette/database.py
+++ b/datasette/database.py
@@ -7,7 +7,7 @@ import sys
 import threading
 import uuid

-from .tracer import trace
+from .tracer import trace, trace_child_tasks
 from .utils import (
     detect_fts,
     detect_primary_keys,
@@ -207,30 +207,31 @@ class Database:
                 time_limit_ms = custom_time_limit

             with sqlite_timelimit(conn, time_limit_ms):
-                try:
-                    cursor = conn.cursor()
-                    cursor.execute(sql, params if params is not None else {})
-                    max_returned_rows = self.ds.max_returned_rows
-                    if max_returned_rows == page_size:
-                        max_returned_rows += 1
-                    if max_returned_rows and truncate:
-                        rows = cursor.fetchmany(max_returned_rows + 1)
-                        truncated = len(rows) > max_returned_rows
-                        rows = rows[:max_returned_rows]
-                    else:
-                        rows = cursor.fetchall()
-                        truncated = False
-                except (sqlite3.OperationalError, sqlite3.DatabaseError) as e:
-                    if e.args == ("interrupted",):
-                        raise QueryInterrupted(e, sql, params)
-                    if log_sql_errors:
-                        sys.stderr.write(
-                            "ERROR: conn={}, sql = {}, params = {}: {}\n".format(
-                                conn, repr(sql), params, e
+                with trace("sql", database=self.name, sql=sql.strip(), params=params):
+                    try:
+                        cursor = conn.cursor()
+                        cursor.execute(sql, params if params is not None else {})
+                        max_returned_rows = self.ds.max_returned_rows
+                        if max_returned_rows == page_size:
+                            max_returned_rows += 1
+                        if max_returned_rows and truncate:
+                            rows = cursor.fetchmany(max_returned_rows + 1)
+                            truncated = len(rows) > max_returned_rows
+                            rows = rows[:max_returned_rows]
+                        else:
+                            rows = cursor.fetchall()
+                            truncated = False
+                    except (sqlite3.OperationalError, sqlite3.DatabaseError) as e:
+                        if e.args == ("interrupted",):
+                            raise QueryInterrupted(e, sql, params)
+                        if log_sql_errors:
+                            sys.stderr.write(
+                                "ERROR: conn={}, sql = {}, params = {}: {}\n".format(
+                                    conn, repr(sql), params, e
+                                )
                             )
-                        )
-                        sys.stderr.flush()
-                    raise
+                            sys.stderr.flush()
+                        raise

             if truncate:
                 return Results(rows, truncated, cursor.description)
@@ -238,9 +239,8 @@ class Database:
             else:
                 return Results(rows, False, cursor.description)

-        with trace("sql", database=self.name, sql=sql.strip(), params=params):
-            results = await self.execute_fn(sql_operation_in_thread)
-        return results
+        with trace_child_tasks():
+            return await self.execute_fn(sql_operation_in_thread)

     @property
     def size(self):
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1111597176 https://github.com/simonw/datasette/issues/1727#issuecomment-1111597176 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CQaB4 simonw 9599 2022-04-28T00:11:44Z 2022-04-28T00:11:44Z OWNER

Though it would be interesting to also have the trace reveal how much time is spent in the functions that wrap that core SQL - the stuff that is being measured at the moment.

I have a hunch that this could help solve the over-arching performance mystery.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1111595319 https://github.com/simonw/datasette/issues/1727#issuecomment-1111595319 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CQZk3 simonw 9599 2022-04-28T00:09:45Z 2022-04-28T00:11:01Z OWNER

Here's where read queries are instrumented: https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L241-L242

So the instrumentation is actually capturing quite a bit of Python activity before it gets to SQLite:

https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L179-L190

And then:

https://github.com/simonw/datasette/blob/7a6654a253dee243518dc542ce4c06dbb0d0801d/datasette/database.py#L204-L233

Ideally I'd like that trace() block to wrap just the cursor.execute() and cursor.fetchmany(...) or cursor.fetchall() calls.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1111558204 https://github.com/simonw/datasette/issues/1727#issuecomment-1111558204 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CQQg8 simonw 9599 2022-04-27T22:58:39Z 2022-04-27T22:58:39Z OWNER

I should check my timing mechanism. Am I capturing the time taken just in SQLite or does it include time spent in Python crossing between async and threaded world and waiting for a thread pool worker to become available?

That could explain the longer query times.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  
1111553029 https://github.com/simonw/datasette/issues/1727#issuecomment-1111553029 https://api.github.com/repos/simonw/datasette/issues/1727 IC_kwDOBm6k_c5CQPQF simonw 9599 2022-04-27T22:48:21Z 2022-04-27T22:48:21Z OWNER

I wonder if it would be worth exploring multiprocessing here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: demonstrate if parallel SQL queries are worthwhile 1217759117  

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 288.146ms · About: github-to-sqlite