181 rows where author_association = "CONTRIBUTOR" sorted by updated_at descending

View and edit SQL

Suggested facets: reactions, created_at (date), updated_at (date)

author_association

  • CONTRIBUTOR · 181
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
754619930 https://github.com/simonw/datasette/issues/1167#issuecomment-754619930 https://api.github.com/repos/simonw/datasette/issues/1167 MDEyOklzc3VlQ29tbWVudDc1NDYxOTkzMA== benpickles 3637 2021-01-05T12:57:57Z 2021-01-05T12:57:57Z CONTRIBUTOR

Not sure where exactly to put the actual docs (presumably somewhere in docs/contributing.rst) but I've made a slight change to make it easier to run locally (copying the approach in excalidraw): https://github.com/simonw/datasette/compare/main...benpickles:prettier-docs

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add Prettier to contributing documentation 777145954  
647922203 https://github.com/simonw/datasette/issues/859#issuecomment-647922203 https://api.github.com/repos/simonw/datasette/issues/859 MDEyOklzc3VlQ29tbWVudDY0NzkyMjIwMw== abdusco 3243482 2020-06-23T05:44:58Z 2021-01-05T08:22:43Z CONTRIBUTOR

I'm seeing the problem on database page. Index page and table page runs quite fast.

  • Tables have <10 columns (id, url, title, body_html, date, author, meta (for keeping unstructured json)). I've added index on date columns (using sqlite-utils) in addition to the index present on id columns.
  • All tables have FTS enabled on text and varchar columns (title, body_html etc) to speed up searching.
  • There are couple of tables related with foreign keys (think a thread in a forum and posts in that thread, related with thread_id)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
754007242 https://github.com/simonw/datasette/issues/1169#issuecomment-754007242 https://api.github.com/repos/simonw/datasette/issues/1169 MDEyOklzc3VlQ29tbWVudDc1NDAwNzI0Mg== benpickles 3637 2021-01-04T14:29:57Z 2021-01-04T14:29:57Z CONTRIBUTOR

I somewhat share your reluctance to add a package.json to seemingly every project out there but ultimately if they're project dependencies it's important they're managed within the codebase.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Prettier package not actually being cached 777677671  
754004715 https://github.com/simonw/datasette/pull/1170#issuecomment-754004715 https://api.github.com/repos/simonw/datasette/issues/1170 MDEyOklzc3VlQ29tbWVudDc1NDAwNDcxNQ== benpickles 3637 2021-01-04T14:25:44Z 2021-01-04T14:25:44Z CONTRIBUTOR

I was going to re-add the filter to only run Prettier when there have been changes in datasette/static but that would mean it wouldn't run when the package is updated. That plus the fact that the last run of the job took only 8 seconds is why I decided not to re-add the filter.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Install Prettier via package.json 778126516  
753531657 https://github.com/simonw/datasette/issues/1012#issuecomment-753531657 https://api.github.com/repos/simonw/datasette/issues/1012 MDEyOklzc3VlQ29tbWVudDc1MzUzMTY1Nw== bollwyvl 45380 2021-01-02T21:25:36Z 2021-01-02T21:25:36Z CONTRIBUTOR

Actually, on more research, I found out this is handled by the trove-classifiers package now, so it's just a one-liner pr instead of fire-up-a-docker-container-and-do-some-migrations

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
For 1.0 update trove classifier in setup.py 718540751  
752098906 https://github.com/simonw/datasette/issues/417#issuecomment-752098906 https://api.github.com/repos/simonw/datasette/issues/417 MDEyOklzc3VlQ29tbWVudDc1MjA5ODkwNg== psychemedia 82988 2020-12-29T14:34:30Z 2020-12-29T14:34:50Z CONTRIBUTOR

FWIW, I had a look at watchdog for a datasette powered Jupyter notebook search tool: https://github.com/ouseful-testing/nbsearch/blob/main/nbsearch/nbwatchdog.py

Not a production thing, just an experiment trying to explore what might be possible...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Datasette Library 421546944  
750389683 https://github.com/simonw/datasette/pull/1158#issuecomment-750389683 https://api.github.com/repos/simonw/datasette/issues/1158 MDEyOklzc3VlQ29tbWVudDc1MDM4OTY4Mw== eumiro 6774676 2020-12-23T17:02:50Z 2020-12-23T17:02:50Z CONTRIBUTOR

The dict/set suggestion comes from pyupgrade --py36-plus, but then had to black the change.

The rest comes from PyCharm's Inspect code function. I reviewed all the suggestions and fixed a thing or two, such as leading/trailing spaces in the docstrings or turned around the chained conditions.

Then I tried to convert all os.path/glob/open to Path, but there were some local test issues, so I'll have to start over in smaller chunks if you want to have that too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Modernize code to Python 3.6+ 773913793  
748562330 https://github.com/dogsheep/dogsheep-photos/pull/31#issuecomment-748562330 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/31 MDEyOklzc3VlQ29tbWVudDc0ODU2MjMzMA== RhetTbull 41546558 2020-12-20T04:45:08Z 2020-12-20T04:45:08Z CONTRIBUTOR
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Update for Big Sur 771511344  
748562288 https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748562288 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 MDEyOklzc3VlQ29tbWVudDc0ODU2MjI4OA== RhetTbull 41546558 2020-12-20T04:44:22Z 2020-12-20T04:44:22Z CONTRIBUTOR

@nickvazz @simonw I opened a PR that replaces the SQL for ZCOMPUTEDASSETATTRIBUTES to use osxphotos which now exposes all this data and has been updated for Big Sur. I did regression tests to confirm the extracted data is identical, with one exception which should not affect operation: the old code pulled data from ZCOMPUTEDASSETATTRIBUTES for missing photos while the main loop ignores missing photos and does not add them to apple_photos. The new code does not add rows to the apple_photos_scores table for missing photos.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767  
748436779 https://github.com/dogsheep/dogsheep-photos/issues/15#issuecomment-748436779 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/15 MDEyOklzc3VlQ29tbWVudDc0ODQzNjc3OQ== RhetTbull 41546558 2020-12-19T07:49:00Z 2020-12-19T07:49:00Z CONTRIBUTOR

@nickvazz ZGENERICASSET changed to ZASSET in Big Sur. Here's a list of other changes to the schema in Big Sur: https://github.com/RhetTbull/osxphotos/wiki/Changes-in-Photos-6---Big-Sur

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Expose scores from ZCOMPUTEDASSETATTRIBUTES 612151767  
748305976 https://github.com/simonw/datasette/issues/493#issuecomment-748305976 https://api.github.com/repos/simonw/datasette/issues/493 MDEyOklzc3VlQ29tbWVudDc0ODMwNTk3Ng== jefftriplett 50527 2020-12-18T20:34:39Z 2020-12-18T20:34:39Z CONTRIBUTOR

I can't keep up with the renaming contexts, but I like having the ability to run datasette+ datasette-ripgrep against different configs:

datasette serve --metadata=./metadata.json

I have one for all of my code and one per client who has lots of code. So as long as I can point to datasette to something, it's easy to work with.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Rename metadata.json to config.json 449886319  
738907852 https://github.com/simonw/datasette/pull/1130#issuecomment-738907852 https://api.github.com/repos/simonw/datasette/issues/1130 MDEyOklzc3VlQ29tbWVudDczODkwNzg1Mg== abdusco 3243482 2020-12-04T17:22:29Z 2020-12-04T17:31:25Z CONTRIBUTOR

EDIT: I misunderstood the problem. This seems like a fix better suited for Safari. But I don't have any Apple device to test it.

body {
  min-height: 100vh;
  min-height: -webkit-fill-available;
}
html {
  height: -webkit-fill-available;
}

https://css-tricks.com/css-fix-for-100vh-in-mobile-webkit/


It's actually not that difficult to fix.
Well, this is actually a workaround to keep viewport in place.

I usually put a transition (forgot to do it here) that keeps page from resizing.

.container {
  min-height: 100vh;
  transition: height 10000s steps(0);
}

steps() function prevents excessive layout calculations, and lets the page snap back into place (10000s ~= 3h later) in a single step.
This fix also prevents page from jumping around when the keyboard pops up and down.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix footer not sticking to bottom in short pages 756876238  
736322290 https://github.com/simonw/datasette/issues/1111#issuecomment-736322290 https://api.github.com/repos/simonw/datasette/issues/1111 MDEyOklzc3VlQ29tbWVudDczNjMyMjI5MA== abdusco 3243482 2020-12-01T08:54:47Z 2020-12-01T08:54:47Z CONTRIBUTOR

Somewhat related: https://github.com/simonw/datasette/issues/859
I fixed the issue with forking and disabling the counts for hidden tables.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Accessing a database's `.json` is slow for very large SQLite files 751195017  
735281577 https://github.com/simonw/datasette/issues/493#issuecomment-735281577 https://api.github.com/repos/simonw/datasette/issues/493 MDEyOklzc3VlQ29tbWVudDczNTI4MTU3Nw== jefftriplett 50527 2020-11-28T19:39:53Z 2020-11-28T19:39:53Z CONTRIBUTOR

I was confused by --config and I tried passing the json from datasette-ripgrep into config.json just as a wild guess.

A short term solution might be pointing out in plugins that their snippet json can go in metadata.json at least makes it easier to search for config options or to know where to start if someone is new.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Rename metadata.json to config.json 449886319  
735279355 https://github.com/simonw/datasette/pull/1112#issuecomment-735279355 https://api.github.com/repos/simonw/datasette/issues/1112 MDEyOklzc3VlQ29tbWVudDczNTI3OTM1NQ== jefftriplett 50527 2020-11-28T19:21:09Z 2020-11-28T19:21:09Z CONTRIBUTOR

(Even more annoying is that I see my editor leaked an extra delete space at the end of the line. I'm happy to rebuild this to be less annoying, but you probably don't want the changelog update either way)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix --metadata doc usage 752749485  
720354227 https://github.com/simonw/datasette/issues/838#issuecomment-720354227 https://api.github.com/repos/simonw/datasette/issues/838 MDEyOklzc3VlQ29tbWVudDcyMDM1NDIyNw== psychemedia 82988 2020-11-02T09:33:58Z 2020-11-02T09:33:58Z CONTRIBUTOR

Thanks; just a note that the datasette.urls.static(path) and datasette.urls.static_plugins(plugin_name, path) items both seem to be repeated and appear in the docs twice?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Incorrect URLs when served behind a proxy with base_url set 637395097  
718528252 https://github.com/simonw/datasette/pull/1049#issuecomment-718528252 https://api.github.com/repos/simonw/datasette/issues/1049 MDEyOklzc3VlQ29tbWVudDcxODUyODI1Mg== psychemedia 82988 2020-10-29T09:20:34Z 2020-10-29T09:20:34Z CONTRIBUTOR

That workaround is probably fine. I was trying to work out whether there might be other situations where a pre-external package load might be useful but couldn't offhand bring any other examples to mind. The static plugins option also looks interesting.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add template block prior to extra URL loaders 729017519  
717359145 https://github.com/simonw/sqlite-utils/pull/189#issuecomment-717359145 https://api.github.com/repos/simonw/sqlite-utils/issues/189 MDEyOklzc3VlQ29tbWVudDcxNzM1OTE0NQ== adamwolf 35681 2020-10-27T16:20:32Z 2020-10-27T16:20:32Z CONTRIBUTOR

No problem. I added a test. Let me know if it looks sufficient or if you want me to to tweak something!

If you don't mind, would you tag this PR as "hacktoberfest-accepted"? If you do mind, no problem and I'm sorry for asking :) My kiddos like the shirts.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow iterables other than Lists in m2m records 729818242  
716237524 https://github.com/simonw/datasette/pull/1043#issuecomment-716237524 https://api.github.com/repos/simonw/datasette/issues/1043 MDEyOklzc3VlQ29tbWVudDcxNjIzNzUyNA== bollwyvl 45380 2020-10-26T00:14:57Z 2020-10-26T00:14:57Z CONTRIBUTOR

Sorry, I was out of the loop this weekend. The missing sdists were in some the datasette-* plugins... i'll capture my findings more concretely in one spot when i have a chance...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Include LICENSE in sdist 727915394  
716123598 https://github.com/simonw/datasette/issues/838#issuecomment-716123598 https://api.github.com/repos/simonw/datasette/issues/838 MDEyOklzc3VlQ29tbWVudDcxNjEyMzU5OA== psychemedia 82988 2020-10-25T10:20:12Z 2020-10-25T10:53:24Z CONTRIBUTOR

I'm trying to run something behind a MyBinder proxy, but seem to have something set up incorrectly and not sure what the fix is?

I'm starting datasette with jupyter-server-proxy setup:

# __init__.py
def setup_nbsearch():

    return {
        "command": [
            "datasette",
            "serve",
            f"{_NBSEARCH_DB_PATH}",
            "-p",
            "{port}",
            "--config",
            "base_url:{base_url}nbsearch/"
        ],
        "absolute_url": True,
        # The following needs a the labextension installing.
        # eg in postBuild: jupyter labextension install jupyterlab-server-proxy
        "launcher_entry": {
            "enabled": True,
            "title": "nbsearch",
        },
    }

where the base_url gets automatically populated by the server-proxy. I define the loaders as:

# __init__.py
from datasette import hookimpl

@hookimpl
def extra_css_urls(database, table, columns, view_name, datasette):
    return [
        "/-/static-plugins/nbsearch/prism.css",
        "/-/static-plugins/nbsearch/nbsearch.css",
    ]

but these seem to also need a base_url prefix set somehow?

Currently, the generated HTML loads properly but internal links are incorrect; eg they take the form <link rel="stylesheet" href="/-/static-plugins/nbsearch/prism.css"> which resolves to eg https://notebooks.gesis.org/hub/-/static-plugins/nbsearch/prism.css rather than required URL of form https://notebooks.gesis.org/binder/jupyter/user/ouseful-testing-nbsearch-0fx1mx67/nbsearch/-/static-plugins/nbsearch/prism.css.

The main css is loaded correctly: <link rel="stylesheet" href="/binder/jupyter/user/ouseful-testing-nbsearch-0fx1mx67/nbsearch/-/static/app.css?404439">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Incorrect URLs when served behind a proxy with base_url set 637395097  
716066000 https://github.com/simonw/datasette/issues/1033#issuecomment-716066000 https://api.github.com/repos/simonw/datasette/issues/1033 MDEyOklzc3VlQ29tbWVudDcxNjA2NjAwMA== psychemedia 82988 2020-10-24T22:58:33Z 2020-10-24T22:58:33Z CONTRIBUTOR

From the docs, I note:

datasette.urls.instance()
Returns the URL to the Datasette instance root page. This is usually "/"

What about the proxy case? Eg if I am using jupyter-server-proxy on a MyBinder or local Jupyter notebook server site, https://example.com:PORT/weirdpath/datasette, what does datasette.urls.instance() refer to?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
datasette.urls.static_plugins(...) method 725099777  
714908859 https://github.com/simonw/datasette/issues/1012#issuecomment-714908859 https://api.github.com/repos/simonw/datasette/issues/1012 MDEyOklzc3VlQ29tbWVudDcxNDkwODg1OQ== bollwyvl 45380 2020-10-23T04:49:20Z 2020-10-23T04:49:20Z CONTRIBUTOR

Good luck on 1.0! It may also be worth lobbying for a Framework::Datasette::1.0 classifier. This would be a nice way to allow the ecosystem to self-document a bit more discoverably.

I was surprised to see the PR for Framework::Jupyter is a... database migration! Of course, there may be more workflow to it!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
For 1.0 update trove classifier in setup.py 718540751  
714657366 https://github.com/simonw/datasette/issues/1033#issuecomment-714657366 https://api.github.com/repos/simonw/datasette/issues/1033 MDEyOklzc3VlQ29tbWVudDcxNDY1NzM2Ng== psychemedia 82988 2020-10-22T17:51:29Z 2020-10-22T17:51:29Z CONTRIBUTOR

How does /-/static relate to current guidance docs around static regarding the --static option and metadata formulations such as "extra_js_urls": [ "/static/app.js"] (I've not managed to get this to work in a Jupyter server proxied set up; the datasette / jupyter server proxy repo may provide a useful test example, eg via MyBinder, for folk to crib from?)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
datasette.urls.static_plugins(...) method 725099777  
708520800 https://github.com/simonw/datasette/issues/1019#issuecomment-708520800 https://api.github.com/repos/simonw/datasette/issues/1019 MDEyOklzc3VlQ29tbWVudDcwODUyMDgwMA== jsfenfen 639012 2020-10-14T16:37:19Z 2020-10-14T16:37:19Z CONTRIBUTOR

🎉 Thanks so much @simonw ! 🎉

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Edit SQL" button on canned queries 721050815  
707326192 https://github.com/dogsheep/swarm-to-sqlite/pull/10#issuecomment-707326192 https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/10 MDEyOklzc3VlQ29tbWVudDcwNzMyNjE5Mg== mattiaborsoi 29426418 2020-10-12T20:20:02Z 2020-10-12T20:20:02Z CONTRIBUTOR

This closes issue #8

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Update utils.py to fix sqlite3.OperationalError 719637258  
704503719 https://github.com/dogsheep/github-to-sqlite/pull/48#issuecomment-704503719 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/48 MDEyOklzc3VlQ29tbWVudDcwNDUwMzcxOQ== adamjonas 755825 2020-10-06T19:26:59Z 2020-10-06T19:26:59Z CONTRIBUTOR

ref #46

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add pull requests 681228542  
688573964 https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688573964 https://api.github.com/repos/simonw/sqlite-utils/issues/146 MDEyOklzc3VlQ29tbWVudDY4ODU3Mzk2NA== simonwiles 96218 2020-09-08T01:55:07Z 2020-09-08T01:55:07Z CONTRIBUTOR

Okay, I've rewritten this PR to preserve the batching behaviour but still fix #145, and rebased the branch to account for the db.execute() api change. It's not terribly sophisticated -- if it attempts to insert a batch which has too many variables, the exception is caught, the batch is split in two and each half is inserted separately, and then it carries on as before with the same batch_size. In the edge case where this gets triggered, subsequent batches will all be inserted in two groups too if they continue to have the same number of columns (which is presumably reasonably likely). Do you reckon this is acceptable when set against the awkwardness of recalculating the batch_size on the fly?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Handle case where subsequent records (after first batch) include extra columns 688668680  
688481317 https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688481317 https://api.github.com/repos/simonw/sqlite-utils/issues/146 MDEyOklzc3VlQ29tbWVudDY4ODQ4MTMxNw== simonwiles 96218 2020-09-07T19:18:55Z 2020-09-07T19:18:55Z CONTRIBUTOR

Just force-pushed to update d042f9c with more formatting changes to satisfy black==20.8b1 and pass the GitHub Actions "Test" workflow.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Handle case where subsequent records (after first batch) include extra columns 688668680  
688479163 https://github.com/simonw/sqlite-utils/pull/146#issuecomment-688479163 https://api.github.com/repos/simonw/sqlite-utils/issues/146 MDEyOklzc3VlQ29tbWVudDY4ODQ3OTE2Mw== simonwiles 96218 2020-09-07T19:10:33Z 2020-09-07T19:11:57Z CONTRIBUTOR

@simonw -- I've gone ahead updated the documentation to reflect the changes introduced in this PR. IMO it's ready to merge now.

In writing the documentation changes, I begin to wonder about the value and role of batch_size at all, tbh. May I assume it was originally intended to prevent using the entire row set to determine columns and column types, and that this was a performance consideration? If so, this PR entirely undermines its purpose. I've been passing in excess of 500,000 rows at a time to insert_all() with these changes and although I'm sure the performance difference is measurable it's not really noticeable; given #145, I don't know that any performance advantages outweigh the problems doing it this way removes. What do you think about just dropping the argument and defaulting to the maximum batch_size permissible given SQLITE_MAX_VARS? Are there other reasons one might want to restrict batch_size that I've overlooked? I could open a new issue to discuss/implement this.

Of course the documentation will need to change again too if/when something is done about #147.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Handle case where subsequent records (after first batch) include extra columns 688668680  
686061028 https://github.com/simonw/datasette/pull/952#issuecomment-686061028 https://api.github.com/repos/simonw/datasette/issues/952 MDEyOklzc3VlQ29tbWVudDY4NjA2MTAyOA== dependabot-preview[bot] 27856297 2020-09-02T22:26:14Z 2020-09-02T22:26:14Z CONTRIBUTOR

Looks like black is up-to-date now, so this is no longer needed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Update black requirement from ~=19.10b0 to >=19.10,<21.0 687245650  
683382252 https://github.com/simonw/sqlite-utils/issues/145#issuecomment-683382252 https://api.github.com/repos/simonw/sqlite-utils/issues/145 MDEyOklzc3VlQ29tbWVudDY4MzM4MjI1Mg== simonwiles 96218 2020-08-30T06:27:25Z 2020-08-30T06:27:52Z CONTRIBUTOR

Note: had to adjust the test above because trying to exhaust a SQLITE_MAX_VARIABLE_NUMBER of 250000 in 99 records requires 2526 columns, and trips the "Rows can have a maximum of {} columns".format(SQLITE_MAX_VARS) check even before it trips the default SQLITE_MAX_COLUMN value (2000).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bug when first record contains fewer columns than subsequent records 688659182  
682815377 https://github.com/simonw/sqlite-utils/issues/139#issuecomment-682815377 https://api.github.com/repos/simonw/sqlite-utils/issues/139 MDEyOklzc3VlQ29tbWVudDY4MjgxNTM3Nw== simonwiles 96218 2020-08-28T16:14:58Z 2020-08-28T16:14:58Z CONTRIBUTOR

Thanks! And yeah, I had updating the docs on my list too :) Will try to get to it this afternoon (budgeting time is fraught with uncertainty at the moment!).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
insert_all(..., alter=True) should work for new columns introduced after the first 100 records 686978131  
682182178 https://github.com/simonw/sqlite-utils/issues/139#issuecomment-682182178 https://api.github.com/repos/simonw/sqlite-utils/issues/139 MDEyOklzc3VlQ29tbWVudDY4MjE4MjE3OA== simonwiles 96218 2020-08-27T20:46:18Z 2020-08-27T20:46:18Z CONTRIBUTOR

I tried changing the batch_size argument to the total number of records, but it seems only to effect the number of rows that are committed at a time, and has no influence on this problem.

So the reason for this is that the batch_size for import is limited (of necessity) here: https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/db.py#L1048

With regard to the issue of ignoring columns, however, I made a fork and hacked a temporary fix that looks like this:
https://github.com/simonwiles/sqlite-utils/commit/3901f43c6a712a1a3efc340b5b8d8fd0cbe8ee63

It doesn't seem to affect performance enormously (but I've not tested it thoroughly), and it now does what I need (and would expect, tbh), but it now fails the test here:
https://github.com/simonw/sqlite-utils/blob/main/tests/test_create.py#L710-L716

The existence of this test suggests that insert_all() is behaving as intended, of course. It seems odd to me that this would be a desirable default behaviour (let alone the only behaviour), and its not very prominently flagged-up, either.

@simonw is this something you'd be willing to look at a PR for? I assume you wouldn't want to change the default behaviour at this point, but perhaps an option could be provided, or at least a bit more of a warning in the docs. Are there oversights in the implementation that I've made?

Would be grateful for your thoughts! Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
insert_all(..., alter=True) should work for new columns introduced after the first 100 records 686978131  
661524006 https://github.com/simonw/datasette/issues/456#issuecomment-661524006 https://api.github.com/repos/simonw/datasette/issues/456 MDEyOklzc3VlQ29tbWVudDY2MTUyNDAwNg== abeyerpath 32467826 2020-07-21T01:15:07Z 2020-07-21T01:15:07Z CONTRIBUTOR

Bumping this, as the previous fix is passing the wrong type, and not actually addressing the issue...

The exclude argument needs an iterable of packages instead of a single string (but since str is iterable, it's currently excluding packages t, e, and s.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Installing installs the tests package 442327592  
655898722 https://github.com/simonw/sqlite-utils/issues/121#issuecomment-655898722 https://api.github.com/repos/simonw/sqlite-utils/issues/121 MDEyOklzc3VlQ29tbWVudDY1NTg5ODcyMg== tsibley 79913 2020-07-09T04:53:08Z 2020-07-09T04:53:08Z CONTRIBUTOR

Yep, I agree that makes more sense for backwards compat and more casual use cases. I think it should be possible for the Database/Queryable methods to DTRT based on seeing if it's within a context-manager-managed transaction.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Improved (and better documented) support for transactions 652961907  
655652679 https://github.com/simonw/sqlite-utils/issues/121#issuecomment-655652679 https://api.github.com/repos/simonw/sqlite-utils/issues/121 MDEyOklzc3VlQ29tbWVudDY1NTY1MjY3OQ== tsibley 79913 2020-07-08T17:24:46Z 2020-07-08T17:24:46Z CONTRIBUTOR

Better transaction handling would be really great. Some of my thoughts on implementing better transaction discipline are in https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655239728.

My preferences:

  • Each CLI command should operate in a single transaction so that either the whole thing succeeds or the whole thing is rolled back. This avoids partially completed operations when an error occurs part way through processing. Partially completed operations are typically much harder to recovery from gracefully and may cause inconsistent data states.

  • The Python API should be transaction-agnostic and rely on the caller to coordinate transactions. Only the caller knows how individual insert, create, update, etc operations/methods should be bundled conceptually into transactions. When the caller is the CLI, for example, that bundling would be at the CLI command-level. Other callers might want to break up operations into multiple transactions. Transactions are usually most useful when controlled at the application-level (like logging configuration) instead of the library level. The library needs to provide an API that's conducive to transaction use, though.

  • The Python API should provide a context manager to provide consistent transactions handling with more useful defaults than Python's sqlite3 module. The latter issues implicit BEGIN statements by default for most DML (INSERT, UPDATE, DELETE, … but not SELECT, I believe), but not DDL (CREATE TABLE, DROP TABLE, CREATE VIEW, …). Notably, the sqlite3 module doesn't issue the implicit BEGIN until the first DML statement. It does not issue it when entering the with conn block, like other DBAPI2-compatible modules do. The with conn block for sqlite3 only arranges to commit or rollback an existing transaction when exiting. Including DDL and SELECTs in transactions is important for operation consistency, though. There are several existing bugs.python.org tickets about this and future changes are in the works, but sqlite-utils can provide its own API sooner. sqlite-utils's Database class could itself be a context manager (built on the sqlite3 connection context manager) which additionally issues an explicit BEGIN when entering. This would then let Python API callers do something like:

db = sqlite_utils.Database(path)

with db: # ← BEGIN issued here by Database.__enter__
    db.insert(…)
    db.create_view(…)
# ← COMMIT/ROLLBACK issue here by sqlite3.connection.__exit__
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Improved (and better documented) support for transactions 652961907  
655643078 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655643078 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTY0MzA3OA== tsibley 79913 2020-07-08T17:05:59Z 2020-07-08T17:05:59Z CONTRIBUTOR

The only thing missing from this PR is updates to the documentation.

Ah, yes, thanks for this reminder! I've repushed with doc bits added.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
655239728 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655239728 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTIzOTcyOA== tsibley 79913 2020-07-08T02:16:42Z 2020-07-08T02:16:42Z CONTRIBUTOR

I fixed my original oops by moving the DELETE FROM $table out of the chunking loop and repushed. I think this change can be considered in isolation from issues around transactions, which I discuss next.

I wanted to make the DELETE + INSERT happen all in the same transaction so it was robust, but that was more complicated than I expected. The transaction handling in the Database/Table classes isn't systematic, and this poses big hurdles to making Table.insert_all (or other operations) consistent and robust in the face of errors.

For example, I wanted to do this (whitespace ignored in diff, so indentation change not highlighted):

diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py
index d6b9ecf..4107ceb 100644
--- a/sqlite_utils/db.py
+++ b/sqlite_utils/db.py
@@ -1028,6 +1028,11 @@ class Table(Queryable):
         batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns))
         self.last_rowid = None
         self.last_pk = None
+        with self.db.conn:
+            # Explicit BEGIN is necessary because Python's sqlite3 doesn't
+            # issue implicit BEGINs for DDL, only DML.  We mix DDL and DML
+            # below and might execute DDL first, e.g. for table creation.
+            self.db.conn.execute("BEGIN")
             if truncate and self.exists():
                 self.db.conn.execute("DELETE FROM [{}];".format(self.name))
             for chunk in chunks(itertools.chain([first_record], records), batch_size):
@@ -1038,7 +1043,11 @@ class Table(Queryable):
                         # Use the first batch to derive the table names
                         column_types = suggest_column_types(chunk)
                         column_types.update(columns or {})
-                    self.create(
+                        # Not self.create() because that is wrapped in its own
+                        # transaction and Python's sqlite3 doesn't support
+                        # nested transactions.
+                        self.db.create_table(
+                            self.name,
                             column_types,
                             pk,
                             foreign_keys,
@@ -1139,7 +1148,6 @@ class Table(Queryable):
                     flat_values = list(itertools.chain(*values))
                     queries_and_params = [(sql, flat_values)]

-            with self.db.conn:
                 for query, params in queries_and_params:
                     try:
                         result = self.db.conn.execute(query, params)

but that fails in tests because other methods call insert/upsert/insert_all/upsert_all in the middle of their transactions, so the BEGIN statement throws an error (no nested transactions allowed).

Stepping back, it would be nice to make the transaction handling systematic and predictable. One way to do this is to make the sqlite_utils/db.py code generally not begin or commit any transactions, and require the caller to do that instead. This lets the caller mix and match the Python API calls into transactions as appropriate (which is impossible for the API methods themselves to fully determine). Then, make sqlite_utils/cli.py begin and commit a transaction in each @cli.command function, making each command robust and consistent in the face of errors. The big change here, and why I didn't just submit a patch, is that it dramatically changes the Python API to require callers to begin a transaction rather than just immediately calling methods.

There is also the caveat that for each transaction, an explicit BEGIN is also necessary so that DDL as well as DML (as well as SELECTs) are consistent and rolled back on error. There are several bugs.python.org discussions around this particular problem of DDL and some plans to make it better and consistent with DBAPI2, eventually. In the meantime, the sqlite-utils Database class could be a context manager which supports the incantations necessary to do proper transactions. This would still be a Python API change for callers but wouldn't expose them to the weirdness of the sqlite3's default transaction handling.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
655052451 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655052451 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTA1MjQ1MQ== tsibley 79913 2020-07-07T18:45:23Z 2020-07-07T18:45:23Z CONTRIBUTOR

Ah, I see the problem. The truncate is inside a loop I didn't realize was there.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
655018966 https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655018966 https://api.github.com/repos/simonw/sqlite-utils/issues/118 MDEyOklzc3VlQ29tbWVudDY1NTAxODk2Ng== tsibley 79913 2020-07-07T17:41:06Z 2020-07-07T17:41:06Z CONTRIBUTOR

Hmm, while tests pass, this may not work as intended on larger datasets. Looking into it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add insert --truncate option 651844316  
653002499 https://github.com/simonw/datasette/issues/889#issuecomment-653002499 https://api.github.com/repos/simonw/datasette/issues/889 MDEyOklzc3VlQ29tbWVudDY1MzAwMjQ5OQ== amjith 49260 2020-07-02T13:22:13Z 2020-07-02T13:22:13Z CONTRIBUTOR

I was able to narrow this down to the fact that lifespan protocol is turned on.

I see the workaround you've used here: https://github.com/simonw/datasette-debug-asgi/commit/72d568d32a3159c763ce908c0b269736935c6987

If so, maybe it's time to update some of the asg_wrapper plugins.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
asgi_wrapper plugin hook is crashing at startup 649907676  
652990131 https://github.com/simonw/datasette/issues/889#issuecomment-652990131 https://api.github.com/repos/simonw/datasette/issues/889 MDEyOklzc3VlQ29tbWVudDY1Mjk5MDEzMQ== amjith 49260 2020-07-02T12:58:11Z 2020-07-02T13:00:18Z CONTRIBUTOR

FWIW, this error does NOT happen in datasette 0.45a4.

It only started on 0.45a5

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
asgi_wrapper plugin hook is crashing at startup 649907676  
652394742 https://github.com/simonw/datasette/pull/883#issuecomment-652394742 https://api.github.com/repos/simonw/datasette/issues/883 MDEyOklzc3VlQ29tbWVudDY1MjM5NDc0Mg== abdusco 3243482 2020-07-01T12:41:13Z 2020-07-01T12:41:13Z CONTRIBUTOR

Well tests need to be updated.

I need to get tests working on Windows.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Skip counting hidden tables 648749062  
652297139 https://github.com/simonw/datasette/pull/883#issuecomment-652297139 https://api.github.com/repos/simonw/datasette/issues/883 MDEyOklzc3VlQ29tbWVudDY1MjI5NzEzOQ== abdusco 3243482 2020-07-01T09:11:29Z 2020-07-01T09:11:29Z CONTRIBUTOR

Turns out we should include hidden tables in the result dict, or we're breaking tests. I've committed a refactor https://github.com/simonw/datasette/pull/883/commits/4f06e1bf6fbe4b73be770b87f610bf7c0e6e3ea7

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Skip counting hidden tables 648749062  
652255960 https://github.com/simonw/datasette/issues/877#issuecomment-652255960 https://api.github.com/repos/simonw/datasette/issues/877 MDEyOklzc3VlQ29tbWVudDY1MjI1NTk2MA== abdusco 3243482 2020-07-01T07:52:25Z 2020-07-01T08:10:00Z CONTRIBUTOR

I am calling the API from another origin, so injecting CSRF token into templates wouldn't work.

EDIT:

I'll try the new version, it sounds promising

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Consider dropping explicit CSRF protection entirely? 648421105  
652261382 https://github.com/simonw/datasette/issues/877#issuecomment-652261382 https://api.github.com/repos/simonw/datasette/issues/877 MDEyOklzc3VlQ29tbWVudDY1MjI2MTM4Mg== abdusco 3243482 2020-07-01T08:03:17Z 2020-07-01T08:03:23Z CONTRIBUTOR

Bearer tokens sound interesting. Where do tokens come from? An auth provider of my choosing? How do they get verified?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Consider dropping explicit CSRF protection entirely? 648421105  
652166115 https://github.com/simonw/datasette/issues/877#issuecomment-652166115 https://api.github.com/repos/simonw/datasette/issues/877 MDEyOklzc3VlQ29tbWVudDY1MjE2NjExNQ== abdusco 3243482 2020-07-01T03:28:07Z 2020-07-01T03:28:07Z CONTRIBUTOR

Does this mean custom routes get to expose endpoints accepting POST requests? I've tried earlier to add some POST endpoints, but requests were being rejected by Datasette due to CSRF

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Consider dropping explicit CSRF protection entirely? 648421105  
652160909 https://github.com/simonw/datasette/issues/859#issuecomment-652160909 https://api.github.com/repos/simonw/datasette/issues/859 MDEyOklzc3VlQ29tbWVudDY1MjE2MDkwOQ== abdusco 3243482 2020-07-01T03:09:32Z 2020-07-01T03:10:21Z CONTRIBUTOR

I've just realized Datasette tries to count hidden tables too. There are 5 visible tables, 25 hidden tables, which I haven't realize earlier to consider their effect. I've turned off counting for hidden tables to see if it has any effect.

What's the point of counting FTS tables?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
648669523 https://github.com/simonw/datasette/issues/859#issuecomment-648669523 https://api.github.com/repos/simonw/datasette/issues/859 MDEyOklzc3VlQ29tbWVudDY0ODY2OTUyMw== abdusco 3243482 2020-06-24T08:13:23Z 2020-06-24T10:30:36Z CONTRIBUTOR

I tried setting cache_size_kb=0 then cache_size_kb=100000, still getting this behavior. I even changed Database::table_counts and lowered time limit to 1

table_count = (
    await self.execute(
        "select count(*) from [{}]".format(table),
        custom_time_limit=1,
    )
).rows[0][0]
counts[table] = table_count

I feel like 10 seconds is a magic number, like a processing timeout and datasette gives up and returns the page.
Index page loads instantly, table page, query page, as well. But when I return to database page after some time, it loads in 10s.

EDIT:

It's always like 10 + 0.3s, like 10s wait and timeout then 300ms to render the page

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
648232645 https://github.com/simonw/datasette/issues/859#issuecomment-648232645 https://api.github.com/repos/simonw/datasette/issues/859 MDEyOklzc3VlQ29tbWVudDY0ODIzMjY0NQ== abdusco 3243482 2020-06-23T15:19:53Z 2020-06-23T15:19:53Z CONTRIBUTOR

The issue seems to appear sporadically, like when I return to database page after a while, during which some records have been added to the database.

I've just visited database, page first visit took ~10s, consecutive visits took 0.3s.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
647925594 https://github.com/simonw/datasette/issues/859#issuecomment-647925594 https://api.github.com/repos/simonw/datasette/issues/859 MDEyOklzc3VlQ29tbWVudDY0NzkyNTU5NA== abdusco 3243482 2020-06-23T05:55:21Z 2020-06-23T06:28:29Z CONTRIBUTOR

Hmm, not seeing the problem now.
I've removed the commented out sections in database.py and restarted the process. Database page now loads in <250ms.

I have couple of workers that check some pages regularly and scrape new content and save to the DB. Could it be that datasette tries to recount tables every time database size changes? Normally it keeps a count cache, but as DB gets updated so often (new content every 5 min or so) it's practically recounting every time I go to the database page?

EDIT:
It turns out it doesn't hold cache with mutable databases.

I'll update the issue with more findings and a better way to reproduce the problem if I encounter it again.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
647936117 https://github.com/simonw/datasette/issues/859#issuecomment-647936117 https://api.github.com/repos/simonw/datasette/issues/859 MDEyOklzc3VlQ29tbWVudDY0NzkzNjExNw== abdusco 3243482 2020-06-23T06:25:17Z 2020-06-23T06:25:17Z CONTRIBUTOR

sqlite-generate many-cols.db --tables 2 --rows 200000 --columns 50

Looks like that will take 35 minutes to run (it's not a particularly fast tool).

Try chunking write operations into batches every 1000 records or so.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
647935300 https://github.com/simonw/datasette/issues/859#issuecomment-647935300 https://api.github.com/repos/simonw/datasette/issues/859 MDEyOklzc3VlQ29tbWVudDY0NzkzNTMwMA== abdusco 3243482 2020-06-23T06:23:01Z 2020-06-23T06:23:01Z CONTRIBUTOR

You said "200k+, 50+ rows in a couple of tables" - does that mean 50+ columns? I'll try with larger numbers of columns and see what difference that makes.

Ah that was a typo, I meant 50k.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
647923666 https://github.com/simonw/datasette/issues/859#issuecomment-647923666 https://api.github.com/repos/simonw/datasette/issues/859 MDEyOklzc3VlQ29tbWVudDY0NzkyMzY2Ng== abdusco 3243482 2020-06-23T05:49:31Z 2020-06-23T05:49:31Z CONTRIBUTOR

I think I should mention that having FTS on all tables mean I have 5 visible, 25 hidden (FTS) tables displayed on database page.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
647194131 https://github.com/simonw/datasette/issues/859#issuecomment-647194131 https://api.github.com/repos/simonw/datasette/issues/859 MDEyOklzc3VlQ29tbWVudDY0NzE5NDEzMQ== abdusco 3243482 2020-06-21T23:15:54Z 2020-06-21T23:26:09Z CONTRIBUTOR

I'm not sure if table counts are to blame. There shouldn't be a ~3 orders of magnitude difference.

user@klein /a/w/scrapyard (master)> set sql "select count(*) from table_1; select count(*) from table_2; select count(*) from table_3;"
user@klein /a/w/scrapyard (master)> time sqlite3 scrapyard.db "$sql"
187489
46492
2229

________________________________________________________
Executed in   25.57 millis    fish           external
   usr time    3.55 millis    0.00 micros    3.55 millis
   sys time   22.42 millis  1123.00 micros   21.30 millis

but not letting datasette count the tables definitely helps.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
647135713 https://github.com/simonw/datasette/issues/859#issuecomment-647135713 https://api.github.com/repos/simonw/datasette/issues/859 MDEyOklzc3VlQ29tbWVudDY0NzEzNTcxMw== abdusco 3243482 2020-06-21T14:30:02Z 2020-06-21T14:30:02Z CONTRIBUTOR

Oops, the same method is called from both index and database pages. But removing select count queries speed up the page load quite a bit.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
645293374 https://github.com/simonw/datasette/issues/851#issuecomment-645293374 https://api.github.com/repos/simonw/datasette/issues/851 MDEyOklzc3VlQ29tbWVudDY0NTI5MzM3NA== abdusco 3243482 2020-06-17T10:32:02Z 2020-06-17T10:32:28Z CONTRIBUTOR

Welp, I'm an idiot.

Turns out I had a sneaky comma , after sql key:

... (:name, :url),

which tells sqlite to expect another values(...) list.

Correcting the SQL solved the issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Having trouble getting writable canned queries to work 640330278  
643709037 https://github.com/simonw/datasette/issues/691#issuecomment-643709037 https://api.github.com/repos/simonw/datasette/issues/691 MDEyOklzc3VlQ29tbWVudDY0MzcwOTAzNw== amjith 49260 2020-06-14T02:35:16Z 2020-06-14T02:35:16Z CONTRIBUTOR

The server should reload in the config_dir mode.

Ref: #848

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--reload sould reload server if code in --plugins-dir changes 574021194  
632555800 https://github.com/simonw/datasette/issues/767#issuecomment-632555800 https://api.github.com/repos/simonw/datasette/issues/767 MDEyOklzc3VlQ29tbWVudDYzMjU1NTgwMA== rixx 2657547 2020-05-22T08:00:23Z 2020-05-22T08:00:23Z CONTRIBUTOR

That would be perfect!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow to specify a URL fragment for canned queries 620969465  
628405453 https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-628405453 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22 MDEyOklzc3VlQ29tbWVudDYyODQwNTQ1Mw== RhetTbull 41546558 2020-05-14T05:59:53Z 2020-05-14T05:59:53Z CONTRIBUTOR

I've added support for the above exif data to v0.28.17 of osxphotos. PhotoInfo.exif_info will return an ExifInfo dataclass object with the following properties:

    flash_fired: bool
    iso: int
    metering_mode: int
    sample_rate: int
    track_format: int
    white_balance: int
    aperture: float
    bit_rate: float
    duration: float
    exposure_bias: float
    focal_length: float
    fps: float
    latitude: float
    longitude: float
    shutter_speed: float
    camera_make: str
    camera_model: str
    codec: str
    lens_model: str

It's not all the EXIF data available in most files but is the data Photos deems important to save. Of course, you can get all the exif_data

Note: this only works in Photos 5. As best as I can tell, EXIF data is not stored in the database for earlier versions.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Try out ExifReader 615626118  
627007458 https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-627007458 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22 MDEyOklzc3VlQ29tbWVudDYyNzAwNzQ1OA== RhetTbull 41546558 2020-05-11T22:51:52Z 2020-05-11T22:52:26Z CONTRIBUTOR

I'm not familiar with ExifReader. I wrote my own wrapper around exiftool because I wanted a simple way to write EXIF data when exporting photos (e.g. writing out to PersonInImage and keywords to IPTC:Keywords) and the existing python packages like pyexiftool didn't do quite what I wanted. If all you're after is the camera and shot info, that's available in ZEXTENDEDATTRIBUTES table. I've got an open issue #11 to add this to osxphotos but it hasn't bubbled to the top of my backlog yet.

osxphotos will give you the location info: PhotoInfo.location returns a tuple of (lat, lon) though this info is in ZEXTENDEDATTRIBUTES too (though it might not be correct as I believe Photos creates this table at import and the user might have changed the location of a photo, e.g. if camera didn't have GPS).

CREATE TABLE ZEXTENDEDATTRIBUTES (
  Z_PK INTEGER PRIMARY KEY, Z_ENT INTEGER, 
  Z_OPT INTEGER, ZFLASHFIRED INTEGER, 
  ZISO INTEGER, ZMETERINGMODE INTEGER, 
  ZSAMPLERATE INTEGER, ZTRACKFORMAT INTEGER, 
  ZWHITEBALANCE INTEGER, ZASSET INTEGER, 
  ZAPERTURE FLOAT, ZBITRATE FLOAT, ZDURATION FLOAT, 
  ZEXPOSUREBIAS FLOAT, ZFOCALLENGTH FLOAT, 
  ZFPS FLOAT, ZLATITUDE FLOAT, ZLONGITUDE FLOAT, 
  ZSHUTTERSPEED FLOAT, ZCAMERAMAKE VARCHAR, 
  ZCAMERAMODEL VARCHAR, ZCODEC VARCHAR, 
  ZLENSMODEL VARCHAR
);
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Try out ExifReader 615626118  
626667235 https://github.com/dogsheep/dogsheep-photos/issues/22#issuecomment-626667235 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22 MDEyOklzc3VlQ29tbWVudDYyNjY2NzIzNQ== RhetTbull 41546558 2020-05-11T12:20:34Z 2020-05-11T12:20:34Z CONTRIBUTOR

@simonw FYI, osxphotos includes a built in ExifTool class that uses exiftool to read and write exif data. It's not exposed yet in the docs because I really only use it right now in the osphotos command line interface to write tags when exporting. In v0.28.16 (just pushed) I added an ExifTool.as_dict() method which will give you a dict with all the exif tags in a file. For example:

import osxphotos
photos = osxphotos.PhotosDB().photos()
exiftool = osxphotos.exiftool.ExifTool(photos[0].path)
exifdata = exiftool.as_dict()
tags = exifdata["IPTC:Keywords"]

Not as elegant perhaps as a python only implementation because ExifTool has to make subprocess calls to an external tool but exiftool is by far the best tool available for reading and writing EXIF data and it does support HEIC.

As for implementation, ExifTool uses a singleton pattern so the first time you instantiate it, it spawns an IPC to exiftool but then keeps it open and uses the same process for any subsequent calls (even on different files).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Try out ExifReader 615626118  
626396379 https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626396379 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 MDEyOklzc3VlQ29tbWVudDYyNjM5NjM3OQ== RhetTbull 41546558 2020-05-10T22:01:48Z 2020-05-10T22:01:48Z CONTRIBUTOR

Frustrates me when package authors create a "drop in" replacement with the same import name...this kind of thing has bitten me more than once! Would've been nicer I think for bpylist2 to do "import bpylist2 as bpylist"

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990  
626395641 https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395641 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 MDEyOklzc3VlQ29tbWVudDYyNjM5NTY0MQ== RhetTbull 41546558 2020-05-10T21:55:54Z 2020-05-10T21:55:54Z CONTRIBUTOR

Did removing old bpylist solve the original problem or do you still have a photo that throws circular reference?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990  
626395507 https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626395507 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 MDEyOklzc3VlQ29tbWVudDYyNjM5NTUwNw== RhetTbull 41546558 2020-05-10T21:54:45Z 2020-05-10T21:54:45Z CONTRIBUTOR

@simonw does Photos show valid reverse geolocation info? Are you sure you're using bpylist2 and not bpylist? They're both unfortunately imported as "bpylist" so if you somehow got the wrong (original bpylist) version installed, it could be the issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990  
626390317 https://github.com/dogsheep/dogsheep-photos/issues/21#issuecomment-626390317 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/21 MDEyOklzc3VlQ29tbWVudDYyNjM5MDMxNw== RhetTbull 41546558 2020-05-10T21:11:24Z 2020-05-10T21:50:58Z CONTRIBUTOR

Ugh....Yeah, I think easiest is to catch the exception and return no place as you suggest. This particular bit of code involves un-archiving a serialized NSKeyedArchiver which uses an object table and it is certainly possible to create a circular reference that way. Because this is happening in the decode, the circular reference must be in the original data. Does Photos show valid reverse geolocation info for the photo in question? If so, Photos may be doing something beyond a simple decode of the binary plist. For now, I'll push a patch to catch the exception.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
bpylist.archiver.CircularReference: archive has a cycle with uid(13) 615474990  
624284539 https://github.com/dogsheep/dogsheep-photos/issues/17#issuecomment-624284539 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/17 MDEyOklzc3VlQ29tbWVudDYyNDI4NDUzOQ== RhetTbull 41546558 2020-05-05T20:20:05Z 2020-05-05T20:20:05Z CONTRIBUTOR

FYI, I've got an issue to make osxphotos cross-platform but it's low on my priority list. About 90% of the functionality could be done cross-platform but right now the MacOS specific stuff is embedded throughout and would take some work. Though I try to minimize it, there's sprinklings of ObjC & Applescript throughout osxphotos.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Only install osxphotos if running on macOS 612860531  
623845014 https://github.com/dogsheep/dogsheep-photos/issues/16#issuecomment-623845014 https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16 MDEyOklzc3VlQ29tbWVudDYyMzg0NTAxNA== RhetTbull 41546558 2020-05-05T03:55:14Z 2020-05-05T03:56:24Z CONTRIBUTOR

I'm traveling w/o access to my Mac so can't help with any code right now. I suspected ZSCENEIDENTIFIER was a foreign key into one of these psi.sqlite tables. But looks like you're on to something connecting groups to assets. As for the UUID, I think there's two ints because each is 64-bits but UUIDs are 128-bits. Thus they need to be combined to get the 128 bit UUID. You might be able to use Apple's NSUUID, for example, by wrapping with pyObjC. Here's one example of using this in PyObjC's test suite. Interesting it's stored this way instead of a UUIDString as in Photos.sqlite. Perhaps it for faster indexing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Import machine-learning detected labels (dog, llama etc) from Apple Photos 612287234  
623463200 https://github.com/simonw/datasette/pull/730#issuecomment-623463200 https://api.github.com/repos/simonw/datasette/issues/730 MDEyOklzc3VlQ29tbWVudDYyMzQ2MzIwMA== dependabot-preview[bot] 27856297 2020-05-04T13:27:22Z 2020-05-04T13:27:22Z CONTRIBUTOR

Superseded by #753.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Update pytest-asyncio requirement from ~=0.10.0 to >=0.10,<0.12 604001627  
622599528 https://github.com/simonw/sqlite-utils/issues/103#issuecomment-622599528 https://api.github.com/repos/simonw/sqlite-utils/issues/103 MDEyOklzc3VlQ29tbWVudDYyMjU5OTUyOA== b0b5h4rp13 32605365 2020-05-01T22:49:12Z 2020-05-02T11:15:44Z CONTRIBUTOR

With SQLITE_MAX_VARS = 999, or even 899, This hits the problem with the batch rows causing a overflow (works fine if SQLITE_MAX_VARS = 799).

p.s. I have tried a few list of dicts to sqlite modules and this was the easiest to use/understand

------------- file begins ------------------
import sqlite_utils as su

data = [
{'tickerId': 913324382, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'CONSTELLATION B', 'symbol': 'STZ B', 'disSymbol': 'STZ-B', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'status': 'D', 'close': '163.13', 'change': '6.46', 'changeRatio': '0.0412', 'marketValue': '31180699895.63', 'volume': '417', 'turnoverRate': '0.0000'},
{'tickerId': 913323791, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Molina Health', 'symbol': 'MOH', 'disSymbol': 'MOH', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '173.25', 'change': '9.28', 'changeRatio': '0.0566', 'pPrice': '173.25', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '10520341695.50', 'volume': '1281557', 'turnoverRate': '0.0202'},
{'tickerId': 913257501, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Seattle Genetics', 'symbol': 'SGEN', 'disSymbol': 'SGEN', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '145.64', 'change': '8.41', 'changeRatio': '0.0613', 'pPrice': '146.45', 'pChange': '0.8100', 'pChRatio': '0.0056', 'marketValue': '25117961347.60', 'volume': '2791411', 'turnoverRate': '0.0162'},
{'tickerId': 925381971, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Bandwidth', 'symbol': 'BAND', 'disSymbol': 'BAND', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '89.22', 'change': '7.66', 'changeRatio': '0.0939', 'pPrice': '89.00', 'pChange': '-0.2200', 'pChRatio': '-0.0025', 'marketValue': '2100025474.98', 'volume': '1508629', 'turnoverRate': '0.0641'},
{'tickerId': 913323935, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Magellan Health', 'symbol': 'MGLN', 'disSymbol': 'MGLN', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '68.00', 'change': '7.27', 'changeRatio': '0.1197', 'pPrice': '68.00', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '1697894040.00', 'volume': '448919', 'turnoverRate': '0.0180'},
{'tickerId': 913254854, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'On Assignment', 'symbol': 'ASGN', 'disSymbol': 'ASGN', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '53.04', 'change': '6.59', 'changeRatio': '0.1419', 'pPrice': '53.04', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '2811120000.00', 'volume': '1339771', 'turnoverRate': '0.0253'},
{'tickerId': 913255732, 'exchangeId': 95, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Arcturus', 'symbol': 'ARCT', 'disSymbol': 'ARCT', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NMS', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '40.86', 'change': '6.36', 'changeRatio': '0.1843', 'pPrice': '42.60', 'pChange': '1.740', 'pChRatio': '0.0426', 'marketValue': '812021444.46', 'volume': '1577508', 'turnoverRate': '0.0794'},
{'tickerId': 913256616, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'DexCom', 'symbol': 'DXCM', 'disSymbol': 'DXCM', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '341.52', 'change': '6.32', 'changeRatio': '0.0189', 'pPrice': '340.00', 'pChange': '-1.5200', 'pChRatio': '-0.0045', 'marketValue': '31522296000.00', 'volume': '1008849', 'turnoverRate': '0.0109'},
{'tickerId': 913255108, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Clorox', 'symbol': 'CLX', 'disSymbol': 'CLX', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '192.71', 'change': '6.27', 'changeRatio': '0.0336', 'pPrice': '192.95', 'pChange': '0.2400', 'pChRatio': '0.0012', 'marketValue': '24185773318.28', 'volume': '4996414', 'turnoverRate': '0.0398'},
{'tickerId': 925314627, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'FRANCO NEVADA', 'symbol': 'FNV', 'disSymbol': 'FNV', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '137.85', 'change': '5.64', 'changeRatio': '0.0427', 'pPrice': '138.50', 'pChange': '0.6500', 'pChRatio': '0.0047', 'marketValue': '26110405326.30', 'volume': '1047688', 'turnoverRate': '0.0055'},
{'tickerId': 913254955, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Aon Plc', 'symbol': 'AON', 'disSymbol': 'AON', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '178.21', 'change': '5.54', 'changeRatio': '0.0321', 'pPrice': '178.21', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '41181209117.22', 'volume': '2026234', 'turnoverRate': '0.0088'},
{'tickerId': 913324105, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Willis Towers', 'symbol': 'WLTW', 'disSymbol': 'WLTW', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '183.34', 'change': '5.05', 'changeRatio': '0.0283', 'pPrice': '183.34', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '23597461124.96', 'volume': '968943', 'turnoverRate': '0.0075'},
{'tickerId': 913254759, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'TELADOC HEALTH', 'symbol': 'TDOC', 'disSymbol': 'TDOC', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '169.43', 'change': '4.84', 'changeRatio': '0.0294', 'pPrice': '168.88', 'pChange': '-0.5500', 'pChRatio': '-0.0032', 'marketValue': '12614616858.38', 'volume': '2628946', 'turnoverRate': '0.0353'},
{'tickerId': 913255222, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Emergent Bio', 'symbol': 'EBS', 'disSymbol': 'EBS', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '78.70', 'change': '4.75', 'changeRatio': '0.0642', 'pPrice': '78.40', 'pChange': '-0.3000', 'pChRatio': '-0.0038', 'marketValue': '4113368277.10', 'volume': '783804', 'turnoverRate': '0.0150'},
{'tickerId': 913323443, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Pool', 'symbol': 'POOL', 'disSymbol': 'POOL', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '216.02', 'change': '4.36', 'changeRatio': '0.0206', 'pPrice': '216.02', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '8696077573.82', 'volume': '310837', 'turnoverRate': '0.0077'},
{'tickerId': 913257075, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Masimo', 'symbol': 'MASI', 'disSymbol': 'MASI', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '218.00', 'change': '4.09', 'changeRatio': '0.0191', 'pPrice': '217.00', 'pChange': '-1.0000', 'pChRatio': '-0.0046', 'marketValue': '11797070000.00', 'volume': '542131', 'turnoverRate': '0.0100'},
{'tickerId': 913253761, 'exchangeId': 10, 'type': 2, 'secType': [62], 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Pope Resources', 'symbol': 'POPE', 'disSymbol': 'POPE', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NAS', 'listStatus': 1, 'template': 'stock', 'status': 'D', 'close': '101.05', 'change': '3.95', 'changeRatio': '0.0407', 'pPrice': '99.90', 'pChange': '2.800', 'pChRatio': '0.0288', 'marketValue': '447370075.75', 'volume': '33138', 'turnoverRate': '0.0075'},
{'tickerId': 913323560, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Seneca Foods', 'symbol': 'SENEB', 'disSymbol': 'SENEB', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'status': 'D', 'close': '40.04', 'change': '3.84', 'changeRatio': '0.1061', 'marketValue': '347950039.71', 'volume': '501'},
{'tickerId': 913324274, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Resmed', 'symbol': 'RMD', 'disSymbol': 'RMD', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '159.07', 'change': '3.75', 'changeRatio': '0.0241', 'pPrice': '159.07', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '23004217759.29', 'volume': '1267075', 'turnoverRate': '0.0088'},
{'tickerId': 913323736, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Vertex Pharms', 'symbol': 'VRTX', 'disSymbol': 'VRTX', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '254.90', 'change': '3.70', 'changeRatio': '0.0147', 'pPrice': '255.00', 'pChange': '0.1000', 'pChRatio': '0.0004', 'marketValue': '66062980780.10', 'volume': '1939843', 'turnoverRate': '0.0075'},
{'tickerId': 913323767, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'MCCORMICK VTG', 'symbol': 'MKC V', 'disSymbol': 'MKC-V', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'status': 'D', 'close': '159.99', 'change': '3.42', 'changeRatio': '0.0218', 'marketValue': '21262671000.00', 'volume': '432', 'turnoverRate': '0.0000'},
{'tickerId': 950118595, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'ZOOM VIDEO', 'symbol': 'ZM', 'disSymbol': 'ZM', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '138.56', 'change': '3.39', 'changeRatio': '0.0251', 'pPrice': '138.99', 'pChange': '0.4300', 'pChRatio': '0.0031', 'marketValue': '38620532420.16', 'volume': '13786017', 'turnoverRate': '0.0495'},
{'tickerId': 916040738, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'WHEATON PRECIOUS', 'symbol': 'WPM', 'disSymbol': 'WPM', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '41.10', 'change': '3.34', 'changeRatio': '0.0885', 'pPrice': '41.09', 'pChange': '-0.0100', 'pChRatio': '-0.0002', 'marketValue': '18404536146.30', 'volume': '5019137', 'turnoverRate': '0.0112'},
{'tickerId': 913257174, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Royal Gold', 'symbol': 'RGLD', 'disSymbol': 'RGLD', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '125.86', 'change': '3.33', 'changeRatio': '0.0272', 'pPrice': '125.86', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '8253015011.08', 'volume': '853473', 'turnoverRate': '0.0130'},
{'tickerId': 913254394, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Fortune Brand', 'symbol': 'FBHS', 'disSymbol': 'FBHS', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '51.50', 'change': '3.30', 'changeRatio': '0.0685', 'pPrice': '51.50', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '7194870278.50', 'volume': '3004021', 'turnoverRate': '0.0214'},
{'tickerId': 913323312, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Liberty Global', 'symbol': 'LBTYK', 'disSymbol': 'LBTYK', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '21.49', 'change': '3.18', 'changeRatio': '0.1737', 'pPrice': '21.48', 'pChange': '-0.0100', 'pChRatio': '-0.0005', 'marketValue': '13594662302.41', 'volume': '19980228', 'turnoverRate': '0.0315'},
{'tickerId': 913323882, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Preformed Line', 'symbol': 'PLPC', 'disSymbol': 'PLPC', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'status': 'D', 'close': '52.82', 'change': '3.14', 'changeRatio': '0.0632', 'pPrice': '52.10', 'pChange': '-0.7200', 'pChRatio': '-0.0136', 'marketValue': '264979981.20', 'volume': '9305', 'turnoverRate': '0.0018'},
{'tickerId': 913323248, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Discovery', 'symbol': 'DISCB', 'disSymbol': 'DISCB', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'status': 'A', 'close': '57.95', 'change': '23.63', 'changeRatio': '0.6884', 'pPrice': '54.26', 'pChange': '-3.6900', 'pChRatio': '-0.0637', 'marketValue': '29362894177.95', 'volume': '218305', 'turnoverRate': '0.0004'},
{'tickerId': 913323930, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'MercadoLibre', 'symbol': 'MELI', 'disSymbol': 'MELI', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '605.52', 'change': '22.01', 'changeRatio': '0.0377', 'pPrice': '603.69', 'pChange': '-1.8300', 'pChRatio': '-0.0030', 'marketValue': '30226598045.28', 'volume': '699008', 'turnoverRate': '0.0140'},
{'tickerId': 913257170, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Liberty Global', 'symbol': 'LBTYA', 'disSymbol': 'LBTYA', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '22.28', 'change': '2.86', 'changeRatio': '0.1473', 'pPrice': '22.29', 'pChange': '0.0100', 'pChRatio': '0.0004', 'marketValue': '14094419548.52', 'volume': '10534672', 'turnoverRate': '0.0167'},
{'tickerId': 913303991, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Liberty Brodband', 'symbol': 'LBRDK', 'disSymbol': 'LBRDK', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '125.44', 'change': '2.76', 'changeRatio': '0.0225', 'pPrice': '125.44', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '22817900904.96', 'volume': '926177', 'turnoverRate': '0.0042'},
{'tickerId': 913257082, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Helen of Troy', 'symbol': 'HELE', 'disSymbol': 'HELE', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '167.04', 'change': '2.76', 'changeRatio': '0.0168', 'pPrice': '167.04', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '4216707982.08', 'volume': '341465', 'turnoverRate': '0.0135'},
{'tickerId': 913256458, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Forrester', 'symbol': 'FORR', 'disSymbol': 'FORR', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '33.88', 'change': '2.58', 'changeRatio': '0.0824', 'marketValue': '635419400.00', 'volume': '85115', 'turnoverRate': '0.0045'},
{'tickerId': 950158952, 'exchangeId': 95, 'type': 2, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'LYRA THERAPEUTICS, INC.', 'symbol': 'LYRA', 'disSymbol': 'LYRA', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NMS', 'listStatus': 1, 'template': 'ipo', 'status': 'A', 'close': '18.56', 'change': '2.56', 'changeRatio': '0.1600', 'pPrice': '18.96', 'pChange': '0.4000', 'pChRatio': '0.0216', 'marketValue': '229705575.68', 'volume': '1738472', 'turnoverRate': '0.1405'},
{'tickerId': 913257570, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Bio-Techne', 'symbol': 'TECH', 'disSymbol': 'TECH', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '227.54', 'change': '2.54', 'changeRatio': '0.0113', 'pPrice': '227.54', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '8726538309.18', 'volume': '497006', 'turnoverRate': '0.0130'},
{'tickerId': 913323246, 'exchangeId': 96, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Bel Fuse', 'symbol': 'BELFB', 'disSymbol': 'BELFB', 'disExchangeCode': 'NASDAQ', 'exchangeCode': 'NSQ', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '9.99', 'change': '2.53', 'changeRatio': '0.3391', 'pPrice': '9.75', 'pChange': '-0.2400', 'pChRatio': '-0.0240', 'marketValue': '122562454.86', 'volume': '177634', 'turnoverRate': '0.0145'},
{'tickerId': 916040647, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Agnico Eagle', 'symbol': 'AEM', 'disSymbol': 'AEM', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '61.20', 'change': '2.52', 'changeRatio': '0.0429', 'pPrice': '61.10', 'pChange': '-0.1000', 'pChRatio': '-0.0016', 'marketValue': '14739911553.60', 'volume': '2820765', 'turnoverRate': '0.0117'},
{'tickerId': 913303768, 'exchangeId': 12, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'CHASE CORP', 'symbol': 'CCF', 'disSymbol': 'CCF', 'disExchangeCode': 'AMEX', 'exchangeCode': 'ASE', 'listStatus': 1, 'template': 'stock', 'status': 'D', 'close': '96.71', 'change': '2.45', 'changeRatio': '0.0260', 'marketValue': '916799598.60', 'volume': '29229', 'turnoverRate': '0.0031'},
{'tickerId': 913324557, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'Allergan', 'symbol': 'AGN', 'disSymbol': 'AGN', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'A', 'close': '189.74', 'change': '2.40', 'changeRatio': '0.0128', 'pPrice': '189.76', 'pChange': '0.0200', 'pChRatio': '0.0001', 'marketValue': '62424842326.10', 'volume': '5787032', 'turnoverRate': '0.0176'},
{'tickerId': 913324566, 'exchangeId': 11, 'type': 2, 'secType': 61, 'regionId': 6, 'regionCode': 'US', 'currencyId': 247, 'name': 'West Pharm Svc', 'symbol': 'WST', 'disSymbol': 'WST', 'disExchangeCode': 'NYSE', 'exchangeCode': 'NYSE', 'listStatus': 1, 'template': 'stock', 'derivativeSupport': 1, 'status': 'D', 'close': '191.64', 'change': '2.38', 'changeRatio': '0.0126', 'pPrice': '191.64', 'pChange': '0.0000', 'pChRatio': '0.0000', 'marketValue': '14078267117.08', 'volume': '352460', 'turnoverRate': '0.0042'}
]

db = su.Database(f"overnight hold.db" )
db['active'].insert_all(data)

--------------- file ends ----------------------

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 610517472  
612216820 https://github.com/simonw/datasette/issues/236#issuecomment-612216820 https://api.github.com/repos/simonw/datasette/issues/236 MDEyOklzc3VlQ29tbWVudDYxMjIxNjgyMA== cldellow 193185 2020-04-10T21:03:38Z 2020-04-10T21:03:38Z CONTRIBUTOR

I made a repo at https://github.com/code402/datasette-lambda to demonstrate the idea, and scratch my personal itch for this.

The demo relies on some central authority having already published a public, reusable Lambda layer with Datasette & its dependencies. I think that differs from the other publish plugins which seem to mainly publish Dockerfiles that the host will interpret to install deps from a requirements.txt file.

I chose that approach because uvloop appears to be a dependency with native code that needs to be compiled for the target runtime environment. In this case, that's Amazon Linux 2. I'm not 100% clear on whether that's still required, because:

  • maybe uvloop is only needed for uvicorn, which the demo doesn't actually use since HTTP routing is handled by API Gateway
  • it seems like uvloop may be an optional, drop-in optimization for asyncio in any case (but I may be misreading this; I'm very much a Python noob)

If it's the case that uvloop is truly optional, then I think the publish plugin could do the packaging on the user's machine, regardless of what flavour of operating system they're on. That'd be a bit slower for the user, but would provide the most long-term flexibility in terms of supporting plugins.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
datasette publish lambda plugin 317001500  
608716819 https://github.com/simonw/datasette/issues/236#issuecomment-608716819 https://api.github.com/repos/simonw/datasette/issues/236 MDEyOklzc3VlQ29tbWVudDYwODcxNjgxOQ== cldellow 193185 2020-04-03T22:19:00Z 2020-04-03T22:19:00Z CONTRIBUTOR

Hi Simon,

I'm thinking of attempting this. Can you clarify some questions I have?

1) I assume the goal is to have a CORS-friendly HTTPS endpoint that hosts the datasette service + user's db.

2) If that's the goal, I think Lambda alone is insufficient. Lambda provides the compute fabric, but not the HTTP routing. You'd also need to add Application Load Balancer or API Gateway to provide an HTTP endpoint that routes to the lambda function.

Do you have a preference between ALB or API GW? ALB has better economics at scale, but has a minimum monthly cost. API GW has worse per-request economics, but scales to zero when no requests are happening.

3) Does Datasette have any native components, or is it all pure python? If it has native bits, they'll likely need to be recompiled to work on Amazon Linux 2.

4) There are a few disparate services that need to be wired together to expose a Python service securely to the web. If I was doing this outside of the datasette publish system, I'd use an AWS CloudFormation template. Even within datasette, I think it still makes sense to use a CloudFormation template and just have the publish plugin invoke it (via the standard aws cli) with user-specified parameters. Does that sound reasonable to you?

Thanks for your help!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
datasette publish lambda plugin 317001500  
604328163 https://github.com/simonw/datasette/issues/573#issuecomment-604328163 https://api.github.com/repos/simonw/datasette/issues/573 MDEyOklzc3VlQ29tbWVudDYwNDMyODE2Mw== psychemedia 82988 2020-03-26T09:41:30Z 2020-03-26T09:41:30Z CONTRIBUTOR
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Exposing Datasette via Jupyter-server-proxy 492153532  
590022164 https://github.com/simonw/datasette/pull/666#issuecomment-590022164 https://api.github.com/repos/simonw/datasette/issues/666 MDEyOklzc3VlQ29tbWVudDU5MDAyMjE2NA== kevindkeogh 13896256 2020-02-23T03:26:00Z 2020-02-23T03:26:00Z CONTRIBUTOR

It was very helpful for me, using it for a 15M row table. Added a test, happy to amend though!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Use inspect-file, if possible, for total row count 562085508  
586599424 https://github.com/simonw/datasette/issues/417#issuecomment-586599424 https://api.github.com/repos/simonw/datasette/issues/417 MDEyOklzc3VlQ29tbWVudDU4NjU5OTQyNA== psychemedia 82988 2020-02-15T15:12:19Z 2020-02-15T15:12:33Z CONTRIBUTOR

So could the polling support also allow you to call sqlite_utils to update a database with csv files? (Though I'm guessing you would only want to handle changed files? Do your scrapers check and cache csv datestamps/hashes?)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Datasette Library 421546944  
582106085 https://github.com/simonw/datasette/pull/653#issuecomment-582106085 https://api.github.com/repos/simonw/datasette/issues/653 MDEyOklzc3VlQ29tbWVudDU4MjEwNjA4NQ== jaywgraves 418191 2020-02-04T20:43:43Z 2020-02-04T20:43:43Z CONTRIBUTOR

but this also doesn't have to land at all if it doesn't match your use case.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
allow leading comments in SQL input field 541331755  
582105810 https://github.com/simonw/datasette/pull/653#issuecomment-582105810 https://api.github.com/repos/simonw/datasette/issues/653 MDEyOklzc3VlQ29tbWVudDU4MjEwNTgxMA== jaywgraves 418191 2020-02-04T20:43:01Z 2020-02-04T20:43:01Z CONTRIBUTOR

I think the existing code will be OK even if I strip the lines in the middle of a new line delimited string.

It's only used for the validation, SQLite handles the -- just fine and the whole SQL textarea still gets sent once it passes validation.

I can add your test case to my branch later this evening though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
allow leading comments in SQL input field 541331755  
573389669 https://github.com/simonw/sqlite-utils/issues/74#issuecomment-573389669 https://api.github.com/repos/simonw/sqlite-utils/issues/74 MDEyOklzc3VlQ29tbWVudDU3MzM4OTY2OQ== jayvdb 15092 2020-01-12T07:21:17Z 2020-01-12T07:21:17Z CONTRIBUTOR

I guess there is some extra flag for CliRunner.invoke to check exitcode and raise the exception, or that should be an extra assert added.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column 546073980  
573388052 https://github.com/simonw/sqlite-utils/issues/74#issuecomment-573388052 https://api.github.com/repos/simonw/sqlite-utils/issues/74 MDEyOklzc3VlQ29tbWVudDU3MzM4ODA1Mg== jayvdb 15092 2020-01-12T06:51:30Z 2020-01-12T06:51:30Z CONTRIBUTOR

Thanks. That showed me that there was a click cli runner error, and setting export LANG=en_US.UTF-8 fixed it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column 546073980  
567133734 https://github.com/simonw/datasette/issues/394#issuecomment-567133734 https://api.github.com/repos/simonw/datasette/issues/394 MDEyOklzc3VlQ29tbWVudDU2NzEzMzczNA== jsfenfen 639012 2019-12-18T17:33:23Z 2019-12-18T17:33:23Z CONTRIBUTOR

FWIW I did a dumb merge of the branch here: https://github.com/jsfenfen/datasette and it seemed to work in that I could run stuff at a subdirectory, but ended up abandoning it in favor of just posting a subdomain because getting the nginx configs right was making me crazy. I still would prefer posting at a subdirectory but the subdomain seems simpler at the moment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url configuration setting 396212021  
565755208 https://github.com/simonw/datasette/pull/644#issuecomment-565755208 https://api.github.com/repos/simonw/datasette/issues/644 MDEyOklzc3VlQ29tbWVudDU2NTc1NTIwOA== chris48s 6025893 2019-12-14T21:33:31Z 2019-12-14T21:33:31Z CONTRIBUTOR

Hi @simonw

Have you had a chance to look at this at all?

I'm going to have a chunk of time free next week so if there is additional work needed on this, that would be a particularly convenient time for me to revisit this.

Cheers

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Validate metadata json on startup 530513784  
559632608 https://github.com/simonw/datasette/issues/573#issuecomment-559632608 https://api.github.com/repos/simonw/datasette/issues/573 MDEyOklzc3VlQ29tbWVudDU1OTYzMjYwOA== psychemedia 82988 2019-11-29T01:43:38Z 2019-11-29T01:43:38Z CONTRIBUTOR

In passing, it looks like a start was made on a datasette Jupyter server extension in https://github.com/lucasdurand/jupyter-datasette although the build fails in MyBinder.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Exposing Datasette via Jupyter-server-proxy 492153532  
559207224 https://github.com/simonw/datasette/issues/642#issuecomment-559207224 https://api.github.com/repos/simonw/datasette/issues/642 MDEyOklzc3VlQ29tbWVudDU1OTIwNzIyNA== psychemedia 82988 2019-11-27T18:40:57Z 2019-11-27T18:41:07Z CONTRIBUTOR

Would cookie cutter approaches also work for creating various flavours of customised templates?

I need to try to create a couple of sites for myself to get a feel for what sorts of thing are easily doable, and what cribbable cookie cutter items might be. I'm guessing https://simonwillison.net/2019/Nov/25/niche-museums/ is a good place to start from?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Provide a cookiecutter template for creating new plugins 529429214  
558687342 https://github.com/simonw/datasette/issues/639#issuecomment-558687342 https://api.github.com/repos/simonw/datasette/issues/639 MDEyOklzc3VlQ29tbWVudDU1ODY4NzM0Mg== jacobian 21148 2019-11-26T15:40:00Z 2019-11-26T15:40:00Z CONTRIBUTOR

A bit of background: the reason heroku git:clone brings down an empty directory is because datasette publish heroku uses the builds API, rather than a git push, to release the app. I originally did this because it seemed like a lower bar than having a working git, but the downside is, as you found out, that tweaking the created app is hard.

So there's one option -- change datasette publish heroku to use git push instead of heroku builds:create.

@pkoppstein - what you suggested seems like it ought to work (you don't need maintenance mode, though). I'm not sure why it doesn't.

You could also look into using the slugs API to download the slug, change metadata.json, re-pack and re-upload the slug.

Ultimately though I think I think @simonw's idea of reading metadata.json from an external source might be better (#357). Reading from an alternate URL would be fine, or you could also just stuff the whole metadata.json into a Heroku config var, and write a plugin to read it from there.

Hope this helps a bit!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
updating metadata.json without recreating the app 527670799  
556749086 https://github.com/simonw/datasette/issues/394#issuecomment-556749086 https://api.github.com/repos/simonw/datasette/issues/394 MDEyOklzc3VlQ29tbWVudDU1Njc0OTA4Ng== jsfenfen 639012 2019-11-21T01:15:34Z 2019-11-21T01:21:45Z CONTRIBUTOR

Hey @simonw is the url_prefix config option available in another branch, it looks like you've written some tests for it above? In 0.32 I get "url_prefix is not a valid option". I think this would be really helpful!

This would be really handy for proxying datasette in another domain's subdirectory I believe this will allow folks to run upstream authentication, but the links break if the url_prefix doesn't match.

I'd prefer not to host a proxied version of datasette on a subdomain (e.g. datasette.myurl.com b/c then I gotta worry about sharing authorization cookies with the subdomain, which I just assume not do, but...)

Edit: I see the wip-url-prefix branch, I may try with that https://github.com/simonw/datasette/commit/8da2db4b71096b19e7a9ef1929369b8483d448bf

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
base_url configuration setting 396212021  
549246007 https://github.com/simonw/datasette/pull/602#issuecomment-549246007 https://api.github.com/repos/simonw/datasette/issues/602 MDEyOklzc3VlQ29tbWVudDU0OTI0NjAwNw== rixx 2657547 2019-11-04T07:29:33Z 2019-11-04T07:29:33Z CONTRIBUTOR

Not sure – I'm always a bit weirded out when elements that I clicked disappear on me.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Offer to format readonly SQL 509535510  
544214418 https://github.com/simonw/datasette/pull/601#issuecomment-544214418 https://api.github.com/repos/simonw/datasette/issues/601 MDEyOklzc3VlQ29tbWVudDU0NDIxNDQxOA== rixx 2657547 2019-10-20T02:29:49Z 2019-10-20T02:29:49Z CONTRIBUTOR

Submitted in #602!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Don't auto-format SQL on page load 509340359  
544008944 https://github.com/simonw/datasette/pull/601#issuecomment-544008944 https://api.github.com/repos/simonw/datasette/issues/601 MDEyOklzc3VlQ29tbWVudDU0NDAwODk0NA== rixx 2657547 2019-10-18T23:40:48Z 2019-10-18T23:40:48Z CONTRIBUTOR

The only negative impact that comes to mind is that now you have no way to get the read-only query to be formatted nicely, I think, so maybe a second PR adding the formatting functionality even to the read-only page would be good?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Don't auto-format SQL on page load 509340359  
544008463 https://github.com/simonw/datasette/pull/601#issuecomment-544008463 https://api.github.com/repos/simonw/datasette/issues/601 MDEyOklzc3VlQ29tbWVudDU0NDAwODQ2Mw== rixx 2657547 2019-10-18T23:39:21Z 2019-10-18T23:39:21Z CONTRIBUTOR

That looks right, and I completely agree with the intent.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Don't auto-format SQL on page load 509340359  
541587823 https://github.com/simonw/datasette/pull/590#issuecomment-541587823 https://api.github.com/repos/simonw/datasette/issues/590 MDEyOklzc3VlQ29tbWVudDU0MTU4NzgyMw== rixx 2657547 2019-10-14T09:58:23Z 2019-10-14T09:58:23Z CONTRIBUTOR

Added tests.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Handle spaces in DB names 505818256  
541562581 https://github.com/simonw/datasette/pull/590#issuecomment-541562581 https://api.github.com/repos/simonw/datasette/issues/590 MDEyOklzc3VlQ29tbWVudDU0MTU2MjU4MQ== rixx 2657547 2019-10-14T08:57:46Z 2019-10-14T08:57:46Z CONTRIBUTOR

Ah, thank you – I saw the need for unit tests but wasn't sure what the best way to add one would be.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Handle spaces in DB names 505818256  
541119038 https://github.com/simonw/datasette/issues/512#issuecomment-541119038 https://api.github.com/repos/simonw/datasette/issues/512 MDEyOklzc3VlQ29tbWVudDU0MTExOTAzOA== rixx 2657547 2019-10-11T15:49:13Z 2019-10-11T15:49:13Z CONTRIBUTOR

How open are you to changing the config variable names (with appropriate deprecation, of course)? "about_url_text", "license_url_text" etc might be better suited to convey that these are just meant as basically URL titles.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"about" parameter in metadata does not appear when alone 457147936  
541118904 https://github.com/simonw/datasette/issues/507#issuecomment-541118904 https://api.github.com/repos/simonw/datasette/issues/507 MDEyOklzc3VlQ29tbWVudDU0MTExODkwNA== rixx 2657547 2019-10-11T15:48:49Z 2019-10-11T15:48:49Z CONTRIBUTOR

Headless Chrome and Firefox via Selenium are a solid choice in my experience. You may be interested in how pretix and pretalx solve this problem: They use pytest to create those screenshots on release to make sure they are up to date. See this writeup and this repo.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Every datasette plugin on the ecosystem page should have a screenshot 455852801  
541052329 https://github.com/simonw/datasette/issues/585#issuecomment-541052329 https://api.github.com/repos/simonw/datasette/issues/585 MDEyOklzc3VlQ29tbWVudDU0MTA1MjMyOQ== rixx 2657547 2019-10-11T12:53:51Z 2019-10-11T12:53:51Z CONTRIBUTOR

I think this would be good, yeah – currently, databases are explicitly sorted by name in the IndexView, we could just remove that part (and use an OrderedDict for consistency, I suppose)?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Databases on index page should display in order they were passed to "datasette serve"? 503217375  
533818697 https://github.com/simonw/sqlite-utils/issues/61#issuecomment-533818697 https://api.github.com/repos/simonw/sqlite-utils/issues/61 MDEyOklzc3VlQ29tbWVudDUzMzgxODY5Nw== amjith 49260 2019-09-21T18:09:01Z 2019-09-21T18:09:28Z CONTRIBUTOR

@witeshadow The library version doesn't have helpers around CSV (at least not from what I can see in the code).

But here's a snippet that makes it easy to insert from CSV using the library.

import csv
from sqlite_utils import Database

# CSV Reader

csv_file = open("filename.csv")   # open the csv file.
reader = csv.reader(csv_file)  # Create a CSV reader
headers = next(reader)   # First line is the header
docs = (dict(zip(headers, row)) for row in reader)

# Now you can use the `sqlite_utils` library. 

db = Database("my_database.db")
db["table_name"].insert_all(docs)

This snippet is adapted from reading the CLI source code on how it implements the csv option.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
importing CSV to SQLite as library 491219910  
527211047 https://github.com/simonw/sqlite-utils/pull/57#issuecomment-527211047 https://api.github.com/repos/simonw/sqlite-utils/issues/57 MDEyOklzc3VlQ29tbWVudDUyNzIxMTA0Nw== amjith 49260 2019-09-02T17:30:43Z 2019-09-02T17:30:43Z CONTRIBUTOR

I have merged the other PR (#56) into this one.

I have incorporated your suggestions. Cheers!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add triggers while enabling FTS 487987958  
527209840 https://github.com/simonw/sqlite-utils/pull/56#issuecomment-527209840 https://api.github.com/repos/simonw/sqlite-utils/issues/56 MDEyOklzc3VlQ29tbWVudDUyNzIwOTg0MA== amjith 49260 2019-09-02T17:23:21Z 2019-09-02T17:23:21Z CONTRIBUTOR

I have updated the other PR with the changes from this one and added tests. I have also changed the escaping from double quotes to brackets.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Escape the table name in populate_fts and search. 487847945  
510730200 https://github.com/simonw/datasette/issues/511#issuecomment-510730200 https://api.github.com/repos/simonw/datasette/issues/511 MDEyOklzc3VlQ29tbWVudDUxMDczMDIwMA== abdusco 3243482 2019-07-12T03:23:22Z 2019-07-12T03:23:22Z CONTRIBUTOR

@simonw yes it works fine on Windows, but test suite doesn't run properly, for that I had to use WSL

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get Datasette working on Windows, including CI 456578474  
509629331 https://github.com/simonw/datasette/pull/554#issuecomment-509629331 https://api.github.com/repos/simonw/datasette/issues/554 MDEyOklzc3VlQ29tbWVudDUwOTYyOTMzMQ== abdusco 3243482 2019-07-09T12:51:35Z 2019-07-09T12:51:35Z CONTRIBUTOR

I wanted to add a test for it too, but I've realized it's impossible to test a server process as we cannot get its exit code.

# tests/test_cli.py
def test_static_mounts_on_windows():
    if sys.platform != "win32":
        return
    runner = CliRunner()
    result = runner.invoke(
        cli, ["serve", "--static", r"s:C:\\"]
    )
    assert result.exit_code == 0
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix static mounts using relative paths and prevent traversal exploits 465728430  
509618339 https://github.com/simonw/datasette/pull/554#issuecomment-509618339 https://api.github.com/repos/simonw/datasette/issues/554 MDEyOklzc3VlQ29tbWVudDUwOTYxODMzOQ== abdusco 3243482 2019-07-09T12:16:32Z 2019-07-09T12:16:32Z CONTRIBUTOR

I've also added another fix for using static mounts with absolute paths on Windows.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix static mounts using relative paths and prevent traversal exploits 465728430  

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);