home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

23 rows where "created_at" is on date 2021-02-18 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 7

  • Support cross-database joins 8
  • --crossdb option for joining across databases 8
  • --port option should validate port is between 0 and 65535 2
  • Race condition errors in new refresh_schemas() mechanism 2
  • Feature Request: Gmail 1
  • Vega charts are plotted only for rows on the visible page, cluster maps only for rows in the remaining pages 1
  • "datasette publish cloudrun" cannot publish files with spaces in their name 1

user 5

  • simonw 19
  • Btibert3 1
  • Kabouik 1
  • rayvoelker 1
  • codecov[bot] 1

author_association 2

  • OWNER 19
  • NONE 4
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
781670827 https://github.com/simonw/datasette/issues/283#issuecomment-781670827 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTY3MDgyNw== simonw 9599 2021-02-18T22:16:46Z 2021-02-18T22:16:46Z OWNER

Demo is now live here: https://latest.datasette.io/_memory

The documentation is at https://docs.datasette.io/en/latest/sql_queries.html#cross-database-queries - it links to this example query: https://latest.datasette.io/_memory?sql=select%0D%0A++%27fixtures%27+as+database%2C+%0D%0Afrom%0D%0A++%5Bfixtures%5D.sqlite_master%0D%0Aunion%0D%0Aselect%0D%0A++%27extra_database%27+as+database%2C+%0D%0Afrom%0D%0A++%5Bextra_database%5D.sqlite_master

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781599929 https://github.com/simonw/datasette/pull/1232#issuecomment-781599929 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTU5OTkyOQ== codecov[bot] 22429695 2021-02-18T19:59:54Z 2021-02-18T22:06:42Z NONE

Codecov Report

Merging #1232 (8876499) into main (4df548e) will increase coverage by 0.03%. The diff coverage is 100.00%.

```diff @@ Coverage Diff @@

main #1232 +/-

========================================== + Coverage 91.42% 91.46% +0.03%
========================================== Files 32 32
Lines 3955 3970 +15
========================================== + Hits 3616 3631 +15
Misses 339 339
```

| Impacted Files | Coverage Δ | | |---|---|---| | datasette/app.py | 95.68% <100.00%> (+0.06%) | :arrow_up: | | datasette/cli.py | 76.62% <100.00%> (+0.36%) | :arrow_up: | | datasette/views/database.py | 97.19% <100.00%> (+0.01%) | :arrow_up: |


Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 4df548e...8876499. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781665560 https://github.com/simonw/datasette/issues/283#issuecomment-781665560 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTY2NTU2MA== simonw 9599 2021-02-18T22:06:14Z 2021-02-18T22:06:14Z OWNER

The implementation in #1232 is ready to land. It's the simplest-thing-that-could-possibly-work: you can run datasette one.db two.db three.db --crossdb and then use the /_memory page to run joins across tables from multiple databases.

It only works on the first 10 databases that were passed to the command-line. This means that if you have a Datasette instance with hundreds of attached databases (see Datasette Library) this won't be particularly useful for you.

So... a better, future version of this feature would be one that lets you join across databases on command - maybe by hitting /_memory?attach=db1&attach=db2 to get a special connection.

Also worth noting: plugins that implement the prepare_connection() hook can attach additional databases - so if you need better, customized support for this one way to handle that would be with a custom plugin.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781651283 https://github.com/simonw/datasette/pull/1232#issuecomment-781651283 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTY1MTI4Mw== simonw 9599 2021-02-18T21:37:55Z 2021-02-18T21:37:55Z OWNER

UI listing the attached tables:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781641728 https://github.com/simonw/datasette/pull/1232#issuecomment-781641728 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTY0MTcyOA== simonw 9599 2021-02-18T21:19:34Z 2021-02-18T21:19:34Z OWNER

I tested the demo deployment like this: datasette publish cloudrun fixtures.db extra_database.db \ -m fixtures.json \ --plugins-dir=plugins \ --branch=crossdb \ --extra-options="--setting template_debug 1 --crossdb" \ --install=pysqlite3-binary \ --service=datasette-latest-crossdb

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781637292 https://github.com/simonw/datasette/pull/1232#issuecomment-781637292 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTYzNzI5Mg== simonw 9599 2021-02-18T21:11:31Z 2021-02-18T21:11:31Z OWNER

Due to bug #1233 I'm going to publish the additional database as extra_database.db rather than extra database.db as it is used in the tests.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781636590 https://github.com/simonw/datasette/issues/1233#issuecomment-781636590 https://api.github.com/repos/simonw/datasette/issues/1233 MDEyOklzc3VlQ29tbWVudDc4MTYzNjU5MA== simonw 9599 2021-02-18T21:10:08Z 2021-02-18T21:10:08Z OWNER

I think the bug is here: https://github.com/simonw/datasette/blob/640ac7071b73111ba4423812cd683756e0e1936b/datasette/utils/init.py#L349-L373

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"datasette publish cloudrun" cannot publish files with spaces in their name 811458446  
781634819 https://github.com/simonw/datasette/pull/1232#issuecomment-781634819 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTYzNDgxOQ== simonw 9599 2021-02-18T21:06:43Z 2021-02-18T21:06:43Z OWNER

I'll document this option on https://docs.datasette.io/en/stable/sql_queries.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781629841 https://github.com/simonw/datasette/pull/1232#issuecomment-781629841 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTYyOTg0MQ== simonw 9599 2021-02-18T20:57:23Z 2021-02-18T20:57:23Z OWNER

The new warning looks like this:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781598585 https://github.com/simonw/datasette/pull/1232#issuecomment-781598585 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTU5ODU4NQ== simonw 9599 2021-02-18T19:57:30Z 2021-02-18T19:57:30Z OWNER

It would also be neat if https://latest.datasette.io/ had multiple databases attached in order to demonstrate this feature.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781594632 https://github.com/simonw/datasette/pull/1232#issuecomment-781594632 https://api.github.com/repos/simonw/datasette/issues/1232 MDEyOklzc3VlQ29tbWVudDc4MTU5NDYzMg== simonw 9599 2021-02-18T19:50:21Z 2021-02-18T19:50:21Z OWNER

It would be neat if the /_memory page showed a list of attached databases, to indicate that the --crossdb option is working and give people links to click to start running queries.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--crossdb option for joining across databases 811407131  
781593169 https://github.com/simonw/datasette/issues/283#issuecomment-781593169 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTU5MzE2OQ== simonw 9599 2021-02-18T19:47:34Z 2021-02-18T19:47:34Z OWNER

I have a working version now, moving development to a pull request.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781591015 https://github.com/simonw/datasette/issues/283#issuecomment-781591015 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTU5MTAxNQ== simonw 9599 2021-02-18T19:44:02Z 2021-02-18T19:44:02Z OWNER

For the moment I'm going to hard-code a SQLITE_LIMIT_ATTACHED=10 constant and only attach the first 10 databases to the _memory connection.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781574786 https://github.com/simonw/datasette/issues/283#issuecomment-781574786 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTU3NDc4Ng== simonw 9599 2021-02-18T19:15:37Z 2021-02-18T19:15:37Z OWNER

select * from pragma_database_list(); is useful - shows all attached databases for the current connection.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781573676 https://github.com/simonw/datasette/issues/283#issuecomment-781573676 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTU3MzY3Ng== simonw 9599 2021-02-18T19:13:30Z 2021-02-18T19:13:30Z OWNER

It turns out SQLite defaults to a maximum of 10 attached databases. This can be increased using a compile-time constant, but even with that it cannot be more than 62: https://stackoverflow.com/questions/9845448/attach-limit-10

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  
781560989 https://github.com/simonw/datasette/issues/1231#issuecomment-781560989 https://api.github.com/repos/simonw/datasette/issues/1231 MDEyOklzc3VlQ29tbWVudDc4MTU2MDk4OQ== simonw 9599 2021-02-18T18:50:53Z 2021-02-18T18:50:53Z OWNER

Ideally I'd figure out a way to replicate this error in a concurrent unit test.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
781560865 https://github.com/simonw/datasette/issues/1231#issuecomment-781560865 https://api.github.com/repos/simonw/datasette/issues/1231 MDEyOklzc3VlQ29tbWVudDc4MTU2MDg2NQ== simonw 9599 2021-02-18T18:50:38Z 2021-02-18T18:50:38Z OWNER

I started trying to use locks to resolve this but I've not figured out the right way to do that yet - here's my first experiment: ```diff diff --git a/datasette/app.py b/datasette/app.py index 9e15a16..1681c9d 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -217,6 +217,7 @@ class Datasette: self.inspect_data = inspect_data self.immutables = set(immutables or []) self.databases = collections.OrderedDict() + self._refresh_schemas_lock = threading.Lock() if memory or not self.files: self.add_database(Database(self, is_memory=True), name="_memory") # memory_name is a random string so that each Datasette instance gets its own @@ -324,6 +325,13 @@ class Datasette: self.client = DatasetteClient(self)

 async def refresh_schemas(self):
  • return
  • if self._refresh_schemas_lock.locked():
  • return
  • with self._refresh_schemas_lock:
  • await self._refresh_schemas() +
  • async def _refresh_schemas(self): internal_db = self.databases["_internal"] if not self.internal_db_created: await init_internal_db(internal_db) ```
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Race condition errors in new refresh_schemas() mechanism 811367257  
781546512 https://github.com/simonw/datasette/issues/1226#issuecomment-781546512 https://api.github.com/repos/simonw/datasette/issues/1226 MDEyOklzc3VlQ29tbWVudDc4MTU0NjUxMg== simonw 9599 2021-02-18T18:26:19Z 2021-02-18T18:26:19Z OWNER

This broke CI: https://github.com/simonw/datasette/runs/1929355965?check_suite_focus=true

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--port option should validate port is between 0 and 65535 808843401  
781530157 https://github.com/simonw/datasette/issues/1226#issuecomment-781530157 https://api.github.com/repos/simonw/datasette/issues/1226 MDEyOklzc3VlQ29tbWVudDc4MTUzMDE1Nw== simonw 9599 2021-02-18T18:00:15Z 2021-02-18T18:00:15Z OWNER

I can use click.IntRange(min=None, max=None) for this. https://click.palletsprojects.com/en/7.x/options/#ranges - inclusive on both edges.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--port option should validate port is between 0 and 65535 808843401  
781451701 https://github.com/dogsheep/google-takeout-to-sqlite/issues/4#issuecomment-781451701 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4 MDEyOklzc3VlQ29tbWVudDc4MTQ1MTcwMQ== Btibert3 203343 2021-02-18T16:06:21Z 2021-02-18T16:06:21Z NONE

Awesome!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Feature Request: Gmail 778380836  
781330466 https://github.com/simonw/datasette/issues/1230#issuecomment-781330466 https://api.github.com/repos/simonw/datasette/issues/1230 MDEyOklzc3VlQ29tbWVudDc4MTMzMDQ2Ng== Kabouik 7107523 2021-02-18T13:06:22Z 2021-02-18T15:22:15Z NONE

[Edit] Oh, I just saw the "Load all" button under the cluster map as well as the setting to alter the max number or results. So I guess this issue only is about the Vega charts.

Note that datasette-cluster-map also seems to be limited to 998 displayed points: ![ss-2021-02-18_140548](https://user-images.githubusercontent.com/7107523/108361225-15fb2a80-71ea-11eb-9a19-d885e8513f55.png)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Vega charts are plotted only for rows on the visible page, cluster maps only for rows in the remaining pages 811054000  
781077127 https://github.com/simonw/datasette/issues/283#issuecomment-781077127 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MTA3NzEyNw== simonw 9599 2021-02-18T05:56:30Z 2021-02-18T05:57:34Z OWNER

I'm going to to try prototyping the --crossdb option that causes /_memory to connect to all databases as a starting point and see how well that works.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 1,
    "eyes": 0
}
Support cross-database joins 325958506  
780991910 https://github.com/simonw/datasette/issues/283#issuecomment-780991910 https://api.github.com/repos/simonw/datasette/issues/283 MDEyOklzc3VlQ29tbWVudDc4MDk5MTkxMA== rayvoelker 9308268 2021-02-18T02:13:56Z 2021-02-18T02:13:56Z NONE

I was going ask you about this issue when we talk during your office-hours schedule this Friday, but was there any support ever added for doing this cross-database joining?

I have a use-case where could be pretty neat to do analysis using this tool on time-specific databases from snapshots

https://ilsweb.cincinnatilibrary.org/collection-analysis/

and thanks again for such an amazing tool!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support cross-database joins 325958506  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 576.521ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows