{"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-552140975", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 552140975, "node_id": "MDEyOklzc3VlQ29tbWVudDU1MjE0MDk3NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-11-09T21:51:41Z", "updated_at": "2019-11-09T21:51:41Z", "author_association": "OWNER", "body": "It may turn out that we have to recommend NOT exposing a Datasette instance to the public with dozens of database files that has multi-db queries enabled - will need to load test to understand if this recommendation is needed or not.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-552140870", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 552140870, "node_id": "MDEyOklzc3VlQ29tbWVudDU1MjE0MDg3MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-11-09T21:49:51Z", "updated_at": "2019-11-09T21:49:51Z", "author_association": "OWNER", "body": "Better idea: if you run Datasette in cross-database joining mode, all connections start out as memory connections and then have new databases attached to them on-demand.\r\n\r\nAll table view queries will be automatically rewritten to start `SELECT db.table.one, db.table.two FROM db.table ...`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-537716955", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 537716955, "node_id": "MDEyOklzc3VlQ29tbWVudDUzNzcxNjk1NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-10-02T23:02:15Z", "updated_at": "2019-10-02T23:02:15Z", "author_association": "OWNER", "body": "I've been thinking pretty hard about this as part of #569. My big concerns are:\r\n\r\n* If I'm caching and reusing connections I need to worry about the different combinations - if I have four databases do I cache separate connections for the (\"one\", \"two\") AND (\"two\", \"three\") AND (\"one\", \"three\") and so on pairs?\r\n* How does the API and interface deal with instances where you have a database connected as the primary and you want to ATTACH another database and talk to that as well?\r\n\r\nI think the best way to do this is to say that cross-database joins will only be available against the `:memory:` database. Maybe with an optional mode you can run like `datasette --crossdb` which causes every database to be `ATTACHd` to that connection with an alias so you can start running queries.\r\n\r\nIf this proves to be a problem when hundreds of files are attached to a Datasette Library instance (#417) then maybe cross database joins are handled (in that case) by the authenticated user selecting which ones to ?_attach= and detaching them at the end of the request. Also perhaps limit to joining across a maximum of 3 databases at once in this case.\r\n\r\nI can probably avoid the scariest negative consequences of cross-database joins by having them turned off by default for signed-out users. The datasette-on-my-laptop or authenticated Datasette Library cases can be opt-in and can be a little less locked down.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391768302", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391768302, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTc2ODMwMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T16:00:05Z", "updated_at": "2018-05-24T16:00:05Z", "author_association": "OWNER", "body": "I like `/-/all-5de27e3` for this (with `/-/all` redirecting to the correct hash)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391756841", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391756841, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTc1Njg0MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T15:27:42Z", "updated_at": "2018-05-24T15:27:42Z", "author_association": "OWNER", "body": "For an example query that pre-populates that textarea... maybe a UNION that pulls the first 10 rows from the first table of each of the first two databases?\r\n\r\n```\r\nselect * from (select rowid, actors from fivethirtyeight.[love-actually/love_actually_adjacencies] limit 10)\r\n union all\r\nselect * from (select rowid, city from [google-trends].[20150430_UKDebate] limit 10)\r\n```\r\n\r\nhttps://datasette-cross-database-joins-prototype.now.sh/memory?sql=select+*+from+%28select+rowid%2C+actors+from+fivethirtyeight.%5Blove-actually%2Flove_actually_adjacencies%5D+limit+10%29%0D%0A+++union+all%0D%0Aselect+*+from+%28select+rowid%2C+city+from+%5Bgoogle-trends%5D.%5B20150430_UKDebate%5D+limit+10%29", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391755300", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391755300, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTc1NTMwMA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T15:23:37Z", "updated_at": "2018-05-24T15:23:37Z", "author_association": "OWNER", "body": "On the `/-all-5de27e3` page we can show the regular https://fivethirtyeight.datasettes.com/fivethirtyeight-5de27e3 interface but instead of the list of tables we can show a list of attached databases plus some help text showing how to construct a cross-database join.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391754506", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391754506, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTc1NDUwNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T15:21:37Z", "updated_at": "2018-05-24T15:21:53Z", "author_association": "OWNER", "body": "Giving it `/all/` would be easier since that way the existing URL routes (including canned queries) would all work... but I would have to teach it NOT to expect a database content hash on that URL.\r\n\r\nOr maybe it should still have a content hash (to enable far-future cache expiry headers on query results) but the hash should be constructed out of all of the other database hashes concatenated together.\r\n\r\nThat way the URLs would be `/all-5de27e3` and `/all-5de27e3/canned-query-name`\r\n\r\nOnly downside: this would make it impossible to have a database file with the name `all.db`. I think that's probably an OK trade-off. You could turn the feature off with a config flag if you really want to use that filename (for whatever reason).\r\n\r\nHow about `/-all-5de27e3/` instead to avoid collisions?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391752882", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391752882, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTc1Mjg4Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T15:17:10Z", "updated_at": "2018-05-24T15:17:10Z", "author_association": "OWNER", "body": "Another option: give this the `/-/all` URL namespace.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391752629", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391752629, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTc1MjYyOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T15:16:25Z", "updated_at": "2018-05-24T15:16:25Z", "author_association": "OWNER", "body": "Should this support canned queries too? I think it should, though that raises interesting questions regarding their URL structure.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391752425", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391752425, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTc1MjQyNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T15:15:51Z", "updated_at": "2018-05-24T15:15:51Z", "author_association": "OWNER", "body": "This would make Datasett's SQL features a lot more instantly obvious to people who land on a homepage, which is probably a good thing.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391752218", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391752218, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTc1MjIxOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T15:15:19Z", "updated_at": "2018-05-24T15:15:19Z", "author_association": "OWNER", "body": "Most of the time Datasette is used with just a single database file. So maybe it makes sense for this option to be turned on by default and to ALWAYS be available on the Datasette instance homepage unless the user has explicitly disabled it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391584112", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391584112, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTU4NDExMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T04:26:29Z", "updated_at": "2018-05-24T04:30:50Z", "author_association": "OWNER", "body": "I built a very rough prototype of this to prove it could work. It's deployed here - and here's an example of a query that joins across two different databases:\r\n\r\nhttps://datasette-cross-database-joins-prototype.now.sh/memory?sql=select+fivethirtyeight.%5Blove-actually%2Flove_actually_adjacencies%5D.rowid%2C%0D%0Afivethirtyeight.%5Blove-actually%2Flove_actually_adjacencies%5D.actors%2C%0D%0A%5Bgoogle-trends%5D.%5B20150430_UKDebate%5D.city%0D%0Afrom+fivethirtyeight.%5Blove-actually%2Flove_actually_adjacencies%5D%0D%0Ajoin+%5Bgoogle-trends%5D.%5B20150430_UKDebate%5D%0D%0A++on+%5Bgoogle-trends%5D.%5B20150430_UKDebate%5D.rowid+%3D+fivethirtyeight.%5Blove-actually%2Flove_actually_adjacencies%5D.rowid\r\n\r\n```\r\nselect fivethirtyeight.[love-actually/love_actually_adjacencies].rowid,\r\nfivethirtyeight.[love-actually/love_actually_adjacencies].actors,\r\n[google-trends].[20150430_UKDebate].city\r\nfrom fivethirtyeight.[love-actually/love_actually_adjacencies]\r\njoin [google-trends].[20150430_UKDebate]\r\n on [google-trends].[20150430_UKDebate].rowid = fivethirtyeight.[love-actually/love_actually_adjacencies].rowid\r\n```\r\nI deployed it like this:\r\n\r\n datasette publish now --branch=cross-database-joins fivethirtyeight.db google-trends.db --name=datasette-cross-database-joins-prototype\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391584527", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391584527, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTU4NDUyNw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T04:29:40Z", "updated_at": "2018-05-24T04:29:40Z", "author_association": "OWNER", "body": "Rather than stealing the `/memory` namespace for this it would be nicer if these cross-database joins could be executed at the very top-level URL of the Datasette instance - `https://example.com/?sql=...`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391584366", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391584366, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTU4NDM2Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T04:28:20Z", "updated_at": "2018-05-24T04:28:20Z", "author_association": "OWNER", "body": "I used some pretty ugly hacks, like faking an entire `.inspect()` block for the `:memory:` database just to get past the errors I was seeing. To ship this as a feature it will need quite a bit of code refactoring to make those hacks unnecessary.\r\n\r\nhttps://github.com/simonw/datasette/blob/7a3040f5782375373b2b66e5969bc2c49b3a6f0e/datasette/views/database.py#L18-L26", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-391583528", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 391583528, "node_id": "MDEyOklzc3VlQ29tbWVudDM5MTU4MzUyOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2018-05-24T04:21:49Z", "updated_at": "2018-05-24T04:21:49Z", "author_association": "OWNER", "body": "The challenge here is which database should be the \"default\" database. The first database attached to SQLite is treated as the default - if no database is specified in a query, that's the database that queries will be executed against.\r\n\r\nCurrently, each database URL in Datasette (e.g. https://san-francisco.datasettes.com/sf-film-locations-84594a7 v.s. https://san-francisco.datasettes.com/sf-trees-ebc2ad9 ) gets its own independent connection, and all queries within that base URL run against that database.\r\n\r\nIf we're going to attach multiple databases to the same connection, how do we set which database gets to be the default?\r\n\r\nThe easiest thing to do here will be to have a special database (maybe which is turned off by default and can be enabled using `datasette serve --enable-cross-database-joins` or similar) which attaches to ALL the databases. Perhaps it starts as an in-memory database, maybe at `/memory`?\r\n\r\n\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null}