{"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-481310295", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 481310295, "node_id": "MDEyOklzc3VlQ29tbWVudDQ4MTMxMDI5NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-04-09T15:50:52Z", "updated_at": "2019-04-09T15:50:52Z", "author_association": "OWNER", "body": "Efficient row counts are even more important for the `DatabaseView` and `IndexView` pages.\r\n\r\nThe row counts on those pages don't have to be precise, so one option is for me to calculate them and cache them occasionally. I could even have a dedicated thread which just does the counting?\r\n\r\nIn #422 I've figured out a mechanism for getting accurate or lower-bound counts within a time limit (accurate if possible, lower-bound otherwise).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-480556166", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 480556166, "node_id": "MDEyOklzc3VlQ29tbWVudDQ4MDU1NjE2Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-04-07T03:35:59Z", "updated_at": "2019-04-07T03:48:14Z", "author_association": "OWNER", "body": "Still need to solve: `TableView.data()` - but this is the one with a row count in hence the need to solve #422 ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-480552387", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 480552387, "node_id": "MDEyOklzc3VlQ29tbWVudDQ4MDU1MjM4Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-04-07T02:06:20Z", "updated_at": "2019-04-07T02:06:20Z", "author_association": "OWNER", "body": "`expand_foreign_keys()` relies on the `.inspect()` command having automatically derived the `label_column` for a table, which it does using this code:\r\n\r\nhttps://github.com/simonw/datasette/blob/97331f3435ba1583a0f9dbcaffc25de8894cf1f8/datasette/inspect.py#L34-L42\r\n\r\nThis needs access to the column names for the table. I think we can drop this entirely in favour of a new utility function - and that function can incorporate the metadata check as well.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-478393116", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 478393116, "node_id": "MDEyOklzc3VlQ29tbWVudDQ3ODM5MzExNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-03-31T22:52:48Z", "updated_at": "2019-03-31T22:52:48Z", "author_association": "OWNER", "body": "This means the `Datasette` class needs a new property, keeping track of all of the connected databases.\r\n\r\n```\r\nds.databases = {\r\n \"name_used_in_urls\": {\r\n \"type\": \"file\", # or \"memory\"\r\n \"path\": filepath # or None if memory\r\n \"mutable\": True # or False,\r\n \"hash\": \"...\" # or None if mutable\r\n }\r\n}\r\n```\r\n\r\nMaybe these should be objects, not dictionaries.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-478391708", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 478391708, "node_id": "MDEyOklzc3VlQ29tbWVudDQ3ODM5MTcwOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-03-31T22:33:32Z", "updated_at": "2019-03-31T22:34:02Z", "author_association": "OWNER", "body": "Next I need to fix this:\r\n\r\nhttps://github.com/simonw/datasette/blob/0209a0a344503157351e625f0629b686961763c9/datasette/app.py#L420-L435\r\n\r\nGiven the name of the database (from the URL e.g. https://latest.datasette.io/fixtures) I need to figure out what name I used to cache the collection.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-477636768", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 477636768, "node_id": "MDEyOklzc3VlQ29tbWVudDQ3NzYzNjc2OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-03-28T15:09:27Z", "updated_at": "2019-03-28T15:09:27Z", "author_association": "OWNER", "body": "Even more tricky: `table_exists()` is currently a synchronous function. If it's going to be executing a SQL query it needs to become an async function.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-477633354", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 477633354, "node_id": "MDEyOklzc3VlQ29tbWVudDQ3NzYzMzM1NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-03-28T15:01:37Z", "updated_at": "2019-03-28T15:01:37Z", "author_association": "OWNER", "body": "I started looking at how I would implement `table_exists()` with a direct call that uses `sqlite-utils` to see if a table exists.\r\n\r\nhttps://github.com/simonw/datasette/blob/82fec6048148b58748040a7e2caa163387e982a3/datasette/app.py#L303-L304\r\n\r\n`sqlite-utils` needs access to the database connection - but the database connection itself is currently only available in code that runs in a thread inside the `.execute()` method:\r\n\r\nhttps://github.com/simonw/datasette/blob/82fec6048148b58748040a7e2caa163387e982a3/datasette/app.py#L413-L426\r\n\r\nSo I'm going to need to refactor this a bit. I think I need a way to say \"here is a function which needs access to the connection object for database named X - run that function in a thread, give it access to that connection and then give me back the result\".\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-474407617", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 474407617, "node_id": "MDEyOklzc3VlQ29tbWVudDQ3NDQwNzYxNw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-03-19T14:55:51Z", "updated_at": "2019-03-19T14:55:51Z", "author_association": "OWNER", "body": "A microbenchmark against `fivethirtyeight.db` (415 tables):\r\n\r\n In [1]: import sqlite3 \r\n In [2]: c = sqlite3.connect(\"fivethirtyeight.db\") \r\n In [3]: %timeit c.execute(\"select name from sqlite_master where type = 'table'\").fetchall() \r\n 283 \u00b5s \u00b1 12.3 \u00b5s per loop (mean \u00b1 std. dev. of 7 runs, 1000 loops each)\r\n In [4]: tables = [r[0] for r in c.execute(\"select name from sqlite_master where type = 'table'\").fetchall()] \r\n In [5]: len(tables) \r\n Out[5]: 415\r\n In [6]: %timeit [c.execute(\"pragma foreign_keys([{}])\".format(t)).fetchall() for t in tables] \r\n 1.81 ms \u00b1 161 \u00b5s per loop (mean \u00b1 std. dev. of 7 runs, 1000 loops each)\r\n\r\nSo running `pragma foreign_keys()` against 415 tables only takes 1.81ms. This is going to be fine.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-474399630", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 474399630, "node_id": "MDEyOklzc3VlQ29tbWVudDQ3NDM5OTYzMA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-03-19T14:38:14Z", "updated_at": "2019-03-19T14:38:14Z", "author_association": "OWNER", "body": "Most of these can be replaced with relatively straight-forward direct introspection of the SQLite table.\r\n\r\nThe one exception is the incoming foreign keys: these can only be found by inspecting ALL of the other tables. \r\n\r\nThis requires running `PRAGMA foreign_key_list([table_name])` against every other table in the database. How expensive is doing this on a database with hundreds of tables?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-474398127", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 474398127, "node_id": "MDEyOklzc3VlQ29tbWVudDQ3NDM5ODEyNw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-03-19T14:34:55Z", "updated_at": "2019-03-19T14:34:55Z", "author_association": "OWNER", "body": "I systematically reviewed the codebase for things that `.inspect()` is used for:\r\n\r\nIn `app.py`:\r\n\r\n* `table_exists()` uses `table in self.inspect().get(database, {}).get(\"tables\")`\r\n* `.execute()` looks up the database name to get the `info[\"file\"]` (the correct filename with the `.db` extension)\r\n\r\nIn `cli.py`:\r\n\r\n* The `datasette inspect` command dumps it to JSON\r\n* `datasette skeleton` iterates over it\r\n* `datasette serve` calls it on startup (to populate static cache of inspect data)\r\n\r\nIn `base.py`:\r\n\r\n* `.database_url(database)` calls it to lookup the hash (if `hash_urls` config turned on)\r\n* `.resolve_db_name()` uses it to lookup the hash\r\n\r\nIn `database.py`:\r\n\r\n* `DatabaseView` uses it to find up the list of tables and views to display, plus the size of the DB file in bytes\r\n* `DatabaseDownload` uses it to get the filepath for download\r\n\r\nIn `index.py`:\r\n\r\n* `IndexView` uses it _extensively_ - to loop through every database and every table. This would make a good starting point for the refactor.\r\n\r\nIn `table.py`:\r\n\r\n* `sortable_columns_for_table()` uses it to find the columns in a table\r\n* `expandable_columns()` uses it to find foreign keys\r\n* `expand_foreign_keys()` uses it to find foreign keys\r\n* `display_columns_and_rows()` uses it to find primary keys and foreign keys... but also has access to a `cursor.description` which it uses to list the columns\r\n* `TableView.data` uses it to lookup columns and primary keys and the `table_rows_count` (used if the thing isn't a view) and probably a few more things, this method is huge!\r\n* `RowView.data` uses it for primary keys\r\n* `foreign_key_tables()` uses it for foreign keys\r\n\r\nIn the tests it's used by `test_api.test_inspect_json()` and by a couple of tests in `test_inspect`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-473744172", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 473744172, "node_id": "MDEyOklzc3VlQ29tbWVudDQ3Mzc0NDE3Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-03-18T02:08:12Z", "updated_at": "2019-03-18T02:08:12Z", "author_association": "OWNER", "body": "Maybe this is a good opportunity to improve the introspection capabilities in [sqlite-utils](https://github.com/simonw/sqlite-utils) and add it as a dependency.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-473726587", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 473726587, "node_id": "MDEyOklzc3VlQ29tbWVudDQ3MzcyNjU4Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-03-17T23:29:22Z", "updated_at": "2019-03-17T23:29:22Z", "author_association": "OWNER", "body": "Needed for #419", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/420#issuecomment-473713946", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/420", "id": 473713946, "node_id": "MDEyOklzc3VlQ29tbWVudDQ3MzcxMzk0Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-03-17T20:56:38Z", "updated_at": "2019-03-17T20:58:17Z", "author_association": "OWNER", "body": "Some examples:\r\n\r\nhttps://github.com/simonw/datasette/blob/1f54e092306b208125f39d06712b02895eb75168/datasette/views/table.py#L34-L40\r\n\r\nhttps://github.com/simonw/datasette/blob/1f54e092306b208125f39d06712b02895eb75168/datasette/views/table.py#L45-L48\r\n\r\nhttps://github.com/simonw/datasette/blob/1f54e092306b208125f39d06712b02895eb75168/datasette/views/table.py#L62-L65\r\n\r\nhttps://github.com/simonw/datasette/blob/1f54e092306b208125f39d06712b02895eb75168/datasette/views/table.py#L112-L123\r\n\r\nhttps://github.com/simonw/datasette/blob/1f54e092306b208125f39d06712b02895eb75168/datasette/views/index.py#L11-L19\r\n\r\nhttps://github.com/simonw/datasette/blob/afe9aa3ae03c485c5d6652741438d09445a486c1/datasette/views/base.py#L143-L147\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 421971339, "label": "Fix all the places that currently use .inspect() data"}, "performed_via_github_app": null}