html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009508865,https://api.github.com/repos/simonw/sqlite-utils/issues/366,1009508865,IC_kwDOCGYnMM48K-IB,9599,2022-01-11T01:08:51Z,2022-01-11T01:08:51Z,OWNER,"The Python methods are all done now, next step is the CLI options. I'll do those in a separate issue.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096563265,
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009288898,https://api.github.com/repos/simonw/sqlite-utils/issues/366,1009288898,IC_kwDOCGYnMM48KIbC,9599,2022-01-10T19:54:04Z,2022-01-10T19:54:04Z,OWNER,"Having browsed the API reference I think the methods that would benefit from an `analyze=True` parameter are:
- `db.create_index`
- `table.insert_all`
- `table.upsert_all`
- `table.delete_where`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096563265,
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009285627,https://api.github.com/repos/simonw/sqlite-utils/issues/366,1009285627,IC_kwDOCGYnMM48KHn7,9599,2022-01-10T19:49:19Z,2022-01-10T19:51:25Z,OWNER,Documentation for those two new methods: https://sqlite-utils.datasette.io/en/latest/python-api.html#optimizing-index-usage-with-analyze,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096563265,
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009286373,https://api.github.com/repos/simonw/sqlite-utils/issues/366,1009286373,IC_kwDOCGYnMM48KHzl,9599,2022-01-10T19:50:22Z,2022-01-10T19:50:22Z,OWNER,"With respect to #365, I'm now thinking that having the ability to say ""... and then run ANALYZE"" could be useful for a bunch of Python methods. For example:
```python
db[""dogs""].insert_all(list_of_dogs, analyze=True)
db[""dogs""].create_index([""name""], analyze=True)
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096563265,
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009273525,https://api.github.com/repos/simonw/sqlite-utils/issues/366,1009273525,IC_kwDOCGYnMM48KEq1,9599,2022-01-10T19:32:39Z,2022-01-10T19:32:39Z,OWNER,"I'm going to implement the Python library methods based on the prototype:
```diff
commit 650f97a08f29a688c530e5f6c9eedc9269ed7bdc
Author: Simon Willison
Date: Sat Jan 8 13:34:01 2022 -0800
Initial prototype of .analyze(), refs #366
diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py
index dfc4723..1348b4a 100644
--- a/sqlite_utils/db.py
+++ b/sqlite_utils/db.py
@@ -923,6 +923,13 @@ class Database:
""Run a SQLite ``VACUUM`` against the database.""
self.execute(""VACUUM;"")
+ def analyze(self, name=None):
+ ""Run ``ANALYZE`` against the entire database or a named table or index.""
+ sql = ""ANALYZE""
+ if name is not None:
+ sql += "" [{}]"".format(name)
+ self.execute(sql)
+
class Queryable:
def exists(self) -> bool:
@@ -2902,6 +2909,10 @@ class Table(Queryable):
)
return self
+ def analyze(self):
+ ""Run ANALYZE against this table""
+ self.db.analyze(self.name)
+
def analyze_column(
self, column: str, common_limit: int = 10, value_truncate=None, total_rows=None
) -> ""ColumnDetails"":
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096563265,
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1008158616,https://api.github.com/repos/simonw/sqlite-utils/issues/366,1008158616,IC_kwDOCGYnMM48F0eY,9599,2022-01-08T21:35:32Z,2022-01-08T21:35:32Z,OWNER,"Built a prototype in a branch, see #367.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096563265,
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1008157132,https://api.github.com/repos/simonw/sqlite-utils/issues/366,1008157132,IC_kwDOCGYnMM48F0HM,9599,2022-01-08T21:23:08Z,2022-01-08T21:25:05Z,OWNER,"Running `ANALYZE` creates a new visible table called `sqlite_stat1`: https://www.sqlite.org/fileformat.html#the_sqlite_stat1_table
This should be added to the default list of hidden tables in Datasette.
It looks something like this:
| tbl | idx | stat |
|---------------------------------|------------------------------------|-----------|
| _counts | sqlite_autoindex__counts_1 | 5 1 |
| global-power-plants_fts_config | global-power-plants_fts_config | 1 1 |
| global-power-plants_fts_docsize | | 33643 |
| global-power-plants_fts_idx | global-power-plants_fts_idx | 199 40 1 |
| global-power-plants_fts_data | | 136 |
| global-power-plants | ""global-power-plants_owner"" | 33643 4 |
| global-power-plants | ""global-power-plants_country_long"" | 33643 202 |
> In each such row, the sqlite_stat.stat column will be a string consisting of a list of integers followed by zero or more arguments. The first integer in this list is the approximate number of rows in the index. (The number of rows in the index is the same as the number of rows in the table, except for partial indexes.) The second integer is the approximate number of rows in the index that have the same value in the first column of the index. The third integer is the number number of rows in the index that have the same value for the first two columns. The N-th integer (for N>1) is the estimated average number of rows in the index which have the same value for the first N-1 columns. For a K-column index, there will be K+1 integers in the stat column. If the index is unique, then the last integer will be 1. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096563265,
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1007641634,https://api.github.com/repos/simonw/sqlite-utils/issues/366,1007641634,IC_kwDOCGYnMM48D2Qi,9599,2022-01-07T18:35:35Z,2022-01-07T18:35:35Z,OWNER,"Since the existing CLI feature is this:
$ sqlite-utils analyze-tables github.db tags
I can add `sqlite-utils analyze` to reflect the Python library method.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096563265,
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1007639860,https://api.github.com/repos/simonw/sqlite-utils/issues/366,1007639860,IC_kwDOCGYnMM48D100,9599,2022-01-07T18:32:59Z,2022-01-07T18:33:07Z,OWNER,"From the SQLite docs:
> If no arguments are given, all attached databases are analyzed. If a schema name is given as the argument, then all tables and indices in that one database are analyzed. If the argument is a table name, then only that table and the indices associated with that table are analyzed. If the argument is an index name, then only that one index is analyzed.
So I think this becomes two methods:
- `db.analyze()` calls analyze on the whole database
- `db.analyze(name_of_table_or_index)` for a specific named table or index
- `table.analyze()` is a shortcut for `db.analyze(table.name)`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096563265,
https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1007637963,https://api.github.com/repos/simonw/sqlite-utils/issues/366,1007637963,IC_kwDOCGYnMM48D1XL,9599,2022-01-07T18:30:13Z,2022-01-07T18:30:13Z,OWNER,"Annoyingly I use the word ""analyze"" to mean something else in the CLI - for these features:
- #207
- #320
there's only one method with a similar name in the Python library though and that's this one:
https://github.com/simonw/sqlite-utils/blob/6e46b9913411682f3a3ec66f4d58886c1db8654b/sqlite_utils/db.py#L2904-L2906","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1096563265,