github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008166084 | https://api.github.com/repos/simonw/sqlite-utils/issues/365 | 1008166084 | IC_kwDOCGYnMM48F2TE | 536941 | 2022-01-08T22:32:47Z | 2022-01-08T22:32:47Z | CONTRIBUTOR | or using “ pragma optimize” | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1096558279 | |
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008164786 | https://api.github.com/repos/simonw/sqlite-utils/issues/365 | 1008164786 | IC_kwDOCGYnMM48F1-y | 536941 | 2022-01-08T22:24:19Z | 2022-01-08T22:24:19Z | CONTRIBUTOR | the out-of-date scenario you describe could be addressed by automatically adding an analyze to the insert or convert commands if they implicate an index | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1096558279 | |
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008164116 | https://api.github.com/repos/simonw/sqlite-utils/issues/365 | 1008164116 | IC_kwDOCGYnMM48F10U | 536941 | 2022-01-08T22:18:57Z | 2022-01-08T22:18:57Z | CONTRIBUTOR | the table with the query ran so bad was about 50k. i think the scenario should not be worse than no stats. i also did not know that sqlite was so different from postgres and needed an explicit analyze call. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1096558279 | |
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008163050 | https://api.github.com/repos/simonw/sqlite-utils/issues/365 | 1008163050 | IC_kwDOCGYnMM48F1jq | 9599 | 2022-01-08T22:10:51Z | 2022-01-08T22:10:51Z | OWNER | Is there a downside to having a `sqlite_stat1` table if it has wildly incorrect statistics in it? Imagine the following sequence of events: - User imports a few records, creating the table, using `sqlite-utils insert` - User runs `sqlite-utils create-index ...` which also creates and populates the `sqlite_stat1` table - User runs `insert` again to populate several million new records The user now has a database file with several million records and a statistics table that is wildly out of date, having been populated when they only had a few. Will this result in surprisingly bad query performance compared to it that statistics table did not exist at all? If so, I lean much harder towards `ANALYZE` as a strictly opt-in optimization, maybe with the `--analyze` option added to `sqlite-utils insert` top to help users opt in to updating their statistics after running big inserts. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1096558279 | |
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008161965 | https://api.github.com/repos/simonw/sqlite-utils/issues/365 | 1008161965 | IC_kwDOCGYnMM48F1St | 536941 | 2022-01-08T22:02:56Z | 2022-01-08T22:02:56Z | CONTRIBUTOR | for options 2 and 3, i would worry about discoverablity. in other db’s it is not necessary to explicitly call analyze for most indices. ie for postgres > The system regularly collects statistics on all of a table's columns. Newly-created non-expression indexes can immediately use these statistics to determine an index's usefulness. i suppose i would propose raising a warning if the stats table is created that explains what is going on and informs users about a —no-analyze argument. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1096558279 | |
https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008158357 | https://api.github.com/repos/simonw/sqlite-utils/issues/365 | 1008158357 | IC_kwDOCGYnMM48F0aV | 9599 | 2022-01-08T21:33:07Z | 2022-01-08T21:33:07Z | OWNER | The one thing that worries me a little bit about doing this by default is that it adds a surprising new table to the database - it may be confusing to users if they run `create-index` and their database suddenly has a new `sqlite_stat1` table, see https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1008157132 Options here are: - Do it anyway. People can tolerate a surprise table appearing when they create an index. - Only run `ANALYZE` if the user says `sqlite-utils create-index ... --analyze` - Use the `--analyze` option, but also automatically run `ANALYZE` if they create an index and the database they are working with already has a `sqlite_stat1` table I'm currently leading towards that third option - @fgregg any thoughts? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1096558279 |