{"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-880153069", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 880153069, "node_id": "MDEyOklzc3VlQ29tbWVudDg4MDE1MzA2OQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-07-14T19:31:00Z", "updated_at": "2021-07-14T19:31:00Z", "author_association": "OWNER", "body": "... though interestingly I can't replicate that error on `latest.datasette.io` - https://latest.datasette.io/fixtures/searchable?_search=park.&_searchmode=raw\r\n\r\nThat's running https://latest.datasette.io/-/versions SQLite 3.35.4 whereas https://www.niche-museums.com/-/versions is running 3.27.2 (the most recent version available with Vercel) - but there's nothing in the SQLite changelog between those two versions that suggests changes to how the FTS5 parser works. https://www.sqlite.org/changes.html", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-880150755", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 880150755, "node_id": "MDEyOklzc3VlQ29tbWVudDg4MDE1MDc1NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-07-14T19:26:47Z", "updated_at": "2021-07-14T19:29:08Z", "author_association": "OWNER", "body": "> What are the side-effects of turning that on in the query string, or even by default as you suggested? I see that you stated in the docs... \"to ensure they do not cause any confusion for users who are not aware of them\", but I'm not sure what those could be.\r\n\r\nMainly that it's possible to generate SQL queries that crash with an error. This was the example that convinced me to default to escaping:\r\n\r\n- https://www.niche-museums.com/browse/museums?_search=park.&_searchmode=raw (returns `fts5: syntax error near \".\"`)\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-876721585", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 876721585, "node_id": "MDEyOklzc3VlQ29tbWVudDg3NjcyMTU4NQ==", "user": {"value": 9308268, "label": "rayvoelker"}, "created_at": "2021-07-08T20:22:17Z", "updated_at": "2021-07-08T20:22:17Z", "author_association": "NONE", "body": "I do like the idea of there being a option for turning that on by default so that you could use those terms in the default \"Search\" bar presented when you browse to a table where FTS has been enabled. Maybe even a small inline pop up with a short bit explaining the FTS feature and the keywords (e.g. case matters). What are the side-effects of turning that on in the query string, or even by default as you suggested? I see that you stated in the docs... \"to ensure they do not cause any confusion for users who are not aware of them\", but I'm not sure what those could be.\r\n\r\nIsn't it the case that those keywords are only picked up by sqlite in where you're using the MATCH clause?\r\n\r\nSeems like a really powerful feature (even though there are a lot of hurdles around setting it up in the sqlite db ... sqlite-utils makes that so simple by the way!)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-876616414", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 876616414, "node_id": "MDEyOklzc3VlQ29tbWVudDg3NjYxNjQxNA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-07-08T17:29:04Z", "updated_at": "2021-07-08T17:29:04Z", "author_association": "OWNER", "body": "> I had setup a full text search on my instance of Datasette for title data for our public library, and was noticing that some of the features of the SQLite FTS weren't working as expected ... and maybe the issue is in the `escape_fts()` function\r\n\r\nThat's a deliberate feature (albeit controversial, see #759) - part of the main problem here is that it's easy to construct a SQLite full-text search string which results in a database error. This is a bad user-experience!\r\n\r\nYou can opt-in to raw SQL queries by appending `?_searchmode=raw` to the page, see https://docs.datasette.io/en/stable/full_text_search.html#advanced-sqlite-search-queries\r\n\r\nBut maybe there should be an option for turning that on by default without needing the query string?\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-876428348", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 876428348, "node_id": "MDEyOklzc3VlQ29tbWVudDg3NjQyODM0OA==", "user": {"value": 9308268, "label": "rayvoelker"}, "created_at": "2021-07-08T13:13:12Z", "updated_at": "2021-07-08T13:13:12Z", "author_association": "NONE", "body": "I had setup a full text search on my instance of Datasette for title data for our public library, and was noticing that some of the features of the SQLite FTS weren't working as expected ... and maybe the issue is in the `escape_fts()` function\r\n\r\n![image](https://user-images.githubusercontent.com/9308268/124925900-f1ea8b00-dfca-11eb-895e-59cc083d6524.png)\r\nvs removing the function...\r\n![image](https://user-images.githubusercontent.com/9308268/124925971-0464c480-dfcb-11eb-8fbf-8e9b5d6e0861.png)\r\n\r\nAlso, on the issue of sorting by rank by default .. perhaps something like this could work for the baked-in default SQL query for Datasette?\r\n![image](https://user-images.githubusercontent.com/9308268/124927191-5a863780-dfcc-11eb-9908-3f63577d5ff5.png)\r\n\r\n[link to the above search in my instance of Datasette](https://ilsweb.cincinnatilibrary.org/collection-analysis/current_collection-87a9011?sql=with+fts_search+as+%28%0D%0A++select%0D%0A++rowid%2C%0D%0A++rank%0D%0A++++from%0D%0A++++++bib_fts%0D%0A++++where%0D%0A++++++bib_fts+match+%3Asearch%0D%0A%29%0D%0A%0D%0Aselect%0D%0A++%0D%0A++bib_record_num%2C%0D%0A++creation_date%2C%0D%0A++record_last_updated%2C%0D%0A++isbn%2C%0D%0A++best_author%2C%0D%0A++best_title%2C%0D%0A++publisher%2C%0D%0A++publish_year%2C%0D%0A++bib_level_callnumber%2C%0D%0A++indexed_subjects%0D%0Afrom%0D%0A++fts_search%0D%0A++join+bib+on+bib.rowid+%3D+fts_search.rowid%0D%0A++%0D%0Aorder+by%0D%0Arank%0D%0A&search=black+death+NOT+fiction)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-790257263", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 790257263, "node_id": "MDEyOklzc3VlQ29tbWVudDc5MDI1NzI2Mw==", "user": {"value": 649467, "label": "mhalle"}, "created_at": "2021-03-04T03:20:23Z", "updated_at": "2021-03-04T03:20:23Z", "author_association": "NONE", "body": "It's kind of an ugly hack, but you can try out what using the fts5 table as an actual datasette-accessible table looks like without changing any datasette code by creating yet another view on top of the fts5 table:\r\n\r\n`create view proxyview as select *, rank, table_fts as fts from table_fts;`\r\n\r\nThat's now visible from datasette, just like any other view, but you can use `fts match escape_fts(search_string) order by rank`.\r\n\r\nThis is only good as a proof of concept because you're inefficiently going from view -> fts5 external content table -> view -> data table. However, it does show it works.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-789409126", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 789409126, "node_id": "MDEyOklzc3VlQ29tbWVudDc4OTQwOTEyNg==", "user": {"value": 649467, "label": "mhalle"}, "created_at": "2021-03-03T03:57:15Z", "updated_at": "2021-03-03T03:58:40Z", "author_association": "NONE", "body": "In FTS5, I think doing an FTS search is actually much easier than doing a join against the main table like datasette does now. In fact, FTS5 external content tables provide a transparent interface back to the original table or view.\r\n\r\nHere's what I'm currently doing:\r\n* build a view that joins whatever tables I want and rename the columns to non-joiny names (e.g, `chapter.name AS chapter_name` in the view where needed)\r\n* Create an FTS5 table with `content=\"viewname\"`\r\n* As described in the \"external content tables\" section (https://www.sqlite.org/fts5.html#external_content_tables), sql queries can be made directly to the FTS table, which behind the covers makes select calls to the content table when the content of the original columns are needed.\r\n* In addition, you get \"rank\" and \"bm25()\" available to you when you select on the _fts table.\r\n\r\nUnfortunately, datasette doesn't currently seem happy being coerced into doing a real query on an fts5 table. This works:\r\n```select col1, col2, col3 from table_fts where coll1=\"value\" and table_fts match escape_fts(\"search term\") order by rank```\r\n\r\nBut this doesn't work in the datasette SQL query interface:\r\n```select col1, col2, col3 from table_fts where coll1=\"value\" and table_fts match escape_fts(:search) order by rank``` (the \"search\" input text field doesn't show up)\r\n\r\nFor what datasette is doing right now, I think you could just use contentless fts5 tables (`content=\"\"`), since all you care about is the rowid since all you're doing a subselect to get the rowid anyway. In fts5, that's just a contentless table.\r\n\r\nI guess if you want to follow this suggestion, you'd need a somewhat different code path for fts5.\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-726419027", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 726419027, "node_id": "MDEyOklzc3VlQ29tbWVudDcyNjQxOTAyNw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-11-13T00:09:04Z", "updated_at": "2020-11-13T00:09:04Z", "author_association": "OWNER", "body": "Part of the challenge here is that this is the first time the `TableView` will have had a complete rewrite of the SQL it is going to execute. That SQL is currently constructed here: https://github.com/simonw/datasette/blob/5eb8e9bf250b26e30b017d39a392c33973997656/datasette/views/table.py#L628-L636", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-723740546", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 723740546, "node_id": "MDEyOklzc3VlQ29tbWVudDcyMzc0MDU0Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-11-09T04:01:50Z", "updated_at": "2020-11-09T04:01:50Z", "author_association": "OWNER", "body": "I should depend on `sqlite-fts4` - I'm doing that in `sqlite-utils` now and it works great: https://github.com/simonw/sqlite-utils/issues/198", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-721896822", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 721896822, "node_id": "MDEyOklzc3VlQ29tbWVudDcyMTg5NjgyMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-11-04T18:23:29Z", "updated_at": "2020-11-04T18:23:29Z", "author_association": "OWNER", "body": "Worth noting that joining to get the rank works for FTS5 but not for FTS4 - see comment here: https://github.com/simonw/sqlite-utils/issues/192#issuecomment-721420539\r\n\r\nEasiest solution would be to only support sort-by-rank for FTS5 tables. Alternative would be to depend on https://github.com/simonw/sqlite-fts4", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-675725464", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 675725464, "node_id": "MDEyOklzc3VlQ29tbWVudDY3NTcyNTQ2NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-08-18T21:18:07Z", "updated_at": "2020-08-18T21:18:35Z", "author_association": "OWNER", "body": "I want this on the table page - but that means that the table page will need to run a slightly more complex query since it needs access to a `rank` column to sort by - which it gets from running a join.\r\n\r\nBUT... that join needs to be constructed in a way that keeps existing filters, `?_where=` clauses etc intact.\r\n\r\nHere's a prototype using SQLite CTEs: https://register-of-members-interests.datasettes.com/regmem?sql=with+original+as+%28select+rowid%2C+*+from+items%29%0D%0Aselect%0D%0A++original.*%2C%0D%0A++items_fts.rank+as+items_fts_rank%0D%0Afrom%0D%0A++original+join+items_fts+on+original.rowid+%3D+items_fts.rowid%0D%0Awhere%0D%0A++items_fts+match+escape_fts%28%3Asearch%29%0D%0Aorder+by+items_fts_rank+desc+limit+10&search=hotel\r\n\r\n```sql\r\nwith original as (\r\n select\r\n rowid,\r\n *\r\n from\r\n items\r\n)\r\nselect\r\n original.*,\r\n items_fts.rank as items_fts_rank\r\nfrom\r\n original\r\n join items_fts on original.rowid = items_fts.rowid\r\nwhere\r\n items_fts match escape_fts(:search)\r\norder by\r\n items_fts_rank desc\r\nlimit\r\n 10\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/268#issuecomment-504880796", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/268", "id": 504880796, "node_id": "MDEyOklzc3VlQ29tbWVudDUwNDg4MDc5Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-06-24T06:47:23Z", "updated_at": "2019-06-24T06:47:23Z", "author_association": "OWNER", "body": "I did a bunch of research relevant to this a while ago: https://simonwillison.net/2019/Jan/7/exploring-search-relevance-algorithms-sqlite/", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 323718842, "label": "Mechanism for ranking results from SQLite full-text search"}, "performed_via_github_app": null}