html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722055104,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722055104,MDEyOklzc3VlQ29tbWVudDcyMjA1NTEwNA==,9599,2020-11-05T00:47:34Z,2020-11-05T00:47:34Z,OWNER,"This is surprisingly difficult. I need to parse the `CREATE VIRTUAL TABLE` statement, which will look something like this: ```sql CREATE VIRTUAL TABLE ""global-power-plants_fts"" USING FTS5 (""name"", content=""global-power-plants"") ``` The problem is I need to be able to handle various different quoting formats for the table name (`mytable` v.s. `""mytable""` v.s. `[mytable]`) plus I need to look out for `CREATE TABLE IF NOT EXISTS`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722055291,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722055291,MDEyOklzc3VlQ29tbWVudDcyMjA1NTI5MQ==,9599,2020-11-05T00:48:10Z,2020-11-05T00:48:10Z,OWNER,This is blocking landing `.search()` in #195,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722056576,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722056576,MDEyOklzc3VlQ29tbWVudDcyMjA1NjU3Ng==,9599,2020-11-05T00:52:42Z,2020-11-05T00:52:42Z,OWNER,"I could use a parsing library like https://parsy.readthedocs.io/en/latest/tutorial.html for this - or `pyparsing` which has a SQLite example here: https://github.com/pyparsing/pyparsing/blob/master/examples/select_parser.py I'd rather not add a new dependency for this though so I'm going to see if I can get something that's good-enough just using a regular expression.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722057392,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722057392,MDEyOklzc3VlQ29tbWVudDcyMjA1NzM5Mg==,9599,2020-11-05T00:55:31Z,2020-11-05T00:55:51Z,OWNER,"https://sqlite.org/lang_keywords.html says: > There are four ways of quoting keywords in SQLite: > > **'keyword'** A keyword in single quotes is a string literal. > **""keyword""** A keyword in double-quotes is an identifier. > **[keyword]** A keyword enclosed in square brackets is an identifier. This is not standard SQL. This quoting mechanism is used by MS Access and SQL Server and is included in SQLite for compatibility. > **\`keyword\`** A keyword enclosed in grave accents (ASCII code 96) is an identifier. This is not standard SQL. This quoting mechanism is used by MySQL and is included in SQLite for compatibility.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722057923,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722057923,MDEyOklzc3VlQ29tbWVudDcyMjA1NzkyMw==,9599,2020-11-05T00:57:22Z,2020-11-05T00:57:22Z,OWNER,"Then https://sqlite.org/lang_expr.html#literal_values_constants_ says: > A string constant is formed by enclosing the string in single quotes ('). A single quote within the string can be encoded by putting two single quotes in a row - as in Pascal. C-style escapes using the backslash character are not supported because they are not standard SQL. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722058598,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722058598,MDEyOklzc3VlQ29tbWVudDcyMjA1ODU5OA==,9599,2020-11-05T00:59:58Z,2020-11-05T00:59:58Z,OWNER,"That two-in-a-row thing works for `""` too: https://latest.datasette.io/fixtures?sql=select+%22foo%22%2C+%27bar%27%2C+%22foo%22%22and%22%2C+%27bar%27%27and%27 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722062082,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722062082,MDEyOklzc3VlQ29tbWVudDcyMjA2MjA4Mg==,9599,2020-11-05T01:10:51Z,2020-11-05T01:10:51Z,OWNER,"I confirmed all three of these are valid syntax for creating tables: ``` ~ % sqlite3 tmp.db SQLite version 3.28.0 2019-04-15 14:49:49 Enter "".help"" for usage hints. sqlite> create table 'foo''and' (id int); sqlite> create table ""bar""""and"" (id int); sqlite> create table [baz] (id int); sqlite> create table `bant` (id int); sqlite> .schema CREATE TABLE IF NOT EXISTS 'foo''and' (id int); CREATE TABLE IF NOT EXISTS ""bar""""and"" (id int); CREATE TABLE [baz] (id int); CREATE TABLE `bant` (id int); sqlite> select * from sqlite_master; table|foo'and|foo'and|2|CREATE TABLE 'foo''and' (id int) table|bar""and|bar""and|3|CREATE TABLE ""bar""""and"" (id int) table|baz|baz|4|CREATE TABLE [baz] (id int) table|bant|bant|5|CREATE TABLE `bant` (id int) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722062449,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722062449,MDEyOklzc3VlQ29tbWVudDcyMjA2MjQ0OQ==,9599,2020-11-05T01:12:14Z,2020-11-05T01:12:14Z,OWNER,"Good news: I don't think I have to deal with `foo.tablename`, because that doesn't get reflected in the `sqlite_master` table: ``` sqlite> attach 'foo.db' as foo; sqlite> create table foo.`bant` (id int); sqlite> select * from foo.sqlite_master; table|bant|bant|2|CREATE TABLE `bant` (id int) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722064258,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722064258,MDEyOklzc3VlQ29tbWVudDcyMjA2NDI1OA==,9599,2020-11-05T01:18:07Z,2020-11-05T01:21:31Z,OWNER,"``` In [8]: r = re.compile(r""""""'[^']*(?:''[^']*)*'"""""") In [9]: r.match(""'fo'o'"") Out[9]: In [10]: r.match(""'fo''o'"") Out[10]: ``` `'[^']*(?:''[^']*)*'` This matches a single quote, then 0+ not-single-quotes, then 0+ (either 0+ not-single quotes or a double single quote), then a single quote. Unrolling the loop technique described here: http://www.softec.lu/site/RegularExpressions/UnrollingTheLoop","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722070569,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722070569,MDEyOklzc3VlQ29tbWVudDcyMjA3MDU2OQ==,9599,2020-11-05T01:38:40Z,2020-11-05T01:38:40Z,OWNER,"I'm going to try `re.VERBOSE` to see if I can make this readable with comments. https://docs.python.org/3/howto/regex.html ```python charref = re.compile(r"""""" &[#] # Start of a numeric entity reference ( 0[0-7]+ # Octal form | [0-9]+ # Decimal form | x[0-9a-fA-F]+ # Hexadecimal form ) ; # Trailing semicolon """""", re.VERBOSE) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722078286,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722078286,MDEyOklzc3VlQ29tbWVudDcyMjA3ODI4Ng==,9599,2020-11-05T02:04:18Z,2020-11-05T02:04:18Z,OWNER,"I think this might be it: ```python create_virtual_table_re = re.compile(r"""""" \s*CREATE\s+VIRTUAL\s+TABLE\s+ # CREATE VIRTUAL TABLE ( '(?P[^']*(?:''[^']*)*)' | # single quoted name ""(?P[^""]*(?:""""[^""]*)*)"" | # double quoted name `(?P[^`]+)` | # `backtick` quoted name \[(?P[^\]]+)\] # [...] quoted name ) \s+(IF\s+NOT\s+EXISTS\s+)? # IF NOT EXISTS (optional) USING\s+(?P\w+) """""", re.VERBOSE | re.IGNORECASE) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722078361,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722078361,MDEyOklzc3VlQ29tbWVudDcyMjA3ODM2MQ==,9599,2020-11-05T02:04:33Z,2020-11-05T02:04:33Z,OWNER,Next step: lots of unit tests.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722082497,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722082497,MDEyOklzc3VlQ29tbWVudDcyMjA4MjQ5Nw==,9599,2020-11-05T02:18:08Z,2020-11-05T02:18:08Z,OWNER,"I'm missing the case where a table has no quotes around it at all - `create virtual table foo using fts5` So I need to know how to create a regex for a SQLite identifier. https://www.sqlite.org/draft/tokenreq.html seems to be the only available documentation for that. > > ### Identifier tokens > > Identifiers follow the usual rules with the exception that SQLite allows the dollar-sign symbol in the interior of an identifier. The dollar-sign is for compatibility with Microsoft SQL-Server and is not part of the SQL standard. > > > **H41130:** SQLite shall recognize as an ID token any sequence of characters that begins with an ALPHABETIC character and continue with zero or more ALPHANUMERIC characters and/or ""$"" (u0024) characters and which is not a keyword token. > > Identifiers can be arbitrary character strings within square brackets. This feature is also for compatibility with Microsoft SQL-Server and not a part of the SQL standard. > > > **H41140:** SQLite shall recognize as an ID token any sequence of non-zero characters that begins with ""["" (u005b) and continuing through the first ""]"" (u005d) character. > > The standard way of quoting SQL identifiers is to use double-quotes. > > > **H41150:** SQLite shall recognize as an ID token any sequence of characters that begins with a double-quote (u0022), is followed by zero or more non-zero characters and/or pairs of double-quotes (u0022) and terminates with a double-quote (u0022) that is not part of a pair. > > MySQL allows identifiers to be quoted using the grave accent character. SQLite supports this for interoperability. > > > **H41160:** SQLite shall recognize as an ID token any sequence of characters that begins with a grave accent (u0060), is followed by zero or more non-zero characters and/or pairs ofgrave accents (u0060) and terminates with a grave accent (u0022) that is not part of a pair.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722082759,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722082759,MDEyOklzc3VlQ29tbWVudDcyMjA4Mjc1OQ==,9599,2020-11-05T02:18:58Z,2020-11-05T02:18:58Z,OWNER,"More from that document, describing `ALPHANUMERIC`: > **ALPHABETIC** > > Any of the characters in the range u0041 through u005a (letters ""A"" through ""Z"") or in the range u0061 through u007a (letters ""a"" through ""z"") or the character u005f (""_"") or any other character larger than u007f. > > **NUMERIC** > > Any of the characters in the range u0030 through u0039 (digits ""0"" through ""9"") > > **ALPHANUMERIC** > > Any character which is either ALPHABETIC or NUMERIC","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722082874,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722082874,MDEyOklzc3VlQ29tbWVudDcyMjA4Mjg3NA==,9599,2020-11-05T02:19:18Z,2020-11-05T02:19:18Z,OWNER,"""any other character larger than u007f."" Need to figure that out!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722083527,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722083527,MDEyOklzc3VlQ29tbWVudDcyMjA4MzUyNw==,9599,2020-11-05T02:21:26Z,2020-11-05T02:21:26Z,OWNER,I think that's `\u007F-\uFFFF` in regex range speak.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722084213,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722084213,MDEyOklzc3VlQ29tbWVudDcyMjA4NDIxMw==,9599,2020-11-05T02:23:37Z,2020-11-05T02:23:37Z,OWNER,"So... ALPHABETIC: `[\u0041-\u005a\u0061-\u0071\u007f-\uffff\u005f]` NUMERIC: `[\u0030-\u0039]`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722084593,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722084593,MDEyOklzc3VlQ29tbWVudDcyMjA4NDU5Mw==,9599,2020-11-05T02:24:47Z,2020-11-05T02:24:47Z,OWNER,"And an identifier is ""ALPHABETIC character and continue with zero or more ALPHANUMERIC characters and/or ""$"" (u0024) characters"" So... [\u0041-\u005a\u0061-\u0071\u007f-\uffff\u005f][\u0041-\u005a\u0061-\u0071\u007f-\uffff\u005f\u0030-\u0039\u0024]+","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310, https://github.com/simonw/sqlite-utils/issues/196#issuecomment-722086105,https://api.github.com/repos/simonw/sqlite-utils/issues/196,722086105,MDEyOklzc3VlQ29tbWVudDcyMjA4NjEwNQ==,9599,2020-11-05T02:29:50Z,2020-11-05T03:39:58Z,OWNER,"The finished monster: ```python _virtual_table_using_re = re.compile(r"""""" ^ # Start of string \s*CREATE\s+VIRTUAL\s+TABLE\s+ # CREATE VIRTUAL TABLE ( '(?P[^']*(?:''[^']*)*)' | # single quoted name ""(?P[^""]*(?:""""[^""]*)*)"" | # double quoted name `(?P[^`]+)` | # `backtick` quoted name \[(?P[^\]]+)\] | # [...] quoted name (?P # SQLite non-quoted identifier [A-Za-z_\u0080-\uffff] # \u0080-\uffff = ""any character larger than u007f"" [A-Za-z_\u0080-\uffff0-9\$]* # zero-or-more alphanemuric or $ ) ) \s+(IF\s+NOT\s+EXISTS\s+)? # IF NOT EXISTS (optional) USING\s+(?P\w+) # e.g. USING FTS5 """""", re.VERBOSE | re.IGNORECASE) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",736520310,