home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where "created_at" is on date 2021-08-25 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, updated_at (date)

issue 4

  • Database page loads too slowly with many large tables (due to table counts) 3
  • xml.etree.ElementTree.ParseError: not well-formed (invalid token) 2
  • Remove underscore from search mode parameter name 1
  • `table.convert()` method should clean up after itself 1

author_association 3

  • OWNER 3
  • CONTRIBUTOR 2
  • MEMBER 2

user 2

  • simonw 5
  • brandonrobertz 2
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
905904540 https://github.com/simonw/datasette/issues/859#issuecomment-905904540 https://api.github.com/repos/simonw/datasette/issues/859 IC_kwDOBm6k_c41_wGc brandonrobertz 2670795 2021-08-25T21:59:14Z 2021-08-25T21:59:55Z CONTRIBUTOR

I did two tests: one with 1000 5-30mb DBs and a second with 20 multi gig DBs. For the second, I created them like so: for i in {1..20}; do sqlite-generate db$i.db --tables ${i}00 --rows 100,2000 --columns 5,100 --pks 0 --fks 0; done

This was for deciding whether to use lots of small DBs or to group things into a smaller number of bigger DBs. The second strategy wins.

By simply persisting the _internal DB to disk, I was able to avoid most of the performance issues I was experiencing previously. (To do this, I changed the datasette/internal_db.py:init_internal_db creates to if not exists, and changed the _internal DB instantiation in datasette/app.py:Datasette.__init__ to a path with is_mutable=True.) Super rough, but the pages now load so I can continue testing ideas.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
905900807 https://github.com/simonw/datasette/issues/859#issuecomment-905900807 https://api.github.com/repos/simonw/datasette/issues/859 IC_kwDOBm6k_c41_vMH simonw 9599 2021-08-25T21:51:10Z 2021-08-25T21:51:10Z OWNER

10-20 minutes to populate _internal! How many databases and tables is that for?

I may have to rethink the _internal mechanism entirely. One possible alternative would be for the Datasette homepage to just show a list of available databases (maybe only if there are more than X connected) and then load in their metadata only the first time they are accessed.

I need to get my own stress testing rig setup for this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
905899177 https://github.com/simonw/datasette/issues/859#issuecomment-905899177 https://api.github.com/repos/simonw/datasette/issues/859 IC_kwDOBm6k_c41_uyp brandonrobertz 2670795 2021-08-25T21:48:00Z 2021-08-25T21:48:00Z CONTRIBUTOR

Upon first stab, there's two issues here: - DB/table/row counts (as discussed above). This isn't too bad if the DBs are actually above the MAX limit check. - Populating the internal DB. On first load of a giant set of DBs, it can take 10-20 mins to populate. By altering datasette and persisting the internal DB to disk, this problem is vastly improved, but I'm sure this will cause problems elsewhere.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
905886797 https://github.com/simonw/sqlite-utils/issues/323#issuecomment-905886797 https://api.github.com/repos/simonw/sqlite-utils/issues/323 IC_kwDOCGYnMM41_rxN simonw 9599 2021-08-25T21:25:18Z 2021-08-25T21:25:18Z OWNER

As far as I can tell the Python sqlite3 module doesn't actually have a mechanism for de-registering a custom SQL function.

This means that if I implement a mechanism whereby each call to .convert() registers a new SQL function with a random suffix (convert_value_23424() for example) those functions will stay registered - and if .convert() is called a large number of times the number of obsolete custom function registrations will grow without bounds.

For that reason, I'm going to wontfix this issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`table.convert()` method should clean up after itself 979627285  
905206234 https://github.com/dogsheep/evernote-to-sqlite/issues/13#issuecomment-905206234 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/13 IC_kwDOEhK-wc419Fna simonw 9599 2021-08-25T05:58:42Z 2021-08-25T05:58:42Z MEMBER

https://github.com/dogsheep/evernote-to-sqlite/blob/36a466f142e5bad52719851c2fbda0c05cd35b99/evernote_to_sqlite/utils.py#L34-L42

Not sure why I was round-tripping the content_xml like that - I will try not doing that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
xml.etree.ElementTree.ParseError: not well-formed (invalid token) 978743426  
905203570 https://github.com/dogsheep/evernote-to-sqlite/issues/13#issuecomment-905203570 https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/13 IC_kwDOEhK-wc419E9y simonw 9599 2021-08-25T05:51:22Z 2021-08-25T05:53:27Z MEMBER

The debugger showed me that it broke on a string that looked like this: ```xml

<en-note>

Q3 2018 Reflection & Development

... ``` Yeah that is not valid XML!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
xml.etree.ElementTree.ParseError: not well-formed (invalid token) 978743426  
905097468 https://github.com/simonw/datasette/pull/1447#issuecomment-905097468 https://api.github.com/repos/simonw/datasette/issues/1447 IC_kwDOBm6k_c418rD8 simonw 9599 2021-08-25T01:28:53Z 2021-08-25T01:28:53Z OWNER

Good catch, thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Remove underscore from search mode parameter name 978614898  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1036.779ms · About: github-to-sqlite