issue_comments

3,884 rows sorted by updated_at descending

View and edit SQL

Suggested facets: reactions, created_at (date), updated_at (date)

issue

author_association

id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
701708072 https://github.com/simonw/datasette/issues/984#issuecomment-701708072 https://api.github.com/repos/simonw/datasette/issues/984 MDEyOklzc3VlQ29tbWVudDcwMTcwODA3Mg== simonw 9599 2020-10-01T00:01:36Z 2020-10-01T00:01:36Z OWNER

Column action menus are the cog icons on https://latest.datasette.io/fixtures/facetable

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Review accessibility of new column action menus 712368432  
701707116 https://github.com/simonw/datasette/issues/981#issuecomment-701707116 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTcwNzExNg== simonw 9599 2020-09-30T23:58:17Z 2020-09-30T23:58:17Z OWNER

Now live at https://latest.datasette.io/fixtures/facetable

https://user-images.githubusercontent.com/9599/94751748-1f205380-033e-11eb-940b-bb4ee7728d31.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701704716 https://github.com/simonw/datasette/issues/981#issuecomment-701704716 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTcwNDcxNg== simonw 9599 2020-09-30T23:48:36Z 2020-09-30T23:48:36Z OWNER

Since this menu doesn't provide new functionality, I'm going to ignore the fact that it doesn't exist on portrait mobile view for the moment. Likewise, I'm going to skip making it accessible for the moment since lacking accessibility doesn't prevent functionality from being accessed - the menu-less experience currently works the same as the portrait mobile experience.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701703472 https://github.com/simonw/datasette/issues/981#issuecomment-701703472 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTcwMzQ3Mg== simonw 9599 2020-09-30T23:43:30Z 2020-09-30T23:43:30Z OWNER

I'm going to go based just on the visible values on the current page. I think that's good enough, and it avoids the complexity involved in doing a server-side check for blank values.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701697918 https://github.com/simonw/datasette/issues/981#issuecomment-701697918 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTY5NzkxOA== simonw 9599 2020-09-30T23:24:14Z 2020-09-30T23:25:28Z OWNER

I could provide the "Show non-blank values" option only for columns where there are blank values visible on the current page.

This could be a bit confusing though, since the absence of that option could suggest that there are no blank values at all when that's actually not true.

One option: run a separate fetch() call that figures out if any of the columns contain blank values, which gets a bit of extra time to execute. Only show the "Show non-blank values" option in the menu once that has returned.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701679729 https://github.com/simonw/datasette/issues/981#issuecomment-701679729 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTY3OTcyOQ== simonw 9599 2020-09-30T22:26:33Z 2020-09-30T22:26:33Z OWNER

Bug: https://latest.datasette.io/fixtures/sortable?_sort=sortable

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701678659 https://github.com/simonw/sqlite-utils/issues/183#issuecomment-701678659 https://api.github.com/repos/simonw/sqlite-utils/issues/183 MDEyOklzc3VlQ29tbWVudDcwMTY3ODY1OQ== simonw 9599 2020-09-30T22:23:36Z 2020-09-30T22:23:36Z OWNER

It just ran and didn't find anything. I'll leave it enabled for a while but I may turn it off again, it could be that this kind of Python library isn't really what it's useful for.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Try out GitHub code scanning 712316959  
701668526 https://github.com/simonw/datasette/issues/981#issuecomment-701668526 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTY2ODUyNg== simonw 9599 2020-09-30T21:57:22Z 2020-09-30T21:57:22Z OWNER

A bunch of things to fix:

  • It clobbers existing querystring parameters - it needs to leave these alone (but replace the current sort order)
  • Facet option should not show up if you are already faceting by that column
  • There's no way to close the menu once it has opened!
  • Accessibility: SVG icon doesn't even have an alt attribute yet. Should use ARIA when the thing appears.

It's also not visible on mobile, need to think about how that will work.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701664306 https://github.com/simonw/datasette/issues/981#issuecomment-701664306 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTY2NDMwNg== simonw 9599 2020-09-30T21:47:08Z 2020-09-30T21:47:08Z OWNER

The arrow icon didn't make sense because I already have a triangle icon showing sort order. I'm trying a cog icon instead:

https://user-images.githubusercontent.com/9599/94743176-ce9ffa80-032b-11eb-8f4d-da3962984a4f.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701659197 https://github.com/simonw/datasette/issues/981#issuecomment-701659197 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTY1OTE5Nw== simonw 9599 2020-09-30T21:34:44Z 2020-09-30T21:34:44Z OWNER

Showing "facet by this" on the primary key column doesn't make sense.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701642448 https://github.com/simonw/datasette/issues/981#issuecomment-701642448 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTY0MjQ0OA== simonw 9599 2020-09-30T20:59:33Z 2020-09-30T20:59:33Z OWNER

I think I've got everything I need to implement this now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701629984 https://github.com/simonw/datasette/issues/983#issuecomment-701629984 https://api.github.com/repos/simonw/datasette/issues/983 MDEyOklzc3VlQ29tbWVudDcwMTYyOTk4NA== simonw 9599 2020-09-30T20:34:43Z 2020-09-30T20:34:43Z OWNER

I had a look around and there isn't an obvious pluggy equivalent in JavaScript world at the moment. Lots of frameworks like jQuery and Vue have their own custom plugin mechanisms.

https://github.com/rekit/js-plugin is a simple standalone plugin mechanism. Not quite as full-featured as Pluggy though - in particular I like how Pluggy supports multiple plugins returning results for the same hook that get concatenated into a list of results.

https://css-tricks.com/designing-a-javascript-plugin-system/ has some ideas.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
JavaScript plugin hooks mechanism similar to pluggy 712260429  
701627158 https://github.com/simonw/sqlite-utils/pull/178#issuecomment-701627158 https://api.github.com/repos/simonw/sqlite-utils/issues/178 MDEyOklzc3VlQ29tbWVudDcwMTYyNzE1OA== simonw 9599 2020-09-30T20:29:11Z 2020-09-30T20:29:11Z OWNER

Thanks for the fix!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Update README.md 709043182  
701626134 https://github.com/simonw/sqlite-utils/issues/182#issuecomment-701626134 https://api.github.com/repos/simonw/sqlite-utils/issues/182 MDEyOklzc3VlQ29tbWVudDcwMTYyNjEzNA== simonw 9599 2020-09-30T20:27:09Z 2020-09-30T20:27:42Z OWNER

It looks like http://maps.natalian.org/data.txt is encoded as latin-1, but sqlite-utils assumes utf-8 and hence breaks.

It would be worth improving the error message here. I could also add a --encoding latin-1 option to sqlite-utils insert to help in consuming files that are stored in charsets other than utf-8.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Better handling of encodings other than utf-8 for "sqlite-utils insert" 711649325  
701616922 https://github.com/simonw/datasette/issues/981#issuecomment-701616922 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTYxNjkyMg== simonw 9599 2020-09-30T20:08:02Z 2020-09-30T20:08:02Z OWNER

It would be neat to provide a JavaScript plugin hook that plugins can use to add their own options to this menu. No idea what that would look like though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701157010 https://github.com/simonw/datasette/issues/981#issuecomment-701157010 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTE1NzAxMA== simonw 9599 2020-09-30T05:00:42Z 2020-09-30T20:06:59Z OWNER

Maybe use this as the icon:

<svg width="100px" height="100px" viewBox="0 0 255 255">
  <polygon points="0,64 128,191 255,64"/>
</svg>

https://user-images.githubusercontent.com/9599/94734258-d22c8500-031d-11eb-982b-82cde4b1920f.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701615291 https://github.com/simonw/datasette/issues/981#issuecomment-701615291 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTYxNTI5MQ== simonw 9599 2020-09-30T20:04:34Z 2020-09-30T20:05:37Z OWNER

Another potential action:

  • Show rows where this is not blank (equivalent to is not blank filter)

This could be displayed conditionally based on if the column is detected to have any blank rows in it?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701585695 https://github.com/simonw/datasette/issues/982#issuecomment-701585695 https://api.github.com/repos/simonw/datasette/issues/982 MDEyOklzc3VlQ29tbWVudDcwMTU4NTY5NQ== simonw 9599 2020-09-30T19:06:29Z 2020-09-30T19:06:29Z OWNER

This is a little related to the error display issue #619 in that both will require some reworking of how the code is structured.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
SQL editor should allow execution of write queries, if you have permission 712202333  
701585075 https://github.com/simonw/datasette/issues/982#issuecomment-701585075 https://api.github.com/repos/simonw/datasette/issues/982 MDEyOklzc3VlQ29tbWVudDcwMTU4NTA3NQ== simonw 9599 2020-09-30T19:05:11Z 2020-09-30T19:05:11Z OWNER

The form needs to switch from GET to POST if the query is a write query. JavaScript can handle this based on the checkbox - if a user does not have JavaScript submitting the form will cause the form action to be changed to POST and the form to be redisplayed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
SQL editor should allow execution of write queries, if you have permission 712202333  
701153543 https://github.com/simonw/datasette/issues/981#issuecomment-701153543 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTE1MzU0Mw== simonw 9599 2020-09-30T04:46:05Z 2020-09-30T05:11:45Z OWNER

Prototype:

<style>
body {
font-family: helvetica;
}
.dropdown-menu {
  display: inline-flex;
  border: 1px solid #ccc;
  border-radius: 4px;
  line-height: 1.4;
  font-size: 12px;
  box-shadow: 2px 2px 2px #aaa;
}
.dropdown-menu ul,
.dropdown-menu li {
  list-style-type: none;
  margin: 0;
  padding: 0;
}
.dropdown-menu li {
  border-bottom: 1px solid #ccc;
}
.dropdown-menu li:last-child {
  border: none;
}
.dropdown-menu a:link,
.dropdown-menu a:visited,
.dropdown-menu a:hover,
.dropdown-menu a:focus
.dropdown-menu a:active {
  text-decoration: none;
  display: block;
  padding: 4px 8px 2px 8px;
  color: #222;
  background-color: #fff;
}
.dropdown-menu a:hover {
  background-color: #eee;
}
.hook {
  display: block;
  position: absolute;
  top: 3px;
  left: 12px;
  width: 0; 
  height: 0; 
  border-left: 5px solid transparent;
  border-right: 5px solid transparent;
  border-bottom: 5px solid #ccc;
}
</style>
<div class="dropdown-menu">
<div class="hook"></div>
<ul>
  <li><a href="#">Sort descending</a></li>
  <li><a href="#">Sort ascending</a></li>
  <li><a href="#">Facet by this</a></li>
  <li><a href="#">Count by this</a></li>
</ul>
</div>

https://user-images.githubusercontent.com/9599/94643784-7241ca00-029c-11eb-8554-863fcc255352.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701153982 https://github.com/simonw/datasette/issues/981#issuecomment-701153982 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTE1Mzk4Mg== simonw 9599 2020-09-30T04:47:54Z 2020-09-30T04:47:54Z OWNER

I think the accessible way to do this is with absolute positioning - have a menu icon in the <th> which, when clicked, causes the dropdown menu to appear as an absolutely positioned <div> that is not located within the DOM hierarchy of the<th> itself but is positioned to show up in the correct place.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701153822 https://github.com/simonw/datasette/issues/981#issuecomment-701153822 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTE1MzgyMg== simonw 9599 2020-09-30T04:47:10Z 2020-09-30T04:47:10Z OWNER

Future version could have expanding out nested side menus that let you do things like "calculate sum/avg for this column against this-other-column".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
701153600 https://github.com/simonw/datasette/issues/981#issuecomment-701153600 https://api.github.com/repos/simonw/datasette/issues/981 MDEyOklzc3VlQ29tbWVudDcwMTE1MzYwMA== simonw 9599 2020-09-30T04:46:18Z 2020-09-30T04:46:18Z OWNER

More options:

<ul>
  <li><a href="#">Sort descending</a></li>
  <li><a href="#">Sort ascending</a></li>
  <li><a href="#">Facet by this</a></li>
  <li><a href="#">Unique values</a></li>
  <li><a href="#">Count unique values</a></li>
</ul>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Action menu for table columns 711627628  
700929721 https://github.com/simonw/datasette/issues/980#issuecomment-700929721 https://api.github.com/repos/simonw/datasette/issues/980 MDEyOklzc3VlQ29tbWVudDcwMDkyOTcyMQ== simonw 9599 2020-09-29T19:21:50Z 2020-09-29T19:21:50Z OWNER

That fixed it: https://latest-with-plugins.datasette.io/fixtures?sql=select%0D%0A++dateutil_rrule(%27FREQ%3DHOURLY%3BCOUNT%3D5%27)%2C%0D%0A++dateutil_rrule_date(%0D%0A++++%27FREQ%3DDAILY%3BCOUNT%3D3%27%2C%0D%0A++++%271st+jan+2020%27%0D%0A++)%3B

https://user-images.githubusercontent.com/9599/94605680-4d266a80-024e-11eb-8818-bc8bd7958df4.png">

<style>
@media only screen and (max-width: 576px) {

    .rows-and-columns td:nth-of-type(1):before { content: "dateutil_rrule(\000027FREQ=HOURLY;COUNT=5\000027)"; }

    .rows-and-columns td:nth-of-type(2):before { content: "dateutil_rrule_date(\00000A    \000027FREQ=DAILY;COUNT=3\000027,\00000A    \0000271st jan 2020\000027\00000A  )"; }

}
</style>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Another rendering glitch with column headers on mobile 710819020  
700490225 https://github.com/simonw/datasette/issues/980#issuecomment-700490225 https://api.github.com/repos/simonw/datasette/issues/980 MDEyOklzc3VlQ29tbWVudDcwMDQ5MDIyNQ== simonw 9599 2020-09-29T06:53:37Z 2020-09-29T06:53:37Z OWNER

This time it's because there are newlines in the column header:

<style>
@media only screen and (max-width: 576px) {

    .rows-and-columns td:nth-of-type(1):before { content: "dateutil_rrule(\000027FREQ=HOURLY;COUNT=5\000027)"; }

    .rows-and-columns td:nth-of-type(2):before { content: "dateutil_rrule_date(
\00000A    \000027FREQ=DAILY;COUNT=3\000027,
\00000A    \0000271st jan 2020\000027
\00000A  )"; }

}
</style>

Those need to be escaped somehow.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Another rendering glitch with column headers on mobile 710819020  
700343373 https://github.com/simonw/datasette/issues/979#issuecomment-700343373 https://api.github.com/repos/simonw/datasette/issues/979 MDEyOklzc3VlQ29tbWVudDcwMDM0MzM3Mw== simonw 9599 2020-09-28T23:56:27Z 2020-09-28T23:56:27Z OWNER

This would benefit https://github.com/simonw/datasette-import-table - which currently ignores the CREATE TABLE and derives the schema by inserting rows.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Default table view JSON should include CREATE TABLE 710650633  
700343229 https://github.com/simonw/datasette/issues/979#issuecomment-700343229 https://api.github.com/repos/simonw/datasette/issues/979 MDEyOklzc3VlQ29tbWVudDcwMDM0MzIyOQ== simonw 9599 2020-09-28T23:55:55Z 2020-09-28T23:55:55Z OWNER

Here's the code that adds it to the HTML context: https://github.com/simonw/datasette/blob/c11383e6284e000b2641569457efa16ac9e0d6ae/datasette/views/table.py#L835-L837

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Default table view JSON should include CREATE TABLE 710650633  
700320480 https://github.com/simonw/datasette/issues/978#issuecomment-700320480 https://api.github.com/repos/simonw/datasette/issues/978 MDEyOklzc3VlQ29tbWVudDcwMDMyMDQ4MA== simonw 9599 2020-09-28T22:39:18Z 2020-09-28T22:39:18Z OWNER
def escape_css_string(s):
    return _css_re.sub(lambda m: "\\" + ("{:X}".format(ord(m.group())).zfill(6)), s)

That fixes it:
https://user-images.githubusercontent.com/9599/94493173-c23b6680-01a0-11eb-9468-e972c51b015c.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Rendering glitch with column headings on mobile 710506708  
700319656 https://github.com/simonw/datasette/issues/978#issuecomment-700319656 https://api.github.com/repos/simonw/datasette/issues/978 MDEyOklzc3VlQ29tbWVudDcwMDMxOTY1Ng== simonw 9599 2020-09-28T22:36:44Z 2020-09-28T22:36:44Z OWNER

Weirdly even those leading 0s doesn't fix it:

https://user-images.githubusercontent.com/9599/94492937-44775b00-01a0-11eb-9c5f-e991af620404.png">

But... padding to six characters does! See https://www.w3.org/International/questions/qa-escapes

https://user-images.githubusercontent.com/9599/94492988-61139300-01a0-11eb-8304-bffe448c7d2b.png">

In [32]: print('\\' + "{:X}".format(ord('"')).zfill(6))
\000022
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Rendering glitch with column headings on mobile 710506708  
700317760 https://github.com/simonw/datasette/issues/978#issuecomment-700317760 https://api.github.com/repos/simonw/datasette/issues/978 MDEyOklzc3VlQ29tbWVudDcwMDMxNzc2MA== simonw 9599 2020-09-28T22:30:25Z 2020-09-28T22:30:25Z OWNER
print('\\' + "{:X}".format(ord('"')).zfill(4))
\0022
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Rendering glitch with column headings on mobile 710506708  
700316511 https://github.com/simonw/datasette/issues/978#issuecomment-700316511 https://api.github.com/repos/simonw/datasette/issues/978 MDEyOklzc3VlQ29tbWVudDcwMDMxNjUxMQ== simonw 9599 2020-09-28T22:26:38Z 2020-09-28T22:26:38Z OWNER

The fix may be to use \0022 instead of \22.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Rendering glitch with column headings on mobile 710506708  
700314509 https://github.com/simonw/datasette/issues/978#issuecomment-700314509 https://api.github.com/repos/simonw/datasette/issues/978 MDEyOklzc3VlQ29tbWVudDcwMDMxNDUwOQ== simonw 9599 2020-09-28T22:20:51Z 2020-09-28T22:20:51Z OWNER

Here's the HTML for the broken example above:

<style>
@media only screen and (max-width: 576px) {

    .rows-and-columns td:nth-of-type(1):before { content: "dateutil_parse(\2210 october 2020 3pm\22)"; }

    .rows-and-columns td:nth-of-type(2):before { content: "dateutil_easter(\222020\22)"; }

    .rows-and-columns td:nth-of-type(3):before { content: "dateutil_parse_fuzzy(\22This is due 10 september\22)"; }

    .rows-and-columns td:nth-of-type(4):before { content: "dateutil_parse(\221/2/2020\22)"; }

    .rows-and-columns td:nth-of-type(5):before { content: "dateutil_parse(\222020-03-04\22)"; }

    .rows-and-columns td:nth-of-type(6):before { content: "dateutil_parse_dayfirst(\222020-03-04\22)"; }

    .rows-and-columns td:nth-of-type(7):before { content: "dateutil_easter(2020)"; }

}
</style>

The glitch affects the ones where the quote is followed by digits.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Rendering glitch with column headings on mobile 710506708  
700313836 https://github.com/simonw/datasette/issues/978#issuecomment-700313836 https://api.github.com/repos/simonw/datasette/issues/978 MDEyOklzc3VlQ29tbWVudDcwMDMxMzgzNg== simonw 9599 2020-09-28T22:19:05Z 2020-09-28T22:19:05Z OWNER

Looks like a bug in this function: https://github.com/simonw/datasette/blob/1f021c37110fc9019b0ef70062c28c335e568ae2/datasette/utils/__init__.py#L269-L274

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Rendering glitch with column headings on mobile 710506708  
700012161 https://github.com/simonw/datasette/pull/977#issuecomment-700012161 https://api.github.com/repos/simonw/datasette/issues/977 MDEyOklzc3VlQ29tbWVudDcwMDAxMjE2MQ== codecov[bot] 22429695 2020-09-28T13:37:44Z 2020-09-28T13:37:44Z NONE

Codecov Report

Merging #977 into main will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #977   +/-   ##
=======================================
  Coverage   84.27%   84.27%           
=======================================
  Files          28       28           
  Lines        3847     3847           
=======================================
  Hits         3242     3242           
  Misses        605      605           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9a6d0dc...5c01344. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Update pytest requirement from <6.1.0,>=5.2.2 to >=5.2.2,<6.2.0 710269200  
699762881 https://github.com/simonw/sqlite-utils/issues/181#issuecomment-699762881 https://api.github.com/repos/simonw/sqlite-utils/issues/181 MDEyOklzc3VlQ29tbWVudDY5OTc2Mjg4MQ== simonw 9599 2020-09-28T04:29:23Z 2020-09-28T04:29:23Z OWNER

Relevant code: https://github.com/simonw/sqlite-utils/blob/94fc62857ee2655a21d85f6dae84b67bbfa5956d/sqlite_utils/db.py#L331-L367

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
pk=["id"] should have same effect as pk="id" 709920027  
699718788 https://github.com/simonw/sqlite-utils/issues/180#issuecomment-699718788 https://api.github.com/repos/simonw/sqlite-utils/issues/180 MDEyOklzc3VlQ29tbWVudDY5OTcxODc4OA== simonw 9599 2020-09-28T01:11:45Z 2020-09-28T01:11:45Z OWNER

https://hypothesis.readthedocs.io/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Try running some tests using Hypothesis 709861194  
699690034 https://github.com/simonw/datasette/issues/858#issuecomment-699690034 https://api.github.com/repos/simonw/datasette/issues/858 MDEyOklzc3VlQ29tbWVudDY5OTY5MDAzNA== smithdc1 39445562 2020-09-27T21:23:04Z 2020-09-27T21:23:04Z NONE

Hi Simon,

Thanks so much for all your work on datasette, it's an excellent project and I wish you all the best with it. I particularly enjoyed your talk at the Django London Meetup a short while back.

I've been trying to publish to Heroku from Windows 10 and I was running into this error. I'm not sure why it can't be run without shell=True on Windows but this seems to help. With this change, I am able to publish if I pass in a name to the publish command. When a name is not passed the default of datasette is used and therefore this line here fails (as datasette at heroku already exists) and causes the recession error mentioned above.

https://github.com/simonw/datasette/blob/9a6d0dce282e7fb58c5610e24c74098c923abfdc/datasette/publish/heroku.py#L126

I tried to write a patch for this but I am really struggling with being on Windows (many of the tests seem to fail anyway?), and my lack of knowledge of Mock, so sorry for this. Hope this is of some help.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
publish heroku does not work on Windows 10 642388564  
699524671 https://github.com/simonw/sqlite-utils/issues/179#issuecomment-699524671 https://api.github.com/repos/simonw/sqlite-utils/issues/179 MDEyOklzc3VlQ29tbWVudDY5OTUyNDY3MQ== simonw 9599 2020-09-26T17:31:23Z 2020-09-27T20:31:50Z OWNER

SQL query for detecting integers:

select
  'contains_non_integer' as result
from
  mytable
where
  cast(cast(mycolumn AS INTEGER) AS TEXT) != mycolumn
limit
  1

This will return a single row with a 1 as soon as it comes across a column that contains a non-integer - so it short circuits quickly on TEXT columns with non-integers in them.

If everything in the column is an integer it will scan the whole thing before returning no rows.

More extensive demo:

select
  value,
  cast(cast(value AS INTEGER) AS TEXT) = value as is_valid_int
from
  (
    select
      '1' as value
    union
    select
      '1.1' as value
    union
    select
      'dog' as value
    union
    select
      null as value
  )

https://latest.datasette.io/fixtures?sql=select%0D%0A++value%2C%0D%0A++cast%28cast%28value+AS+INTEGER%29+AS+TEXT%29+%3D+value+as+is_valid_int%0D%0Afrom%0D%0A++%28%0D%0A++++select%0D%0A++++++%271%27+as+value%0D%0A++++union%0D%0A++++select%0D%0A++++++%271.1%27+as+value%0D%0A++++union%0D%0A++++select%0D%0A++++++%27dog%27+as+value%0D%0A++++union%0D%0A++++select%0D%0A++++++null+as+value%0D%0A++%29

<table> <thead> <tr> <th>value</th> <th>is_valid_int</th> </tr> </thead> <tbody> <tr> <td> </td> <td> </td> </tr> <tr> <td>1</td> <td>1</td> </tr> <tr> <td>1.1</td> <td>0</td> </tr> <tr> <td>dog</td> <td>0</td> </tr> </tbody> </table>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils transform/insert --detect-types 709577625  
699684535 https://github.com/simonw/sqlite-utils/issues/179#issuecomment-699684535 https://api.github.com/repos/simonw/sqlite-utils/issues/179 MDEyOklzc3VlQ29tbWVudDY5OTY4NDUzNQ== simonw 9599 2020-09-27T20:30:31Z 2020-09-27T20:30:31Z OWNER

This recipe looks like it might be the way to detect floats:

select
  value,
  cast(cast(value AS REAL) AS TEXT) in (value, value || '.0') as is_valid_float
from
  (
    select
      '1' as value
    union
    select
      '1.1' as value
    union
    select
      'dog' as value
    union
    select
      null as value
  )

Demo: https://latest.datasette.io/fixtures?sql=select%0D%0A++value%2C%0D%0A++cast%28cast%28value+AS+REAL%29+AS+TEXT%29+in+%28value%2C+value+%7C%7C+%27.0%27%29+as+is_valid_float%0D%0Afrom%0D%0A++%28%0D%0A++++select%0D%0A++++++%271%27+as+value%0D%0A++++union%0D%0A++++select%0D%0A++++++%271.1%27+as+value%0D%0A++++union%0D%0A++++select%0D%0A++++++%27dog%27+as+value%0D%0A++++union%0D%0A++++select%0D%0A++++++null+as+value%0D%0A++%29

<table> <thead> <tr> <th>value</th> <th>is_valid_float</th> </tr> </thead> <tbody> <tr> <td> </td> <td> </td> </tr> <tr> <td>1</td> <td>1</td> </tr> <tr> <td>1.1</td> <td>1</td> </tr> <tr> <td>dog</td> <td>0</td> </tr> </tbody> </table>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils transform/insert --detect-types 709577625  
699526149 https://github.com/simonw/sqlite-utils/issues/179#issuecomment-699526149 https://api.github.com/repos/simonw/sqlite-utils/issues/179 MDEyOklzc3VlQ29tbWVudDY5OTUyNjE0OQ== simonw 9599 2020-09-26T17:43:28Z 2020-09-26T17:43:28Z OWNER

Posed a question about this on the SQLite forum here: https://sqlite.org/forum/forumpost/ab0dcd66ef

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils transform/insert --detect-types 709577625  
698626768 https://github.com/simonw/sqlite-utils/issues/138#issuecomment-698626768 https://api.github.com/repos/simonw/sqlite-utils/issues/138 MDEyOklzc3VlQ29tbWVudDY5ODYyNjc2OA== simonw 9599 2020-09-24T22:46:56Z 2020-09-24T22:46:56Z OWNER

Yeah this works fine, added a new confirmatory test.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
extracts= doesn't configure foreign keys 684118950  
698578959 https://github.com/simonw/sqlite-utils/issues/173#issuecomment-698578959 https://api.github.com/repos/simonw/sqlite-utils/issues/173 MDEyOklzc3VlQ29tbWVudDY5ODU3ODk1OQ== simonw 9599 2020-09-24T20:44:35Z 2020-09-24T20:50:19Z OWNER

I'm using a click.File() at the moment: https://github.com/simonw/sqlite-utils/blob/5a63b9e88c5887432eb1d7df39f304ea55038437/sqlite_utils/cli.py#L496

I'll need to change that to be something that I can easily measure progress through. Also I should change its name - json_file is a bad name when it sometimes handles csv or tsv instead.

It looks like the argument provided by click.File doesn't provide a way to read the size of the file, so I need to switch that out for a file path instead. https://click.palletsprojects.com/en/7.x/api/#click.Path

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Progress bar for sqlite-utils insert 707478649  
698579389 https://github.com/simonw/sqlite-utils/issues/173#issuecomment-698579389 https://api.github.com/repos/simonw/sqlite-utils/issues/173 MDEyOklzc3VlQ29tbWVudDY5ODU3OTM4OQ== simonw 9599 2020-09-24T20:45:29Z 2020-09-24T20:45:29Z OWNER

Relevant code: https://github.com/simonw/sqlite-utils/blob/5a63b9e88c5887432eb1d7df39f304ea55038437/sqlite_utils/cli.py#L550-L560

Changing that to track progress through NL-JSON, CSV and TSV shouldn't be too hard.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Progress bar for sqlite-utils insert 707478649  
698577508 https://github.com/simonw/sqlite-utils/issues/173#issuecomment-698577508 https://api.github.com/repos/simonw/sqlite-utils/issues/173 MDEyOklzc3VlQ29tbWVudDY5ODU3NzUwOA== simonw 9599 2020-09-24T20:41:18Z 2020-09-24T20:41:18Z OWNER

I know how to build this for CSV and TSV - I can read them via a file wrapper that counts how many bytes it has seen.

Not sure how to do it for JSON though. Maybe I could provide it just for newline-delimited JSON? Again I can measure progress based on how many bytes have been read.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Progress bar for sqlite-utils insert 707478649  
698575545 https://github.com/simonw/sqlite-utils/issues/119#issuecomment-698575545 https://api.github.com/repos/simonw/sqlite-utils/issues/119 MDEyOklzc3VlQ29tbWVudDY5ODU3NTU0NQ== simonw 9599 2020-09-24T20:36:59Z 2020-09-24T20:36:59Z OWNER

This was implemented in #161.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ability to remove a foreign key 652700770  
698572493 https://github.com/simonw/sqlite-utils/issues/176#issuecomment-698572493 https://api.github.com/repos/simonw/sqlite-utils/issues/176 MDEyOklzc3VlQ29tbWVudDY5ODU3MjQ5Mw== simonw 9599 2020-09-24T20:30:18Z 2020-09-24T20:30:18Z OWNER

Documentation: https://sqlite-utils.readthedocs.io/en/stable/cli.html#transforming-tables

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils transform column order option 708293114  
698572264 https://github.com/simonw/sqlite-utils/issues/175#issuecomment-698572264 https://api.github.com/repos/simonw/sqlite-utils/issues/175 MDEyOklzc3VlQ29tbWVudDY5ODU3MjI2NA== simonw 9599 2020-09-24T20:29:48Z 2020-09-24T20:29:48Z OWNER

Documentation: https://sqlite-utils.readthedocs.io/en/stable/python-api.html#transforming-a-table

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add docs for .transform(column_order=) 708261775  
698488971 https://github.com/simonw/datasette/issues/976#issuecomment-698488971 https://api.github.com/repos/simonw/datasette/issues/976 MDEyOklzc3VlQ29tbWVudDY5ODQ4ODk3MQ== simonw 9599 2020-09-24T17:42:09Z 2020-09-24T17:42:35Z OWNER

This is complex enough new logic that it will need test coverage - specifically covering tables or databases with strange names.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Idea: -o could open to a more convenient location 708289783  
698444567 https://github.com/simonw/sqlite-utils/issues/177#issuecomment-698444567 https://api.github.com/repos/simonw/sqlite-utils/issues/177 MDEyOklzc3VlQ29tbWVudDY5ODQ0NDU2Nw== simonw 9599 2020-09-24T16:14:47Z 2020-09-24T16:14:47Z OWNER

This is a backwards incompatible change, so technically I should bump the major version to 3. I'm not going to do that, because the feature is brand new and the chance that anyone has written code or shell scripts that use it is vanishingly small.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Simplify .transform(drop_foreign_keys=) and sqlite-transform --drop-foreign-key 708301810  
698438043 https://github.com/simonw/sqlite-utils/issues/176#issuecomment-698438043 https://api.github.com/repos/simonw/sqlite-utils/issues/176 MDEyOklzc3VlQ29tbWVudDY5ODQzODA0Mw== simonw 9599 2020-09-24T16:02:55Z 2020-09-24T16:02:55Z OWNER

I think I'll call this option --column-order with a shortcut of -o.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils transform column order option 708293114  
698434811 https://github.com/simonw/sqlite-utils/issues/175#issuecomment-698434811 https://api.github.com/repos/simonw/sqlite-utils/issues/175 MDEyOklzc3VlQ29tbWVudDY5ODQzNDgxMQ== simonw 9599 2020-09-24T15:57:17Z 2020-09-24T15:57:17Z OWNER

Landed that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add docs for .transform(column_order=) 708261775  
698434236 https://github.com/simonw/datasette/issues/970#issuecomment-698434236 https://api.github.com/repos/simonw/datasette/issues/970 MDEyOklzc3VlQ29tbWVudDY5ODQzNDIzNg== simonw 9599 2020-09-24T15:56:18Z 2020-09-24T15:56:50Z OWNER

Idea: if a database only has a single table, this could open straight to /db/table. If it has multiple tables but a single database it could open straight to /db.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
request an "-o" option on "datasette server" to open the default browser at the running url 705108492  
698412692 https://github.com/simonw/sqlite-utils/issues/175#issuecomment-698412692 https://api.github.com/repos/simonw/sqlite-utils/issues/175 MDEyOklzc3VlQ29tbWVudDY5ODQxMjY5Mg== simonw 9599 2020-09-24T15:19:28Z 2020-09-24T15:19:28Z OWNER

Need to land #174 first.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add docs for .transform(column_order=) 708261775  
698400790 https://github.com/simonw/sqlite-utils/pull/174#issuecomment-698400790 https://api.github.com/repos/simonw/sqlite-utils/issues/174 MDEyOklzc3VlQ29tbWVudDY5ODQwMDc5MA== simonw 9599 2020-09-24T14:59:50Z 2020-09-24T14:59:50Z OWNER

For reusing the lookup table: I'm going to raise an error if a lookup table exists but without the correct columns. The caller can then add those columns and try again.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Much, much faster extract() implementation 707944044  
698184166 https://github.com/simonw/sqlite-utils/pull/174#issuecomment-698184166 https://api.github.com/repos/simonw/sqlite-utils/issues/174 MDEyOklzc3VlQ29tbWVudDY5ODE4NDE2Ng== simonw 9599 2020-09-24T08:01:07Z 2020-09-24T08:01:07Z OWNER

I may revert the now unnecessary undocumented tweaks to the .update() method made in 66d506587eba9f0715267d6560b97c1fa44cc781 as well.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Much, much faster extract() implementation 707944044  
698182656 https://github.com/simonw/sqlite-utils/pull/174#issuecomment-698182656 https://api.github.com/repos/simonw/sqlite-utils/issues/174 MDEyOklzc3VlQ29tbWVudDY5ODE4MjY1Ng== simonw 9599 2020-09-24T07:58:08Z 2020-09-24T07:58:08Z OWNER

The way the lookup table works here differs from the previous implementation. In the previous implementation the usage of .lookup() meant that an existing table would be modified to fit the new purpose. That no longer happens in this version. Need to make a design decision about how this should work.

It should definitely be possible to use an existing lookup table - imagine a database where several tables have a "Departments" column and we want to extract all of those values out to a single shared "Departments" table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Much, much faster extract() implementation 707944044  
698182037 https://github.com/simonw/sqlite-utils/pull/174#issuecomment-698182037 https://api.github.com/repos/simonw/sqlite-utils/issues/174 MDEyOklzc3VlQ29tbWVudDY5ODE4MjAzNw== simonw 9599 2020-09-24T07:56:50Z 2020-09-24T07:56:50Z OWNER

I could also be a bit smarter about transaction handling. I think it may be possible to run this entire operation in a single transaction now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Much, much faster extract() implementation 707944044  
698181478 https://github.com/simonw/sqlite-utils/pull/174#issuecomment-698181478 https://api.github.com/repos/simonw/sqlite-utils/issues/174 MDEyOklzc3VlQ29tbWVudDY5ODE4MTQ3OA== simonw 9599 2020-09-24T07:55:45Z 2020-09-24T07:55:45Z OWNER

import functools is no longer needed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Much, much faster extract() implementation 707944044  
698180705 https://github.com/simonw/sqlite-utils/pull/174#issuecomment-698180705 https://api.github.com/repos/simonw/sqlite-utils/issues/174 MDEyOklzc3VlQ29tbWVudDY5ODE4MDcwNQ== simonw 9599 2020-09-24T07:54:10Z 2020-09-24T07:54:10Z OWNER

After running through the steps in https://simonwillison.net/2020/Sep/23/sqlite-utils-extract/ I get a table that looks like this:

https://user-images.githubusercontent.com/9599/94116875-666b8900-fe00-11ea-9e97-2b9ccbfeae29.png">

The foreign key columns are all at the end of the table. It would be nicer if they were arranged in the same order as the columns they replaced.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Much, much faster extract() implementation 707944044  
698180113 https://github.com/simonw/sqlite-utils/pull/174#issuecomment-698180113 https://api.github.com/repos/simonw/sqlite-utils/issues/174 MDEyOklzc3VlQ29tbWVudDY5ODE4MDExMw== simonw 9599 2020-09-24T07:53:03Z 2020-09-24T07:53:03Z OWNER

This could do with a little bit more testing - I'm worried there may be column or table name edge cases that are not covered yet. I also need to remove the progress bar code since that no longer makes sense for this implementation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Much, much faster extract() implementation 707944044  
698178101 https://github.com/simonw/sqlite-utils/issues/172#issuecomment-698178101 https://api.github.com/repos/simonw/sqlite-utils/issues/172 MDEyOklzc3VlQ29tbWVudDY5ODE3ODEwMQ== simonw 9599 2020-09-24T07:48:57Z 2020-09-24T07:49:20Z OWNER

I wonder if I could make this faster by separating it out into a few steps:

* Create the new lookup table with all of the distinct rows

* Add the blank foreign key column

* run a `UPDATE table SET blah_id = (select id from lookup where thang = table.thang)`

* Drop the value columns

My prototype of this knocked the time down from 10 minutes to 4 seconds, so I think the change is worth it!

% date
sqlite-utils extract salaries.db salaries \
   'Department Code' 'Department' \
  --table 'departments' \
  --fk-column 'department_id' \
  --rename 'Department Code' code \
  --rename 'Department' name
date
sqlite-utils extract salaries.db salaries \
   'Union Code' 'Union' \
  --table 'unions' \
  --fk-column 'union_id' \
  --rename 'Union Code' code \
  --rename 'Union' name
date
sqlite-utils extract salaries.db salaries \
   'Job Family Code' 'Job Family' \
  --table 'job_families' \
  --fk-column 'job_family_id' \
  --rename 'Job Family Code' code \
  --rename 'Job Family' name
date
sqlite-utils extract salaries.db salaries \
   'Job Code' 'Job' \
  --table 'jobs' \
  --fk-column 'job_id' \
  --rename 'Job Code' code \
  --rename 'Job' name
date
Thu Sep 24 00:48:16 PDT 2020

Thu Sep 24 00:48:20 PDT 2020

Thu Sep 24 00:48:24 PDT 2020

Thu Sep 24 00:48:28 PDT 2020

Thu Sep 24 00:48:32 PDT 2020
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Improve performance of extract operations 707427200  
698174957 https://github.com/simonw/datasette/issues/123#issuecomment-698174957 https://api.github.com/repos/simonw/datasette/issues/123 MDEyOklzc3VlQ29tbWVudDY5ODE3NDk1Nw== obra 45416 2020-09-24T07:42:05Z 2020-09-24T07:42:05Z NONE

Oh. Awesome.

On Thu, Sep 24, 2020 at 12:28:53AM -0700, Simon Willison wrote:

@obra there's a plugin for that! https://github.com/simonw/
datasette-upload-csvs

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.*

--

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Datasette serve should accept paths/URLs to CSVs and other file formats 275125561  
698168648 https://github.com/simonw/datasette/issues/123#issuecomment-698168648 https://api.github.com/repos/simonw/datasette/issues/123 MDEyOklzc3VlQ29tbWVudDY5ODE2ODY0OA== simonw 9599 2020-09-24T07:28:38Z 2020-09-24T07:28:38Z OWNER

@obra there's a plugin for that! https://github.com/simonw/datasette-upload-csvs

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Datasette serve should accept paths/URLs to CSVs and other file formats 275125561  
698110492 https://github.com/simonw/datasette/issues/974#issuecomment-698110492 https://api.github.com/repos/simonw/datasette/issues/974 MDEyOklzc3VlQ29tbWVudDY5ODExMDQ5Mg== simonw 9599 2020-09-24T04:50:56Z 2020-09-24T04:51:05Z OWNER

Come to think of it I've noticed that in the logs when it's running on my laptop, definitely worth fixing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
static assets and favicon aren't cached by the browser 707849175  
698110186 https://github.com/simonw/datasette/issues/123#issuecomment-698110186 https://api.github.com/repos/simonw/datasette/issues/123 MDEyOklzc3VlQ29tbWVudDY5ODExMDE4Ng== obra 45416 2020-09-24T04:49:51Z 2020-09-24T04:49:51Z NONE

As a half-measure, I'd get value out of being able to upload a CSV and have datasette run csv-to-sqlite on it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Datasette serve should accept paths/URLs to CSVs and other file formats 275125561  
698024773 https://github.com/simonw/datasette/issues/619#issuecomment-698024773 https://api.github.com/repos/simonw/datasette/issues/619 MDEyOklzc3VlQ29tbWVudDY5ODAyNDc3Mw== simonw 9599 2020-09-23T23:31:46Z 2020-09-23T23:31:46Z OWNER

I'm going to have to untangle Datasette's error handling a bit for this - currently the expectation is that exceptions will be handled at a higher level, but I need to rethink that to make it cleaner for views like the "execute custom SQL" view to add their own error handling (and still be able to return the correct HTTP status codes, even with custom pages).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Invalid SQL" page should let you edit the SQL 520655983  
697998045 https://github.com/simonw/datasette/issues/619#issuecomment-697998045 https://api.github.com/repos/simonw/datasette/issues/619 MDEyOklzc3VlQ29tbWVudDY5Nzk5ODA0NQ== simonw 9599 2020-09-23T22:09:06Z 2020-09-23T22:09:06Z OWNER

I'll add this to the succesful JSON format:

{
  "ok": true,
  "error": null
}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Invalid SQL" page should let you edit the SQL 520655983  
697995885 https://github.com/simonw/datasette/issues/619#issuecomment-697995885 https://api.github.com/repos/simonw/datasette/issues/619 MDEyOklzc3VlQ29tbWVudDY5Nzk5NTg4NQ== simonw 9599 2020-09-23T22:02:44Z 2020-09-23T22:08:28Z OWNER

So the JSON (still served with a 500 code) will look something like this:

{
  "ok": false,
  "status": 500,
  "database": "fixtures",
  "query_name": null,
  "rows": [],
  "truncated": false,
  "error": "Error message goes here",
  "columns": [],
  "query": {
    "sql": "the query that broke goes here",
    "params": {}
  },
  "private": false,
  "allow_execute_sql": true,
  "query_ms": 0.8716583251953125,
  "source": "tests/fixtures.py",
  "source_url": "https://github.com/simonw/datasette/blob/master/tests/fixtures.py",
  "license": "Apache License 2.0",
  "license_url": "https://github.com/simonw/datasette/blob/master/LICENSE"
}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Invalid SQL" page should let you edit the SQL 520655983  
697995303 https://github.com/simonw/datasette/issues/619#issuecomment-697995303 https://api.github.com/repos/simonw/datasette/issues/619 MDEyOklzc3VlQ29tbWVudDY5Nzk5NTMwMw== simonw 9599 2020-09-23T22:01:08Z 2020-09-23T22:01:08Z OWNER

This is a little tricky to solve, because of the location of the form and the need to return JSON as well as HTML. It would be weird if a JSON request came in and got back the standard output from https://latest.datasette.io/fixtures.json when they were expecting to get back JSON in the shape of https://latest.datasette.io/fixtures.json?sql=select%20*%20from%20sqlite_master

I'm going to return the HTML view that you would get for 0 results for a query - https://latest.datasette.io/fixtures?sql=select%201%20limit%200 - but with an error message.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Invalid SQL" page should let you edit the SQL 520655983  
697980061 https://github.com/simonw/datasette/issues/619#issuecomment-697980061 https://api.github.com/repos/simonw/datasette/issues/619 MDEyOklzc3VlQ29tbWVudDY5Nzk4MDA2MQ== simonw 9599 2020-09-23T21:22:42Z 2020-09-23T21:22:42Z OWNER

Yeah that sucks. Bumping this up the priority list.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Invalid SQL" page should let you edit the SQL 520655983  
697973420 https://github.com/simonw/datasette/issues/619#issuecomment-697973420 https://api.github.com/repos/simonw/datasette/issues/619 MDEyOklzc3VlQ29tbWVudDY5Nzk3MzQyMA== obra 45416 2020-09-23T21:07:58Z 2020-09-23T21:07:58Z NONE

I've just run into this after crafting a complex query and discovered that hitting back loses my query.

Even showing me the whole bad query would be a huge improvement over the current status quo.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Invalid SQL" page should let you edit the SQL 520655983  
697869886 https://github.com/simonw/sqlite-utils/issues/172#issuecomment-697869886 https://api.github.com/repos/simonw/sqlite-utils/issues/172 MDEyOklzc3VlQ29tbWVudDY5Nzg2OTg4Ng== simonw 9599 2020-09-23T18:45:30Z 2020-09-23T18:45:30Z OWNER

There's something to be said for making this operation pausable and resumable, especially if I'm going to make it available in a Datasette plugin at some point.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Improve performance of extract operations 707427200  
697866885 https://github.com/simonw/sqlite-utils/issues/172#issuecomment-697866885 https://api.github.com/repos/simonw/sqlite-utils/issues/172 MDEyOklzc3VlQ29tbWVudDY5Nzg2Njg4NQ== simonw 9599 2020-09-23T18:43:37Z 2020-09-23T18:43:37Z OWNER

Also what would happen if the table had new rows added to it while that command was running?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Improve performance of extract operations 707427200  
697863116 https://github.com/simonw/sqlite-utils/issues/172#issuecomment-697863116 https://api.github.com/repos/simonw/sqlite-utils/issues/172 MDEyOklzc3VlQ29tbWVudDY5Nzg2MzExNg== simonw 9599 2020-09-23T18:41:06Z 2020-09-23T18:41:06Z OWNER

Problem with this approach is it's not compatible with progress bars - but if it's a multiple of times faster it's worth it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Improve performance of extract operations 707427200  
697859772 https://github.com/simonw/sqlite-utils/issues/172#issuecomment-697859772 https://api.github.com/repos/simonw/sqlite-utils/issues/172 MDEyOklzc3VlQ29tbWVudDY5Nzg1OTc3Mg== simonw 9599 2020-09-23T18:38:43Z 2020-09-23T18:38:52Z OWNER

I wonder if I could make this faster by separating it out into a few steps:
- Create the new lookup table with all of the distinct rows
- Add the blank foreign key column
- run a UPDATE table SET blah_id = (select id from lookup where thang = table.thang)
- Drop the value columns

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Improve performance of extract operations 707427200  
697835956 https://github.com/simonw/sqlite-utils/issues/172#issuecomment-697835956 https://api.github.com/repos/simonw/sqlite-utils/issues/172 MDEyOklzc3VlQ29tbWVudDY5NzgzNTk1Ng== simonw 9599 2020-09-23T18:22:49Z 2020-09-23T18:22:49Z OWNER

I ran sudo py-spy top -p 123 against the process while it was running and the most time is definitely spent in .update():

Total Samples 1000
GIL: 0.00%, Active: 90.00%, Threads: 1

  %Own   %Total  OwnTime  TotalTime  Function (filename:line)                                                                                                                                  
 38.00%  38.00%    3.85s     3.85s   update (sqlite_utils/db.py:1283)
 27.00%  27.00%    2.12s     2.12s   execute (sqlite_utils/db.py:161)
 10.00%  10.00%   0.890s    0.890s   execute (sqlite_utils/db.py:163)
 10.00%  17.00%   0.870s     1.54s   columns (sqlite_utils/db.py:553)
  0.00%   0.00%   0.110s    0.210s   <listcomp> (sqlite_utils/db.py:554)
  0.00%   3.00%   0.100s    0.320s   table_names (sqlite_utils/db.py:191)
  0.00%   0.00%   0.100s    0.100s   __new__ (<string>:1)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Improve performance of extract operations 707427200  
697577646 https://github.com/simonw/sqlite-utils/issues/173#issuecomment-697577646 https://api.github.com/repos/simonw/sqlite-utils/issues/173 MDEyOklzc3VlQ29tbWVudDY5NzU3NzY0Ng== simonw 9599 2020-09-23T15:48:51Z 2020-09-23T15:48:51Z OWNER

This can only work when it's reading from a file, not when it's reading from standard input.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Progress bar for sqlite-utils insert 707478649  
697545290 https://github.com/simonw/datasette/issues/111#issuecomment-697545290 https://api.github.com/repos/simonw/datasette/issues/111 MDEyOklzc3VlQ29tbWVudDY5NzU0NTI5MA== simonw 9599 2020-09-23T15:29:11Z 2020-09-23T15:29:11Z OWNER

This is still a good idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add “last_updated” to metadata 274615452  
697473247 https://github.com/simonw/sqlite-utils/issues/172#issuecomment-697473247 https://api.github.com/repos/simonw/sqlite-utils/issues/172 MDEyOklzc3VlQ29tbWVudDY5NzQ3MzI0Nw== simonw 9599 2020-09-23T14:45:13Z 2020-09-23T14:45:13Z OWNER

lookup_table.lookup(lookups) is doing a SQL lookup. This could be cached in-memory, maybe with a LRU cache, to avoid looking up the primary key for records that we have recently used.

The .update() method it is calling first does a get() and then does a SQL UPDATE ... WHERE:

https://github.com/simonw/sqlite-utils/blob/1ebffe1dbeaed7311e5b61ed988f4cd701e84808/sqlite_utils/db.py#L1244-L1264

Batching those updates may have an effect. Or finding a way to skip the .get() since we already know we have a valid record.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Improve performance of extract operations 707427200  
697467833 https://github.com/simonw/sqlite-utils/issues/172#issuecomment-697467833 https://api.github.com/repos/simonw/sqlite-utils/issues/172 MDEyOklzc3VlQ29tbWVudDY5NzQ2NzgzMw== simonw 9599 2020-09-23T14:42:03Z 2020-09-23T14:42:03Z OWNER

Here's the loop that's taking the time: https://github.com/simonw/sqlite-utils/blob/1ebffe1dbeaed7311e5b61ed988f4cd701e84808/sqlite_utils/db.py#L892-L897

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Improve performance of extract operations 707427200  
697466497 https://github.com/simonw/sqlite-utils/issues/172#issuecomment-697466497 https://api.github.com/repos/simonw/sqlite-utils/issues/172 MDEyOklzc3VlQ29tbWVudDY5NzQ2NjQ5Nw== simonw 9599 2020-09-23T14:41:17Z 2020-09-23T14:41:17Z OWNER

Steps to produce that database:

curl -o salaries.csv 'https://data.sfgov.org/api/views/88g8-5mnd/rows.csv?accessType=DOWNLOAD'
sqlite-utils insert salaries.db salaries salaries.csv --csv
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Improve performance of extract operations 707427200  
697073465 https://github.com/simonw/datasette/issues/970#issuecomment-697073465 https://api.github.com/repos/simonw/datasette/issues/970 MDEyOklzc3VlQ29tbWVudDY5NzA3MzQ2NQ== secretGeek 2861690 2020-09-23T01:49:05Z 2020-09-23T01:49:05Z NONE

Oh wow oh wow. Thanks so much Simon. In an astoundingly rough week, this is a shining jewel. 🤣

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
request an "-o" option on "datasette server" to open the default browser at the running url 705108492  
697047591 https://github.com/simonw/sqlite-utils/issues/170#issuecomment-697047591 https://api.github.com/repos/simonw/sqlite-utils/issues/170 MDEyOklzc3VlQ29tbWVudDY5NzA0NzU5MQ== simonw 9599 2020-09-23T00:14:52Z 2020-09-23T00:14:52Z OWNER

@simonw
@db.register_function decorator, closes #162
4824775
@simonw
table.transform() method - closes #114
987dd12
@simonw
Keyword only arguments for transform()
f8e10df

Also renamed columns= to types=

Closes #165

Commits on Sep 22, 2020
@simonw
Implemented sqlite-utils transform command, closes #164
752d261
@simonw
Applied Black
f29f682
@simonw
table.extract() method, refs #42
f855379
@simonw
Docstring for sqlite-utils transform
c755f28
@simonw
Added table.extract(rename=) option, refs #42
c3210f2
@simonw
Applied Black
317071a
@simonw
New .rows_where(select=) argument
7178231
@simonw
table.extract() now works with rowid tables, refs #42
2db6c5b
@simonw
sqlite-utils extract, closes #42
55cf928
@simonw
Progress bar for "sqlite-utils extract", closes #169
5c4d58d
@simonw
Fixed PRAGMA foreign_keys handling for .transform, closes #167

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Release notes for 2.20 706768798  
697037974 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697037974 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAzNzk3NA== simonw 9599 2020-09-22T23:39:31Z 2020-09-22T23:39:31Z OWNER

Documentation for sqlite-utils extract: https://sqlite-utils.readthedocs.io/en/latest/cli.html#extracting-columns-into-a-separate-table

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
697031174 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697031174 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAzMTE3NA== simonw 9599 2020-09-22T23:16:00Z 2020-09-22T23:16:00Z OWNER

Trying this demo again:

wget 'https://raw.githubusercontent.com/wri/global-power-plant-database/master/output_database/global_power_plant_database.csv'
sqlite-utils insert global.db power_plants global_power_plant_database.csv --csv
sqlite-utils extract global.db power_plants country country_long --table countries --rename country_long name

It worked!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
697025403 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697025403 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAyNTQwMw== simonw 9599 2020-09-22T22:57:53Z 2020-09-22T22:57:53Z OWNER

The documentation for the .extract() method is here: https://sqlite-utils.readthedocs.io/en/latest/python-api.html#extracting-columns-into-a-separate-table

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
697019944 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697019944 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAxOTk0NA== simonw 9599 2020-09-22T22:40:00Z 2020-09-22T22:40:00Z OWNER

I tried out the prototype of the CLI on the Global Power Plants data:

wget 'https://raw.githubusercontent.com/wri/global-power-plant-database/master/output_database/global_power_plant_database.csv'
sqlite-utils insert global.db power_plants global_power_plant_database.csv --csv
sqlite-utils extract global.db power_plants country country_long

This threw an error because rowid columns are not yet supported. I fixed that like so:

sqlite-utils transform global.db power_plants --rename rowid id
sqlite-utils extract global.db power_plants country country_long

That worked! But it didn't play great with Datasette, because the resulting extracted table had columns country and country_long and neither of those are called name or value or title.

Based on this I need to add rowid table support AND I need to implement the proposed rename= argument for renaming columns on their way into the new table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
697013681 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697013681 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAxMzY4MQ== simonw 9599 2020-09-22T22:22:49Z 2020-09-22T22:22:49Z OWNER

The command-line version of this needs to accept a table and one or more columns, then a --table and --fk-column option.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
697012111 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697012111 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAxMjExMQ== simonw 9599 2020-09-22T22:18:13Z 2020-09-22T22:18:13Z OWNER

Here's how I'm generating the examples for the documentation:

In [2]: import sqlite_utils

In [3]: db = sqlite_utils.Database(memory=True)

In [4]: db["Trees"].insert({"id": 1, "TreeAddress": "52 Vine St", "CommonName":
   ...: "Palm", "LatinName": "foo"}, pk="id")
Out[4]: <Table Trees (id, TreeAddress, CommonName, LatinName)>

In [5]: db["Trees"].extract(["CommonName", "LatinName"], table="Species", fk_col
   ...: umn="species_id")

In [6]: print(db["Trees"].schema)
CREATE TABLE "Trees" (
   [id] INTEGER PRIMARY KEY,
   [TreeAddress] TEXT,
   [species_id] INTEGER,
   FOREIGN KEY(species_id) REFERENCES Species(id)
)

In [7]: print(db["Species"].schema)
CREATE TABLE [Species] (
   [id] INTEGER PRIMARY KEY,
   [CommonName] TEXT,
   [LatinName] TEXT
)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696987925 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696987925 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk4NzkyNQ== simonw 9599 2020-09-22T21:19:04Z 2020-09-22T21:19:04Z OWNER

Need to make sure this works correctly for rowid tables.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696987257 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696987257 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk4NzI1Nw== simonw 9599 2020-09-22T21:17:34Z 2020-09-22T21:17:34Z OWNER

What to do if the table already exists? The .lookup() function already knows how to modify an existing table to create the correct constraints etc, so I'll rely on that mechanism.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696980709 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696980709 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk4MDcwOQ== simonw 9599 2020-09-22T21:05:07Z 2020-09-22T21:05:07Z OWNER

So .extract() probably takes a batch_size= argument too, which defaults to maybe 1000.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696980503 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696980503 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk4MDUwMw== simonw 9599 2020-09-22T21:04:45Z 2020-09-22T21:04:45Z OWNER

table.extract() can take an optional progress= argument which is a callback which will be used to report progress - called after each batch with (num_done, total). It will get called with (0, total) once at the start to allow progress bars to be initialized. The command-line progress bar will use this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696979626 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696979626 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk3OTYyNg== simonw 9599 2020-09-22T21:03:11Z 2020-09-22T21:03:11Z OWNER

And if you want to rename some of the columns in the new table:

db["trees"].extract(["common_name", "latin_name"], table="species", rename={"common_name": "name"})
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696979168 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696979168 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk3OTE2OA== simonw 9599 2020-09-22T21:02:24Z 2020-09-22T21:02:24Z OWNER

In Python it looks like this:

# Simple case - species column species_id pointing to species table
db["trees"].extract("species")

# Setting a custom table
db["trees"].extract("species", table="Species")

# Custom foreign key column on trees
db["trees"].extract("species", fk_column="species")

# Extracting multiple columns
db["trees"].extract(["common_name", "latin_name"])
# (this creates a lookup table called common_name_latin_name ref'd by common_name_latin_name_id)

# Or with explicit table (fk_column here defaults to species_id because of the table name)
db["trees"].extract(["common_name", "latin_name"], table="species")
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696976678 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696976678 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk3NjY3OA== simonw 9599 2020-09-22T20:57:57Z 2020-09-22T20:57:57Z OWNER

I think I understand the shape of this feature now. It lets you specify one or more columns on the source table which will be extracted into another table. It uses the .lookup() mechanism to populate that other table, which means each unique column value / pair / triple will be assigned an integer ID.

That integer ID gets written back into the first of the columns that are being transformed. A .transform() call then converts that column to an integer (and drops the additional columns). Finally we set up the new foreign key relationship.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696893774 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696893774 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njg5Mzc3NA== simonw 9599 2020-09-22T18:15:33Z 2020-09-22T18:15:33Z OWNER

I think the new foreign key column is called company_name_id by default in this example but can be customized by passing --fk-column=xxx

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696893244 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696893244 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njg5MzI0NA== simonw 9599 2020-09-22T18:14:33Z 2020-09-22T18:14:45Z OWNER

Thinking more about this one:

$ sqlite-utils extract my.db \
    dea_sales company_name company_address \
    --table companies

The goal here is to pull the company name and address pair out into a separate table.

Some questions:
- should this first verify that every company_name has just one company_address? I like the idea of a unique constraint on the created table for this.
- what should the foreign key column that gets added to the companies table be called?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
513262013 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-513262013 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDUxMzI2MjAxMw== simonw 9599 2019-07-19T14:58:23Z 2020-09-22T18:12:11Z OWNER

CLI design idea:

$ sqlite-utils extract my.db \
    dea_sales company_name

Here we just specify the original table and column - the new extracted table will automatically be called "company_name" and will have "id" and "value" columns, by default.

To set a custom extract table:

$ sqlite-utils extract my.db \
    dea_sales company_name \
    --table companies

And for extracting multiple columns and renaming them on the created table, maybe something like this:

$ sqlite-utils extract my.db \
    dea_sales company_name company_address \
    --table companies \
    --column company_name name \
    --column company_address address
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696800410 https://github.com/simonw/datasette/issues/973#issuecomment-696800410 https://api.github.com/repos/simonw/datasette/issues/973 MDEyOklzc3VlQ29tbWVudDY5NjgwMDQxMA== simonw 9599 2020-09-22T15:35:28Z 2020-09-22T15:35:28Z OWNER

Confirmed in local dev:

% datasette fixtures.db --inspect-file inspect.json
Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/bin/datasette", line 11, in <module>
    load_entry_point('datasette', 'console_scripts', 'datasette')()
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/simon/Dropbox/Development/datasette/datasette/cli.py", line 406, in serve
    inspect_data = json.load(open(inspect_file))
TypeError: 'bool' object is not callable
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
'bool' object is not callable error 706486323  

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Query took 1094.617ms · About: github-to-sqlite