home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where "created_at" is on date 2020-02-24 and issue = 570101428 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 9

issue 1

  • .execute_write() and .execute_write_fn() methods on Database · 9 ✖

author_association 1

  • OWNER 9
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
590608228 https://github.com/simonw/datasette/pull/683#issuecomment-590608228 https://api.github.com/repos/simonw/datasette/issues/683 MDEyOklzc3VlQ29tbWVudDU5MDYwODIyOA== simonw 9599 2020-02-24T23:52:35Z 2020-02-24T23:52:35Z OWNER

I'm going to punt on the ability to introspect the write queue and poll for completion using a UUID for the moment. Can add those later.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.execute_write() and .execute_write_fn() methods on Database 570101428  
590607385 https://github.com/simonw/datasette/pull/683#issuecomment-590607385 https://api.github.com/repos/simonw/datasette/issues/683 MDEyOklzc3VlQ29tbWVudDU5MDYwNzM4NQ== simonw 9599 2020-02-24T23:49:37Z 2020-02-24T23:49:37Z OWNER

Here's the upload_csv.py plugin file I've been playing with: ```python from datasette import hookimpl from starlette.responses import PlainTextResponse, HTMLResponse from starlette.endpoints import HTTPEndpoint import csv as csv_std import codecs import sqlite_utils

class UploadApp(HTTPEndpoint): def init(self, scope, receive, send, datasette): self.datasette = datasette super().init(scope, receive, send)

def get_database(self):
    # For the moment just use the first one that's not immutable
    mutable = [db for db in self.datasette.databases.values() if db.is_mutable]
    return mutable[0]

async def get(self, request):
    return HTMLResponse(
        await self.datasette.render_template(
            "upload_csv.html", {"database_name": self.get_database().name}
        )
    )

async def post(self, request):
    formdata = await request.form()
    csv = formdata["csv"]
    # csv.file is a SpooledTemporaryFile, I can read it directly
    filename = csv.filename
    # TODO: Support other encodings:
    reader = csv_std.reader(codecs.iterdecode(csv.file, "utf-8"))
    headers = next(reader)
    docs = (dict(zip(headers, row)) for row in reader)
    if filename.endswith(".csv"):
        filename = filename[:-4]
    # Import data into a table of that name using sqlite-utils
    db = self.get_database()

    def fn(conn):
        writable_conn = sqlite_utils.Database(db.path)
        writable_conn[filename].insert_all(docs, alter=True)
        return writable_conn[filename].count

    # Without block=True we may attempt 'select count(*) from ...'
    # before the table has been created by the write thread
    count = await db.execute_write_fn(fn, block=True)

    return HTMLResponse(
        await self.datasette.render_template(
            "upload_csv_done.html",
            {
                "database": self.get_database().name,
                "table": filename,
                "num_docs": count,
            },
        )
    )

@hookimpl def asgi_wrapper(datasette): def wrap_with_asgi_auth(app): async def wrapped_app(scope, recieve, send): if scope["path"] == "/-/upload-csv": await UploadApp(scope, recieve, send, datasette) else: await app(scope, recieve, send)

    return wrapped_app

return wrap_with_asgi_auth

`` I also dropped copies of the two template files from https://github.com/simonw/datasette-upload-csvs/tree/699e6ca591f36264bfc8e590d877e6852f274beb/datasette_upload_csvs/templates into mywrite-templates/` directory.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.execute_write() and .execute_write_fn() methods on Database 570101428  
590606825 https://github.com/simonw/datasette/pull/683#issuecomment-590606825 https://api.github.com/repos/simonw/datasette/issues/683 MDEyOklzc3VlQ29tbWVudDU5MDYwNjgyNQ== simonw 9599 2020-02-24T23:47:38Z 2020-02-24T23:47:38Z OWNER

Another demo plugin: delete_table.py ```python from datasette import hookimpl from datasette.utils import escape_sqlite from starlette.responses import HTMLResponse from starlette.endpoints import HTTPEndpoint

class DeleteTableApp(HTTPEndpoint): def init(self, scope, receive, send, datasette): self.datasette = datasette super().init(scope, receive, send)

async def post(self, request):
    formdata = await request.form()
    database = formdata["database"]
    db = self.datasette.databases[database]
    await db.execute_write("drop table {}".format(escape_sqlite(formdata["table"])))
    return HTMLResponse("Table has been deleted.")

@hookimpl def asgi_wrapper(datasette): def wrap_with_asgi_auth(app): async def wrapped_app(scope, recieve, send): if scope["path"] == "/-/delete-table": await DeleteTableApp(scope, recieve, send, datasette) else: await app(scope, recieve, send)

    return wrapped_app

return wrap_with_asgi_auth

Then I saved this as `table.html` in the `write-templates/` directory:html+django {% extends "default:table.html" %}

{% block content %}

<form action="/-/delete-table" method="POST">

</form>

{{ super() }} {% endblock %} ``` (Needs CSRF protection added)

I ran Datasette like this:

$ datasette --plugins-dir=write-plugins/ data.db --template-dir=write-templates/

Result: I can delete tables!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.execute_write() and .execute_write_fn() methods on Database 570101428  
590599257 https://github.com/simonw/datasette/pull/683#issuecomment-590599257 https://api.github.com/repos/simonw/datasette/issues/683 MDEyOklzc3VlQ29tbWVudDU5MDU5OTI1Nw== simonw 9599 2020-02-24T23:21:56Z 2020-02-24T23:22:35Z OWNER

Also: are UUIDs really necessary here or could I use a simpler form of task identifier? Like an in-memory counter variable that starts at 0 and increments every time this instance of Datasette issues a new task ID?

The neat thing about UUIDs is that I don't have to worry if there are multiple Datasette instances accepting writes behind a load balancer. That seems pretty unlikely (especially considering SQLite databases encourage only one process to be writing at a time)... but I am experimenting with PostgreSQL support in #670 so it's probably worth ensuring these task IDs really are globally unique.

I'm going to stick with UUIDs. They're short-lived enough that their size doesn't really matter.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.execute_write() and .execute_write_fn() methods on Database 570101428  
590598689 https://github.com/simonw/datasette/pull/683#issuecomment-590598689 https://api.github.com/repos/simonw/datasette/issues/683 MDEyOklzc3VlQ29tbWVudDU5MDU5ODY4OQ== simonw 9599 2020-02-24T23:20:11Z 2020-02-24T23:20:11Z OWNER

I think if block it makes sense to return the return value of the function that was executed. Without it all I really need to do is return the uuid so something could theoretically poll for completion later on.

But is it weird having a function that returns different types depending on if you passed block=True or not? Should they be differently named functions?

I'm OK with the block=True pattern changing the return value I think.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.execute_write() and .execute_write_fn() methods on Database 570101428  
590598248 https://github.com/simonw/datasette/pull/683#issuecomment-590598248 https://api.github.com/repos/simonw/datasette/issues/683 MDEyOklzc3VlQ29tbWVudDU5MDU5ODI0OA== simonw 9599 2020-02-24T23:18:50Z 2020-02-24T23:18:50Z OWNER

I'm not convinced by the return value of the .execute_write_fn() method:

https://github.com/simonw/datasette/blob/ab2348280206bde1390b931ae89d372c2f74b87e/datasette/database.py#L79-L83

Do I really need that WriteResponse class or can I do something nicer?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.execute_write() and .execute_write_fn() methods on Database 570101428  
590593120 https://github.com/simonw/datasette/pull/683#issuecomment-590593120 https://api.github.com/repos/simonw/datasette/issues/683 MDEyOklzc3VlQ29tbWVudDU5MDU5MzEyMA== simonw 9599 2020-02-24T23:02:30Z 2020-02-24T23:02:30Z OWNER

I'm going to muck around with a couple more demo plugins - in particular one derived from datasette-upload-csvs - to make sure I'm comfortable with this API - then add a couple of tests and merge it with documentation that warns "this is still an experimental feature and may change".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.execute_write() and .execute_write_fn() methods on Database 570101428  
590592581 https://github.com/simonw/datasette/pull/683#issuecomment-590592581 https://api.github.com/repos/simonw/datasette/issues/683 MDEyOklzc3VlQ29tbWVudDU5MDU5MjU4MQ== simonw 9599 2020-02-24T23:00:44Z 2020-02-24T23:01:09Z OWNER

I've been testing this out by running one-off demo plugins. I saved the following in a file called write-plugins/log_asgi.py (it's a hacked around copy of asgi-log-to-sqlite) and then running datasette data.db --plugins-dir=write-plugins/: ```python from datasette import hookimpl import sqlite_utils import time

class AsgiLogToSqliteViaWriteQueue: lookup_columns = ( "path", "user_agent", "referer", "accept_language", "content_type", "query_string", )

def __init__(self, app, db):
    self.app = app
    self.db = db
    self._tables_ensured = False

async def ensure_tables(self):
    def _ensure_tables(conn):
        db = sqlite_utils.Database(conn)
        for column in self.lookup_columns:
            table = "{}s".format(column)
            if not db[table].exists():
                db[table].create({"id": int, "name": str}, pk="id")
        if "requests" not in db.table_names():
            db["requests"].create(
                {
                    "start": float,
                    "method": str,
                    "path": int,
                    "query_string": int,
                    "user_agent": int,
                    "referer": int,
                    "accept_language": int,
                    "http_status": int,
                    "content_type": int,
                    "client_ip": str,
                    "duration": float,
                    "body_size": int,
                },
                foreign_keys=self.lookup_columns,
            )
    await self.db.execute_write_fn(_ensure_tables)

async def __call__(self, scope, receive, send):
    if not self._tables_ensured:
        self._tables_ensured = True
        await self.ensure_tables()

    response_headers = []
    body_size = 0
    http_status = None

    async def wrapped_send(message):
        nonlocal body_size, response_headers, http_status
        if message["type"] == "http.response.start":
            response_headers = message["headers"]
            http_status = message["status"]

        if message["type"] == "http.response.body":
            body_size += len(message["body"])

        await send(message)

    start = time.time()
    await self.app(scope, receive, wrapped_send)
    end = time.time()

    path = str(scope["path"])
    query_string = None
    if scope.get("query_string"):
        query_string = "?{}".format(scope["query_string"].decode("utf8"))

    request_headers = dict(scope.get("headers") or [])

    referer = header(request_headers, "referer")
    user_agent = header(request_headers, "user-agent")
    accept_language = header(request_headers, "accept-language")

    content_type = header(dict(response_headers), "content-type")

    def _log_to_database(conn):
        db = sqlite_utils.Database(conn)
        db["requests"].insert(
            {
                "start": start,
                "method": scope["method"],
                "path": lookup(db, "paths", path),
                "query_string": lookup(db, "query_strings", query_string),
                "user_agent": lookup(db, "user_agents", user_agent),
                "referer": lookup(db, "referers", referer),
                "accept_language": lookup(db, "accept_languages", accept_language),
                "http_status": http_status,
                "content_type": lookup(db, "content_types", content_type),
                "client_ip": scope.get("client", (None, None))[0],
                "duration": end - start,
                "body_size": body_size,
            },
            alter=True,
            foreign_keys=self.lookup_columns,
        )

    await self.db.execute_write_fn(_log_to_database)

def header(d, name): return d.get(name.encode("utf8"), b"").decode("utf8") or None

def lookup(db, table, value): return db[table].lookup({"name": value}) if value else None

@hookimpl def asgi_wrapper(datasette): def wrap_with_class(app): return AsgiLogToSqliteViaWriteQueue( app, next(iter(datasette.databases.values())) )

return wrap_with_class

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.execute_write() and .execute_write_fn() methods on Database 570101428  
590518182 https://github.com/simonw/datasette/pull/683#issuecomment-590518182 https://api.github.com/repos/simonw/datasette/issues/683 MDEyOklzc3VlQ29tbWVudDU5MDUxODE4Mg== simonw 9599 2020-02-24T19:53:12Z 2020-02-24T19:53:12Z OWNER

Next steps are from comment https://github.com/simonw/datasette/issues/682#issuecomment-590517338

I'm going to move ahead without needing that ability though. I figure SQLite writes are fast, and plugins can be trusted to implement just fast writes. So I'm going to support either fire-and-forget writes (they get added to the queue and a task ID is returned) or have the option to block awaiting the completion of the write (using Janus) but let callers decide which version they want. I may add optional timeouts some time in the future.

I am going to make both execute_write() and execute_write_fn() awaitable functions though, for consistency with .execute() and to give me flexibility to change how they work in the future.

I'll also add a block=True option to both of them which causes the function to wait for the write to be successfully executed - defaults to False (fire-and-forget mode).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.execute_write() and .execute_write_fn() methods on Database 570101428  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 63.619ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows