issue_comments
12 rows where author_association = "OWNER" and issue = 1426001541 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- API for bulk inserting records into a table · 12 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
1313128913 | https://github.com/simonw/datasette/issues/1866#issuecomment-1313128913 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5ORMHR | simonw 9599 | 2022-11-14T05:48:22Z | 2022-11-14T05:48:22Z | OWNER | I changed my mind about the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 | |
1295200988 | https://github.com/simonw/datasette/issues/1866#issuecomment-1295200988 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5NMzLc | simonw 9599 | 2022-10-28T16:29:55Z | 2022-10-28T16:29:55Z | OWNER | I wonder if there's something clever I could do here within a transaction? Start a transaction. Write out a temporary in-memory table with all of the existing primary keys in the table. Run the bulk insert. Then run I don't think that's going to work well for large tables. I'm going to go with not returning inserted rows by default, unless you pass a special option requesting that. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 | |
1294316640 | https://github.com/simonw/datasette/issues/1866#issuecomment-1294316640 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5NJbRg | simonw 9599 | 2022-10-28T01:51:40Z | 2022-10-28T01:51:40Z | OWNER | This needs to support the following: - Rows do not include a primary key - one is assigned by the database - Rows provide their own primary key, any clashes are errors - Rows provide their own primary key, clashes are silently ignored - Rows provide their own primary key, replacing any existing records |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 | |
1294306071 | https://github.com/simonw/datasette/issues/1866#issuecomment-1294306071 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5NJYsX | simonw 9599 | 2022-10-28T01:37:14Z | 2022-10-28T01:37:59Z | OWNER | Quick crude benchmark: ```python import sqlite3 db = sqlite3.connect(":memory:") def create_table(db, name): db.execute(f"create table {name} (id integer primary key, title text)") create_table(db, "single") create_table(db, "multi") create_table(db, "bulk") def insert_singles(titles): inserted = [] for title in titles: cursor = db.execute(f"insert into single (title) values (?)", [title]) inserted.append((cursor.lastrowid, title)) return inserted def insert_many(titles): db.executemany(f"insert into multi (title) values (?)", ((t,) for t in titles)) def insert_bulk(titles): db.execute("insert into bulk (title) values {}".format( ", ".join("(?)" for _ in titles) ), titles) titles = ["title {}".format(i) for i in range(1, 10001)]
In [13]: %timeit insert_many(titles) 12 ms ± 520 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [12]: %timeit insert_bulk(titles) 2.59 ms ± 25 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` So the bulk insert really is a lot faster - 3ms compared to 24ms for single inserts, so ~8x faster. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 | |
1294296767 | https://github.com/simonw/datasette/issues/1866#issuecomment-1294296767 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5NJWa_ | simonw 9599 | 2022-10-28T01:22:25Z | 2022-10-28T01:23:09Z | OWNER | Nasty catch on this one: I wanted to return the IDs of the freshly inserted rows. But... the SQLite itself added a Two options then:
That third option might be the way to go here. I should benchmark first to figure out how much of a difference this actually makes. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 | |
1294282263 | https://github.com/simonw/datasette/issues/1866#issuecomment-1294282263 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5NJS4X | simonw 9599 | 2022-10-28T01:00:42Z | 2022-10-28T01:00:42Z | OWNER | I'm going to set the limit at 1,000 rows inserted at a time. I'll make this configurable using a new |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 | |
1293893789 | https://github.com/simonw/datasette/issues/1866#issuecomment-1293893789 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5NH0Cd | simonw 9599 | 2022-10-27T18:13:00Z | 2022-10-27T18:13:00Z | OWNER | If people care about that kind of thing they could always push all of their inserts to a table called |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 | |
1293892818 | https://github.com/simonw/datasette/issues/1866#issuecomment-1293892818 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5NHzzS | simonw 9599 | 2022-10-27T18:12:02Z | 2022-10-27T18:12:02Z | OWNER | There's one catch with batched inserts: if your CLI tool fails half way through you could end up with a partially populated table - since a bunch of batches will have succeeded first. I think that's OK. In the future I may want to come up with a way to run multiple batches of inserts inside a single transaction, but I can ignore that for the first release of this feature. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 | |
1293891876 | https://github.com/simonw/datasette/issues/1866#issuecomment-1293891876 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5NHzkk | simonw 9599 | 2022-10-27T18:11:05Z | 2022-10-27T18:11:05Z | OWNER | Likewise for newline-delimited JSON. While it's tempting to want to accept that as an ingest format (because it's nice to generate and stream) I think it's better to have a client application that can turn a stream of newline-delimited JSON into batched JSON inserts. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 | |
1293891191 | https://github.com/simonw/datasette/issues/1866#issuecomment-1293891191 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5NHzZ3 | simonw 9599 | 2022-10-27T18:10:22Z | 2022-10-27T18:10:22Z | OWNER | So for the moment I'm just going to concentrate on the JSON API. I can consider CSV variants later on, or as plugins, or both. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 | |
1293890684 | https://github.com/simonw/datasette/issues/1866#issuecomment-1293890684 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5NHzR8 | simonw 9599 | 2022-10-27T18:09:52Z | 2022-10-27T18:09:52Z | OWNER | Should this API accept CSV/TSV etc in addition to JSON? I'm torn on this one. My initial instinct is that it should not - and there should instead be a Datasette client library / CLI tool you can use that knows how to turn CSV into batches of JSON calls for when you want to upload a CSV file. I don't think the usability of |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 | |
1293887808 | https://github.com/simonw/datasette/issues/1866#issuecomment-1293887808 | https://api.github.com/repos/simonw/datasette/issues/1866 | IC_kwDOBm6k_c5NHylA | simonw 9599 | 2022-10-27T18:07:02Z | 2022-10-27T18:07:02Z | OWNER | Error handling is really important here. What should happen if you submit 100 records and one of them has some kind of validation error? How should that error be reported back to you? I'm inclined to say that it defaults to all-or-nothing in a transaction - but there should be a |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for bulk inserting records into a table 1426001541 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [issue] INTEGER REFERENCES [issues]([id]) , [performed_via_github_app] TEXT); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1