html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1256858763,https://api.github.com/repos/simonw/sqlite-utils/issues/491,1256858763,IC_kwDOCGYnMM5K6iSL,7908073,2022-09-24T04:50:59Z,2022-09-24T04:52:08Z,CONTRIBUTOR,"Instead of outputting binary data to stdout the interface might be better like this ``` sqlite-utils merge animals.db cats.db dogs.db ``` similar to `zip`, `ogr2ogr`, etc Actually I think this might already be possible within `ogr2ogr`. I don't believe spatial data is a requirement though it might add an `ogc_id` column or something ``` cp cats.db animals.db ogr2ogr -append animals.db dogs.db ogr2ogr -append animals.db another.db ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1383646615, https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258449887,https://api.github.com/repos/simonw/sqlite-utils/issues/491,1258449887,IC_kwDOCGYnMM5LAmvf,9599,2022-09-26T18:35:50Z,2022-09-26T18:35:50Z,OWNER,"This is a really interesting idea. I'm nervous about needing to set the rules for how duplicate tables should be merged though. This feels like a complex topic - one where there isn't necessarily an obviously ""correct"" way of doing it, but where different problems that people are solving might need different merging approaches. Likewise, merging isn't just a database-to-database thing at that point - I could see a need for merging two tables using similar rules to those used for merging two databases. So I think I'd want to have some good concrete use-cases in mind before trying to design how something like this should work. Will leave this thread open for people to drop those in!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1383646615, https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258450447,https://api.github.com/repos/simonw/sqlite-utils/issues/491,1258450447,IC_kwDOCGYnMM5LAm4P,9599,2022-09-26T18:36:23Z,2022-09-26T18:36:23Z,OWNER,This is also the kind of feature that would need to express itself in both the Python library and the CLI utility.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1383646615, https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258508215,https://api.github.com/repos/simonw/sqlite-utils/issues/491,1258508215,IC_kwDOCGYnMM5LA0-3,25778,2022-09-26T19:22:14Z,2022-09-26T19:22:14Z,CONTRIBUTOR,"This might be fairly straightforward using SQLite's backup utility: https://docs.python.org/3/library/sqlite3.html#sqlite3.Connection.backup ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1383646615, https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258697384,https://api.github.com/repos/simonw/sqlite-utils/issues/491,1258697384,IC_kwDOCGYnMM5LBjKo,9599,2022-09-26T22:12:45Z,2022-09-26T22:12:45Z,OWNER,That feels like a slightly different command to me - maybe `sqlite-utils backup data.db data-backup.db`? It doesn't have any of the mechanics for merging tables together. Could be a useful feature separately though.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1383646615, https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258712931,https://api.github.com/repos/simonw/sqlite-utils/issues/491,1258712931,IC_kwDOCGYnMM5LBm9j,25778,2022-09-26T22:31:58Z,2022-09-26T22:31:58Z,CONTRIBUTOR,"Right. The backup command will copy tables completely, but in the case of conflicting table names, the destination gets overwritten silently. That might not be what you want here. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1383646615, https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1264218914,https://api.github.com/repos/simonw/sqlite-utils/issues/491,1264218914,IC_kwDOCGYnMM5LWnMi,7908073,2022-10-01T03:18:36Z,2023-06-14T22:14:24Z,CONTRIBUTOR,"> some good concrete use-cases in mind I actually found myself wanting something like this the past couple days. The use-case was databases with slightly different schema but same table names. here is a full script: ``` import argparse from pathlib import Path from sqlite_utils import Database def connect(args, conn=None, **kwargs) -> Database: db = Database(conn or args.database, **kwargs) with db.conn: db.conn.execute(""PRAGMA main.cache_size = 8000"") return db def parse_args() -> argparse.Namespace: parser = argparse.ArgumentParser() parser.add_argument(""database"") parser.add_argument(""dbs_folder"") parser.add_argument(""--db"", ""-db"", help=argparse.SUPPRESS) parser.add_argument(""--verbose"", ""-v"", action=""count"", default=0) args = parser.parse_args() if args.db: args.database = args.db Path(args.database).touch() args.db = connect(args) return args def merge_db(args, source_db): source_db = str(Path(source_db).resolve()) s_db = connect(argparse.Namespace(database=source_db, verbose = args.verbose)) for table in s_db.table_names(): data = s_db[table].rows args.db[table].insert_all(data, alter=True, replace=True) args.db.conn.commit() def merge_directory(): args = parse_args() source_dbs = list(Path(args.dbs_folder).glob('*.db')) for s_db in source_dbs: merge_db(args, s_db) if __name__ == '__main__': merge_directory() ``` edit: I've made some improvements to this and put it on PyPI: ``` $ pip install xklb $ lb merge-db -h usage: library merge-dbs DEST_DB SOURCE_DB ... [--only-target-columns] [--only-new-rows] [--upsert] [--pk PK ...] [--table TABLE ...] Merge-DBs will insert new rows from source dbs to target db, table by table. If primary key(s) are provided, and there is an existing row with the same PK, the default action is to delete the existing row and insert the new row replacing all existing fields. Upsert mode will update matching PK rows such that if a source row has a NULL field and the destination row has a value then the value will be preserved instead of changed to the source row's NULL value. Ignore mode (--only-new-rows) will insert only rows which don't already exist in the destination db Test first by using temp databases as the destination db. Try out different modes / flags until you are satisfied with the behavior of the program library merge-dbs --pk path (mktemp --suffix .db) tv.db movies.db Merge database data and tables library merge-dbs --upsert --pk path video.db tv.db movies.db library merge-dbs --only-target-columns --only-new-rows --table media,playlists --pk path audio-fts.db audio.db library merge-dbs --pk id --only-tables subreddits reddit/81_New_Music.db audio.db library merge-dbs --only-new-rows --pk subreddit,path --only-tables reddit_posts reddit/81_New_Music.db audio.db -v positional arguments: database source_dbs ``` Also if you want to dedupe a table based on a ""business key"" which isn't explicitly your primary key(s) you can run this: ``` $ lb dedupe-db -h usage: library dedupe-dbs DATABASE TABLE --bk BUSINESS_KEYS [--pk PRIMARY_KEYS] [--only-columns COLUMNS] Dedupe your database (not to be confused with the dedupe subcommand) It should not need to be said but *backup* your database before trying this tool! Dedupe-DB will help remove duplicate rows based on non-primary-key business keys library dedupe-db ./video.db media --bk path If --primary-keys is not provided table metadata primary keys will be used If --only-columns is not provided all non-primary and non-business key columns will be upserted positional arguments: database table options: -h, --help show this help message and exit --skip-0 --only-columns ONLY_COLUMNS Comma separated column names to upsert --primary-keys PRIMARY_KEYS, --pk PRIMARY_KEYS Comma separated primary keys --business-keys BUSINESS_KEYS, --bk BUSINESS_KEYS Comma separated business keys ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1383646615,