4.9 KiB
4.9 KiB
sqlx-record Batch Operations Skill
Guide to insert_many() and upsert() for efficient bulk operations.
Triggers
- "batch insert", "bulk insert"
- "insert many", "insert_many"
- "upsert", "insert or update"
- "on conflict", "on duplicate key"
Overview
sqlx-record provides efficient batch operations:
insert_many()- Insert multiple records in a single queryupsert()- Insert or update on primary key conflict
insert_many()
Insert multiple entities in a single SQL statement:
pub async fn insert_many(executor, entities: &[Self]) -> Result<Vec<PkType>, Error>
Usage
use sqlx_record::prelude::*;
let users = vec![
User { id: new_uuid(), name: "Alice".into(), email: "alice@example.com".into() },
User { id: new_uuid(), name: "Bob".into(), email: "bob@example.com".into() },
User { id: new_uuid(), name: "Carol".into(), email: "carol@example.com".into() },
];
// Insert all in single query
let ids = User::insert_many(&pool, &users).await?;
println!("Inserted {} users", ids.len());
SQL Generated
-- MySQL
INSERT INTO users (id, name, email) VALUES (?, ?, ?), (?, ?, ?), (?, ?, ?)
-- PostgreSQL
INSERT INTO users (id, name, email) VALUES ($1, $2, $3), ($4, $5, $6), ($7, $8, $9)
-- SQLite
INSERT INTO users (id, name, email) VALUES (?, ?, ?), (?, ?, ?), (?, ?, ?)
Benefits
- Single round-trip to database
- Much faster than N individual inserts
- Atomic - all succeed or all fail
Limitations
- Entity must implement
Clone(for collecting PKs) - Empty slice returns empty vec without database call
- Very large batches may hit database limits (split into chunks if needed)
Chunked Insert
For very large datasets:
const BATCH_SIZE: usize = 1000;
async fn insert_large_dataset(pool: &Pool, users: Vec<User>) -> Result<Vec<Uuid>, sqlx::Error> {
let mut all_ids = Vec::with_capacity(users.len());
for chunk in users.chunks(BATCH_SIZE) {
let ids = User::insert_many(pool, chunk).await?;
all_ids.extend(ids);
}
Ok(all_ids)
}
upsert() / insert_or_update()
Insert a new record, or update if primary key already exists:
pub async fn upsert(&self, executor) -> Result<PkType, Error>
pub async fn insert_or_update(&self, executor) -> Result<PkType, Error> // alias
Usage
let user = User {
id: existing_or_new_id,
name: "Alice".into(),
email: "alice@example.com".into(),
};
// Insert if new, update if exists
user.upsert(&pool).await?;
// Or using alias
user.insert_or_update(&pool).await?;
SQL Generated
-- MySQL
INSERT INTO users (id, name, email) VALUES (?, ?, ?)
ON DUPLICATE KEY UPDATE name = VALUES(name), email = VALUES(email)
-- PostgreSQL
INSERT INTO users (id, name, email) VALUES ($1, $2, $3)
ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name, email = EXCLUDED.email
-- SQLite
INSERT INTO users (id, name, email) VALUES (?, ?, ?)
ON CONFLICT(id) DO UPDATE SET name = excluded.name, email = excluded.email
Use Cases
- Sync external data: Import data that may already exist
- Idempotent operations: Safe to retry without duplicates
- Cache refresh: Update cached records atomically
Examples
Sync Products
async fn sync_products(pool: &Pool, external_products: Vec<ExternalProduct>) -> Result<(), sqlx::Error> {
for ext in external_products {
let product = Product {
id: ext.id, // Use external ID as PK
name: ext.name,
price: ext.price,
updated_at: chrono::Utc::now().timestamp_millis(),
};
product.upsert(pool).await?;
}
Ok(())
}
Idempotent Event Processing
async fn process_event(pool: &Pool, event: Event) -> Result<(), sqlx::Error> {
let record = ProcessedEvent {
id: event.id, // Event ID as PK - prevents duplicates
event_type: event.event_type,
payload: event.payload,
processed_at: chrono::Utc::now().timestamp_millis(),
};
// Safe to call multiple times - won't create duplicates
record.upsert(pool).await?;
Ok(())
}
With Transaction
use sqlx_record::transaction;
transaction!(&pool, |tx| {
// Upsert multiple records atomically
for item in items {
item.upsert(&mut *tx).await?;
}
Ok::<_, sqlx::Error>(())
}).await?;
Comparison
| Operation | Behavior on Existing PK | SQL Efficiency |
|---|---|---|
insert() |
Error (duplicate key) | Single row |
insert_many() |
Error (duplicate key) | Multiple rows, single query |
upsert() |
Updates all non-PK fields | Single row |
Notes
upsert()updates ALL non-PK fields, not just changed ones- Primary key must be properly indexed (usually automatic)
- For partial updates, use
insert()+update_by_id()with conflict check insert_many()requires all entities have unique PKs among themselves