197 lines
4.9 KiB
Markdown
197 lines
4.9 KiB
Markdown
# sqlx-record Batch Operations Skill
|
|
|
|
Guide to insert_many() and upsert() for efficient bulk operations.
|
|
|
|
## Triggers
|
|
- "batch insert", "bulk insert"
|
|
- "insert many", "insert_many"
|
|
- "upsert", "insert or update"
|
|
- "on conflict", "on duplicate key"
|
|
|
|
## Overview
|
|
|
|
`sqlx-record` provides efficient batch operations:
|
|
- `insert_many()` - Insert multiple records in a single query
|
|
- `upsert()` - Insert or update on primary key conflict
|
|
|
|
## insert_many()
|
|
|
|
Insert multiple entities in a single SQL statement:
|
|
|
|
```rust
|
|
pub async fn insert_many(executor, entities: &[Self]) -> Result<Vec<PkType>, Error>
|
|
```
|
|
|
|
### Usage
|
|
|
|
```rust
|
|
use sqlx_record::prelude::*;
|
|
|
|
let users = vec![
|
|
User { id: new_uuid(), name: "Alice".into(), email: "alice@example.com".into() },
|
|
User { id: new_uuid(), name: "Bob".into(), email: "bob@example.com".into() },
|
|
User { id: new_uuid(), name: "Carol".into(), email: "carol@example.com".into() },
|
|
];
|
|
|
|
// Insert all in single query
|
|
let ids = User::insert_many(&pool, &users).await?;
|
|
|
|
println!("Inserted {} users", ids.len());
|
|
```
|
|
|
|
### SQL Generated
|
|
|
|
```sql
|
|
-- MySQL
|
|
INSERT INTO users (id, name, email) VALUES (?, ?, ?), (?, ?, ?), (?, ?, ?)
|
|
|
|
-- PostgreSQL
|
|
INSERT INTO users (id, name, email) VALUES ($1, $2, $3), ($4, $5, $6), ($7, $8, $9)
|
|
|
|
-- SQLite
|
|
INSERT INTO users (id, name, email) VALUES (?, ?, ?), (?, ?, ?), (?, ?, ?)
|
|
```
|
|
|
|
### Benefits
|
|
|
|
- Single round-trip to database
|
|
- Much faster than N individual inserts
|
|
- Atomic - all succeed or all fail
|
|
|
|
### Limitations
|
|
|
|
- Entity must implement `Clone` (for collecting PKs)
|
|
- Empty slice returns empty vec without database call
|
|
- Very large batches may hit database limits (split into chunks if needed)
|
|
|
|
### Chunked Insert
|
|
|
|
For very large datasets:
|
|
|
|
```rust
|
|
const BATCH_SIZE: usize = 1000;
|
|
|
|
async fn insert_large_dataset(pool: &Pool, users: Vec<User>) -> Result<Vec<Uuid>, sqlx::Error> {
|
|
let mut all_ids = Vec::with_capacity(users.len());
|
|
|
|
for chunk in users.chunks(BATCH_SIZE) {
|
|
let ids = User::insert_many(pool, chunk).await?;
|
|
all_ids.extend(ids);
|
|
}
|
|
|
|
Ok(all_ids)
|
|
}
|
|
```
|
|
|
|
## upsert() / insert_or_update()
|
|
|
|
Insert a new record, or update if primary key already exists:
|
|
|
|
```rust
|
|
pub async fn upsert(&self, executor) -> Result<PkType, Error>
|
|
pub async fn insert_or_update(&self, executor) -> Result<PkType, Error> // alias
|
|
```
|
|
|
|
### Usage
|
|
|
|
```rust
|
|
let user = User {
|
|
id: existing_or_new_id,
|
|
name: "Alice".into(),
|
|
email: "alice@example.com".into(),
|
|
};
|
|
|
|
// Insert if new, update if exists
|
|
user.upsert(&pool).await?;
|
|
|
|
// Or using alias
|
|
user.insert_or_update(&pool).await?;
|
|
```
|
|
|
|
### SQL Generated
|
|
|
|
```sql
|
|
-- MySQL
|
|
INSERT INTO users (id, name, email) VALUES (?, ?, ?)
|
|
ON DUPLICATE KEY UPDATE name = VALUES(name), email = VALUES(email)
|
|
|
|
-- PostgreSQL
|
|
INSERT INTO users (id, name, email) VALUES ($1, $2, $3)
|
|
ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name, email = EXCLUDED.email
|
|
|
|
-- SQLite
|
|
INSERT INTO users (id, name, email) VALUES (?, ?, ?)
|
|
ON CONFLICT(id) DO UPDATE SET name = excluded.name, email = excluded.email
|
|
```
|
|
|
|
### Use Cases
|
|
|
|
1. **Sync external data**: Import data that may already exist
|
|
2. **Idempotent operations**: Safe to retry without duplicates
|
|
3. **Cache refresh**: Update cached records atomically
|
|
|
|
### Examples
|
|
|
|
#### Sync Products
|
|
|
|
```rust
|
|
async fn sync_products(pool: &Pool, external_products: Vec<ExternalProduct>) -> Result<(), sqlx::Error> {
|
|
for ext in external_products {
|
|
let product = Product {
|
|
id: ext.id, // Use external ID as PK
|
|
name: ext.name,
|
|
price: ext.price,
|
|
updated_at: chrono::Utc::now().timestamp_millis(),
|
|
};
|
|
product.upsert(pool).await?;
|
|
}
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
#### Idempotent Event Processing
|
|
|
|
```rust
|
|
async fn process_event(pool: &Pool, event: Event) -> Result<(), sqlx::Error> {
|
|
let record = ProcessedEvent {
|
|
id: event.id, // Event ID as PK - prevents duplicates
|
|
event_type: event.event_type,
|
|
payload: event.payload,
|
|
processed_at: chrono::Utc::now().timestamp_millis(),
|
|
};
|
|
|
|
// Safe to call multiple times - won't create duplicates
|
|
record.upsert(pool).await?;
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
#### With Transaction
|
|
|
|
```rust
|
|
use sqlx_record::transaction;
|
|
|
|
transaction!(&pool, |tx| {
|
|
// Upsert multiple records atomically
|
|
for item in items {
|
|
item.upsert(&mut *tx).await?;
|
|
}
|
|
Ok::<_, sqlx::Error>(())
|
|
}).await?;
|
|
```
|
|
|
|
## Comparison
|
|
|
|
| Operation | Behavior on Existing PK | SQL Efficiency |
|
|
|-----------|------------------------|----------------|
|
|
| `insert()` | Error (duplicate key) | Single row |
|
|
| `insert_many()` | Error (duplicate key) | Multiple rows, single query |
|
|
| `upsert()` | Updates all non-PK fields | Single row |
|
|
|
|
## Notes
|
|
|
|
- `upsert()` updates ALL non-PK fields, not just changed ones
|
|
- Primary key must be properly indexed (usually automatic)
|
|
- For partial updates, use `insert()` + `update_by_id()` with conflict check
|
|
- `insert_many()` requires all entities have unique PKs among themselves
|