Skip to content
Table Format Maintenance & Operations Last updated: May 29, 2026

Iceberg Spark Procedure expire_snapshots

A Spark SQL procedure in Apache Iceberg used to remove expired snapshots and physically delete their unreferenced data and delete files.

expire_snapshots sparkiceberg snapshot cleanupspark sql call expire_snapshots

Iceberg Spark Procedure expire_snapshots

The Iceberg Spark Procedure expire_snapshots is a maintenance function executed via Spark SQL to purge historical snapshots that are older than a specified retention threshold. When a snapshot is expired, its reference pointer is removed from the table’s metadata log, and any data files or delete files that were exclusively associated with the expired snapshot are physically deleted from storage.

Syntax and Parameters

The procedure provides options to customize the cleanup range. You can target snapshots older than a specific timestamp while maintaining a minimum number of recent snapshots:

/* Expire snapshots older than May 1st, 2026, while preserving at least 5 snapshots */
CALL prod.system.expire_snapshots(
    table => 'db.web_logs',
    older_than => TIMESTAMP '2026-05-01 00:00:00.000',
    retain_last => 5
);

Operational Importance

Running snapshot expiration periodically is essential to control storage costs and optimize planning speed:

πŸ“š Go Deeper on Apache Iceberg

Alex Merced has authored three hands-on books covering Apache Iceberg, the Agentic Lakehouse, and modern data architecture. Pick up a copy to master the full ecosystem.

← Back to Iceberg Knowledge Base