Kafka Tiered Storage: Infinite Retention Without Infinite Disks

Configure Kafka tiered storage with S3 to reduce storage costs 3-9x. Broker config, common errors, and performance tradeoffs.

Stéphane Derosiaux · November 21, 2025 ·

Kafka Tiered Storage: Infinite Retention Without Infinite Disks

Keeping 90 days of Kafka data costs 3x what it should. You're paying for fast SSDs to store data nobody reads.

Tiered storage fixes this by offloading old segments to object storage. Recent data stays on local disks; historical data moves to S3. Introduced in Kafka 3.0, it became GA in Kafka 3.6 with continued improvements in later versions.

We needed 7-year retention for SOX compliance. Traditional Kafka would've cost $600K/month. With tiered storage, we're at $58K. Same data, same durability.
Data Architect at a financial services firm

The Cost Math

S3: $0.023/GB. EBS SSD: $0.08/GB. But Kafka replicates 3x. S3 handles durability internally, so you pay storage once (plus GET requests and data transfer for consumer reads). For EBS, you pay 3x.

Storage Type	Per-GB Cost	After 3x Replication	10 TB Monthly
EBS SSD (gp3)	$0.08/GB	$0.24/GB	$2,400
S3 Standard	$0.023/GB	$0.023/GB	$230

Tradeoff: Object storage is slower. Consumers reading historical data experience latency in seconds, not milliseconds. Tiered storage is for cold data, not your hot path.

How It Works

Traditional Kafka stores everything on broker disks. With tiered storage:

Local tier: Fast SSDs. Holds recent data (hours to days).
Remote tier: S3/GCS/Azure. Holds historical data (weeks to years).

When a segment closes, the broker uploads it to remote storage. After local.retention.ms, the local copy is deleted.

Enabling Tiered Storage

Kafka doesn't include a RemoteStorageManager. You need a plugin like Aiven's tiered-storage-for-apache-kafka.

# server.properties
remote.log.storage.system.enable=true
remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager
remote.log.storage.manager.class.path=/opt/kafka/libs/tiered-storage/*

# S3 config
rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage
rsm.config.storage.s3.bucket.name=kafka-tiered-storage-prod
rsm.config.storage.s3.region=us-east-1

# Required for metadata manager
remote.log.metadata.manager.listener.name=PLAINTEXT

Then enable per topic (you can also configure these settings through a topic management UI):

kafka-topics.sh --bootstrap-server localhost:9092 \
  --create --topic events-archive \
  --partitions 12 \
  --config remote.storage.enable=true \
  --config local.retention.ms=86400000 \
  --config retention.ms=7776000000
# local = 1 day on disk, total = 90 days including remote

Two Retention Settings

Setting	Applies To
`local.retention.ms`	Time on local disk
`retention.ms`	Total retention (local + remote)

Data flow: Producer writes → segment closes → uploads to remote → local deleted after local.retention.ms → remote deleted after retention.ms.

Common Errors

"Tiered Storage functionality is disabled"

You enabled remote.storage.enable=true on a topic but the broker lacks remote.log.storage.system.enable=true.

"No RemoteStorageManager configured"

Kafka has no built-in S3 connector. Install a plugin and set remote.log.storage.manager.class.name.

Consumer reads timing out

Remote fetches take longer. Increase consumer timeouts:

fetch.max.wait.ms=5000
request.timeout.ms=60000

Limitations

Compacted topics not supported
JBOD not supported
No easy disable (must migrate and delete topic)
Sequential remote fetches (parallel fetches planned for future releases)

Recovery Time Improvement

This is where tiered storage shines:

Operation	Without	With
Disk failure recovery	230 min	2 min
Partition reassignment	3 hours	12 min
Cluster scale-up	13+ hours	1 hour

New brokers don't copy terabytes of historical data. They only replicate local tier data; remote data is already durable.

For teams running multi-terabyte clusters with retention beyond a few days, tiered storage is worth the configuration effort.

Book a demo to see how Conduktor Console provides unified visibility into local and remote storage metrics.