Partition Count: The Decision You Can't Undo

How to choose Kafka partition count. Sizing formulas, overhead costs, and why decreasing partitions requires topic migration.

Stéphane Derosiaux · March 1, 2024 ·

Partition Count: The Decision You Can't Undo

Kafka lets you increase partitions anytime. It does not let you decrease them.

This asymmetry makes partition count one of the most consequential decisions in your architecture. Get it wrong, and you're either leaving performance on the table or facing a painful migration.

We defaulted to 100 partitions per topic "for future scale" on a cluster processing 10 MB/s. Two years later, we're still migrating topics to right-sized partition counts.
Platform Engineer at a logistics company

Why It's Irreversible

Kafka stores messages by partition. Decreasing partitions would require redistributing messages, breaking ordering guarantees.

# Key "user-123" hashes to partition 2 with 6 partitions
hash("user-123") % 6 = 2

# Same key hashes to partition 0 with 4 partitions
hash("user-123") % 4 = 0

# Messages split across partitions. Ordering lost.

The only way to reduce partitions: create a new topic, migrate data, switch producers/consumers. A multi-day operation for production topics.

The Sizing Formula

Partitions = max(Target_Throughput / Producer_Rate, Target_Throughput / Consumer_Rate)

Example: You need 1 GB/s. Producers achieve 100 MB/s per partition. Consumers process 50 MB/s per partition.

Producer: 1000 / 100 = 10 partitions
Consumer: 1000 / 50 = 20 partitions
Result: 20 partitions minimum

Test your actual throughput:

kafka-producer-perf-test.sh --topic test --num-records 1000000 --record-size 1024 --throughput -1 --producer-props bootstrap.servers=localhost:9092 acks=all

Parallelism Ceiling

Each partition can have at most one consumer per group:

Max parallel consumers = Number of partitions

6 partitions, 10 consumers → 4 sit idle. Plan for growth: if you expect 20 consumers, start with at least 20 partitions. Monitoring partition distribution helps identify these imbalances before they cause issues.

Hidden Costs of Over-Provisioning

Each partition requires:

10 MB pre-allocated index files
Open file handles (3-6 per partition)
Metadata in the controller

File handles: 5,000 partitions per broker = 15,000-30,000 file handles. Most Linux defaults to 1,024. Set ulimit -n to 100,000+.

Controller failover:

Partitions	Approx. Failover Time
1,000	~2 seconds
10,000	~20 seconds
100,000	~3+ minutes

Limits:

Scope	Soft Limit	Hard Limit
Per broker	2,000	4,000
Per cluster	50,000	200,000

Rebalancing Impact

Consumer rebalances redistribute partitions. More partitions = longer rebalance duration.

# Enable cooperative rebalancing (Kafka 2.4+)
partition.assignment.strategy=org.apache.kafka.clients.consumer.CooperativeStickyAssignor

For stable groups, static membership prevents rebalances on restart:

group.instance.id=consumer-1
session.timeout.ms=300000

Practical Guidelines

Small cluster (< 6 brokers): 3 × Broker_Count per topic

Large cluster (> 12 brokers): 2 × Broker_Count per topic

Avoid prime numbers. Use 6, 8, 12, 16, 24, 48—divisible by common consumer counts.

When You've Over-Partitioned

Signs: controller elections > 30 seconds, rebalances > 60 seconds, URPs spike during rolling restarts.

Mitigation: Add brokers, upgrade to KRaft, tune controller settings.

Full migration (last resort):

kafka-topics.sh --create --topic my-topic-v2 --partitions 12
# Mirror data, switch producers, drain old topic, switch consumers, delete old topic

When in doubt, slightly over-provision. Adding partitions is easy; removing them is not.

Book a demo to see how Conduktor Console shows partition distribution and consumer lag.