Field-Level Encryption in Kafka: Beyond TLS

TLS protects data in transit, but brokers still see plaintext. Field-level encryption protects PII at rest and enables crypto shredding.

Stéphane DerosiauxStéphane Derosiaux · September 7, 2024 ·
Field-Level Encryption in Kafka: Beyond TLS

TLS encrypts data between your producers, brokers, and consumers. It doesn't protect your data once it reaches the broker.

Kafka brokers see every message in plaintext. Administrators with filesystem access can read your topics. Heap dumps expose sensitive fields. If your cluster handles PII or financial data, TLS leaves significant gaps.

We passed our SOC2 audit only after implementing field-level encryption. The auditor explicitly asked about data at rest on brokers—TLS wasn't enough.

Security Engineer at a healthcare company

Where TLS Falls Short

Producer → [TLS encrypted] → Broker (plaintext) → [TLS encrypted] → Consumer
                                   ↓
                            Disk (plaintext)
                            Backups (plaintext)
VectorTLS Protection
Network eavesdroppingYes
Broker filesystem accessNo
Kafka admin reading topicsNo
Backup systemsNo
Heap dump analysisNo
Disk encryption (AWS MSK, Confluent Cloud) helps but doesn't protect against administrators or the cloud provider itself.

When Field-Level Encryption Is Required

Regulatory compliance: GDPR, HIPAA, and PCI-DSS increasingly expect field-level controls, not just transport encryption.

Multi-tenant access: Your analytics team needs order totals. Your support team needs emails. Neither should see the other's data. Same topic, different views based on decryption keys.

GDPR deletion: Kafka's append-only log makes traditional deletion impractical. Crypto shredding solves this—encrypt with per-user keys, delete the key on request. The ciphertext becomes unreadable.

Implementation Options

Client-side encryption: Producers encrypt before sending. Every application needs encryption logic and key access.

SMT-based encryption: Kafka Connect transforms encrypt data in pipelines. Doesn't protect direct producer/consumer applications.

Proxy-layer encryption: A proxy between clients and brokers handles encryption transparently. Zero application changes. See how to implement encryption with a proxy-based approach.

# Proxy configuration example
encryption:
  topic: customers
  fields:
    - fieldName: ssn
      keySecretId: pii-key
      algorithm: AES256_GCM
ApproachProsCons
Client-sideBroker never sees plaintextEvery app needs logic
SMTWorks with Connect pipelinesDoesn't cover all access
ProxyZero code changes, central policyAdditional infrastructure

Deterministic vs Probabilistic

Probabilistic (recommended): Same plaintext encrypts to different ciphertext. Maximum security. Can't search encrypted fields.

Deterministic: Same plaintext, same ciphertext. Enables joins on encrypted fields. Weaker security—allows frequency analysis.

Use deterministic only when you must join or dedupe on encrypted data.

Crypto Shredding for GDPR

User 12345 requests deletion
      ↓
Delete encryption key for User 12345
      ↓
All messages for User 12345 are now unreadable
(Topic remains intact, no partition rewrite)

The data technically exists but is permanently inaccessible without the key.

Field-level encryption isn't optional for Kafka clusters handling sensitive data. The question is whether you implement it in every application, in pipelines, or centrally.

Book a demo to see how Conduktor Gateway provides proxy-layer encryption with policy-based decryption.