Home / Glossary

Crypto Shredding for Kafka: GDPR-Compliant Data Deletion

Crypto shredding is a data deletion technique that makes encrypted data permanently unreadable by destroying the encryption keys. Unlike physical deletion, the ciphertext remains in Kafka but becomes cryptographically meaningless without its key, satisfying GDPR's "right to be forgotten" without modifying immutable logs.

Kafka's append-only architecture creates a fundamental tension with data privacy regulations. When a user exercises their "right to be forgotten" under GDPR, you cannot simply delete their records from an immutable log. The data persists in topic partitions, replicas, and consumer state stores.

Crypto shredding resolves this by making data unreadable rather than physically deleting it. Encrypt each user's data with their own key, then destroy the key when deletion is required. The ciphertext remains in Kafka, but without the key, it becomes meaningless bytes, cryptographically indistinguishable from random noise.

This approach satisfies GDPR Article 17 (Right to Erasure) requirements while preserving Kafka's architectural integrity. The data is effectively "deleted" from a privacy perspective: no one can access it, even with full access to the underlying storage.

How Crypto Shredding Works

Crypto shredding combines field-level encryption with per-user key management. The architecture uses envelope encryption with three key types: Crypto Shredding Architecture with Envelope Encryption

Key terminology:

  • KEK (Key Encryption Key): Master key stored in your external KMS (AWS KMS, HashiCorp Vault, Azure Key Vault). Never leaves the KMS.
  • DEK (Data Encryption Key): Per-user key generated by Gateway that encrypts actual record fields.
  • EDEK (Encrypted DEK): The DEK encrypted by the KEK. Stored safely because it cannot be decrypted without the KEK.

The encryption flow:

  1. Gateway intercepts records on produce
  2. Extracts a unique identifier (e.g., userId) to determine the key ID
  3. Generates or retrieves a DEK for that user
  4. Encrypts specified fields (e.g., email, visa) with the DEK
  5. Stores the EDEK in a dedicated Kafka topic (Encryption Keys Store)
  6. Produces the encrypted record to the target topic

When crypto shredding is triggered, you tombstone the EDEK record in the keys store. Without the DEK, Gateway cannot decrypt the data, and neither can anyone else.

Gateway Configuration for Crypto Shredding

Conduktor Gateway implements crypto shredding through the gateway-kms:// key scheme. This delegates key storage to a dedicated Kafka topic while using your external KMS for the master key.

Encryption Interceptor

# Per-user encryption keys for crypto shredding
pluginClass: io.conduktor.gateway.interceptor.EncryptPlugin
config:
  topic: customers
  kmsConfig:
    vault:
      uri: http://vault:8200
      token: ${VAULT_TOKEN}
    gateway:
      masterKeyId: vault-kms://vault:8200/transit/keys/master-key
  fields:
    - fieldName: email
      keySecretId: gateway-kms://user-{{record.value.userId}}
    - fieldName: visa
      keySecretId: gateway-kms://user-{{record.value.userId}}

Key points:

  • gateway-kms://user-{{record.value.userId}}: Creates a unique DEK per user via mustache templating
  • masterKeyId: References the KEK in your external KMS (Vault, AWS, Azure, or GCP)
  • Fields not listed remain unencrypted for querying

Decryption Interceptor

pluginClass: io.conduktor.gateway.interceptor.DecryptPlugin
config:
  topic: customers
  kmsConfig:
    vault:
      uri: http://vault:8200
      token: ${VAULT_TOKEN}
    gateway:
      masterKeyId: vault-kms://vault:8200/transit/keys/master-key

If a key has been shredded, Gateway returns encrypted data per the configured errorPolicy. For complete configuration, see Gateway encryption documentation.

The Crypto Shredding Process

When a user requests data deletion, you perform crypto shredding by tombstoning their EDEK in the Encryption Keys Store topic. Crypto Shredding Deletion Process

Step-by-Step Deletion

  1. Identify the key ID: For a user with userId: 101, the key ID is gateway-kms://user-101
  2. Find all EDEKs: Scan _conduktor_gateway_encryption_keys for records matching this key ID. Due to distributed processing, multiple EDEKs may exist with different UUIDs:
{"algorithm":"AES128_GCM","keyId":"gateway-kms://user-101","uuid":"abc-123"}
{"algorithm":"AES128_GCM","keyId":"gateway-kms://user-101","uuid":"def-456"}
  1. Tombstone each EDEK: Produce a null-value record for each key:
echo '{"algorithm":"AES128_GCM","keyId":"gateway-kms://user-101","uuid":"abc-123"}|NULL' | \
kafka-console-producer \
  --bootstrap-server gateway:6969 \
  --topic _conduktor_gateway_encryption_keys \
  --property "parse.key=true" \
  --property "key.separator=|" \
  --property "null.marker=NULL"
  1. Verify: Consume from the original topic. The shredded user's data now returns encrypted (unreadable), while other users' data decrypts normally.

Important Considerations

  • Tombstone all UUIDs: Multiple Gateway nodes may create duplicate EDEKs for the same user. Ensure all are tombstoned.
  • New records still encrypt: Shredding only affects historical data. New records for the same user would create a new DEK.
  • Topic compaction timing: The tombstoned records eventually compact away, but timing depends on Kafka's compaction settings.

For a hands-on tutorial, see Configure crypto shredding.

Why Crypto Shredding vs Alternatives

Compared to Tombstone Records

Standard Kafka tombstones (key=userId, value=null) only mark logical deletion. Consumers must implement filtering logic, and downstream systems may already have copies of the data. With crypto shredding, the data itself becomes unreadable everywhere.

Compared to Log Compaction

Log compaction eventually removes older records with the same key, but:

  • No guarantees on timing
  • Replicas and backups may retain data
  • Downstream consumers may have persistent copies

Crypto shredding provides immediate, cryptographic certainty.

Compared to Pseudonymization

Pseudonymization (storing PII separately with tokens) requires managing a separate PII store and complex join logic. Crypto shredding keeps data in Kafka's natural structure while enabling compliant deletion.

Production Considerations

Cost and Performance

Gateway KMS significantly reduces KMS costs for high-volume scenarios:

  • Single KEK: Only one master key in your external KMS
  • Many DEKs: Per-user keys stored as encrypted records in Kafka
  • Local caching: DEKs are cached in Gateway memory with configurable TTL

Without Gateway KMS, storing millions of per-user keys directly in AWS KMS or Vault would be cost-prohibitive and create a performance bottleneck.

Key Security

The master key (KEK) is the crown jewel:

  • Stored only in your external KMS (Vault, AWS, Azure, GCP)
  • Never exposed to Gateway
  • Rotate per your organization's policy
  • If compromised, all data protected by that KEK is at risk

DEKs are protected by:

  • Encryption with the KEK before storage
  • Compacted Kafka topic with access controls
  • Gateway authentication requirements

Compliance Audit Trail

Maintain records of:

  • When crypto shredding was requested (deletion request timestamp)
  • When tombstones were produced (execution timestamp)
  • Which key IDs were shredded
  • Verification that encrypted data is returned post-shredding

This audit trail demonstrates GDPR compliance during regulatory review.

Error Handling

Configure the decryption interceptor's errorPolicy:

  • return_encrypted: Returns unreadable ciphertext (supports crypto shredding)
  • fail_fetch: Fails the consumer request
  • crypto_shred_safe_fail_fetch: Fails unless key not found (crypto shredding case)

For most GDPR scenarios, return_encrypted is appropriate, it allows consumers to continue processing while making shredded data permanently inaccessible.

Sources and References

Written by Stéphane Derosiaux · Last updated February 18, 2026