How to deploy Conduktor Gateway?

How to deploy Conduktor Gateway?

Conduktor Gateway sits between your Kafka clusters and your applications and works with the Apache Protocol. We review different ways to deploy it, pros and cons.

François Teychené

1 juin 2023

"Cool but how do I deploy this thing?!"

This is the first question you ask yourself when you have a new toy to play with. Yes, you want to reinvent your Kafka infrastructure with Conduktor Gateaway and protect your infra from clever developers... who sometimes make typos or are unaware of some behaviors that happen in real-life.

Let's go back to what's the impact of Conduktor Gateway, and talk about various deployment strategies: Load Balancer, SNI routing, joy!

What is Conduktor Gateway?

Conduktor Gateway is your swiss-army-knife when it comes to Kafka.

It aims to provide additional tools and various capabilities to your existing Apache Kafka deployments:

  • Safeguard: Enforce best-pratices and protect your Kafka infrastructure from outages (just like that)

  • Chaos: Simulates real-world scenarios to ensure application resilience

  • Policy: Enforces rules all the developers will need to follow if they want to do something on your Kafka. If they don't follow the rules, they don't pass

  • Encryption: Ensures secure transmission of sensitive messages and encryption at-rest (with field level encryption)

  • Multi-tenancy: Reduces costs by having many teams/projects, fully isolated, co-existing on a single cluster

  • RBAC: Resolves Apache Kafka's ACL flaws

  • Audit: Know who is doing what/when/how with Kafka and the applications

  • Cold Storage: Send very large messages in Kafka without impacting the Kafka infrastructure

  • Cache: Reduces networking costs, latency and improves performance

  • Lineage: Adds lineage information to data flows

Conduktor Gateway is deployed between your Kafka cluster and your applications: it acts like Kafka, listens on :9092, and works with the Apache Protocol. (It's not a basic HTTP proxy, faaar from it!)

To understand the various deployment options, Let's see how your infrastructure looks without Conduktor Gateway.

Without Conduktor Gateway

Let's describe the networking flow between Kafka clients and Kafka clusters.

  1. The producer initiates a connection with a random broker (the famous "bootstrap server") and sends a Metadata request.

  2. The broker responds to the producer with the cluster information, indicating that this Kafka cluster consists of " brokers:

  • Broker1 at port :9092

  • Broker2 at port :9092

  • Broker3 at port :9092

  1. The producer proceeds to send data to some topics by sending piece of data to Broker1:9092, Broker2:9092, and Broker3:9092.

That's how Kafka works, no magic here!

Deployment with a Load Balancer

Now, we want Conduktor Gateway to get all the fancy benefits it's bringing! As you don't want to create a Single Point of Failure (SPoF) in case it fails for some reason (or maintenance, upgrades, etc.), you're going to deploy more than one.

The guidance is to:

  • Deploy many Conduktor Gateway instances (at least 3)

    You want to ensure High Availability in case of failure and ensure you are not impacting the overall performance.

  • Deploy a TCP Load Balancer in front of the Conduktor Gateway instances

    eg: Amazon ELB, Google Cloud LB, Azure LB, HAProxy.

  • Set Conduktor Gateway's' advertised.listener to your Load Balancer IP or hostname

    This is necessary for your applications to always connect to the LB and never to your Gateways directly (that would defeat the purpose of what we're trying to achieve)

  • Update your applications (Producers and Consumers) to use the Load Balancer IP or hostname as their bootstrap.server.

And voilà! You now have a flexible, resilient, efficient, and scalable deployment of Conduktor Gateway, no Single-Point-of-Failure, you're good to go.

Let's talk about the networking flow now!

  1. The Kafka producer sends a Metadata request to loadbalancer:9092

  2. The LB forwards the Metadata request to a random Conduktor Gateway (CG), let's say gateway1:9092

  3. Gateway1 responds with the cluster information to the LB, indicating that:

  • its port 9092 is assigned for broker1:9092

  • its port 9093 is assigned for broker2:9092

  • its port 9094 is assigned for broker3:9092

  • This port-to-broker mapping allows the Gateway to determine the intended target broker when called on a specific port: each Gateway port maps to a single broker. See the section below about SNI to avoid getting tangled up in this port mess!

  1. The LB then relays the cluster information to the producer

  2. The producer sends data to loadbalancer:9093

  3. The LB selects a random CG and forwards the data, like gateway3:9093

  4. Gateway3 sends the data to broker3:9092 because of the port-to-broker mapping, using :9093

Your Kafka applications configuration would look like:

bootstrap.servers=loadbalancer:9092,loadbalancer:9093,loadbalancer:9094
security.protocol=SSL
ssl.truststore.location=/var/private/ssl/kafka.client.truststore.jks
ssl.truststore.password=changeit
ssl.keystore.location=/var/private/ssl/kafka.client.keystore.jks
ssl.keystore.password=changeit

When using Conduktor Gateway with a Load Balancer, if a Conduktor Gateway instance fails, the following process occurs:

  • The client (producer or consumer) will retry the failed operation

  • The Load Balancer will redirect the request to another healthy Conduktor Gateway instance (it's its job)

  • The client will automatically resume seamlessly, no business is stopped, the whole system is still available and performance follows

Pros:

  • Simple deployment: Setting up a LB with multiple Conduktor Gateway instances linked is a common operation for Ops

  • Single Load Balancer: One LB simplifies management and reduces complexity for auditing as well as client configuration

Cons:

  • Multiple ports management: on your LB, you will require as many ports as there are brokers in your Kafka cluster, which might increase the management overhead and complexity

  • Load Balancer aversion: Some users or organizations may prefer not to use load balancers due to specific requirements, policies, or personal preferences

Deployment without Load Balancer

Let's try another way, no LB! The trick is to do exactly like we do with Kafka (there is no LB in Kafka, right?).

  • Deploy many Conduktor Gateway instances (at least 3)

  • Each Conduktor Gateway instance should open as many ports as there are brokers in the underlying Kafka cluster

  • Configure each Conduktor Gateway instance to send a heartbeat to the Kafka cluster. It's to maintain an up-to-date view of the active/healthy Conduktor Gateway instances

  • When returning broker information in the Metadata request, each Gateway should sequentially select alive Conduktor Gateways

Your Kafka applications configuration would be linked directly to the Conduktor Gateways, like:

bootstrap.servers=gateway1:9093,gateway2:9093,gateway3:9093
security.protocol=SSL
ssl.truststore.location=/var/private/ssl/kafka.client.truststore.jks
ssl.truststore.password=changeit
ssl.keystore.location=/var/private/ssl/kafka.client.keystore.jks
ssl.keystore.password=changeit

The networking flow would look like:

  1. The producer sends a Metadata request to gateway1:9092

  2. Gateway1 forwards this Metadata request to broker1:9092, which responds with the normal brokers information: broker1:9092, broker2:9092, broker3:9092

  3. Gateway1 returns the Metadata results as gateway1:9092, gateway2:9093, gateway3:9094 (notice the translation)

  4. Later, the producer sends some data to gateway2:9093 for instance

  5. gateway2:9093 forwards the data to broker2:9092

If a Conduktor Gateway fails during this process, the Kafka client will automatically retry its operations using the remaining Gateways.

This is why we must always set multiple addresses in the bootstrap.server !

The client may select a different Conduktor Gateway based on the updated metadata, allowing for seamless recovery and continuation of operations.

Pros:

  • Just Conduktor Gateway to deploy, no LB, no fancy things!

Cons:

  • Removing/Adding Gateways may require a client configuration update and restart

Time to be serious: Deployment with SNI routing

SNI stands for Server Name Indication. It is an extension to the TLS (Transport Layer Security) protocol that allows a client to indicate the hostname of the server it wants to connect to during the initial handshake.

This allows the server to determine which certificate to present based on the requested hostname. This enables the hosting of multiple secure "resources" on a single IP address, improving resource utilization and making SSL/TLS deployments more flexible.

When using SSL/TLS to connect to Kafka brokers, beyond the certificate checks, we can extract useful information: the Server Name Indication (SNI). By doing so, we remove the constraint we had above: no more need to open as many ports as the number of clusters. SNI helps us by automatically giving us what we need!

The guidance is the same as above with the deployment relying on a LB, except for the Gateway configuration:

  • Deploy many Conduktor Gateway instances (at least 3)

  • Each Conduktor Gateway instance will open a single port (with TLS), to use the SNI magic

  • Deploy a TCP Load Balancer in front of the Conduktor Gateway instances (as above)

  • Set Conduktor Gateways' advertised.listener to your Load Balancer IP or hostname

  • Update your applications (Producers and Consumers) to use the Load Balancer IP or hostname as their bootstrap.server

Your Kafka applications configuration would look like:

bootstrap.servers=loadbalancer:9092
security.protocol=SSL
ssl.truststore.location=/var/private/ssl/kafka.client.truststore.jks
ssl.truststore.password=changeit
ssl.keystore.location=/var/private/ssl/kafka.client.keystore.jks
ssl.keystore.password=changeit

The networking flow would look like above with the LB, except for the tricky SNI part:

  1. The Kafka producer sends a Metadata request to loadbalancer:9092

  2. The LB forwards the Metadata request to a random Conduktor Gateway (CG), let's say gateway1:9092

  3. Gateway1 responds with the cluster information to the LB while altering the advertised.listener using subdomains: broker1.loadbalancer:9092, broker2.loadbalancer:9092, broker3.loadbalancer:9092

  • This subdomain-to-broker mapping allows the Gateway to determine the intended target broker: each subdomain maps to a single broker

  • Each of the brokerX.loadbalancer address resolves to the LB

  1. The LB then relays the cluster information to the producer

  2. The producer sends data for instance to broker2.loadbalancer:9092 which resolves to the LB

  3. The LB selects a random CG to forwards the data, like gateway3:9092 (equipped with the SNI routing)

  • The LB also forwards automatically the TCP stack it received to Gateway3. This TCP stack contains the message encrypted for broker2.loadbalancer:9092

  1. From the SNI extraction (the forwarded TCP stack), Gateway3 knows which broker is targetted and send the data to it

Pros:

  • Single port needed: compared to above, this deployment simplifies tremendously operations and security by only requiring a single port for each Conduktor Gateway instance, making it easier to manage and secure the network connections

Cons:

  • TLS support required: SNI routing is only supported for TLS, meaning that all connections between clients, gateways, and the Load Balancer must be encrypted using TLS. This is not a massive disadvantage to be honest: more security is always appreciated!

  • Managing certificates might introduce some operational overhead (you don't want expired certificates in production!)

  • Load balancer aversion

Summary

Conduktor Gateway is versatile when it comes to its deployment. After all, it's a 'simple' application following the classic rules: run multiple instances of it for high-availability and resiliency purposes. The only trick is related to the port management, which depends on the size of your Kafka clusters.

We recommend SNI for organizations who wants the highest level of security while having the less operational constraints. Let's use modern features to simplify our life!

Don't miss these