We would like to store data in Kafka using exactly-once
semantics in order to avoid message duplication.
Producer with following properties:
spring.kafka.producer.properties.acks=all spring.kafka.producer.properties.enable.idempotence=true
Kafka topic description:
Topic: topicName PartitionCount: 1 ReplicationFactor: 1 Configs: Topic: topicName Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Integration test:
@Test void exactlyOnceTest() { kafkaTemplate.send("topicName", "key", "data"); kafkaTemplate.send("topicName", "key", "data"); kafkaTemplate.send("topicName", "key", "data"); }
Our expectation is that only one message should be stored in Kafka, but the actual result 3 messages.
How can I make excatly-once semantics to work with Kafka?
What is missing in my configuration?
Advertisement
Answer
Exactly Once Semantics does not work that way ,
the idempotent producer is configured in order to avoid duplicate or out of order rows when producer failed in the process,
Consider the following scenario: You send a message to topic, your producer client is waiting for acknowledgment from the broker, the message got written to kafka, but there is network error now and the acknowledgment never received to the producer client, your producer will do internal retry in order to produce the message, the message will be send again to the broker,
if you did not enable idemptance then your broker will write again the message and send you acknowledgment, you’ll get duplicate messages inside the topic,
if you enabled idemptance, the broker will understand it is a retry of producer and the message already written to topic and he just will send you acknowledgment, no duplicate in the topic.
In your Test you just produce 3 messages with same value, they are different “threads”… so you will end up having 3 messages in the topic
For your information the apache kafka project is very intensively doing checks for everything they are adding in order to avoid any breaking changes, it is very stable, you can look how they tested out the idemptance producer feature over this link