version conflict, current version [2] is different than the one provided [1]

Question

I have a Kafka topic and a Spark application. The Spark application gets data from Kafka topic, pre aggregates it and stores it in Elastic Search. Sounds simple, right? Everything works fine as expected, but the minute I set "spark.cores" property something other than 1, I start getting After researching a bit, I think the error is because multiple cores

Accepted Answer

I would like to answer my own question. In my use case, I was updating the document counter. So, all I had to do was retry whenever a conflict arise because I just needed to aggregate my counter.My use case was somewhat this:  For many uses of partial update, it doesn’t matter that a document has been changed. For instance, if two processes are both incrementing the page-view counter, it doesn’t matter in which order it happens; if a conflict occurs, the only thing we need to do is reattempt the update.    This can be done automatically by setting the retry_on_conflict parameter to the number of times that update should retry before failing; it defaults to 0.Thanks to Willis and this blog, I was able to configure Elastic Search settings and now I am not having any problems at all

Advertisement

Answer