I’ve built an Avro schema that I’ve stored on the Confluent Schema Registry (version 5.5.1) running locally. However it seems that the schema that is being used to serialize the record is actually different than what I expect it to be. The schema definition is several hundred lines long, so I’m sharing a very pared-down version here that represents how it is structured:
[ { "name": "AddressBase", "type": "record", "namespace": "com.namespace", "fields": [ { "name": "line1", "type": "string" } ] }, { "name": "Address", "type": "record", "namespace": "com.namespace", "fields": [ { "name": "addressBase", "type": "AddressBase" } ] }, { "name": "SchemaName", "fields": [ { "name": "agency", "type": { "fields": [ { "name": "code", "type": "string" }, { "name": "name", "type": "string" }, { "name": "currentMailingAddress", "type": "Address" } ], "name": "Agency", "type": "record" } } ], "namespace": "com.namespace", "type": "record" } ]
Here are the steps I’ve taken to reproduce the problem:
Saved schema in Schema Registry – this was version 2 of the schema for the topic
Built local class files using that same schema
Created POJO with appropriate values populated
Ran the producer to store serialized object on Kafka, with auto.register.schemas set to “false”
Received an error “schema not found”:
Error (truncated) Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema not found; error code: 40403
Set auto.register.schemas to “true”
Ran again, serializing a new record
This time, the message was stored successfully, but the schema was updated and is now on version 3.
I’ve searched quite a bit but have come up at a loss as to why this may occur. Are there any particular things I might need to double check within my schema definition that could cause this behavior?
Advertisement
Answer
After working through some more of this issue, I discovered that the following two settings need to be made for this to work properly:
auto.register.schemas=false use.latest.version=true
It’s that second setting that takes care of matching the record being written – which will have the nested type(s) expanded – with the appropriate schema. More information is available on the linked page. After adding the version flag to my code I was able to test and verify this now works properly.
Additionally – as suggested in the comments – using IDL and then generating the schema using the Avro Tools appears to get around having to set those two configuration values, as the generated schema can more easily be used for serdes. It is this approach that thus far has worked the best for me.