Q: What is Schema Registry and why is Avro commonly used with Kafka?
Answer:
The Problem: Schema Evolution
In a microservices architecture, producers and consumers are developed by different teams and deployed at different times. What happens when the producer changes the message format (adds a field, renames one, changes a type)?
Without schema management, the consumer breaks because it can't deserialize the new format.
Schema Registry
The Confluent Schema Registry is a centralized service that stores and manages schemas for Kafka message keys and values. It ensures that producers and consumers agree on the data format.
Producer → Schema Registry: "Here's my schema, give me an ID"
Schema Registry → "Schema ID: 42"
Producer → Kafka: [Schema ID: 42] + [Serialized Data]
...
Consumer ← Kafka: [Schema ID: 42] + [Serialized Data]
Consumer → Schema Registry: "What schema is ID 42?"
Schema Registry → Returns the schema
Consumer: Deserializes data using the schema
Why Avro?
Apache Avro is a binary serialization format that is the dominant choice for Kafka messages. It pairs perfectly with Schema Registry.
Avro schema example:
{
"type": "record",
"name": "OrderEvent",
"namespace": "com.example",
"fields": [
{"name": "orderId", "type": "string"},
{"name": "amount", "type": "double"},
{"name": "currency", "type": "string", "default": "USD"},
{"name": "timestamp", "type": "long"}
]
}
Why not JSON?
| Feature | JSON | Avro | Protobuf |
|---|---|---|---|
| Size | Large (text + keys) | Compact (binary, no keys) | Compact (binary) |
| Schema | None (schema-less) | Required | Required |
| Speed | Slow (parsing text) | Fast (binary) | Fast (binary) |
| Schema evolution | Manual | Built-in | Built-in |
| Human readable | ✅ Yes | ❌ No | ❌ No |
Avro messages are typically 50-70% smaller than JSON because they don't include field names — only the values, referenced by schema position.
Compatibility Modes
Schema Registry enforces compatibility rules when a schema evolves:
| Mode | Rule |
|---|---|
| BACKWARD (default) | New schema can read old data. Allows: adding fields with defaults, removing fields. |
| FORWARD | Old schema can read new data. Allows: removing fields, adding optional fields. |
| FULL | Both backward and forward compatible. |
| NONE | No compatibility checks. |
Example of backward-compatible change:
// v1
{"name": "orderId", "type": "string"}
{"name": "amount", "type": "double"}
// v2 (backward compatible: new field has a default)
{"name": "orderId", "type": "string"}
{"name": "amount", "type": "double"}
{"name": "currency", "type": "string", "default": "USD"} // ← NEW
[!TIP] In interviews, mentioning Avro + Schema Registry together shows you understand production Kafka. The key insight: Schema Registry acts as a contract between services, preventing breaking changes from deploying to production.