Config Wars - Chapter 2: JSON Schema
Welcome to “Choosing a Schema for Your Robotics Stack!” This series is designed to help scaling robotics teams understand why schemas are useful for their configuration management and navigate which schema is best suited for their stack.
In this blog, we’ll do a deep dive into JSON Schema, a popular choice for schemas in robotics. We’ll discuss its origin, features, and some of its pros and cons.
History / Motivation:
JSON Schema is a standard for describing and validating the structure of JSON data. Think of it as a contract that defines what a valid JSON object looks like, specifying types, required fields, constraints, and structure.
Origins:
JSON Schema was introduced in 2010. At this time, JSON was rapidly becoming the de facto format for APIs. However, it had a growing problem. For all its benefits, there was no standard way to validate its structure.
JSON Schema filled this gap and gave JSON a similar schema definition to XML’s XSD.
Initial Use Case:
The first major use case was API request and response validation. JSON Schema lets developers validate incoming payloads. It was soon adopted in OpenAPI (at the time called Swagger) and became the backbone for API schemas and documentation.
Usage in Practice:
JSON Schema wasn’t originally designed for configs, but since it’s become the default option for validating JSON, it has begun to pop up in config systems.
Outside of that, here are some places we see JSON Schema in use.
OpenAPI: Uses JSON Schema to describe request and response payloads. If you’ve written an OpenAPI spec, you’ve used JSON Schema
Docker Compose: Uses embedded schemas to validate service definitions and flag misconfigured keys.
AsyncAPI: Builds on JSON Schema to define the structure of events and messages.
Kubernetes Custom Resource Definitions (CRDs): Kubes uses JSON Schema to validate resource specs and reject invalid YAMLs before it applies the resource to the cluster.
Community / Ecosystem:
JSON Schema has one of the most active and mature ecosystems of any schema format. It’s maintained by the JSON Schema organization, with regular drafts and updates. 2020-12 is the current draft.
GitHub Activity:
Tooling:
The community and tooling around JSON Schema is huge. You’ll find production-grade libraries for nearly every language:
JavaScript:
ajv
Python:
jsonschema
Go:
gojsonschema
Rust:
jsonschema
,schemafy
C++:
json-schema-validator
It also supports code generation with tools like:
quicktype
– generates idiomatic types for many languagesdatamodel-code-generator
– for Python, generated via Pydanticschemars
– for Rust via Serde
Finally, it has editor support for VS Code, IntelliJ, and other editors to power autocompletion, validation, and inline documentation.
Something to be careful about: JSON Schema doesn’t have an official validator or CLI. This means you have to use a third-party library. Please ensure the library is active and well-maintained before committing to it in production!
Another consideration is that YAML is considered a second-class citizen in JSON Schema. While you can apply schemas to YAML (we often do internally, as you see with the examples we share in our blogs), the experience is less robust than using JSON.
Feature Evaluation:
Validation:
JSON Schema was initially built for validation, so naturally, it is strong here.
It gives you a declarative way to define rules like:
Required fields
Type checks (
string
,number
,boolean
, etc.)Enum constraints (
status
must be one of["idle", "active", "error"]
)Min/max for numbers, length limits for arrays
Regex patterns for strings
Nesting for objects-of-objects
Validation happens at runtime via a separate library like ajv
or json-schema-validator. This means that there are no compile-time guarantees. You’re responsible for calling the validator while loading the config into your app.
It’s strong for config validation for predictable, bounded checks; however, it lacks the expressive power for computed defaults or logic. You won’t be able to do, “if motor_type
is servo
, then gear_ratio
must be set.”
Code Generation:
JSON Schema makes it easy to generate typed models in your language of choice from your schema.
You can generate:
Python classes with
datamodel-code-generator
: A codegen tool that converts JSON Schema, OpenAPI, or raw JSON samples into Pydantic model classes.
Rust structs with
schemafy
: Takes a JSON Schema and generates Rust types that implement Serde traits. Great when you want strongly typed config parsing from schema definitions.
C++ classes with
quicktype
: generates idiomatic C++ classes withnlohmann::json
integration
Here’s an example of going from JSON Schema to a Python class using datamodel-code-generator
:
JSON Schema
{
"type": "object",
"properties": {
"robot_id": { "type": "string" },
"enabled": { "type": "boolean" },
"max_velocity": { "type": "number", "minimum": 0 },
"sensors": {
"type": "array",
"items": { "type": "string" }
}
},
"required": ["robot_id", "enabled"]
}
Here’s the generated Python class:
from typing import List, Optional
from pydantic import BaseModel, Field
class RobotConfig(BaseModel):
robot_id: str
enabled: bool
max_velocity: Optional[float] = Field(default=None, ge=0)
sensors: List[str] = Field(default_factory=list)
This model:
Enforces types and constraints at runtime
Sets the minimum on
max_velocity
Defaults
sensors
to an empty list if not provided
Composability / Overrides:
JSON Schema doesn’t handle this well, natively. It’s a validation spec, so it was never designed to support overrides or schema extensions.
However, there are some workarounds:
Modular schemas using
$ref
: lets you break large schemas into smaller parts and reuse definitions. Note that$ref
is reference only, so you can’t override or patch fields.Compose constraints from multiple schemas with
allOf
,anyOf
, oroneOf
.Use Jsonnet or Jinja to merge and perform overrides at build time.
To use something like Jsonnet, you’ll need a shared base config plus specific overrides for your robot.
Which will give you a workflow like the following:
# Step 1: Compose or override the config
jsonnet robot_07.jsonnet > robot_07.json
# Step 2: Validate the result
jsonschema --schema base.schema.json robot_07.json
Overall, composability and overrides are clunky with JSON Schema.
Note: Check out Miru to see how we make overrides easy w/ JSON Schema 🙂
Templating / Logic / Computation
Since JSON Schema is declarative and static, it cannot define conditionals, compute derived fields, or loop through a list to generate nested objects.
Any logic must be handled outside of JSON Schema (using a templating engine like Jsonnet, Jinja, or even Cue).
This limits its usefulness where config values are dependent on each other.
Self-Documentation / Readability:
For JSON and YAML enthusiasts, JSON Schema is a familiar format.
You can embed metadata directly into the schema:
description
: explain what a field is and how it’s usedexamples
: show valid values or common patternsdefault
: suggest a fallback value (not enforced)title
: optional short label for fields
Here’s what it looks like:
{
"type": "object",
"properties": {
"kp": {
"type": "number",
"description": "Proportional gain for motor controller (unit: N*m/rad)",
"default": 0.05,
"examples": [0.05, 0.1]
}
}
}
This metadata makes it easy for folks to get up to speed when reading the config for the first time.
But for readability, JSON Schema can sometimes be a mixed bag:
For simple schemas: readable and easy to follow.
For large schemas: becomes deeply nested, verbose, and hard to write by hand.
Using
allOf
+$ref
+if/then/else
can quickly make the file hard to parse
Summary
JSON Schema is one of the most widely adopted schema languages today. Originally built for API validation, it has since found its way into configuration systems across web, cloud, and now, robotics infrastructure.
It has a litany of strengths: a mature ecosystem, great third-party validation libraries, and broad tooling support across languages.
It does lack some functionality with regard to overrides, templating, and logic. Teams usually pair it with an external templating engine like Jsonnet or Jinja to fill these gaps.
If your team is already using JSON or YAML and wants strong validation and wide compatibility, JSON Schema is a solid choice.
In the next post, we’ll explore Cue, a schema language purpose-built for configs, with strong validation, expressive logic, and native support for overrides. I’ll meet you in Chapter 3!
Config Wars Series Index
Config Wars - Chapter 3: CUE [Coming Soon]