Config Wars - Chapter 4: Protocol Buffers (Protobuf)
Welcome to “Choosing a Schema for Your Robotics Stack!” This series is designed to help scaling robotics teams understand why schemas are useful for their configuration management and navigate which schema is best suited for their stack.
In this blog, we’ll do a deep dive into Protocol Buffers (Protobuf), a data serialization format turned popular choice for schemas in robotics. We’ll discuss its origin, features, and some of its pros and cons.
History / Motivation:
Protobuf was created by Google in the early 2000s. The team needed a compact, fast, and language-agnostic way to serialize structured data. The performance offered by XML and JSON wasn’t cutting it, both in terms of size and speed.
To keep up with their rocket ship growth, Google needed to efficiently pass structured messages across thousands of internal services written in various languages.
Protobuf was created to solve these problems:
Efficient cross-language serialization. They needed to encode and decode structured data across systems and languages (C++, Java, Python, Go, etc.)
Smaller payload sizes. Protobuf produced a binary format that is smaller and more efficient than XML/JSON
Schema guarantees. Define data schemas with .
proto
files, allowing for type-safe communication between servicesForward/backward compatibility. With services constantly changing, schemas need to safely evolve over time. With Protobuf, you can add, remove, or modify fields without breaking old binaries
Protobuf wasn’t designed for configuration, but over the years, it has been co-opted for this use case in environments where binary formats and strong typing matter. It’s become popular in the world of embedded systems and robotics. That’s great news for us!
Usage in Practice:
Protobuf operates in the infrastructure layer of many distributed systems:
APIs & microservices: Especially with gRPC, which uses Protobuf as its IDL (Interface Definition Language) and serialization format.
Internal service communication: For large-scale systems, Protobuf is often used to reduce payload size and parsing overhead
Embedded systems: Robotics, IoT, etc., anywhere where bandwidth is constrained
Protobuf wasn’t built for configuration, but many teams use .proto files as config schemas, given that they want:
Typed configs and are okay with compiling the schema before runtime
To control the full pipeline (generate configs at build time and deploy as binaries)
Community / Ecosystem:
Protobuf is still led by Google with a mature and production-grade ecosystem. It’s been around for ~15 years, so you’ll find wrappers, plugins, and tooling across every major language.
GitHub Activity:
Tooling:
IDE Support
VS Code: Extensions for syntax highlighting, autocomplete, and linting.
IntelliJ, GoLand, etc: Built-in
.proto
support for major languages.
CLI Tools
protoc
: the compiler that turns.proto
files into codeProtobuf’s CLI,
buf
, allows for:Linting
Breaking change detections
Schema Versioning
Dependency management
protovalidate
: Adds runtime validation rules. Useful since native Protobuf doesn’t have strong validation.
Feature Evaluation:
Validation Model:
Protobuf’s native validation model is minimal. It enforces types but doesn’t support constraints like:
Min/max values
String length
Regex patterns
Cross-field logic
For a real validation, most teams use protovalidate
, a buf
plugin that lets you define rich constraints directly in your .proto
files.
Example with protovalidate
import "buf/validate/validate.proto";
message JointLimits {
string joint_name = 1 [(buf.validate.field).string.min_len = 1];
float min_position = 2;
float max_position = 3 [(buf.validate.field).float.gt = 0];
// Ensure max_position > min_position in application logic
}
This snippet ensures:
joint_name
can’t be an empty string.max_position
must be > 0.You can enforce
max_position > min_position
at runtime via generated validation code.
protovalidate
allows you to validate at runtime and unlock most of the validation capability from both JSON Schema and CUE.
Code Generation:
Protobuf has excellent code generation. You define your schema once in a .proto
file, and you can easily generate typed, structured code in your language of choice.
Let’s say you have a schema for camera parameters on a robot:
message CameraConfig {
float focal_length = 1;
int32 resolution_width = 2;
int32 resolution_height = 3;
bool auto_exposure = 4;
}
You can generate a C++ class using the protoc
compiler:
protoc --cpp_out=./generated camera.proto
This generates a camera.pb.h
file and a camera.pb.cc
file. Here’s a snippet from the former:
class CameraConfig : public ::google::protobuf::Message {
public:
CameraConfig();
virtual ~CameraConfig();
// Getters
float focal_length() const;
int resolution_width() const;
int resolution_height() const;
bool auto_exposure() const;
// Setters
void set_focal_length(float value);
void set_resolution_width(int value);
void set_resolution_height(int value);
void set_auto_exposure(bool value);
// Serialization
bool SerializeToString(std::string* output) const;
bool ParseFromString(const std::string& data);
// ...
};
This gives you:
A single source of truth for shared config structures
Type-safe config access across firmware, host software, and cloud services
Taint tracking. If a field is added or removed, you get a compile-time error in every place it’s used
Protobuf’s codegen is the best in the business.
Composability / Overrides:
Protobuf is not naturally composable in the way that Cue is. It’s more of a strict schema definition than a configuration language.
That said, you can build some modularity using nested messages:
message Resolution {
int32 width = 1;
int32 height = 2;
}
message CameraConfig {
Resolution resolution = 1;
float focal_length = 2;
}
This lets you define reusable components and assemble them into larger schemas.
On the override side, oneof
can give you some basic functionality.
However, you can’t do hierarchical overrides (base config → fleet config → individual robot config) in the way that you might want to do once you have a fleet of production robots. You also won’t be able to implement any conditional logic.
To get around this, teams either build their override logic into their application code, or use Protobuf strictly for static configs, and use another schema language for more dynamic, runtime configs.
Templating / Logic / Computation:
Protobuf has no support for:
logic, conditionals, or computed fields.
templating, variables, or macros.
expressing constraints like “field A must equal field B + 1”.
Which means that it’s very poor at templating!
The schema is static and must be compiled ahead of time. If you want something more dynamic, you’ll need to couple it with a templating engine like Jinja to generate configs from .proto
files.
Self-Documentation / Human Readability:
Protobuf schemas are readable if you know the syntax, but they aren’t designed to double as documentation. Comments are supported, but there’s no built-in support for markdown, structured metadata, or annotations.
// Configuration for the main robot arm camera
message CameraConfig {
// Effective focal length in millimeters
float focal_length = 1;
// Image width in pixels
int32 resolution_width = 2;
// Image height in pixels
int32 resolution_height = 3;
// Whether to enable automatic exposure control
bool auto_exposure = 4;
}
You can describe each field, but the schema can’t explain itself beyond these comments. Of course, once you serialize it into a binary blob, it is certainly not human readable.
Summary:
Protobuf is a high-performance, cross-language serialization format that’s been widely adopted in infrastructure, embedded systems, and now, robotics.
It offers best-in-class code generation and excellent support for static typing and schema evolution. With plugins like protovalidate
, you can also add runtime validation to match the flexibility of other schema languages.
The glaring downside with Protobuf is that it wasn’t designed for configuration. It lacks native support for overrides, templating, and logic. You’ll need to handle inheritance and composability manually or with external tooling.
If you need compact, versioned, type-safe configs that compile down to a binary and integrate with firmware or message-passing systems, Protobuf is a strong choice.
In the next post, we’ll wrap up the series with takeaways. We’ll compare these schema langugaes side by side, share some honorable mentions, and share some insight on which language we chose to support first at Miru!
It’s going to be a fun one. See you in Chapter 5.
Config Wars Series Index
Config Wars - Chapter 5: What to Choose? [Coming Soon]