Skip to main content


Dozer takes an opinionated and horizontal approach and cuts across different categories. In Dozer, you would find modules and functionality comparable to streaming databases, caches, search engines and API generation tools.


Key Entities


A Connection describes one connection to each data store. One Connection can have multiple sources. Typically you describe the connection details and credentials within the configuration section.

Connectors are implemented in dozer-ingestion module.


Each Source essentially describes one unique table with a name and schema.


Each Endpoint describes one API Endpoint that will be deployed when Dozer is running. You can find the configuration reference here

Every Endpoint attaches REST and gRPC API routes on a Cache Reader instance. Every endpoint also creates a Sink in the pipeline where a Cache Writer is initialized.


Dozer instantiates a data pipeline which is essentially a DAG. The pipeline contains sources, processors and sinks.

  • Every source explained above acts as a pipeline source.
  • SQL is transformed into a collection of several processors.
  • A Sink is initialized for each Endpoint.

Pipeline and DAG construction is implemented under dozer-core.


The cache interface exposes methods to insert, update, delete and query records. The cache also creates secondary and full-text indexes for fast lookups and queries. Cache Writer is initialized within a Sink and data gets processed and committed in bulk as part of the pipeline. Both Rest and gRPC API Servers initialize Cache Readers and interact with data stored in the storage layer.

Cache Reader has also support for Authorization based on properties.

This is implemented under dozer-cache.


JWT Tokens can be initialized using APIs that have narrowed down permissions to access data. This could be per Endpoint or even based on Document Properties.