Welcome to this week's update on dozer! We are excited to share with you the latest developments and progress that we have made. Here are the updates for this week.
Release v.0.1.13
Dozer v.0.1.13 is avaiable. Checkout the release notes here.
Insert and Update Conflict Resolution #1267
Now Dozer supports conflict resolution while writing data to the sinks. Depending on the type of data, developers can control app behavior. If consistency and accuracy is far more important vs speed and estimates.
endpoints:
- name: data_api
conflict_resolution:
# Options: nothing | update | panic
on_insert: update
# Options: nothing | upsert | panic
on_update: upsert
# Options: nothing | panic
on_delete: nothing
Parallelized Joins #1180
Performance improvement in the order of 4x to 5x.
We have simplified and optimized Join implementation which resulted in a significant peformance boost. In case of a single source of the query the Processor is simply bypassed, since any operation on the record is necessary on this case.
In case of one or more JOIN operators in the SQL one Product Processor for each join is created and connected.
Eg:
SELECT name, department.name as dep, salary
FROM user
JOIN department ON user.department_id = department.id
JOIN country ON user.country_id = country.id;
This query is converted to a pipeline:

Testing Strategy
Our focus has been introducing a number of test cases to increase the stability of Dozer.
Data type tests for connectors
Populate an external data source with all possible data types the connector supports, Dozer will automatically check if all conversion works without bug. Put the data populating code in DataReadyConnectorTest::new
and you are done!
pub trait DataReadyConnectorTest: Send + Sized + 'static {
type Connector: Connector;
fn new() -> (Self, Self::Connector);
}
For example, local storage connector implements it like this:
pub struct LocalStorageObjectStoreConnectorTest {
_temp_dir: TempDir,
}
impl DataReadyConnectorTest for LocalStorageObjectStoreConnectorTest {
type Connector = ObjectStoreConnector<LocalStorage>;
fn new() -> (Self, Self::Connector) {
let record_batch = record_batch_with_all_supported_data_types();
let (temp_dir, connector) = create_connector("sample".to_string(), &record_batch);
(
Self {
_temp_dir: temp_dir,
},
connector,
)
}
}
Ingestion tests for connectors
Test if a connector ingests data as expected. Implement InsertOnlyConnectorTest
(optionally CudConnectorTest
) to test the most common connector methods used in Dozer. The test suite simulates a full run of Dozer to make sure the tested connector ingests and outputs data correctly.
For example, PostgresConnectorTest
implmenets CudConnectorTest
by executing sql against the postgres database.
impl CudConnectorTest for PostgresConnectorTest {
fn start_cud(&self, operations: Vec<Operation>) {
...
std::thread::spawn(move || {
for operation in operations {
client
.batch_execute(&operation_to_sql(
schema_name.as_deref(),
&table_name,
&operation,
&schema,
))
.unwrap();
}
});
}
}
As long as a connector passes this test suite, Dozer can guarantee data integrity using that connector. Local storage and postgres connector have passed the test.
Integration Tests for Dozer Samples
We've added an integration test for each of the samples, so they won't break unexpectedly! Sql integration tests #1282
Prop Tests
We have complemented our unit tests with a range of prop tests. Read more about prop tests here
We have included various data type tests using the following approach 1245
proptest!(ProptestConfig::with_cases(1000), |(a in ".*", b in ".*")| {
// Tests
});
Other Improvements & Fixes
Local Storage test #1290 This PR adds the necessary mechanism for setting up a local storage connector in e2e tests, and adds a new e2e test according to dozer-samples.
DataReadyConnectorTest #1296
Postgres test #1299
Graceful Handling of grpc API errors #1289
Add ny taxi sample to e2e test #1263
Add postgres connector sample to e2e tests #1278
Changelog
https://github.com/getdozer/dozer/compare/v0.1.12...v0.1.13
Contact us
- GitHub: https://github.com/getdozer/dozer
- Discord: https://discord.com/invite/3eWXBgJaEQ
- Twitter: https://twitter.com/GetDozer
- LinkedIn: https://www.linkedin.com/company/getdozer/