Skip to main content

This week at Dozer #1

· 3 min read

Starting this week, we want to communicate our progress through a weekly blog series. Please reach out to us with any feedback or if you like to see anything in particular. This is a mega update where we released several features that were in development.

Dozer is now available on homebrew and deb install.

# Mac
brew tap getdozer/dozer
brew install dozer

# Ubuntu
curl -sLO \
&& sudo dpkg -i dozer-linux-x86_64.deb

Release v.0.1.11

Dozer v.0.1.11 is avaiable. Checkout the release notes here.

Additional DateTime operations

Based on feedback, we extended additional date time capabilities. Allowing to manipulate and work with dates better.

Extract date part from date and time #1178

select extract(timezone from last_update) from actor;
select extract(year from last_update) from actor;
select extract(hour from last_update) from actor;
select extract(month from last_update) from actor;

Timestamp difference #1074

Duration is extracted as i64 when timestamps are subtracted.

select (date1 - date2) from logs;

Window Capability #1175 Alpha

Now Dozer supports window capabilities as an alpha feature.

Hopping Window

SELECT taxi_id, completed_at, window_start, window_end
FROM HOP('taxi_trips', 'completed_at', '1 MINUTE', '2 MINUTES')

Tumbling Window

SELECT taxi_id, completed_at, window_start, window_end
FROM TUMBLE ('taxi_trips', 'completed_at', '2 MINUTES');

Refer to this issue for further information #893.

Performance Improvements

We have been able to increase performance by an order of magnitude and have introduced several enhancements.

  • Pipeline indexes in memory #1084
  • Remove nested txns in LMDB #1084
  • Optimize cache insertion and query when schema is append only #1192

Perfomance improvements are an ongoing effort and we will be sharing more in the coming weeks.

Schema Evolution v1 Alpha

Dozer's main use case is to be a very fast cache + API layer. So between restarts, data is not retained, just like Redis. We are currently testing blue/green cache functionality where API upgrades are seamless with zero downtime.

  • Blue Green Cache #1061

  • Automatic Switch based on no of records #1092

Ingest using Arrow Format

Arrow format is commonly used in data analysis and for cross language support. Now developers can easily ingest arrow format (i.e. from Polars or Pandas). This is our initial integration with Arrow. We love where arrow format is heading and stay tuned for further updates.

  • Implement arrow format for grpc ingestion #1087

Deltalake Connector Alpha

Dozer now has a connector for Deltalake leveraging deltalake-rs RUST library.

Other Minor Updates

  • Show sources ingestion progress #1079

  • Parallelized ingestion of Postgres snapshot data #1094

What Next?

  • Robust Data Type Handing & Stability
  • Performance improvements
  • Build various samples to showcase Dozer

New Contributors

We are happy to see growing interest in Dozer. We welcome any contributions and are very thankful to help our community grow.

  • @readall made their first contribution in #1122
  • @hoangnh93 made their first contribution in #1035


Contact us