Skip to main content

This week at Dozer #13

Β· 17 min read
Cahyo Subroto

πŸŽ‰ Welcome to the new releases of v0.126, v0.127 and v0.1.28 of Dozer! πŸŽ‰

We are excited to announce the release of three new versions of Dozer: v0.126, v0.127, and v0.128. These releases include a number of bug fixes, improvements, and new features. Our team has put in tremendous effort to enhance and refine the functionality of Dozer. We can't wait for you to explore all the exciting enhancements we've made. Let's explore the details of the releases:

πŸš€Release v0.1.26 to v0.1.28 Highlights πŸš€β€‹β€‹

Allow us to share the thrilling new features and improvements that await you in our latest releases. Here's what's in store for you!

  • Updated dozer-log-python/README.me file ( #1647): (#1647)The README file for the dozer-log-python project has been updated with new information, including instructions on how to install and use the project.

  • Enhanced response for UI render (#1648): (#1648): This chore improved the response time of the UI by optimizing the way the responses are rendered. This was achieved by caching frequently-used responses and by using a more efficient rendering algorithm. The overall goal of this chore is to make the UI more responsive and to improve the user experience.

  • Updated samples table & other sample references (#1652): (#1652): The documents with samples table containing a list of sample projects that are used to demonstrate the functionality of Dozer were updated. The other sample references including documentation, tutorials, and blog posts that provide additional information about Dozer were also updated.

  • Added data latency metrics (#1650): (#1650): This feat added metrics to Dozer that measure the latency of data access. This will allow users to track the performance of their data access queries and to identify areas where performance can be improved.

  • Addition of CASE statement (#1656) (#1656): This feat added a CASE statement to Dozer.The CASE statement will be similar to the CASE statement in other programming languages. It will allow users to test the value of an expression and to execute different blocks of code depending on the outcome of the test. Here is a sample configuration of the CASE statement:

    CASE
    WHEN condition1 THEN result1
    WHEN condition2 THEN result2
    WHEN conditionN THEN resultN
    ELSE result
    END
  • Added name fn for ProcessorFactory for UIDag in Cloud ( #1658): (#1658): This chore will add a getName() function to the ProcessorFactory class for the UIDag in Cloud. This will allow users to get the name of the ProcessorFactory class. The following steps will be performed as part of this chore: Add the getName() function to the ProcessorFactory class. Update the documentation to reflect the addition of the getName() function. Test the changes to ensure that they work correctly.

  • Supported generating UI graph from Dozer config (#1661): (#1661): This feat will add the ability to generate a UI graph from a Dozer configuration file. This will allow users to visualize the mappings in their Dozer configuration file. The UI graph will be generated using a graph visualization library.

  • Implementation of append-last watcher (#1562): (#1562): This feat will add a new watcher to Dozer that will append new mappings to the end of the configuration file.The append-last watcher will be implemented as a new class in the Dozer codebase. The class will implement the Watcher interface and will override the onAdded() method.

  • Enhanced response for UI render (#1664): (#1664): The chore was reverted because it was causing unexpected behavior in the UI. The changes that were rolled back include the addition of the caching mechanism and the changes that were made to the rendering algorithm.

  • Updated default cloud target URL (#1667): (#1667): The chore updated the default cloud target URL in the Dozer codebase. The current default cloud target URL is pointing to an old version of the Dozer cloud service. The new default URL will point to the latest version of the cloud service.

  • Get a deterministic Processor id (#1665): (#1665):The fix ensures that the Processor id is always the same for a given mapping. The fix will use a UUID to generate the Processor id. UUIDs are guaranteed to be unique, so this will ensure that the Processor id is always the same for a given mapping.

  • Introduce Dozer dev flag (#1668): (#1668):The chore introduced a new flag that can be used to enable development features in Dozer. This flag will be called dozer.dev and it will be set to true or false.The following features will be enabled when the dozer.dev flag is set to true:

    • Logging of mapping errors: When an error occurs during mapping, a more detailed log message will be generated.
    • Printing of mapping statistics: After mapping is complete, a summary of the mapping statistics will be printed to the console.
    • Enabling experimental features: Experimental features that are not yet ready for production use will be enabled
  • Index in UI graph (#1671): (#1671):The fix solved an issue where the index of nodes in the UI graph is not always correct. A consistent algorithm to generate the index of nodes in the graph was used which ensures that the index of nodes is always correct, regardless of the order in which the nodes are added to the graph.

  • Send CI env to metadata (#1669): (#1669):The chore sends the CI environment variables to the metadata file. This will allow users to track the environment in which Dozer was built. The following environment variables will be sent to the metadata file:

    • CI_SERVER: The name of the CI server that was used to build Dozer.
    • CI_BUILD_NUMBER: The build number of Dozer.
    • CI_BRANCH: The branch of the Dozer codebase that was built.
    • CI_COMMIT_SHA: The SHA of the commit that was built
  • Implement TO_CHAR for Timestamp format (#1655): (#1655):The TO_CHAR function in Dozer can be used to convert a timestamp to a string. The current implementation of the TO_CHAR function only supports a limited number of formats. The implementation of TO_CHAR for Timestamp format will add support for a wider range of formats. -Added error message for wrong Table config format for Parquet and CSV (#1675): (#1675):The chore will add an error message for the wrong !Table config format for Parquet and CSV files.Here is a sample message of the configuration:

    ```bash
    if table.config.is_none() {
    return Err(ConnectorError::TableNotFound(format!("{} - configuration for Parquet and CSV is changed since 0.1.26 and to check the documentation", table.name.clone())));
    }
    ```
  • Use rustls for TLS support when connecting to Postgres (#1502): (#1502):The fix updates the Dozer codebase to use rustls for TLS support when connecting to Postgres. The current implementation of Dozer uses OpenSSL for TLS support, but OpenSSL is not as secure as rustls. which is a modern TLS library that is known for its security and performance.

  • Fix date conversion when date is more precise (#1682): (#1682):The fix solves an issue where Dozer does not correctly convert dates when the source date is more precise than the destination date. The fix will use a consistent algorithm to convert dates between different formats. This will ensure that dates are always converted correctly, regardless of the precision of the source date.

  • Run CI scripts on merge groups instead of on pushes to main (#1683) : (#1683):The chore will change the way that CI scripts are run in Dozer so that they are only run when a merge is made to the main branch.Currently, CI scripts are run every time a change is pushed to the main branch. This can be inefficient, especially if the change is not a merge increasing the number of false positives.

  • Bump openssl from 0.10.48 to 0.10.55(#1663): (#1663):The chore will update the version of OpenSSL that is used by Dozer. The latest version, 0.10.55, includes a number of security fixes and performance improvements.

  • Append only (1673) : (#1673):The fix will solve an issue where Dozer does not correctly append data to an existing file. This will ensure that data is always appended correctly, regardless of the order in which the data is mapped.

  • Append only (1673) : (#1673):The fix will solve an issue where Dozer does not correctly append data to an existing file. This will ensure that data is always appended correctly, regardless of the order in which the data is mapped.

  • Group CUD operation in cache/source as single metric (#1684): (#1684): The chore groups all CUD (Create, Update, Delete) operations in cache and source as a single metric. This will make it is easier to track the overall performance of Dozer's cache and source.

  • Avoid include snapshot event in metrics count (#1686):(#1686):The fix will solve an issue where Dozer includes snapshot events in the metrics count. This can lead to inaccurate metrics, as snapshot events are not real events.A filter will be added to exclude snapshot events from the metrics count.

  • Support for syntax function() in SQL (#1679): (#1679): The support for syntax function() in SQL will allow Dozer to use custom functions in SQL queries.This will make it easier for Dozer to map data from one database to another, regardless of the functions that are supported by the databases.

  • Added id in processor factory to support query metric on UI (#1685):(#1679):The chore will add an id to each processor in the processor factory. The addition of an id to each processor will allow the UI to track the performance of each processor.

  • Snowflake client fetch_tables ordering (#1688): (#1688):The fix will solve an issue where the Snowflake client does not return tables in the correct order when using the fetch_tables() method. The fix will use a consistent algorithm to order the tables returned by the fetch_tables() method, ensuring that the tables are always returned in the correct order, regardless of the order in which they were created in Snowflake.

  • Added pg tls config connectivity (#1672): (#1672): The feature will add support for TLS configuration to the PostgreSQL connector. The addition of TLS configuration support will allow Dozer to connect to PostgreSQL servers that are using TLS encryption. This will provide an additional layer of security for data that is being transferred between Dozer and PostgreSQL.

  • Add cloud secrets API interfaces to cloud.proto (#1689): (#1689): The feature will add interfaces for the Cloud Secrets API to the cloud.proto file. The addition of cloud secrets API interfaces to cloud.proto will allow Dozer to interact with cloud secrets providers.

  • Addition of Qualified wildcard (#1674): (#1674): The feature will add support for qualified wildcards to the Dozer codebase. The addition of qualified wildcards support will allow Dozer to map data between objects that have different names but the same prefix.

  • Store app id to context file (#1691): (#1691):The feature will add a mechanism to store the app id to a context file. The addition of a mechanism to store the app id to a context file will allow the app id to be available to the Dozer runtime without having to load the configuration file. This will make it easier to use Dozer in applications that do not have access to the configuration file.

  • Snowflake parallel running instances (#1692): (#1692): The fix will solve an issue where Dozer does not correctly handle parallel running instances of Snowflake. The fix will use a different algorithm to handle parallel running instances of Snowflake.

  • Implement secrets commands (#1695): (#1695): The feature will add a set of commands to the Dozer CLI that allow users to manage secrets. This will make it easier for users to store and manage secrets in Dozer. The following commands will be implemented:

    • create secret: This command will create a new secret.
    • list secrets: This command will list all of the secrets that have been created.
    • get secret: This command will get the value of a specific secret.
    • delete secret: This command will delete a specific secret.
  • Remove secrets value and add secrets support in deploy command (#1697): (#1697): The fix will solve an issue where the secrets value is exposed in the deploy command. This can lead to security vulnerabilities if the deploy command is executed in an insecure environment.The fix will also add secrets support to the deploy command.

  • Implementation of log replication server core (#1710):(#1710)This exciting feature introduces the implementation of the log replication server core. With the log replication server core in place, Dozer gains enhanced capabilities for replicating logs, allowing for improved data synchronization and reliability.For more information, check out the log replication server code. For more information, check out the log replication server code..

  • Support JSON Files for Cloud Deployments (#1714):(#1714): Dozer introduced support for JSON files in cloud deployments. This enhancement enables users to utilize JSON configuration files when deploying Dozer on different cloud platforms.Here's a sample configuration for the JSON Files for Cloud Deployments:

    pub fn list_files() -> Result<Vec<File>, crate::errors::CloudError> {
    let mut files = vec![];
    let patterns = ["*.yaml", "*.sql"];
    let patterns = ["*.yaml", "*.sql", "*.json"];
    for pattern in patterns {
    let files_glob = glob(pattern).map_err(WrongPatternOfConfigFilesGlob)?;

    for entry in files_glob {
    let path = entry.map_err(CannotReadFile)?;
    files.push(File {
    name: path.clone().to_str().unwrap().to_string(),
    content: fs::read_to_string(path.clone()).map_err(|e| CannotReadConfig(path, e))?,
    });
    }
    }
    Ok(files)
    }
  • Supports InList Clause in Streaming SQL (#1694):(#1694):This enhancement enables users to utilize the InList clause when querying streaming data with Dozer. With InList, users can specify multiple values in a query, enabling flexible and efficient data filtering.For additional details, please refer to the InList Clause in Streaming SQL documentation..

  • Show Error when App ID is Not Stored (#1699): (#1699): This fix addresses an issue where an error was not being displayed when the app ID was not stored. With this fix, Dozer users will now receive a clear error message when the app ID is not stored, providing better visibility and allowing for prompt resolution of the issue. For more information, check the error code when the App ID is not stored.

  • Collide Metric Key for Cache Operation (#1687): (#1687): This fix addresses a collision issue related to the metric key for cache_operation. With this fix, the metric key collision has been resolved, ensuring that cache operations are accurately tracked and measured. It is implemented by changing the constant variable name from CACHE_OPERATION_COUNTER_NAME to CACHE_OPERATION_LOG_COUNTER_NAME., improving the overall accuracy and reliability of the cache metrics. For more detailed information, please refer to the Collide Metric Key for Cache Operation documentation.

  • Ignoring Failing Test (#1709):(#1709): This fix addresses a failing test and is currently in progress. By temporarily ignoring the failing test, the development process can proceed smoothly while the fix is being implemented. This proactive approach allows the team to focus on identifying and rectifying the underlying problem. Here's a sample configuration for the Ignoring Failing Test fix:

    use super::DozerE2eTest;
    #[tokio::test]
    #[ignore = "Wildcard implementation may have a bug. This test fails about one out of 5 times"]
    async fn test_e2e_wildcard() {
    let mut test = DozerE2eTest::new(include_str!("./fixtures/basic_sql_wildcard.yaml")).await;

note

The wildcard implementation may have a bug, as the test fails approximately one out of five times.

  • Writing Commit Epoch to Log (#1711):(#1711): This fix addresses an issue where the commit epoch was not being written to the log. With this fix, the commit epoch is now properly recorded in the log, providing valuable information for tracking and analyzing the commit history. For details on how the commit epoch log works click here.

  • Handling REST Path Collision (#1693): (#1693): This fix addresses the issue of REST path collision by implementing a solution that effectively handles such occurrences. The handle_endpoint_collisions solves conflicts arising from overlapping REST paths, ensuring smooth and reliable routing within the dozer. Reverse sorts rest endpoints before attaching to the app so that overlapping routes are not incorrectly matched. For further details related to Handling REST PATH Collision, please click here.

  • Logging Error for Failed Internal Pipeline Server Startup (#1715):(1715): This fix ensures proper error logging when the internal pipeline server fails to start. By accurately logging startup errors, developers gain essential information for troubleshooting and debugging. Here's a sample configuration for the Logging Error for Failed Internal Pipeline Server Startup:

    rx.recv().unwrap();
    if rx.recv().is_err() {
    // This means the pipeline thread returned before sending a message. Either an error happened or it panicked.
    return match pipeline_thread.join() {
    Ok(Err(e)) => Err(e),
    Ok(Ok(())) => panic!("An error must have happened"),
    Err(e) => {
    std::panic::panic_any(e);
    }
    };
  • Fix Typo in Source Manager App Test (#1700): (1700): This chore involves correcting a typographical error in the source manager app test, improving its accuracy and reliability. The following changes were made to the code:

    1. Changed from "test_apps_sorce_smanager_connection_exists" to "test_apps_source_manager_connection_exists".
    2. Changed from "test_apps_sorce_smanager_lookup" to "test_apps_source_manager_lookup".
  • Utilizing Utf8Path in HomeDir ( #1702): (1702): This chore involves updating the code to use Utf8Path in HomeDir. By adopting Utf8Path, the codebase benefits from improved compatibility and support for Unicode characters in file paths.. By incorporating Utf8Path, the codebase achieves better reliability and consistency in file handling, enhancing overall compatibility across different environments and file systems. For a more in-depth understanding, we recommend exploring our extensive documents.

  • Login to AWS for coverage job (#1708):(1708):"When initiating the job, it is essential to log in to AWS using the appropriate credentials. This step allows us to access resources and services provided by AWS, enabling accurate code coverage analysis. By logging in, we can leverage AWS's infrastructure and tools to gather insightful data and metrics related to our codebase. To gain further insights, consider reviewing the documents.

  • Login to AWS for coverage job (#1708):(1708):"When initiating the job, it is essential to log in to AWS using the appropriate credentials. This step allows us to access resources and services provided by AWS, enabling accurate code coverage analysis. By logging in, we can leverage AWS's infrastructure and tools to gather insightful data and metrics related to our codebase. To gain further insights, consider reviewing the documents.

  • Enhancing "API-latency" Metric Context (#1713):(1713):To improve system performance and user experience, we're adding context to the "API-latency" metric. It measures API request processing time. By including more context, we gain insights into factors affecting latency, like network conditions, database queries, and system load. For more comprehensive information, consult the detailed documentation.

  • Primary Key Indication in Schema Printing (#1717): (1717):This chore enhances the clarity of our database schema, making it easier to identify key elements.By incorporating primary key indication in the schema printing, we streamline development and gain a comprehensive understanding of the database structure. For a more in-depth understanding, we recommend exploring the documents.

  • Improving CLI Experience (#1703): (1703): By identifying the areas needing improvement, the CLI workflow is improved and provide a more intuitive experience. This refactor includes optimizing command structures, simplifying syntax, and improving error handling with the following improvements:

    1. Reorder commands for cloud subcommand
    2. Rename migrate to build
    3. Add filter to connectors command
    4. app run and api run moved to run subcommand. Now it is an accessible run app and run api
    5. Removed err-threshold argument
    6. Hidden config-token
    7. Renamed company-name to organization-name
    8. Create cloud section in app config and store app_id in dozer-cloud.yaml file
    9. Combine cloud update and cloud deploy commands
  • Prepared v0.1.26, v0.1.27 and v0.1.28 (#1676,#1698, #1716):(1676)(1698): (1716): : Versionv 0.1.26, v0.1.27 and v0.1.28 have been successfully prepared for release.

This milestone represents significant progress in our product's development, including important updates and enhancements. Our team worked diligently to finalize the codebase, conduct thorough testing, and ensure the release's stability and reliability. We take immense pride in the progress achieved with our latest releases. Our team remains dedicated to enhancing Dozer, and we are sincerely grateful for the feedback and contributions from our amazing community. Let's continue our collaborative innovation journey! πŸš€ πŸš€

Dozer v0.1.26, v0.1.27 and v0.1.28 is now available. Take a look at the release notes for more information.

Looking Forward πŸŒˆβ€‹

As we strive to enhance Dozer, we are committed to expanding its features and capabilities. Your feedback is crucial to our development process, so please don't hesitate to share your thoughts and ideas. You can open an issue for feature request or start a discussion thread in our GitHub Q&A category. As a united front, we have the power to elevate Dozer to its fullest potential

Full Changelog:​

Contact us πŸ“¬β€‹