Announcing Azure Data Lake Storage Gen 2 Collector and Databricks Collector Lineage and Jobs

We’re excited to announce new enhancements to data.world’s Databricks Collector and a brand new Collector for Azure Data Lake Storage Gen 2! With the help of these additional metadata harvesting and lineage capabilities, you can now get more detailed insights into your data than ever before.

Our Databricks Collector allows you to quickly and easily collect metadata from your Databricks environment into data.world. Now, with the addition of Jobs harvesting and lineage capabilities, you can get a deeper understanding of where your data is coming from, how it’s being used, and what insights you can discover.

Our new Jobs harvesting feature allows you to collect additional information about your workflows, such as creator, description, success, schedule, and more. This lets you better understand how and why your data was transformed.

The new lineage capabilities let you track your data’s journey, from its source all the way through its transformations. This means you can easily trace your data’s history, identify potential bottlenecks or sources of errors, and quickly gain an understanding of how your data has changed over time.

Our Azure Data Lake Storage Gen 2 Collector allows you to bring insights about your data storage layer into data.world. With this Collector, you can efficiently harvest metadata about Blobs and Containers, including the owner, last modified, path, and more. This information is vital for understanding your underlying data, leading to more trust and confidence in your data-driven decision-making.

You can learn more about these Features in our Databricks documentation and our Azure Data Lake Storage documentation. Both these Collectors are Tier 2 for Enterprise Customers.

An image showing am Blob from ADLS in the data.world platform

An example of ADLS Blob metadata in the data.world platform