data.world January Product Launch

We are excited to announce the launch of new features and latest improvements:

  • Cloud Collectors - configure and run collectors hosted by data.world NEW
  • Support for Snowflake Data Quality - collect and catalog Snowflake Data Metric Functions (DMFs) NEW
  • Bulk operations UX improvements - streamlined bulk enrichment workflow  IMPROVED
  • Enrichment and discovery UX improvements - more context and default sorting  IMPROVED

Read the sections below for full details on each new feature!


NEW Introducing: Cloud Collectors!

We are excited to announce the launch of Cloud Collectors, the newest way to collect metadata on data.world!

Now, you can configure and run collectors that are hosted by data.world with just a few clicks! This feature not only provides a no-code way to start bringing metadata into your catalogs faster, it also has robust functionality around scheduling and monitoring to make setup more transparent and seamless. If you have cloud-accessible data sources that you're ready to bring into your catalog, this feature is for you!

👩‍💼 How can I use Cloud Collectors?

Users with Admin access will see a new option in the collector setup wizard that says "Cloud."

Once you enter your source information, you will be able to set a custom name for your collector configuration, and set a schedule for how frequently the collector should run.

After a collector completes, you will see the metadata and resource types that were collected, as well as the source information you entered while setting up the collector. Here you will also find what might have gone wrong if the collector run failed, and you'll have the ability to cancel the run as well.

You can view all of the collectors you have set up, whether they are from collectors that you host or Cloud Collectors, on the Metadata Collection tab. From here, you can view, edit, and delete collector configurations. And if you're setting up multiple collectors for one source with the same credentials, try the "Duplicate Configuration" button to quickly set all of them up.

For a full list of supported sources and more details on the feature, please refer to the documentation here.


NEW Announcing support for Snowflake Data Quality

We are thrilled to introduce an exciting addition to our existing Snowflake collector – support for Snowflake’s brand-new Data Quality feature, currently available in private preview. This enhancement empowers users to elevate their data quality assessment to new levels.

Key Features:

📊 Collect and catalog Snowflake Data Metric Functions (DMFs): Users can now measure the quality of their data using Snowflake’s powerful "data metric functions" (DMFs) and catalog this context with data.world. Example DMFs include Null Count, Unique Count, and Freshness – providing comprehensive insights into the health of your data.

🔍 Find and understand data quality metrics: The DMFs and observations (recorded metrics) are seamlessly integrated into resource pages on your data.world platform and are also presented as Hoots associated with Snowflake tables and views. This user-friendly interface makes it easy for individuals across your organization to discover and understand data quality metrics effortlessly.

Why Snowflake Data Quality & data.world?

🌐 Compliance & Consistency: In today's data-driven landscape, ensuring compliance and consistency is paramount. This Data Quality feature & integration help you meet these standards by offering real-time insights into critical data metrics.

🔒 Build Trust: Trust is the foundation of effective data utilization. This Data Quality integration  helps users to trust their data by bringing metrics related to freshness, blank values, and inaccuracies to the catalog and everyday tools, such as Tableau and Power BI, via Hoots.

Who Benefits?

👩‍💼 Data Stewards, Engineers, Admins: Empower your data stewards and technical teams by providing them with a tool that gives immediate insights into the current state of their data based on specific metrics.

🚨 Data Consumers: With Hoots, you can identify and take swift action on tables and views that require attention, ensuring data quality monitoring is seamlessly integrated with considerations for cost, consistency, and performance.

Experience a new era of data quality and reliability with data.world’s support for Snowflake's Data Quality today!

Note: Snowflake Data Quality is an enhancement within the existing Snowflake collector, and is currently available to Snowflake Private Preview customers. ❄️🚀

Using Hoots, users can quickly see data quality issues, like duplicate data, and easily fix the errors.


Improvements and Enhancements

IMPROVED Improvements to Bulk Operations UX

Bulk operations are a crucial part of keeping a catalog updated and accurate. We're excited to announce some improvements that will streamline and accelerate bulk operations such as bulk editing tags and attributes and bulk moving resources between collections.

First, we have consolidated these operations into a single menu for each place you can initiate a bulk operation (the Glossary tab, the Resources tab, and the Collection Contains tab). Now you can Quick edit, Add resources to collections, and Export/Import resources from all three locations.

Next, we've added the granular selection experience, that previously existed only in Quick edit, to the Export/Import spreadsheet flow as well. This is available on all three entry points (Glossary, Resources, Collections), which should significantly reduce the time it takes to make changes via the spreadsheet option.

Finally, we've simplified and clarified the experience around moving resources between collections. Previously this experience only existed within the Quick edit flow, but now you can select 'Add to Collections' or 'Move or Add Collections' to access this functionality. From the Glossary and Resources tab, you'll be able to add resources to one or multiple collections, and from the Collection tab (example below), you'll be able to add resources to one or multiple collections, or move resources from one or all collections to one or multiple collections.

With these improvements, administrators and curators will be able to perform bulk operations on resources much more quickly. For more information, please refer to the documentation for bulk editing resources here and for bulk editing glossary here.

IMPROVED Added context in various search experiences

The suggested search dropdown  now has more context, including the list of collections, owning Organization or User profile, and more. We’ve also added more context to the search experience when a user is relating one resource to another. This added context makes it easier to see and understand what has already been added.

IMPROVED Default sorting improvement + column index sort

We’ve provided a default sort experience that makes scanning the related, contained, and column resources faster. We also added column index as a sort option so users can understand the original column order from the database.

IMPROVED Expansion of the Summary field

The Summary field is now available on all resource types out-of-the-box. The field is available on all Types without the need for configuration.

IMPROVED Rich Text Editing without Markdown

Multi-line fields on catalog resources support Rich Text for more engaging and understandable content, and now these fields can be edited in a What-You-See-Is-What-You-Get (WYSIWYG) user experience rather than users having to create and edit content using Markdown.

Markdown editing is still available for users that prefer it, but now more data owners and users can create compelling rich text content.

Introducing Three New Collectors: Azure Data Factory, DynamoDB, and Teradata

We’re excited to announce three new collectors: Azure Data Factory (ADF), DynamoDB, and Teradata. These collectors gather metadata from these systems and seamlessly bring it into our data.world platform. This metadata helps both technical and non-technical users with discovering and understanding their data quickly, governing their data with greater context, and increasing trust in data by providing information about data health and transformations. 

Azure Data Factory Collector: Detailed Data Tracking

The ADF Collector allows users to understand how your data was moved or transformed, the format changes it underwent, and its migration journey to build a foundation of trust. This collector fetches metadata for Factory, Pipeline, Activity, Linked Service, Dataset, Dataflow, Trigger, Integration Runtime, and Global Parameter within Azure Data Factory. It also provides column-level Lineage, showing how data moves between ADF Datasets and connected sources like Snowflake, Databricks, S3, and ADLS. This helps users understand data movements and transformations, increasing trust. It also allows monitoring of pipelines for health checks, boosting confidence in data integrity and reliability.

DynamoDB Collector: Simplified Discovery

Our DynamoDB Collector helps users discover and understand DynamoDB resources. This collector captures deep metadata for Tables and Streams. It’s useful for both technical users managing DynamoDB resources and non-technical users exploring metadata and understanding how they can use DynamoDB resources through an intuitive interface. Technical users will appreciate getting insight into DynamoDB resources, including Tables, Keys, Indexes, and more. 

Teradata Collector: Comprehensive Data Insight

The Teradata Collector allows users to see a holistic view of all their Teradata assets to help them manage and discover their data. This collector covers metadata for Database, Table, SQL Procedures, User Defined Functions, View, External Procedures, Triggers, User Defined Methods, and User Defined Types. It also offers Profiling and Lineage, showcasing column-level lineage between views and sourced columns, plus lineage for stored procedures. Users can track ownership and freshness of Databases and Tables, which helps understand data quality. Users can also see metadata about how the data was queried via SQL procedures, user defined functions and methods, and triggers, boosting trust in data products.

Start Exploring Today

These collectors enhance how data users explore, discover, understand, and trust their data. Whether you're a tech pro or not, these tools make navigating metadata easier and help teams become more data-driven. Dive into your data world with these new collectors, and embark on a journey of empowered decision-making.

Happy exploring, The data.world Team


🌟 Sleek and Simple: Latest Governance UI Upgrades Are Here! 🌟

Hey there, Governance users!

We're super excited to roll out some cool new updates to our Governance product. This December, we're not just adding fancy features – we're transforming your daily governance tasks to make them smoother, easier, and yes, even a bit more enjoyable. Let's dive into the cool new changes that await you!

1. Your Automations, Neatly Organized! 📂

We know managing a bunch of automations can be overwhelming. So, we've introduced neat new tabs: Active, Pending Activation, and Archived. This means no more sifting through inactive automations to find what you need.
Let these tabs keep your screen tidy and your mind clear. 🧠✨ 

✅ Active: View and manage automations that are currently in use.

⏳Pending Activation: Keep track of automations that are configured but not yet activated. When you create a new automation and don't turn it on, it lands here!

❌ Archived: Access your inactive automations without cluttering your main screen.

(See our Docs)


2. No Guessing in Automation Setup 🛠️

Setting up automations should be clear and straightforward. To ensure this, we've added clear visual cues: No more wondering if your automation will start immediately after saving. We've made it crystal clear that 'save and continue' means just that.


  COMING SOON! Enhanced Automation Summary View 👀

Forget about forgetting! Our upcoming summary page improvements means you won't have to scratch your head trying to recall the details of your automations. This is especially a lifesaver for our governance admins who set and forget their automations. This enhancement will soon allow you to:

✅ See all key details of your automations at a glance.

✅ Better manage and remember your automation settings.

✅ Enjoy a more intuitive interface, particularly beneficial for Core and Premium admins.

🗣️ Your Voice Matters!

Your feedback is what shapes our future updates. We love hearing from you, so let us know what you think about these new features, or any ideas you have for what comes next.

 🎁 Cheers to Easier Governing and Happy Holidays! 🎁

The data.world Governance Team

Notable improvements: we're streamlining the user experience.

Release notes: Enjoy an even more nimble and intuitive catalog with our latest UX enhancements. Our aim? To streamline your workflow and cut down on unnecessary clicks,  duplicates, and possible errors. Here is a list of some of our latest updates:

  1. Duplicate resource warning: Now, if you're about to create a duplicate resource, an immediate warning pops up in the user interface (UI). Think of it as your very own virtual time-saver, spotting any resource with the same name and type within your catalog and stopping you in your tracks, reducing redundancy and promoting effective use of resources.

  2. Context-rich glossary results: Over on the glossary tab, you'll notice more context-rich results. Icons, type, and collection names now accompany your title and description, making it easier than ever to find, understand and use the terms in your glossary. This offers more context that makes the glossary more engaging and easier to navigate.


  3. Dynamic filtering on the organization page: Say goodbye to typing in your whole query and hitting enter. Enjoy our new dynamic filtering feature for types, glossaries, and collections within an organization page. Say hello to a smart and efficient data catalog experience.

These updates aim to enhance your experience and productivity. For more details and full list of release notes, please visit docs.data.world.

Offline Editing for Columns by Table


Catalog administrators and stewards can now more easily operate on Column resources in bulk with the spreadsheet export/import flow. Previously, Columns could only be selected for bulk operations at the Collection level. Now, users can export all Columns of a parent Table, edit attributes in the spreadsheet, and upload changes.

The new functionality is accessible for users with correct access via the 'Columns' tab on a parent Table's resource page (example shown below).

For more information, please refer to the documentation.

Granular Filtering, Search, and Selection for Bulk Operations

We are thrilled to announce new functionality for the Quick Edit feature to support all of the bulk operations necessary to keep catalogs fresh and accurate. 

Previously for Quick Edit, users could only filter a set of resources by Resource Type. But now users can leverage search facets, advanced filtering, and text search capabilities available in other parts of the data.world platform. Users can also perform multiple searches and apply multiple filters to continually add resources to the selection without restarting each time. This will streamline bulk operations by allowing users to more seamlessly select the exact set of resources intended for bulk enrichment and editing.

These capabilities are now available wherever Quick Edit lives: Glossary, Resources, and Collections. They will appear once you select either "Quick Edit" for Glossary, or the "Edit Multiple Resources" entry point for Resources and Collections, shown below:

In a future release, we will enable these capabilities for the Bulk Upload/Edit feature as well, making offline editing more targeted and effective.

For more information, please refer to the documentation for Glossary Quick Edit, Resources Quick Edit, and Collections Quick Edit.

A list of notable enhancements across the data catalog!

We're excited to introduce some powerful improvements and enhancements. Here's a list of our latest releases to the enterprise data catalog

1) Archie Bots - description generator enhancement

Archie Bots can now effortlessly describe all types of catalog resources, including custom resources. This improvement saves you time enriching your catalog, improving discoverability and understandability. You can read more about Archie Bots here.

2) Improvements to UX and increased max character count of descriptions

Enjoy getting wordy! We've increased the maximum character count of the Description field to 5000, allowing for more comprehensive and detailed information. We've also included markdown support in the hover-over view of descriptions and increased the view window size in search results.

3) Improvements to the search and navigation of Glossary terms

Users can now quickly filter by the first letter, making it easier to locate and manage terms. We've also made improvements to how special characters are sorted in the glossary, ensuring a more intuitive and organized experience. 

4) Now you can query the catalog layers

Customers can now query the layers of the graph using a named graph called :current. This feature federates your source data and catalog enrichments into one queryable graph, simplifying data exploration across catalog layers and allowing for easier exploration and analysis of your data assets. You can read more about the catalog layers and how to query them here.

We hope these enhancements empower you to make the most of your enterprise data catalog. Stay tuned for more exciting updates in the future!

Improvements to Metadata Collectors Page and Collector Wizard

To make collector setup faster and easier for catalog administrators, the Metadata Collectors page and Command Builder Wizard now support saving, editing, and deleting collector configurations for on-premise collectors.

Some of the new functionality in this release are:

  • Collectors configured from the UI will be saved and viewable, even before collectors are run. Previously, collector configurations were not saved for later use. 
  • Collector configurations can be edited and deleted.
  • Users can give collector configurations custom names.
  • New table “Catalog metadata sources” shows all collectors that are bringing metadata into the catalog.

For more information, refer to the documentation here.

Launching Sigma and InfluxDB Collectors


🚀 Exciting News: Launching Two New Metadata Collectors on data.world! 🚀

Today, we are excited to announce the release of two new metadata collectors on data.world: the Sigma Collector and the InfluxDB Collector. These tools are designed to simplify and supercharge your data integration and management capabilities.

🔍 Why you'll love the Sigma Collector:

  • For Sigma users, governance can sometimes be a challenge. We’re here to help! Now, with data.world, you can obtain a clear visibility into your Sigma workbooks, files, datasets, and more, ensuring data protection, control, and traceability.
  • Business decision-makers can now effortlessly visualize and integrate Sigma's KPIs and metrics with data from other sources, leading to informed decisions.
  • Data analysts will have improved data trust levels. How? Metadata for Sigma workbooks and elements (e.g., titles, descriptions, authors, last updated info) are now easily accessible within data.world, giving quick and deep insight.
  • Gain valuable insight into data quality and health via a Sigma Hoot, our latest DataOps feature. Read more about Hoots in this Whatsnew Post from August.

🌟 Sigma Collector Features:

  • Metadata Harvested: Workspace, Workbook, Folder, Dataset, Connection, Tag, Grant, Member, Team, and Data Element.
  • Lineage: Capture inter-system lineage within Sigma, connecting to tables, datasets, and other data elements in the workbook.

An example of Sigma Workbook metadata inside the data.world platform, such as who created and last updated the workbook as well as permissions, as well as Lineage.

🔍 How the InfluxDB Collector helps you:

  • Enhance your visualization dashboards! With InfluxDB being a pivotal data source for Grafana, it’s essential to understand the intricate Influx-Grafana relationship. We make that possible.
  • Dive deep into metadata about buckets, tasks, and measurement columns, empowering users to easily locate and monitor time series data, coupled with insights on its processing.

🌟 InfluxDB Collector Features:

  • Metadata Harvested: Bucket, Measurement Schema, Measurement Column, Task, Label, Organization, and Telegraf Configuration.

An example of metadata from an InfluxDB Task inside the data.world platform, including the expression, last run status, and task status.

In a world where data is continuously evolving, it’s crucial to have the right tools for discoverability, integration, and governance. With our new collectors, we aim to make your journey smoother, more insightful, and more powerful.

Sigma, a Tier 1 collector, and InfluxDB, a Tier 2 collector, are both available immediately for Enterprise Customers. Read the documentation for full details:

Hoots and BB Bots available for enterprise customers

Note – in January 2024, BB Bots were renamed Sentry Bots

Release Notes – August 17, 2023

Announcing the launch of Hoots and BB Bots, the latest in our set of DataOps application features, free to all tiers of our enterprise catalog customers.

What problems do Hoots and BB Bots solve? Hoots bring the relevant information from the catalog to your data-consuming teams (analysts, scientists, executives, etc.) and provide simple communication and timely updates about data quality and freshness via BB Bots. Together, these features increase communication and trust and save your data engineering team valuable time in reanswering the same data questions across your data-consuming teams.

What is a Hoot? A Hoot surfaces important context about your data – including data quality and usage information – directly to the applications being used to make data-driven decisions. This saves data producers time that would otherwise be spent answering questions about the state of the data and ensure that data consumers have the context they need to use data confidently.


How do Hoots work? Hoots are simple trust badges that turn green, red, or yellow depending on the health status of your data pipeline. Hoots are configured from the catalog and added to your web-based data product to inform users of health status and more information that is fed automatically from the catalog and automated monitors called BB Bots.


What is a BB Bot? BB Bots are automated monitors that change the status color of the Hoots, providing a trust signal to end-users and allowing data engineers more time to investigate issues and less time answering and re-answering questions.

How do BB Bots work? BB Bots monitor the data.world Data Catalog Platform and other orchestration and observability tools, like Airflow, Monte Carlo, dbt and Matillion. BB Bots automate the communication of data quality and health status and surface this information to the Hoot where it can provide important context alongside other information from the catalog, like definitions, lineage, owner, and policies. All of this information is surfaced in the Hoot that lives on the applications that data consumers are using, like Looker, PowerBI, and Tableau.


To find out how to configure a Hoot, you can read more about these features in our product documentation and enroll in the DataOps and BB Bots course available at data.world University.

Show Previous EntriesShow Previous Entries