data.world Product What's New?

data.world September Product Launch

2024-09-18T17:55:04.582Z

The September release of data.world is live, and is jam-packed with features and improvements that will make catalog teams and users more effective! We’ve got a major improvement to the Search UX, support for SCIM to improve access management, and new collectors to continue making your data.world catalog the center of your organization’s data ecosystem.

Read on to learn about these and other exciting new features!

Simplified Search Experience

We’re excited to announce that our new and improved search experience is now generally available! Our new experience has been in public preview since mid July. Through the preview feedback as well as months of user research, we’ve made significant updates to ensure a more intuitive and streamlined search experience.

With a cleaner interface and new features, users can now find what they need faster and with fewer clicks. Enjoy faster discovery through quick organization scoping; build, save, and share custom filters; pin and order your own top facets; preview the resource without leaving search; view important context like resource hierarchy and custom metadata; and utilize inline advanced features without leaving your current search.

Why the change? We listened to your input and designed a new search experience that eliminates noise and complexity, making it easier for all users—from new users to experienced catalog pros to enjoy faster, more efficient searching.

You can learn more about these improvements in our product documentation and our Searching and Exploring Data course in data.world University.

Notable UI and UX improvements

Relating Resources: Resource relationships are important. We’ve made small but meaningful improvements to the user experience for relating resources to one another. Users will now have more room and more context as they search and relate resources.

Providing Feedback: We want to hear directly from end-users so that we can provide an even better product. We’ve built a simple in-app feedback form that allows us to get direct feedback from users.

Look and Feel: We’ve updated our typography, modernized actions and components, and provided more space to absorb information. This is the first in a series of user interface improvements we’ll be rolling out over the coming months.

SCIM - Microsoft Entra (Private Preview)

Imagine never worrying about managing user access again! We are excited to introduce support for SCIM in data.world, to streamline the process of managing catalog, organization and resource access and permissions.

SCIM (System for Cross-domain Identity Management) makes user management automatic, ensuring the users in your enterprise always have the right access at the right time, by syncing user details and group membership between your identity provider and data.world. From onboarding to offboarding, every detail – job title, email, and permissions – stays perfectly synced. Why does this matter? Because separate data entry and manual errors can lead to security risks, manual effort, and frustration. With SCIM, you’re not just saving time; you're enhancing security and efficiency, protecting your business, and making sure nothing slips through the cracks. It’s peace of mind, effortlessly.

This Private Preview release of SCIM functionality supports Microsoft Entra (a.k.a. Azure Active Directory) as an identity provider. We will be adding Okta support in the near future.

If you’re interested in learning more about SCIM support in data.world, or accessing the Private Preview, reach out to your Customer Service Director.

Qlik Sense Cloud Collector (Private Preview)

The Qlik Sense Cloud Collector catalogs metadata from Qlik Sense Cloud, helping maintain a comprehensive inventory of Qlik Sense Cloud assets, facilitating better governance, discovery, and utilization of data across your organization.

This collector harvests metadata for apps, visualizations, sheets, fields, measures and more. For more information, see the product documentation.

An example collection from Qlik Sense

Informatica Cloud Data Integration (CDI) Collector (Private Preview)

The Informatica Cloud Data Integration (CDI) Collector catalogs metadata from CDI, helping maintain a comprehensive inventory of CDI assets, facilitating better governance, discovery, and utilization of data across your organization.

This collector harvests metadata for jobs, mappings, mapping tasks, and more. For more information, see the product documentation.

An example collection from Informatica Cloud Data Integration

Amazon QuickSight Collector (GA)

The Amazon QuickSight Collector, previously in Private Preview, is now generally available! The QuickSight Collector catalogs metadata from QuickSight, helping maintain a comprehensive inventory of QuickSight assets, facilitating better governance, discovery, and utilization of data across your organization.

This collector harvests metadata for Analyses, Dashboards, Datasets, Data Sources, and Folders. Additionally, lineage metadata is harvested between QuickSight Analyses, Datasets, and Data Sources when the data source is a relational database, S3, or Athena. For more information, see the product documentation.

An example of a QuickSight Dashboard

Search API + Syntax improvements

Public search API endpoints and all UI search interfaces now support advanced syntax to target specific metadata fields when constructing complex searches. The following keywords and partial match syntax is now supported:

title:”Sales Order"
Searches for exact matches on the title field for the value “Sales Order”

title:”*sales”
Searches for matches with titles containing the term “sales”.
The leading '*' character is used to specify partial or fuzzy match.

“description” and “summary” can also be used as keywords to target specific fields.

Custom metadata can be searched in this manner with the following syntax:

metadata:”Submitted by:Tim Gasper”
Searches for exact matches on the “Submitted by” field with the value “Tim Gasper”

metadata:”Steward:*Juan”
Searches for matches on the “Steward” field that contain the term “Juan”

When using the public search APIs, specific fields can be targeted for exact and partial matching using the property object and the desired field IRI:

{
    "owner": "democorp",
    "property": {
        "https://democorp.linked.data.world/d/ddw-catalogs/steward": "*Gasper"
    }
}

data.world August Product Launch

2024-08-21T22:06:45.335Z

The August release of data.world brings a number of new and improved product capabilities, including an improved user interface for resource creation, real-time metadata sync with Databricks, a new metadata field to improve an understanding of where catalog resources come from, and enhancements to Microsoft, Salesforce and Databricks collectors.

Also available now is an exciting improvement to our AI Context Engine™ that helps provide explainable answers from your structured data.

Read on to learn about these exciting new features!

Active Directory authentication for Microsoft Collectors

The SQL Server, SQL Server Reporting Services, Power BI Report Server, and SQL Server Integration Services Collectors now support Active Directory domain credentials using NTLM authentication type allowing the collector to connect securely using Active Directory-managed authentication.

Salesforce Collector

The all-new Salesforce Collector catalogs rich metadata from Salesforce, helping maintain a comprehensive inventory of Salesforce assets, facilitating better governance, discovery, and utilization of data across your organization.

The new version of the collector now harvests metadata for objects, fields, dashboards, and reports directly via Salesforce APIs.

An example collection from Salesforce

Databricks Collector harvests lineage to Amazon S3 and ADLS Gen2

The Databricks Collector now harvests External Locations allowing users to understand cross-system lineage between Databricks, Amazon S3, and Azure Data Lake Storage Gen2.

AI Context Engine - new "detailed answer" endpoint

The new "detailed answer" endpoint in AI Context Engine works similarly to the existing "Answer Tool" endpoint, but it returns much more information, including:

answer - Textual response, same as before
result - raw data & schema (frictionless data format)
sparql - SPARQL query
sql - the SQL query
targetSql - SQL queries executed against target systems
terms - Business terms that were used to generate the query
ontologyUsed - The parts of the ontology that were used to generate this response
evidence - the "thoughts" that were generated during the run (same as what you might see in the debug chat tool, Archimedes)

In contrast, "tool" endpoints are simpler, returning only the response in order to integrate seamlessly with other LLMs (e.g., OpenAI).

In the future, this and other "detailed answer" endpoints in AI Context Engine will return additional information and evidence as we further deliver on accurate, explainable and governed answers from your structured data.

Link: https://developer.data.world/reference/callanswer

Source System metadata field

Source System is a new default field that consistently describes the system from which the catalog record metadata was sourced (e.g. Tableau). This field will be used to improve discovery by allowing types to be organized by Source as well as helping to differentiate between ambiguous resource type names (e.g. Dataset). Read more about how to extend and configure this field for your custom types and collectors here.

For more information see our product documentation.

Databricks Publisher real-time updates

This feature allows automatic triggering of Databricks Publisher automation whenever a Databricks Column or Table description is updated in the data.world catalog via the UI or public API. This ensures real-time synchronization of metadata between data.world and Databricks, eliminating the need for manual updates.

Databricks Publisher (announced last month) is currently in Beta, so to get access, reach out to your Customer Success Manager.

Improved UX for resource creation

We’ve enhanced the UI for creating new resources in data.world. Now, when users create a resource in data.world, a new multi-step wizard flow replaces the old small pop up modal.

This new approach makes resource creation much easier for a wider variety of users - thanks for your feedback on this important aspect of the catalog user experience!

data.world July Product Launch

2024-07-24T13:02:00.000Z

The July release of data.world is here, with starter kits to get AI Context Engine™ up and running quickly, enhancements to collect more metadata for popular collectors, the first release of Databricks Publisher to keep metadata in sync between data.world and Databricks, and an opt-in preview of an improved search experience within data.world.

Read on to learn about these exciting new capabilities, available now!

AI Context Engine™ Starter Kits

To help customers quickly utilize the AI Context Engine, we now offer three starter kits:

Key Benefits

Quick Start: Each starter kit provides all necessary components to get up and running quickly.
Customization: Clients can modify the source code to suit their specific needs.
Ongoing Updates: Access the latest features and improvements by updating from GitHub.
No Custom LLM Required: The starter kits enable direct AICE API calls for basic Q&A on structured data.

Availability and Support

The starter kits are available to all AICE customers at the links above. While they are provided as-is and unsupported, they offer a robust foundation for developing custom applications.

Enhancements to Power BI, dbt, and Denodo Collectors

This month, we’re excited to announce enhancements to our top collectors to harvest more metadata and lineage relationships.

Power BI Collectors enhancements

The Power BI Service and Power BI Gov Collectors have been updated to harvest preview images for Power BI reports allowing users to preview reports in data.world before they navigate to them in Power BI.

dbt Collector enhancements

The dbt Core Collector has been updated to streamline the setup of multiple dbt Core Collector instances allowing users to specify multiple run_results.json files in a single run. Additionally, the dbt Core and dbt Cloud Collectors now support Azure Synapse as a target database.

Denodo Collector enhancements

The Denodo Collector has been updated to harvest lineage between Denodo resources and cross-system lineage between Denodo and Power BI are now supported.

Announcing the Beta Release of Databricks Publisher!

We are excited to introduce the first version of the Databricks Publisher, launching in Beta on July 23. This new feature allows you to write back Databricks column and table metadata for individual resources with just the push of a button, streamlining your metadata governance processes.

What’s New?

The Databricks Publisher enables seamless synchronization of metadata, ensuring that annotations by subject matter experts and data stewards in data.world are reflected back in Databricks. This enhancement benefits analysts and data scientists by providing well-governed, meaningful metadata directly within their Databricks environment.

But that’s not all! We have more exciting features in the pipeline:

Real-Time Updates: Soon, you won’t need to manually sync changes. Any saved changes will automatically update in Databricks.
Tag Writeback: This upcoming feature will extend the functionality to include tags, further enhancing your governance capabilities.

Why This Matters

Synchronizing metadata between data.world and Databricks creates a seamless governance workflow. This integration supports a broader persona model where end users and governance professionals utilize data.world directly, while technical analysts and engineers use platforms like Databricks for discovery and data work. By bridging these environments, we’re making it easier for your teams to collaborate and access the data they need.

To get access to this Beta feature, reach out to your Customer Success Manager. We can’t wait for you to experience the benefits of this new feature and look forward to your feedback as we continue to enhance our offerings.

Stay tuned for more updates, and thank you for being a valued customer!

Public preview of new search experience now available

We're happy to announce the public preview of our new search experience! As part of our continuous effort to improve your experience with our platform, we've made some significant upgrades and introductions to the way you search data on our product. For the next month, we invite you to opt-in to the new preview where you can interact with these new features before they go live.

Our enhanced search function now offers a preview of key resource elements, allowing you to quickly differentiate results and jump to related resources or explore resource lineage without steering away from your search experience. Some much-loved additions include refined and streamlined filters to help you find exactly what you need, more efficiently. You can now enjoy a search for filters, select more than one, and delve into advanced filter operations (ANY, ALL, NONE). Organization scoping has never been smoother – stay within your org scope, and sort regardless of your scope.

We’ve made the experience cleaner and more user-friendly, cutting down on overwhelming filter options and obscure filter values, and enabling a customizable order of filters – pin your favorite or most-used ones to the top for ease. Plus, you can also build and save your own filters for future and repeated use.

These changes root from the invaluable feedback we received from you, our valued customers and users. You asked for a simpler, more manageable experience, the ability to preview more types, as well as personal customizations like saved searches and re-ordered facets – and we listened! We’re incredibly excited for you to experience these advancements. You can read more about what we’ve changed in our documentation portal.

The interface invites you to share your feedback directly within the app, or you can reach out to your Customer Success Manager to voice your thoughts. We look forward to hearing from you!

data.world June Product Launch

2024-06-28T22:04:19.057Z

The June release of data.world is here – featuring a new collector for Microsoft SSIS, enhancements to collect more metadata for Databricks and PowerBI, the introduction of versioning for Governance Automations, and a useful set of new security and management features for catalog admins and data stewards.

Read on to learn about these exciting new capabilities, available now!

Enhancements to Databricks and PowerBI Collectors, and a new SQL Server Integration Services Collector.

This month, we’re excited to announce improvements to some of our most frequently used collectors, and the new SQL Server Integration Services collector.

The new SQL Server Integration Services Collector is available in Private Preview, contact your Customer Success Director to learn more how to participate in the program.

SQL Server Integration Services Collector in Private Preview

The SQL Server Integration Services (SSIS) Collector catalogs metadata from SSIS, helping maintain a comprehensive inventory of SSIS assets, facilitating better governance, discovery, and utilization of data across your organization.

This collector harvests metadata for projects, packages, control flow/data flow executables, and much more.

An example collection from SSIS

Power BI Service Collector enhancements

The Power BI Service Collector has been updated to harvest Power BI Measures (including lineage) and lineage between calculated columns. Additionally, the collector now harvests lineage from Power BI Query statements and supports lineage between upstream data sources configured using ODBC connections.

Databricks Collector performance enhancements and external locations

A number of performance improvements were made to the Databricks Collector when harvesting tags on Databricks tables and lineage metadata from Unity Catalog. Users may see up to 80% improvement depending on the shape/weight of their Databricks instance. The Databricks Collector has also been updated to harvest external locations (Azure Data Lake Storage Gen2 and Amazon S3).

Start exploring today

These new collector updates help users understand where data in these reports are sourced from, facilitating troubleshooting for analysts and increasing trust for business end users. Learn more in our documentation:

Versioning for Governance Automations

This new functionality allows admins of Governance Automations in data.world to:

Edit Existing Automations: Modify your current automations directly within the system, ensuring they meet your evolving needs.
Maintain Task Integrity: Existing tasks, both claimed and unclaimed, stay connected and unaffected by changes.
Future-Proof Configurations: Any new runs initiated post-edit will use the updated configuration.
Stay Updated: Edited automations automatically update to the latest template version, incorporating the newest features and capabilities.

Enjoy the additional flexibility and power of Editable Automations, available today!

Data Exfiltration Controls

Enterprise customers can now configure instance-wide policies for downloading dataset content from the platform. This suite of controls also adds support for restricting who can create personal access tokens for the data.world public API.

Learn more about this feature by visiting our documentation portal.

Organization Browse Card Wizard

Organization admins can now design, build, link resources, and edit the organization level browse card through a visual wizard in the UI. This feature is compatible with more advanced Browse Card configuration options such as automations.

Learn more about this new feature by visiting the documentation portal.

Administrative Functions for Discussion Topics

Resource admins now have the ability to manage Discussion Topics on the Discussions tab of a resource to help maintain highly useful and accurate content. An admin can delete discussion topics as well as edit the title of an existing topic.

Learn more about this new feature by visiting the documentation portal.

data.world May Product Launch

2024-05-22T13:30:00.789Z

The May release of data.world has something for everyone – user experience improvements for navigation and understanding, a new source of technical lineage, big performance improvements across multiple collectors, and a set of powerful and time-saving capabilities for the admins and program teams building and managing their catalog experience.

Read on to learn about these exciting new features!

New relationships summary and browsing

In our continued effort to streamline the user experience, we've rolled out a few changes to our catalog metadata resource details pages. These adjustments have been carefully designed to save valuable time and provide the information needed to understand and navigate related resources. Stay tuned for more updates to these pages in the coming months!

Collector performance improvements and Oracle lineage

This month, we’re excited to announce improvements to the overall runtime for a number of our collectors and an update to the Oracle collector for harvesting lineage metadata.

We are also excited to announce that both the Amazon DynamoDB Collector and Azure Data Factory Collector are now generally available.

Performance improvements across collectors

A number of performance improvements were made to collectors to improve their overall runtime and reduce memory usage. Customers may see up to 90% improvement depending on the collector and shape/weight of their database/data warehouse.

These collectors include Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, dbt Core, and dbt Cloud collectors.

Oracle collector harvests lineage metadata

The Oracle collector now harvests lineage relationships from Oracle views, stored procedures, and functions. With this new metadata, users can now visualize and query for how data is moved within Oracle and other technologies.

Start exploring today

These new collector updates help users catalog their sources faster, facilitate troubleshooting for analysts, and increase trust for business end users. Learn more about what is supported in our documentation:

Oracle Collector documentation

Organization Details Public API

We’ve added a utility endpoint to our public API to surface organization details such as extended description and avatar for use in integration development. Visit the developer portal to learn more.

Catalog Resource Public API

We're delighted to announce an updated suite of API endpoints focused on flexible catalog management.

We’ve seen wide adoption of advanced catalog features like custom resources, relationships and integrations. Our public API now has full support for our “catalog anything” mission and brings the flexibility of the knowledge graph to the initiatives you are working on, both on and off the data.world platform.

Visit our interactive developer portal to learn more and try out the new functionality.

In-App Technical Reference

As a companion to the Catalog Management APIs, we’ve added a new in-app reference to each resource page that provides in-depth information about the ontology and configuration details for the resource. The Technical Reference page can be found by navigating to the “Settings” tab of any resource and clicking “Technical Reference” in the left navigation menu.

Use this reference as a starting point for taking full advantage of the power of the knowledge graph through SPARQL queries and our Public API.

The reference provides details about the supported relationships, metadata fields, selection values, asset statuses and type inheritance for the resource.

For a comprehensive introduction to the Technical Reference, visit our documentation portal.

User Management Utilities

Administrators with the Instance Admin role now have expanded capabilities for managing active users of the platform. Administrators can now self-serve on deactivating users when someone leaves the company or should no longer have access to the system.

We’ve also enabled instance admins to promote other users to the role without needing to do so through data.world support.

Access Audit Utility

Administrators are often asked to troubleshoot access issues and confirm that access has been appropriately revoked when users change roles or need help. The user management portal includes a quick reference utility to check the access level a user has to any resource on the platform.

You can learn more about both of these features in our documentation.

data.world April Product Launch

2024-04-17T04:01:10.960Z

April brings multiple new metadata collection capabilities to data.world, including Collector enhancements for Snowflake and Databricks, and a new Collector for Amazon Managed Streaming for Kafka.

Read on to learn about these exciting new features!

Catalog Snowflake Streamlit Apps, Databricks Tags, and Amazon Managed Streaming for Kafka (MSK) assets

We’re excited to announce updates to the Snowflake and Databricks collectors to harvest more metadata and collector support for Amazon Managed Stream for Kafka. These updates gather more metadata from these systems and seamlessly bring it into our data.world platform. This metadata helps both technical and non-technical users discover and understand their data quickly, govern their data with greater context, and increase trust in data by providing information about data health and transformations.

All new features are generally available.

The Snowflake Collector harvests metadata from Streamlit in Snowflake

The Snowflake Collector now catalogs metadata from Streamlit in Snowflake, facilitating better governance, discovery, and utilization of Streamlit apps across your organization.

The metadata harvested for Streamlit apps includes comments, owners, creation date, and root location. From data.world, users can discover apps and navigate directly to the app in Snowflake.

An example of a Streamlit app

Databricks Collector harvests Databricks tags

The Databricks Collector now catalogs tags from Databricks catalogs, schemas, tables, and columns. Tags are used in Databricks to simplify the search and discovery of data assets. With these tags now in data.world, users can quickly discover data assets in Databricks. For instance, product teams can now build their data products in Databricks and identify them in data.world.

An example of Databricks Tags

Amazon Managed Streaming for Kafka (MSK) Collector

The new Amazon Managed Streaming for Kafka (MSK) Collector catalogs metadata from Amazon MSK, helping maintain a comprehensive inventory of MSK assets, facilitating better governance, discovery, and utilization of data across your organization.

This collector harvests metadata for clusters, brokers, topics, consumers, and producers.

An example collection from Amazon MSK

Start exploring today

data.world March Product Launch

2024-03-20T14:07:27.255Z

March brings a host of new capabilities to data.world, including a new Snowflake integration for Tag Syncing, two new collectors (Power BI Report Server, Amazon QuickSight), a highly-requested interface improvement to better understand relationships, and a Chrome Extension for Hoots.

Read on to learn about these exciting new features!

Snowflake Tag Sync Automation [beta]

This automation allows users to edit and create new Snowflake tags within the data.world platform and then sync those Snowflake tags between Snowflake and data.world.

Key Features:

Easily edit and create new Snowflake tags using data.world’s simple user interface
Sync edited/new Snowflake tags back to Snowflake with the push of a button
Display Snowflake tags in a new section titled "Snowflake Tags" on resource pages
data.world becomes the source of truth for Snowflake tags when this automation is enabled

Why integrate your Snowflake tags in data.world?

Creating and editing tags is a breeze within data.world’s UI. Snowflake tags are powerful governance tools that allow users to apply policies, control access, and discover resources. An easier method for users to create and edit tags via the data.world means it’s easier to govern Snowflake resources.

Inside the data.world platform, you can view Snowflake Tags on Snowflake resource pages, like this Column page. You can also Sync the tags back to Snowflake with a simple push of a button.

The Snowflake Tag Sync Automation is currently in beta and is available as part of the Data Governance Premium offering. If you are interested in this feature, please reach out to your Customer Success Director and they will help enable the feature for you. You can read our product documentation here for full details.

Relationships as Fields

You asked, we listened. Our latest improvement is crafted from the desire to streamline the enrichment experience, making it easier to build, manage, and see important relationships. This capability allows metadata fields to be built using custom relationships between resource types, providing a more intelligent way to manage metadata and inspire users to build relationships. For example, if you cataloged your Teams and Data Products, you might want to create a relationship to show which teams govern which products (screenshot below). Your users might see and navigate to Team resources via Data Products, or vice-versa.

These new types of fields help users see, understand, and navigate relationships and show the knowledge graph at work. This enhancement compliments our flexible architecture that allows you to build custom Types, Fields and Relationships - deciding what to catalog and how best your users might want to navigate the related resources. This feature focuses on the philosophy that there are many kinds of relationships - some of which have an "attribute-like" utility, rather than just a related object.

You can read more about how you might use this feature in our documentation. This is now available to all enterprise customers using either MDP or Catalog Toolkit for configuration.

We hope you enjoy the new opportunities this enhancement brings to your catalog.

New collectors for Power BI Report Server and Amazon QuickSight

We’re excited to announce two new collectors: Power BI Report Server Collector and Amazon QuickSight Collector, which gather metadata from these systems and seamlessly bring it into the data.world platform. This metadata helps both technical and non-technical users discover and understand their data quickly, govern their data with greater context, and increase trust in data by providing information about data health and transformations.

Both new collectors are available in Private Preview, please contact your Customer Success Director if you are interested in participating in a Private Preview program. More information is available in our product documentation: Power BI Report Server collector, Amazon QuickSight collector.

Hoots Browser Extension for Google Chrome

Now available in the Chrome App Store is the data.world extension for Google Chrome, with the exciting new capability of automatic display of the Hoots badges on the data products where your organization’s users are using data and making decisions.

Now the valuable data trust signals, related glossary terms and additional context from your data.world catalog Hoots configuration are more easily displayed on your BI & analytics applications. With the Hoots Browser Extension capability, Hoots can be shown on Tableau, Power BI, Looker, and any other web-based application, and there’s no integration required to embed the Hoot display.

More information about the Hoots Browser Extension is available in our product documentation: Using Google Chrome Extension for Hoots.

data.world February Product Launch

2024-02-20T23:39:58.749Z

February brings several new features to data.world, including a new Premium Governance Automation, a new metadata collector for Netezza, and read-only configuration for default fields. Read on to learn about these exciting new features!

New Governance Automation: Premium Metadata Completeness

What is the Premium Metadata Completeness automation? The premium version of our metadata completeness automation checks resources for missing metadata, identifies incomplete information, and lets you assign tasks to your team to update and complete this information efficiently. This gives you automated insights into the completeness of metadata across your resources, enabling better data tracking and quality control.

How does this automation work? If the automation identifies any resources that lack essential metadata, it generates a task for each of these resources. The user group associated with the automation receives in-app notifications regarding these tasks, and can then act upon the resources to provide the missing metadata.

How is this automation different from the Metadata completeness Automation? The Metadata Completeness Automation solely generates a report listing incomplete resources. In contrast, the Premium Metadata Completeness Automation not only produces the report but also creates a list of actionable tasks for the identified incomplete resources that can be assigned to user groups. It then alerts the authorized users to navigate the resource pages, prompting them to fill in the missing information.

How will this automation help me?

As an admin, I can now delegate tasks effortlessly from within the Reports, monitor progress over time, and quickly spot issues of incomplete resources.

As a data owner, I can now take action promptly because I can see task-relevant guidance to help me understand what metadata is missing or incomplete and how to fix it. Also, all my tasks are found in one place so I can quickly manage the next steps.

As a data consumer, I have increased trust in my data because now I have instant visibility into completeness. Incomplete resources are automatically flagged for data me, increasing my understanding and trust of the resource.

The Premium Metadata completeness Automation is available only to customers that have purchased the Data Governance Premium tier. You can read more about the Premium Metadata Completeness Automation in the documentation here: https://docs.data.world/en/229853-premium-metadata-completeness-automation.html

An example of a Metadata Completeness Report.

Launching a new and improved collector for Netezza

We have an exciting update to our previous Netezza collector. Now this collector harvests much more metadata, including Lineage.

You can use this collector to harvest metadata from Netezza Performance Server. It collects metadata for database information, schemas, tables, views, materialized views, functions, stored procedures, and columns. It also supports Lineage for views, materialized views, and procedures. You can read more about the new features here: https://docs.data.world/en/197948-about-the-netezza-collector.html

Netezza is a Tier 3 collector, and available to Enterprise Customers.

An example of Lineage from the Netezza Collector.

Read-only configuration for default fields

Enterprise catalog customers can now configure custom metadata fields and out-of-the-box fields including Title, Description, Summary, Tags, and Status as read-only. This configuration option prevents contributors from editing fields through the UI or through catalog APIs and should be used to identify fields that are updated through automations and collectors.

Read-only fields can be configured through Catalog Toolkit or custom profiles.

An example of Read Only fields in the UI.

data.world January Product Launch

2024-01-24T17:05:07.266Z

We are excited to announce the launch of new features and latest improvements:

Cloud Collectors - configure and run collectors hosted by data.world NEW
Support for Snowflake Data Quality - collect and catalog Snowflake Data Metric Functions (DMFs) NEW
Bulk operations UX improvements - streamlined bulk enrichment workflow IMPROVED
Enrichment and discovery UX improvements - more context and default sorting IMPROVED

Read the sections below for full details on each new feature!

NEW Introducing: Cloud Collectors!

We are excited to announce the launch of Cloud Collectors, the newest way to collect metadata on data.world!

Now, you can configure and run collectors that are hosted by data.world with just a few clicks! This feature not only provides a no-code way to start bringing metadata into your catalogs faster, it also has robust functionality around scheduling and monitoring to make setup more transparent and seamless. If you have cloud-accessible data sources that you're ready to bring into your catalog, this feature is for you!

👩‍💼 How can I use Cloud Collectors?

Users with Admin access will see a new option in the collector setup wizard that says "Cloud."

Once you enter your source information, you will be able to set a custom name for your collector configuration, and set a schedule for how frequently the collector should run.

After a collector completes, you will see the metadata and resource types that were collected, as well as the source information you entered while setting up the collector. Here you will also find what might have gone wrong if the collector run failed, and you'll have the ability to cancel the run as well.

You can view all of the collectors you have set up, whether they are from collectors that you host or Cloud Collectors, on the Metadata Collection tab. From here, you can view, edit, and delete collector configurations. And if you're setting up multiple collectors for one source with the same credentials, try the "Duplicate Configuration" button to quickly set all of them up.

For a full list of supported sources and more details on the feature, please refer to the documentation here.

NEW Announcing support for Snowflake Data Quality

We are thrilled to introduce an exciting addition to our existing Snowflake collector – support for Snowflake’s brand-new Data Quality feature, currently available in private preview. This enhancement empowers users to elevate their data quality assessment to new levels.

Key Features:

📊 Collect and catalog Snowflake Data Metric Functions (DMFs): Users can now measure the quality of their data using Snowflake’s powerful "data metric functions" (DMFs) and catalog this context with data.world. Example DMFs include Null Count, Unique Count, and Freshness – providing comprehensive insights into the health of your data.

🔍 Find and understand data quality metrics: The DMFs and observations (recorded metrics) are seamlessly integrated into resource pages on your data.world platform and are also presented as Hoots associated with Snowflake tables and views. This user-friendly interface makes it easy for individuals across your organization to discover and understand data quality metrics effortlessly.

Why Snowflake Data Quality & data.world?

🌐 Compliance & Consistency: In today's data-driven landscape, ensuring compliance and consistency is paramount. This Data Quality feature & integration help you meet these standards by offering real-time insights into critical data metrics.

🔒 Build Trust: Trust is the foundation of effective data utilization. This Data Quality integration helps users to trust their data by bringing metrics related to freshness, blank values, and inaccuracies to the catalog and everyday tools, such as Tableau and Power BI, via Hoots.

Who Benefits?

👩‍💼 Data Stewards, Engineers, Admins: Empower your data stewards and technical teams by providing them with a tool that gives immediate insights into the current state of their data based on specific metrics.

🚨 Data Consumers: With Hoots, you can identify and take swift action on tables and views that require attention, ensuring data quality monitoring is seamlessly integrated with considerations for cost, consistency, and performance.

Experience a new era of data quality and reliability with data.world’s support for Snowflake's Data Quality today!

Note: Snowflake Data Quality is an enhancement within the existing Snowflake collector, and is currently available to Snowflake Private Preview customers. ❄️🚀

Using Hoots, users can quickly see data quality issues, like duplicate data, and easily fix the errors.

Improvements and Enhancements

IMPROVED Improvements to Bulk Operations UX

Bulk operations are a crucial part of keeping a catalog updated and accurate. We're excited to announce some improvements that will streamline and accelerate bulk operations such as bulk editing tags and attributes and bulk moving resources between collections.

First, we have consolidated these operations into a single menu for each place you can initiate a bulk operation (the Glossary tab, the Resources tab, and the Collection Contains tab). Now you can Quick edit, Add resources to collections, and Export/Import resources from all three locations.

Next, we've added the granular selection experience, that previously existed only in Quick edit, to the Export/Import spreadsheet flow as well. This is available on all three entry points (Glossary, Resources, Collections), which should significantly reduce the time it takes to make changes via the spreadsheet option.

Finally, we've simplified and clarified the experience around moving resources between collections. Previously this experience only existed within the Quick edit flow, but now you can select 'Add to Collections' or 'Move or Add Collections' to access this functionality. From the Glossary and Resources tab, you'll be able to add resources to one or multiple collections, and from the Collection tab (example below), you'll be able to add resources to one or multiple collections, or move resources from one or all collections to one or multiple collections.

With these improvements, administrators and curators will be able to perform bulk operations on resources much more quickly. For more information, please refer to the documentation for bulk editing resources here and for bulk editing glossary here.

IMPROVED Added context in various search experiences

The suggested search dropdown now has more context, including the list of collections, owning Organization or User profile, and more. We’ve also added more context to the search experience when a user is relating one resource to another. This added context makes it easier to see and understand what has already been added.

IMPROVED Default sorting improvement + column index sort

We’ve provided a default sort experience that makes scanning the related, contained, and column resources faster. We also added column index as a sort option so users can understand the original column order from the database.

IMPROVED Expansion of the Summary field

The Summary field is now available on all resource types out-of-the-box. The field is available on all Types without the need for configuration.

IMPROVED Rich Text Editing without Markdown

Multi-line fields on catalog resources support Rich Text for more engaging and understandable content, and now these fields can be edited in a What-You-See-Is-What-You-Get (WYSIWYG) user experience rather than users having to create and edit content using Markdown.

Markdown editing is still available for users that prefer it, but now more data owners and users can create compelling rich text content.

Introducing Three New Collectors: Azure Data Factory, DynamoDB, and Teradata

2023-12-19T16:30:00.000Z

We’re excited to announce three new collectors: Azure Data Factory (ADF), DynamoDB, and Teradata. These collectors gather metadata from these systems and seamlessly bring it into our data.world platform. This metadata helps both technical and non-technical users with discovering and understanding their data quickly, governing their data with greater context, and increasing trust in data by providing information about data health and transformations.

Azure Data Factory Collector: Detailed Data Tracking

The ADF Collector allows users to understand how your data was moved or transformed, the format changes it underwent, and its migration journey to build a foundation of trust. This collector fetches metadata for Factory, Pipeline, Activity, Linked Service, Dataset, Dataflow, Trigger, Integration Runtime, and Global Parameter within Azure Data Factory. It also provides column-level Lineage, showing how data moves between ADF Datasets and connected sources like Snowflake, Databricks, S3, and ADLS. This helps users understand data movements and transformations, increasing trust. It also allows monitoring of pipelines for health checks, boosting confidence in data integrity and reliability.

DynamoDB Collector: Simplified Discovery

Our DynamoDB Collector helps users discover and understand DynamoDB resources. This collector captures deep metadata for Tables and Streams. It’s useful for both technical users managing DynamoDB resources and non-technical users exploring metadata and understanding how they can use DynamoDB resources through an intuitive interface. Technical users will appreciate getting insight into DynamoDB resources, including Tables, Keys, Indexes, and more.

Teradata Collector: Comprehensive Data Insight

The Teradata Collector allows users to see a holistic view of all their Teradata assets to help them manage and discover their data. This collector covers metadata for Database, Table, SQL Procedures, User Defined Functions, View, External Procedures, Triggers, User Defined Methods, and User Defined Types. It also offers Profiling and Lineage, showcasing column-level lineage between views and sourced columns, plus lineage for stored procedures. Users can track ownership and freshness of Databases and Tables, which helps understand data quality. Users can also see metadata about how the data was queried via SQL procedures, user defined functions and methods, and triggers, boosting trust in data products.

Start Exploring Today

These collectors enhance how data users explore, discover, understand, and trust their data. Whether you're a tech pro or not, these tools make navigating metadata easier and help teams become more data-driven. Dive into your data world with these new collectors, and embark on a journey of empowered decision-making.

Happy exploring, The data.world Team

🌟 Sleek and Simple: Latest Governance UI Upgrades Are Here! 🌟

2023-12-19T15:26:00.000Z

Hey there, Governance users!

We're super excited to roll out some cool new updates to our Governance product. This December, we're not just adding fancy features – we're transforming your daily governance tasks to make them smoother, easier, and yes, even a bit more enjoyable. Let's dive into the cool new changes that await you!

1. Your Automations, Neatly Organized! 📂

We know managing a bunch of automations can be overwhelming. So, we've introduced neat new tabs: Active, Pending Activation, and Archived. This means no more sifting through inactive automations to find what you need.
Let these tabs keep your screen tidy and your mind clear. 🧠✨

✅ Active: View and manage automations that are currently in use.

⏳Pending Activation: Keep track of automations that are configured but not yet activated. When you create a new automation and don't turn it on, it lands here!

❌ Archived: Access your inactive automations without cluttering your main screen.

(See our Docs)

2. No Guessing in Automation Setup 🛠️

Setting up automations should be clear and straightforward. To ensure this, we've added clear visual cues: No more wondering if your automation will start immediately after saving. We've made it crystal clear that 'save and continue' means just that.

COMING SOON! Enhanced Automation Summary View 👀

Forget about forgetting! Our upcoming summary page improvements means you won't have to scratch your head trying to recall the details of your automations. This is especially a lifesaver for our governance admins who set and forget their automations. This enhancement will soon allow you to:

✅ See all key details of your automations at a glance.

✅ Better manage and remember your automation settings.

✅ Enjoy a more intuitive interface, particularly beneficial for Core and Premium admins.

🗣️ Your Voice Matters!

Your feedback is what shapes our future updates. We love hearing from you, so let us know what you think about these new features, or any ideas you have for what comes next.

🎁 Cheers to Easier Governing and Happy Holidays! 🎁

The data.world Governance Team

Notable improvements: we're streamlining the user experience.

2023-12-18T21:16:27.387Z

Release notes: Enjoy an even more nimble and intuitive catalog with our latest UX enhancements. Our aim? To streamline your workflow and cut down on unnecessary clicks, duplicates, and possible errors. Here is a list of some of our latest updates:

Duplicate resource warning: Now, if you're about to create a duplicate resource, an immediate warning pops up in the user interface (UI). Think of it as your very own virtual time-saver, spotting any resource with the same name and type within your catalog and stopping you in your tracks, reducing redundancy and promoting effective use of resources.
Context-rich glossary results: Over on the glossary tab, you'll notice more context-rich results. Icons, type, and collection names now accompany your title and description, making it easier than ever to find, understand and use the terms in your glossary. This offers more context that makes the glossary more engaging and easier to navigate.
Dynamic filtering on the organization page: Say goodbye to typing in your whole query and hitting enter. Enjoy our new dynamic filtering feature for types, glossaries, and collections within an organization page. Say hello to a smart and efficient data catalog experience.

These updates aim to enhance your experience and productivity. For more details and full list of release notes, please visit docs.data.world.

Offline Editing for Columns by Table

2023-11-06T21:10:00.000Z

Catalog administrators and stewards can now more easily operate on Column resources in bulk with the spreadsheet export/import flow. Previously, Columns could only be selected for bulk operations at the Collection level. Now, users can export all Columns of a parent Table, edit attributes in the spreadsheet, and upload changes.

The new functionality is accessible for users with correct access via the 'Columns' tab on a parent Table's resource page (example shown below).

For more information, please refer to the documentation.

Granular Filtering, Search, and Selection for Bulk Operations

2023-10-17T21:00:00.000Z

We are thrilled to announce new functionality for the Quick Edit feature to support all of the bulk operations necessary to keep catalogs fresh and accurate.

Previously for Quick Edit, users could only filter a set of resources by Resource Type. But now users can leverage search facets, advanced filtering, and text search capabilities available in other parts of the data.world platform. Users can also perform multiple searches and apply multiple filters to continually add resources to the selection without restarting each time. This will streamline bulk operations by allowing users to more seamlessly select the exact set of resources intended for bulk enrichment and editing.

These capabilities are now available wherever Quick Edit lives: Glossary, Resources, and Collections. They will appear once you select either "Quick Edit" for Glossary, or the "Edit Multiple Resources" entry point for Resources and Collections, shown below:

In a future release, we will enable these capabilities for the Bulk Upload/Edit feature as well, making offline editing more targeted and effective.

For more information, please refer to the documentation for Glossary Quick Edit, Resources Quick Edit, and Collections Quick Edit.

A list of notable enhancements across the data catalog!

2023-09-27T23:32:36.830Z

We're excited to introduce some powerful improvements and enhancements. Here's a list of our latest releases to the enterprise data catalog

1) Archie Bots - description generator enhancement

Archie Bots can now effortlessly describe all types of catalog resources, including custom resources. This improvement saves you time enriching your catalog, improving discoverability and understandability. You can read more about Archie Bots here.

2) Improvements to UX and increased max character count of descriptions

Enjoy getting wordy! We've increased the maximum character count of the Description field to 5000, allowing for more comprehensive and detailed information. We've also included markdown support in the hover-over view of descriptions and increased the view window size in search results.

3) Improvements to the search and navigation of Glossary terms

Users can now quickly filter by the first letter, making it easier to locate and manage terms. We've also made improvements to how special characters are sorted in the glossary, ensuring a more intuitive and organized experience.

4) Now you can query the catalog layers

Customers can now query the layers of the graph using a named graph called :current. This feature federates your source data and catalog enrichments into one queryable graph, simplifying data exploration across catalog layers and allowing for easier exploration and analysis of your data assets. You can read more about the catalog layers and how to query them here.

We hope these enhancements empower you to make the most of your enterprise data catalog. Stay tuned for more exciting updates in the future!

Improvements to Metadata Collectors Page and Collector Wizard

2023-09-19T21:00:00.574Z

To make collector setup faster and easier for catalog administrators, the Metadata Collectors page and Command Builder Wizard now support saving, editing, and deleting collector configurations for on-premise collectors.

Some of the new functionality in this release are:

Collectors configured from the UI will be saved and viewable, even before collectors are run. Previously, collector configurations were not saved for later use.
Collector configurations can be edited and deleted.
Users can give collector configurations custom names.
New table “Catalog metadata sources” shows all collectors that are bringing metadata into the catalog.

For more information, refer to the documentation here.

Launching Sigma and InfluxDB Collectors

2023-09-18T18:36:09.438Z

🚀 Exciting News: Launching Two New Metadata Collectors on data.world! 🚀

Today, we are excited to announce the release of two new metadata collectors on data.world: the Sigma Collector and the InfluxDB Collector. These tools are designed to simplify and supercharge your data integration and management capabilities.

🔍 Why you'll love the Sigma Collector:

For Sigma users, governance can sometimes be a challenge. We’re here to help! Now, with data.world, you can obtain a clear visibility into your Sigma workbooks, files, datasets, and more, ensuring data protection, control, and traceability.
Business decision-makers can now effortlessly visualize and integrate Sigma's KPIs and metrics with data from other sources, leading to informed decisions.
Data analysts will have improved data trust levels. How? Metadata for Sigma workbooks and elements (e.g., titles, descriptions, authors, last updated info) are now easily accessible within data.world, giving quick and deep insight.
Gain valuable insight into data quality and health via a Sigma Hoot, our latest DataOps feature. Read more about Hoots in this Whatsnew Post from August.

🌟 Sigma Collector Features:

Metadata Harvested: Workspace, Workbook, Folder, Dataset, Connection, Tag, Grant, Member, Team, and Data Element.
Lineage: Capture inter-system lineage within Sigma, connecting to tables, datasets, and other data elements in the workbook.

An example of Sigma Workbook metadata inside the data.world platform, such as who created and last updated the workbook as well as permissions, as well as Lineage.

🔍 How the InfluxDB Collector helps you:

Enhance your visualization dashboards! With InfluxDB being a pivotal data source for Grafana, it’s essential to understand the intricate Influx-Grafana relationship. We make that possible.
Dive deep into metadata about buckets, tasks, and measurement columns, empowering users to easily locate and monitor time series data, coupled with insights on its processing.

🌟 InfluxDB Collector Features:

Metadata Harvested: Bucket, Measurement Schema, Measurement Column, Task, Label, Organization, and Telegraf Configuration.

An example of metadata from an InfluxDB Task inside the data.world platform, including the expression, last run status, and task status.

In a world where data is continuously evolving, it’s crucial to have the right tools for discoverability, integration, and governance. With our new collectors, we aim to make your journey smoother, more insightful, and more powerful.

Sigma, a Tier 1 collector, and InfluxDB, a Tier 2 collector, are both available immediately for Enterprise Customers. Read the documentation for full details:

Hoots and BB Bots available for enterprise customers

2023-08-17T16:41:54.995Z

Note – in January 2024, BB Bots were renamed Sentry Bots

Release Notes – August 17, 2023

Announcing the launch of Hoots and BB Bots, the latest in our set of DataOps application features, free to all tiers of our enterprise catalog customers.

What problems do Hoots and BB Bots solve? Hoots bring the relevant information from the catalog to your data-consuming teams (analysts, scientists, executives, etc.) and provide simple communication and timely updates about data quality and freshness via BB Bots. Together, these features increase communication and trust and save your data engineering team valuable time in reanswering the same data questions across your data-consuming teams.

What is a Hoot? A Hoot surfaces important context about your data – including data quality and usage information – directly to the applications being used to make data-driven decisions. This saves data producers time that would otherwise be spent answering questions about the state of the data and ensure that data consumers have the context they need to use data confidently.

How do Hoots work? Hoots are simple trust badges that turn green, red, or yellow depending on the health status of your data pipeline. Hoots are configured from the catalog and added to your web-based data product to inform users of health status and more information that is fed automatically from the catalog and automated monitors called BB Bots.

What is a BB Bot? BB Bots are automated monitors that change the status color of the Hoots, providing a trust signal to end-users and allowing data engineers more time to investigate issues and less time answering and re-answering questions.

How do BB Bots work? BB Bots monitor the data.world Data Catalog Platform and other orchestration and observability tools, like Airflow, Monte Carlo, dbt and Matillion. BB Bots automate the communication of data quality and health status and surface this information to the Hoot where it can provide important context alongside other information from the catalog, like definitions, lineage, owner, and policies. All of this information is surfaced in the Hoot that lives on the applications that data consumers are using, like Looker, PowerBI, and Tableau.

To find out how to configure a Hoot, you can read more about these features in our product documentation and enroll in the DataOps and BB Bots course available at data.world University.

SQL Server Reporting Services (SSRS) support for metadata collection is now Live!

2023-08-15T19:55:00.000Z

Announcing our newest metadata collector - SQL Server Reporting Services (SSRS)! This collector is designed to provide you with an effective solution for extracting metadata from your SSRS environment into your data.world catalog. Our integration facilitates the automated extraction, organization, and presentation of specific metadata elements from your SSRS system. You'll gain valuable insights into your datasets, data sources, folders, KPIs, reports, and linked reports – all within your easily navigable catalog.

With the SSRS collector, you can:

Learn more about your reports and data, including who created a report or dataset and when they were last updated, helping you understand and trust your data
See the lineage of which datasets were used in a report, allowing you a comprehensive view of the data flowing into a report
Keep track of KPIs from SSRS and integrate them with business metrics from other source systems, all within one easy-to-use catalog, leading to better data-informed decisions

Are you ready to unlock the potential of your SQL Server Reporting Services? You can read more about how this collector works and all it harvests in the documentation. This collector is Tier 2 for Enterprise customers, and is available in dwcc version 2.151 and later.

An example of metadata from an SSRS Report, including Lineage:

Announcing Enhanced Email Notification Options

2023-08-15T19:45:34.853Z

Visit your notifications settings page to customize the transactional emails you receive from data.world.

You can choose to:

Turn off all non-essential email communications
Unsubscribe from a category of email notifications
Customize which digests you receive
Customize dataset and project activity notifications

Learn more