data.world April Product Release: Smarter Workflows and Deeper Lineage

We’re thrilled to share this month’s updates. Whether you’re governing metadata at scale, filling gaps in lineage, or looking for a more intuitive user experience, these enhancements are built to help your team move faster and collaborate better. Here’s what’s new, what’s improved, and what’s just around the corner.

🔥 What’s New?

New Premium Workflow Automation: Suggest Changes

Say hello to smarter, scalable governance workflows. Our new Suggest Change Workflow is an automation designed to help our Governance Premium customers with more complex workflows. It allows users to propose metadata changes that route through configurable approval groups. Once approved, updates are either instantly applied or, for source-owned fields, sent as ServiceNow tickets—ensuring updates are made at the system of origin. This is a game-changer for teams who want to empower metadata contributors without compromising control or compliance. Find out more in the product documentation.

📊 New Dashboards – Clarity and Insight at Your Fingertips

In March, we rolled out Governance Dashboards in the Admin Portal—an interactive, visual way for instance administrators to monitor platform activity, analyze engagement trends, and surface meaningful insights across their data.world environment. These dashboards were built to address the growing need for visibility into how users are interacting with data, what they’re searching for, and which resources are being used (or overlooked). You can read more about these dashboards here.

SAP HANA Collector with Lineage

We’ve expanded our enterprise-grade coverage with a brand-new, native SAP HANA collector—now with lineage! Previously, customers had to rely on the Generic JDBC collector. With this upgrade, teams can now trace data movement to where it originated in their MDM pipeline, supporting better impact analysis, governance, and audit-readiness. Enterprise teams using SAP HANA will get a more complete picture of the entire data pipeline with this new collector.

Marquez/OpenLineage Collector

We’re supercharging our support for OpenLineage via Marquez. This collector lets customers easily send lineage metadata to data.world using the OpenLineage standard, which is increasingly supported by tools like Airflow, dbt, and Spark. It’s perfect for teams with complex, fast-moving stacks who need lineage coverage now—no custom integration or roadmap waiting required. Visit our documentation here.

💫 What’s Improved?

New updates to the Resource Page redesign (in Public Preview)

Our redesigned Resource Page continues to evolve—and it’s looking better than ever. We’ve rolled out a sleek new relationship preview component, enhanced collapse/expand behavior, and squashed bugs to improve polish and performance. It’s a smoother experience that makes exploring resources and their relationships feel quick and easy. To take advantage of the new layout, some customers might consider moving their metadata fields to different sections. We plan to fully release this redesign in May. Read here to understand more about the rollout plan and consult the documentation if you’d like to understand which sections moved.

New resource page layout redesign screenshot that illustrates new layout

currentUser keyword added to advanced search syntax

Finding “your stuff” just got easier. The new currentUser search syntax lets users instantly surface resources they own, favorited, or are connected to—without needing to remember complex filters. Catalog Admins can use the new keyword in the Browse Card to link end users to personalized search results. It’s a small feature with big impact for day-to-day navigation and catalog engagement. Read more to find out how to use this syntax to find resources to which you are attributed or responsible.

Example: `metadata:”Steward:currentUser” AND NOT has:description`
Returns personalized results where the current user is listed as the steward and are missing descriptions.

🔮 What’s Coming Next?

📥 Microsoft Fabric Collector: We’re adding a new collector for Microsoft Fabric workspaces, including Power BI artifacts—expanding coverage for customers in the Microsoft ecosystem.

🎯 GA Launch of the New Resource Page: The redesigned Resource Page and relationship grids will move from public preview to general availability (GA) on May 20th, no longer requiring users to opt-in to the new experience.

⚡️ data.world March Product Launch: AI, Usability, and Lineage, Supercharged

This month’s release is packed with powerful updates designed to unlock more from your catalog. From the launch of Archie Chat, your AI-powered assistant for instant answers, to a new design and functionality for our Resource pages, we’re bringing faster workflows and deeper insights to every user. We’ve also rolled out new and enhanced collectors—from Amazon DMS to AWS Glue, ADF, and beyond—giving data teams the visibility they need to govern modern, complex data stacks. And coming soon: new Governance Dashboards that bring actionable insights, helping you measure engagement and optimize your data strategy.

🔥 What’s New?

💬 Archie Chat – Your AI-Powered Data Assistant is Live

Say hello to Archie Chat, the intelligent assistant that turns your data catalog into an interactive knowledge hub. Now available in Public Preview, Archie delivers instant, context-aware answers. Archie Chat answers questions and helps you navigate the catalog instantly, using context from your own data resources, glossary terms, and our product documentation to help you get answers faster.

Data consumers can use natural language to ask questions like “Tell me what you know about marketing campaign attribution” or “What does ACV mean?”, and get quick, reliable answers from your catalog. Stewards and admins can ask things like, “What’s the fastest way to add lots of relationships at once?” and get immediate, actionable guidance—no digging through docs required.

Quick Tip: Archie is now in Public Preview and free to try. Reach out to your Customer Success representative to get started today!

✨ Resource Page Redesign – Usability Reimagined

The Resource Page redesign and new functionality is here. We've overhauled the resource page experience, creating a more intuitive layout, providing edit history, and other new powerful features that put speed and insight at your fingertips:

Streamlined layout and design for faster understanding and navigation
Rich activity feed that capture your data's story
Inline editing capabilities
New preview options
Enhanced filtering options

The new Activity feed is now generally available and you can read more about it here. The redesigned Resource Overview page and tabs are in Public Preview and users can read more about that here. You can read more about it here. One click is all it takes to transform your workflow.

🔄 Amazon Data Migration Service Collector – Lineage Through the Cloud Migration Journey

Our new Amazon DMS Collector brings visibility into your cloud migration pipelines by capturing lineage from AWS Data Migration Service (DMS) jobs. Whether you’re migrating on-prem databases to the cloud or using DMS for change data capture (CDC), this collector gives you line-of-sight traceability into how data moves across systems. Now, you can govern these migration flows with confidence, maintain compliance, and ensure your catalog reflects your evolving data architecture.

💫 What’s Improved?

🧬 AWS Glue Collector – Deeper Metadata, Stronger Lineage

We’ve supercharged the AWS Glue Collector to bring in even richer metadata and more complete lineage. The collector now captures detailed metadata from Glue Data Catalog tables—including file types, sizes, serializers, and deserializers—as well as improved job metadata. Even better, it now builds lineage to S3 objects, providing full traceability back to the ultimate source. This enhancement empowers more robust impact analysis and stronger governance across your AWS data landscape.

🔗 Azure Data Factory Collector – Smarter Lineage for Parameterized Pipelines

The ADF Collector just got an upgrade: it now captures lineage from parameterized datasets, surfacing both upstream and downstream references in complex, dynamic pipelines. This gives data teams clearer visibility into data movement across ADF workflows—especially in cases where pipeline logic changes based on parameters—resulting in more accurate lineage, improved trust, and tighter control over your data flows.

🔧 Public API Enhancements – Create & Manage Discussions Programmatically

Now you can create and manage discussions directly through the public API, making it easier to capture critical conversations wherever they happen. Whether integrating with ticketing systems, BI tools, or custom workflows, this enhancement turns your catalog into a true collaboration hub—bridging the gap between data context and team communication. By embedding conversations directly into the metadata layer, you unlock smarter decision-making and greater operational efficiency.

🔮 What’s Next?

📊 Coming Soon - New Governance Dashboards

Later this week, we’re launching a new suite of Governance Dashboards in the Admin Portal—your command center for understanding how your data catalog is being used. These interactive, visual dashboards give instance administrators the tools they need to monitor platform activity, analyze user engagement, and uncover trends across your data.world environment. From search behavior to resource metrics and usage, and daily active users, you now have the clarity to spot what’s working, what’s being overlooked, and where to invest next.

Built with ready-to-use views and flexible filters, these dashboards empower teams to make faster, smarter decisions about content strategy, user enablement, and platform optimization. This feature is exclusively available to users with Instance Administrator permissions in private instances and single-tenant deployments.

🔄 Coming Soon – New Marquez / OpenLineage Collector

Our upcoming Marquez Collector brings support for the OpenLineage standard, enabling teams to capture lineage from custom pipelines built with Python, PySpark, SQL, and more. By pulling lineage directly from the Marquez metadata store, this collector helps you document and govern bespoke or proprietary data pipelines—even when no native connector exists.

🏢 Coming Soon – New SAP HANA Collector

We’re expanding our enterprise coverage with a new SAP HANA Collector, designed to harvest standard database metadata—and even some lineage—from SAP’s powerful data warehousing platform. This collector will help customers bring critical SAP assets into the catalog for better visibility, governance, and reuse.

Ready to unlock more from your data? Contact your Customer Success representative to explore these groundbreaking updates and unlock the full potential of data.world, the simpler and smarter catalog.

data.world January Product Launch

January Releases: Expanding Connectivity, Simplifying Setup, and Enhancing Lineage Exploration

This month, we’re delivering powerful updates to enhance metadata collection, streamline setup, and improve lineage exploration. We’re expanding our connectivity with new on-premise collectors for Apache Airflow and Qlik Talend, enabling deeper metadata harvesting and lineage tracking for critical ETL and workflow automation tools. People field configuration is now more intuitive, allowing user accounts to be dynamically selected for ownership and stewardship, reducing manual setup and improving governance. Finally, our new public API endpoint for lineage querying makes it easier for customers to customize lineage exploration with flexible queries and standardized outputs. These updates help teams work smarter, get to insights faster, and build on top of our platform with greater ease. 🚀

Support for Airflow and Talend on-premise Collection

By the end of January, the data.world collector integrations will include new collectors for Apache Airflow and Qlik Talend Data Integration (the on-premise version of Qlik’s Talend product). Airflow is an open source workflow automation tool that many enterprises use to schedule and manage data engineering and analytical tasks. The new collector will harvest metadata about these workflows–called Directed Acyclic Graphs–and the tasks contained within them. Qlik Talend is a data integration product that facilitates extract, transform, and load (ETL) processes; the new collector will identify sources and targets of these processes and harvest lineage relationships representing the flow of data between them.

These collectors will initially be available as on-premise collectors only, but will also be available as cloud collectors in early February.

Streamlined Setup of People Fields

Configuring ownership and stewardship just got easier! In addition to supporting people as collected resources, customers can now utilize their user accounts to populate people fields. This update streamlines setup, providing an intuitive approach that helps teams quickly setup, ensuring seamless attribution and governance from the start and helping end-users connect with the right people. You can read the documentation for this feature here.

Screenshot of people field search and select

Resource Lineage Support in Public API

We’re making it easier than ever to programmatically explore lineage with our new Catalog Lineage Public API endpoint! This update provides flexible query options, allowing customers to tailor lineage exploration to their needs to build lineage based tooling, automations, and integrations. This is a win for all lineage customers looking for deeper insights and more intuitive ways to navigate their data relationships.

UX Changes Coming Soon

Activity feed for Resources

Soon, we’ll be introducing a new Activity tab on Resources, Glossary and Collection objects that show edit history and other activity in the UI. This will make it easier for users to quickly understand how the resources have been updated and changed. Announcement of the release will soon follow.

Resource page redesign

Along with a new activity feed, we'll be introducing a newly designed details page that offers more intuitive navigation, better use of whitespace, configurable relationship tabs, inline editing, and other features that will make the resources both easier to understand and scan but also easier to enrich and edit. Announcement of this release will follow in this quarter.

Data.world December Product Launch

It's time to announce our December releases. This month we've fully launched our SCIM integration; new features to enhance browse, automation, and bulk editing; and introduced new data as well as some bug fixes to our metrics datasets.

Read on to learn about these new features and improvements!

SCIM now Generally Available

📢 SCIM (System for Cross-Domain Identity Management) Integration is now in General Availability!

Managing user identities and group access just got smarter. data.world now supports SCIM for Okta and Azure Active Directory—and it’s available to all customers! 🎉

Automate user provisioning, keep group memberships up-to-date, and effortlessly manage Push Groups. Say goodbye to manual access updates and hello to streamlined identity management at scale. See the product documentation for more information.

Ready to try it out? 🚀

Enable SCIM and simplify user management today. Reach out to get started!

New Features

Tabbed browse card

Catalog teams have asked for more control over the landing and home page experience in order to organize content and serve multiple personas to help guide discovery. We now support multiple named and tabbed browse cards so that admin users can configure a landing experience for various purposes, personas, teams, domains, or more. Tabbed browse cards are supported for UI and MDP configuration of both site-wide and individual catalog org profile pages. CTK-configured browse cards do not support tabbed browse cards at this time.

Governance: Automation History MVP

Automation History MVP provides Catalog Admins with the ability to view a history of automation runs, including the status of both past and current executions. This feature improves observability, allowing admins to determine when automations last ran and whether they were successful.

Bulk Editing Support for Columns

Bulk Editing Support for Columns enables Catalog teams to efficiently edit table columns at a tabular level. From the Columns tab on a table page, users can perform all bulk editing actions previously available only at the collection level. This enhancement streamlines the curation process, making it easier to manage and refine columns within individual tables.

Updates to Metrics Datasets

We’re excited to share updates to key tables in the metrics and audit datasets delivered as part of your Data Catalog Team org. These changes fix bugs and add additional features to these tables.

Updated Tables and Key Changes

1. Daily_dwec_asset_facts (available in baseplatformdata, platform-analytics, and platform-data private Snowflake listings) and Resources_live_metadata_assets_activity_by_day (available in ddw-metrics dataset and ddw-metrics private Snowflake listings)

The ASSET column has been renamed to rdf_type for improved clarity. This column now uniformly holds the value “ALL” due to a proliferation of types in customer data catalogs, making aggregation by rdf_type not directly usable. For granular measures by resource type, use daily_catalog_resources_pages_facts.

2. daily_catalog_resources_pages_facts

Resource creation, updates, and deletions are now derived from audit events for more reliable insights; however, human readable names are not available on audit events. Therefore, the resourcename column will remain NULL until specific UI actions are performed in the catalog (e.g., resource view, edit, delete). Additionally, the resourcetype column now includes an array of type IRIs, enabling more detailed categorization.

Looking Ahead

We are actively working to reintroduce aggregation by rdf_type through new categorization techniques and will provide updates as progress is made.

For more details on these changes please see the release notes for our metrics datasets here.

As always, please reach out to your Customer Success team if you have questions or want to provide feedback.

data.world October Product Launch

The October release of data.world brings a wide variety of new capabilities and improvements across the platform – read on to learn more about the GA of Databricks Publisher, new collectors for MongoDB and Alteryx, Okta support in SCIM, the GA of the improved search experience, and more!

Additionally, we highlight some changes made to the data.world Open Data Community to improve privacy and preserve the quality of open data and the user experience.

Databricks Publisher Premium Automation

We’re excited to announce the GA launch of the Databricks Publisher Premium Automation! This new feature allows users to seamlessly publish metadata from data.world to Databricks, simplifying the process of managing and synchronizing key data attributes. Specifically, users can now automatically publish table and column descriptions from data.world to Databricks and push selected metadata attributes as Databricks tags. Whether you prefer manual updates or fully automated syncing, this automation ensures that metadata remains consistent between platforms, reducing manual effort and improving data integrity. With data.world now acting as the source of truth, your metadata stays up-to-date across systems effortlessly.

For more information, see the product documentation.

New MongoDB and Alteryx Collectors

This month, we’re excited to announce new MongoDB and Alteryx Collectors, both available in Private Preview. If you’re interested in early access to either of these new collectors, please reach out to your Customer Service Director.

MongoDB Collector

The MongoDB Collector catalogs metadata from MongoDB, helping maintain a comprehensive inventory of MongoDB assets, facilitating better governance, discovery, and utilization of data across your organization.

This collector harvests metadata for MongoDB databases, collections, views, indexes and more.

An example collection from MongoDB

Alteryx Collector

The Alteryx Collector catalogs metadata from Alteryx, helping maintain a comprehensive inventory of Alteryx assets, facilitating better governance, discovery, and utilization of data across your organization.

This collector harvests metadata for workflows, workflow nodes, workflow jobs, connections, schedules and more.

An example collection from Alteryx

Improved lineage for SQL Server

The SQL Server Collector now collects additional lineage relationships not previously captured through SQL parsing using built-in SQL Server functions that describe relationships between objects (such as, in some cases, the columns and tables referenced by views or stored procedures).

For more information and detail, see the description of lineage collected by the SQL Server Collector in the product documentation.

Support for Okta in SCIM

The active Private Preview of SCIM (System for Cross-domain Identity Management) now additionally supports Okta (in addition to Microsoft Entra ID), allowing customers who use Okta as their enterprise identity provider to have automated management of users and groups in data.world.

If you are interested in being part of the SCIM Private Preview, or just want to learn more, please reach out to your Customer Service Director.

Webhook Authorization enhancement

Webhooks now support an optional authorization key parameter to help consuming applications verify the origin and permissions for an incoming webhook. Learn more

Collection Details in Technical Reference

By popular request, the Technical Reference page for catalog resources now includes details about the collections the resource belongs to. Learn more

Relative Time Advanced Search Syntax

Create powerful saved searches for resources by updated and created dates using three new relative time options:

`created:today`
`updated:yesterday`
`created:{last 30 days}`

See the product documentation on creating advanced searches to learn more.

UX Improvements

Adding the new search experience to Organizations: Our new search experience has been a big hit with users. It’s faster, cleaner, and provides more advanced features in the UI. We've fully retired the classic experience and brought the new search features to the Resources, Glossary and Collection landing pages.

Coming Soon! Advanced relationship editing: We're adding new improvements that make it easier to find the right resources and add or remove more than one relationship at a time.

Coming Soon! More look-and-feel updates: Next up in our work to update and modernize our UI, we'll be swapping the old default avatars to a newer color palette and default avatar design that utilizes letters. This change will also provide a more accessible experience as it gives users the ability to distinguish users and organizations using letters.

Changes to data.world Open Data Community

data.world Open Data Community profiles, datasets, and projects now behind a login wall: To better control the privacy of our users and to protect the effectiveness of the content on our active Open Data Community, we have made the decision to restrict access to profiles, datasets, and projects to account holders. It is always free to join our open data community.

data.world Open Data Community commenting restrictions: Commenting is now restricted to contributors on datasets and projects in the data.world Open Data Community. Organizations can enable comments on their public datasets through organization settings. This feature is not available or enforced for Enterprise customers on private instance or VPC deployments.

data.world September Product Launch

The September release of data.world is live, and is jam-packed with features and improvements that will make catalog teams and users more effective! We’ve got a major improvement to the Search UX, support for SCIM to improve access management, and new collectors to continue making your data.world catalog the center of your organization’s data ecosystem.

Read on to learn about these and other exciting new features!

Simplified Search Experience

We’re excited to announce that our new and improved search experience is now generally available! Our new experience has been in public preview since mid July. Through the preview feedback as well as months of user research, we’ve made significant updates to ensure a more intuitive and streamlined search experience.

With a cleaner interface and new features, users can now find what they need faster and with fewer clicks. Enjoy faster discovery through quick organization scoping; build, save, and share custom filters; pin and order your own top facets; preview the resource without leaving search; view important context like resource hierarchy and custom metadata; and utilize inline advanced features without leaving your current search.

Why the change? We listened to your input and designed a new search experience that eliminates noise and complexity, making it easier for all users—from new users to experienced catalog pros to enjoy faster, more efficient searching.

You can learn more about these improvements in our product documentation and our Searching and Exploring Data course in data.world University.

Notable UI and UX improvements

Relating Resources: Resource relationships are important. We’ve made small but meaningful improvements to the user experience for relating resources to one another. Users will now have more room and more context as they search and relate resources.

Providing Feedback: We want to hear directly from end-users so that we can provide an even better product. We’ve built a simple in-app feedback form that allows us to get direct feedback from users.

Look and Feel: We’ve updated our typography, modernized actions and components, and provided more space to absorb information. This is the first in a series of user interface improvements we’ll be rolling out over the coming months.

SCIM - Microsoft Entra (Private Preview)

Imagine never worrying about managing user access again! We are excited to introduce support for SCIM in data.world, to streamline the process of managing catalog, organization and resource access and permissions.

SCIM (System for Cross-domain Identity Management) makes user management automatic, ensuring the users in your enterprise always have the right access at the right time, by syncing user details and group membership between your identity provider and data.world. From onboarding to offboarding, every detail – job title, email, and permissions – stays perfectly synced. Why does this matter? Because separate data entry and manual errors can lead to security risks, manual effort, and frustration. With SCIM, you’re not just saving time; you're enhancing security and efficiency, protecting your business, and making sure nothing slips through the cracks. It’s peace of mind, effortlessly.

This Private Preview release of SCIM functionality supports Microsoft Entra (a.k.a. Azure Active Directory) as an identity provider. We will be adding Okta support in the near future.

If you’re interested in learning more about SCIM support in data.world, or accessing the Private Preview, reach out to your Customer Service Director.

Qlik Sense Cloud Collector (Private Preview)

The Qlik Sense Cloud Collector catalogs metadata from Qlik Sense Cloud, helping maintain a comprehensive inventory of Qlik Sense Cloud assets, facilitating better governance, discovery, and utilization of data across your organization.

This collector harvests metadata for apps, visualizations, sheets, fields, measures and more. For more information, see the product documentation.

An example collection from Qlik Sense

Informatica Cloud Data Integration (CDI) Collector (Private Preview)

The Informatica Cloud Data Integration (CDI) Collector catalogs metadata from CDI, helping maintain a comprehensive inventory of CDI assets, facilitating better governance, discovery, and utilization of data across your organization.

This collector harvests metadata for jobs, mappings, mapping tasks, and more. For more information, see the product documentation.

An example collection from Informatica Cloud Data Integration

Amazon QuickSight Collector (GA)

The Amazon QuickSight Collector, previously in Private Preview, is now generally available! The QuickSight Collector catalogs metadata from QuickSight, helping maintain a comprehensive inventory of QuickSight assets, facilitating better governance, discovery, and utilization of data across your organization.

This collector harvests metadata for Analyses, Dashboards, Datasets, Data Sources, and Folders. Additionally, lineage metadata is harvested between QuickSight Analyses, Datasets, and Data Sources when the data source is a relational database, S3, or Athena. For more information, see the product documentation.

An example of a QuickSight Dashboard

Search API + Syntax improvements

Public search API endpoints and all UI search interfaces now support advanced syntax to target specific metadata fields when constructing complex searches. The following keywords and partial match syntax is now supported:

title:”Sales Order"
Searches for exact matches on the title field for the value “Sales Order”

title:”*sales”
Searches for matches with titles containing the term “sales”.
The leading '*' character is used to specify partial or fuzzy match.

“description” and “summary” can also be used as keywords to target specific fields.

Custom metadata can be searched in this manner with the following syntax:

metadata:”Submitted by:Tim Gasper”
Searches for exact matches on the “Submitted by” field with the value “Tim Gasper”

metadata:”Steward:*Juan”
Searches for matches on the “Steward” field that contain the term “Juan”

When using the public search APIs, specific fields can be targeted for exact and partial matching using the property object and the desired field IRI:

{
    "owner": "democorp",
    "property": {
        "https://democorp.linked.data.world/d/ddw-catalogs/steward": "*Gasper"
    }
}

data.world August Product Launch

The August release of data.world brings a number of new and improved product capabilities, including an improved user interface for resource creation, real-time metadata sync with Databricks, a new metadata field to improve an understanding of where catalog resources come from, and enhancements to Microsoft, Salesforce and Databricks collectors.

Also available now is an exciting improvement to our AI Context Engine™ that helps provide explainable answers from your structured data.

Read on to learn about these exciting new features!

Active Directory authentication for Microsoft Collectors

The SQL Server, SQL Server Reporting Services, Power BI Report Server, and SQL Server Integration Services Collectors now support Active Directory domain credentials using NTLM authentication type allowing the collector to connect securely using Active Directory-managed authentication.

Salesforce Collector

The all-new Salesforce Collector catalogs rich metadata from Salesforce, helping maintain a comprehensive inventory of Salesforce assets, facilitating better governance, discovery, and utilization of data across your organization.

The new version of the collector now harvests metadata for objects, fields, dashboards, and reports directly via Salesforce APIs.

An example collection from Salesforce

Databricks Collector harvests lineage to Amazon S3 and ADLS Gen2

The Databricks Collector now harvests External Locations allowing users to understand cross-system lineage between Databricks, Amazon S3, and Azure Data Lake Storage Gen2.

AI Context Engine - new "detailed answer" endpoint

The new "detailed answer" endpoint in AI Context Engine works similarly to the existing "Answer Tool" endpoint, but it returns much more information, including:

answer - Textual response, same as before
result - raw data & schema (frictionless data format)
sparql - SPARQL query
sql - the SQL query
targetSql - SQL queries executed against target systems
terms - Business terms that were used to generate the query
ontologyUsed - The parts of the ontology that were used to generate this response
evidence - the "thoughts" that were generated during the run (same as what you might see in the debug chat tool, Archimedes)

In contrast, "tool" endpoints are simpler, returning only the response in order to integrate seamlessly with other LLMs (e.g., OpenAI).

In the future, this and other "detailed answer" endpoints in AI Context Engine will return additional information and evidence as we further deliver on accurate, explainable and governed answers from your structured data.

Link: https://developer.data.world/reference/callanswer

Source System metadata field

Source System is a new default field that consistently describes the system from which the catalog record metadata was sourced (e.g. Tableau). This field will be used to improve discovery by allowing types to be organized by Source as well as helping to differentiate between ambiguous resource type names (e.g. Dataset). Read more about how to extend and configure this field for your custom types and collectors here.

For more information see our product documentation.

Databricks Publisher real-time updates

This feature allows automatic triggering of Databricks Publisher automation whenever a Databricks Column or Table description is updated in the data.world catalog via the UI or public API. This ensures real-time synchronization of metadata between data.world and Databricks, eliminating the need for manual updates.

Databricks Publisher (announced last month) is currently in Beta, so to get access, reach out to your Customer Success Manager.

Improved UX for resource creation

We’ve enhanced the UI for creating new resources in data.world. Now, when users create a resource in data.world, a new multi-step wizard flow replaces the old small pop up modal.

This new approach makes resource creation much easier for a wider variety of users - thanks for your feedback on this important aspect of the catalog user experience!

data.world July Product Launch

The July release of data.world is here, with starter kits to get AI Context Engine™ up and running quickly, enhancements to collect more metadata for popular collectors, the first release of Databricks Publisher to keep metadata in sync between data.world and Databricks, and an opt-in preview of an improved search experience within data.world.

Read on to learn about these exciting new capabilities, available now!

AI Context Engine™ Starter Kits

To help customers quickly utilize the AI Context Engine, we now offer three starter kits:

Key Benefits

Quick Start: Each starter kit provides all necessary components to get up and running quickly.
Customization: Clients can modify the source code to suit their specific needs.
Ongoing Updates: Access the latest features and improvements by updating from GitHub.
No Custom LLM Required: The starter kits enable direct AICE API calls for basic Q&A on structured data.

Availability and Support

The starter kits are available to all AICE customers at the links above. While they are provided as-is and unsupported, they offer a robust foundation for developing custom applications.

Enhancements to Power BI, dbt, and Denodo Collectors

This month, we’re excited to announce enhancements to our top collectors to harvest more metadata and lineage relationships.

Power BI Collectors enhancements

The Power BI Service and Power BI Gov Collectors have been updated to harvest preview images for Power BI reports allowing users to preview reports in data.world before they navigate to them in Power BI.

dbt Collector enhancements

The dbt Core Collector has been updated to streamline the setup of multiple dbt Core Collector instances allowing users to specify multiple run_results.json files in a single run. Additionally, the dbt Core and dbt Cloud Collectors now support Azure Synapse as a target database.

Denodo Collector enhancements

The Denodo Collector has been updated to harvest lineage between Denodo resources and cross-system lineage between Denodo and Power BI are now supported.

Announcing the Beta Release of Databricks Publisher!

We are excited to introduce the first version of the Databricks Publisher, launching in Beta on July 23. This new feature allows you to write back Databricks column and table metadata for individual resources with just the push of a button, streamlining your metadata governance processes.

What’s New?

The Databricks Publisher enables seamless synchronization of metadata, ensuring that annotations by subject matter experts and data stewards in data.world are reflected back in Databricks. This enhancement benefits analysts and data scientists by providing well-governed, meaningful metadata directly within their Databricks environment.

But that’s not all! We have more exciting features in the pipeline:

Real-Time Updates: Soon, you won’t need to manually sync changes. Any saved changes will automatically update in Databricks.
Tag Writeback: This upcoming feature will extend the functionality to include tags, further enhancing your governance capabilities.

Why This Matters

Synchronizing metadata between data.world and Databricks creates a seamless governance workflow. This integration supports a broader persona model where end users and governance professionals utilize data.world directly, while technical analysts and engineers use platforms like Databricks for discovery and data work. By bridging these environments, we’re making it easier for your teams to collaborate and access the data they need.

To get access to this Beta feature, reach out to your Customer Success Manager. We can’t wait for you to experience the benefits of this new feature and look forward to your feedback as we continue to enhance our offerings.

Stay tuned for more updates, and thank you for being a valued customer!

Public preview of new search experience now available

We're happy to announce the public preview of our new search experience! As part of our continuous effort to improve your experience with our platform, we've made some significant upgrades and introductions to the way you search data on our product. For the next month, we invite you to opt-in to the new preview where you can interact with these new features before they go live.

Our enhanced search function now offers a preview of key resource elements, allowing you to quickly differentiate results and jump to related resources or explore resource lineage without steering away from your search experience. Some much-loved additions include refined and streamlined filters to help you find exactly what you need, more efficiently. You can now enjoy a search for filters, select more than one, and delve into advanced filter operations (ANY, ALL, NONE). Organization scoping has never been smoother – stay within your org scope, and sort regardless of your scope.

We’ve made the experience cleaner and more user-friendly, cutting down on overwhelming filter options and obscure filter values, and enabling a customizable order of filters – pin your favorite or most-used ones to the top for ease. Plus, you can also build and save your own filters for future and repeated use.

These changes root from the invaluable feedback we received from you, our valued customers and users. You asked for a simpler, more manageable experience, the ability to preview more types, as well as personal customizations like saved searches and re-ordered facets – and we listened! We’re incredibly excited for you to experience these advancements. You can read more about what we’ve changed in our documentation portal.

The interface invites you to share your feedback directly within the app, or you can reach out to your Customer Success Manager to voice your thoughts. We look forward to hearing from you!

data.world June Product Launch

The June release of data.world is here – featuring a new collector for Microsoft SSIS, enhancements to collect more metadata for Databricks and PowerBI, the introduction of versioning for Governance Automations, and a useful set of new security and management features for catalog admins and data stewards.

Read on to learn about these exciting new capabilities, available now!

Enhancements to Databricks and PowerBI Collectors, and a new SQL Server Integration Services Collector.

This month, we’re excited to announce improvements to some of our most frequently used collectors, and the new SQL Server Integration Services collector.

The new SQL Server Integration Services Collector is available in Private Preview, contact your Customer Success Director to learn more how to participate in the program.

SQL Server Integration Services Collector in Private Preview

The SQL Server Integration Services (SSIS) Collector catalogs metadata from SSIS, helping maintain a comprehensive inventory of SSIS assets, facilitating better governance, discovery, and utilization of data across your organization.

This collector harvests metadata for projects, packages, control flow/data flow executables, and much more.

An example collection from SSIS

Power BI Service Collector enhancements

The Power BI Service Collector has been updated to harvest Power BI Measures (including lineage) and lineage between calculated columns. Additionally, the collector now harvests lineage from Power BI Query statements and supports lineage between upstream data sources configured using ODBC connections.

Databricks Collector performance enhancements and external locations

A number of performance improvements were made to the Databricks Collector when harvesting tags on Databricks tables and lineage metadata from Unity Catalog. Users may see up to 80% improvement depending on the shape/weight of their Databricks instance. The Databricks Collector has also been updated to harvest external locations (Azure Data Lake Storage Gen2 and Amazon S3).

Start exploring today

These new collector updates help users understand where data in these reports are sourced from, facilitating troubleshooting for analysts and increasing trust for business end users. Learn more in our documentation:

Versioning for Governance Automations

This new functionality allows admins of Governance Automations in data.world to:

Edit Existing Automations: Modify your current automations directly within the system, ensuring they meet your evolving needs.
Maintain Task Integrity: Existing tasks, both claimed and unclaimed, stay connected and unaffected by changes.
Future-Proof Configurations: Any new runs initiated post-edit will use the updated configuration.
Stay Updated: Edited automations automatically update to the latest template version, incorporating the newest features and capabilities.

Enjoy the additional flexibility and power of Editable Automations, available today!

Data Exfiltration Controls

Enterprise customers can now configure instance-wide policies for downloading dataset content from the platform. This suite of controls also adds support for restricting who can create personal access tokens for the data.world public API.

Learn more about this feature by visiting our documentation portal.

Organization Browse Card Wizard

Organization admins can now design, build, link resources, and edit the organization level browse card through a visual wizard in the UI. This feature is compatible with more advanced Browse Card configuration options such as automations.

Learn more about this new feature by visiting the documentation portal.

Administrative Functions for Discussion Topics

Resource admins now have the ability to manage Discussion Topics on the Discussions tab of a resource to help maintain highly useful and accurate content. An admin can delete discussion topics as well as edit the title of an existing topic.

Learn more about this new feature by visiting the documentation portal.

data.world May Product Launch

The May release of data.world has something for everyone – user experience improvements for navigation and understanding, a new source of technical lineage, big performance improvements across multiple collectors, and a set of powerful and time-saving capabilities for the admins and program teams building and managing their catalog experience.

Read on to learn about these exciting new features!

New relationships summary and browsing

In our continued effort to streamline the user experience, we've rolled out a few changes to our catalog metadata resource details pages. These adjustments have been carefully designed to save valuable time and provide the information needed to understand and navigate related resources. Stay tuned for more updates to these pages in the coming months!

Collector performance improvements and Oracle lineage

This month, we’re excited to announce improvements to the overall runtime for a number of our collectors and an update to the Oracle collector for harvesting lineage metadata.

We are also excited to announce that both the Amazon DynamoDB Collector and Azure Data Factory Collector are now generally available.

Performance improvements across collectors

A number of performance improvements were made to collectors to improve their overall runtime and reduce memory usage. Customers may see up to 90% improvement depending on the collector and shape/weight of their database/data warehouse.

These collectors include Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, dbt Core, and dbt Cloud collectors.

Oracle collector harvests lineage metadata

The Oracle collector now harvests lineage relationships from Oracle views, stored procedures, and functions. With this new metadata, users can now visualize and query for how data is moved within Oracle and other technologies.

Start exploring today

These new collector updates help users catalog their sources faster, facilitate troubleshooting for analysts, and increase trust for business end users. Learn more about what is supported in our documentation:

Oracle Collector documentation

Organization Details Public API

We’ve added a utility endpoint to our public API to surface organization details such as extended description and avatar for use in integration development. Visit the developer portal to learn more.

Catalog Resource Public API

We're delighted to announce an updated suite of API endpoints focused on flexible catalog management.

We’ve seen wide adoption of advanced catalog features like custom resources, relationships and integrations. Our public API now has full support for our “catalog anything” mission and brings the flexibility of the knowledge graph to the initiatives you are working on, both on and off the data.world platform.

Visit our interactive developer portal to learn more and try out the new functionality.

In-App Technical Reference

As a companion to the Catalog Management APIs, we’ve added a new in-app reference to each resource page that provides in-depth information about the ontology and configuration details for the resource. The Technical Reference page can be found by navigating to the “Settings” tab of any resource and clicking “Technical Reference” in the left navigation menu.

Use this reference as a starting point for taking full advantage of the power of the knowledge graph through SPARQL queries and our Public API.

The reference provides details about the supported relationships, metadata fields, selection values, asset statuses and type inheritance for the resource.

For a comprehensive introduction to the Technical Reference, visit our documentation portal.

User Management Utilities

Administrators with the Instance Admin role now have expanded capabilities for managing active users of the platform. Administrators can now self-serve on deactivating users when someone leaves the company or should no longer have access to the system.

We’ve also enabled instance admins to promote other users to the role without needing to do so through data.world support.

Access Audit Utility

Administrators are often asked to troubleshoot access issues and confirm that access has been appropriately revoked when users change roles or need help. The user management portal includes a quick reference utility to check the access level a user has to any resource on the platform.

You can learn more about both of these features in our documentation.

Show Previous Entries