DockerHub and metadata collector enhancement roundup

Our metadata collector (dwcc, aka the data.world Catalog Collector) is now available on DockerHub! Simply run docker pull datadotworld/dwcc:x.xx where x.xx is your desired version, and you're in business. It's that easy.

Other enhancements to the metadata collector:

  • Updated Domo collector to improve relationship modeling
  • Various Tableau & Manta collector fixes & enhancements
  • Denodo metadata collector support shifted to Denodo 8
  • --config-file option for metadata collector (Beta): We've heard your feedback on wanting a simplified way to manage the configurations for your metadata collectors. The config file will become the default way in the near future to set your parameters going forward. Lots more info on this coming soon! 

Enhanced metadata collector: Tableau

The data.world catalog collector (dwcc) version 2.25 and newer now includes enhanced collection of Tableau metadata, leveraging the Tableau Metadata API. For maximum available metadata, it is recommended you use Tableau user credentials with admin level permissions. The latest version of the collector is also included automatically if using Connection Manager.

New metadata includes expanded information from datasources, databases, fields, metrics, and many more inter-object relationships.


Bug roundup 🐞

In the last few weeks, several minor bugs and enhancements have been made. Here are some notable ones:

Improved help text for tags, including on pressing “Enter” to add tags

Improved empty state messaging for adding contributors to a dataset

Consistent use of timestamps in alerts and notifications

Navigation tabs on various pages are now keyboard-navigable (left and right arrow keys) for ease of browsing and improved accessibility

“Share” button directly opens “Grant access” modal

Consistent use of display name in emails

Text truncation fixed for filter bars and the project workbench

Various layout, text, and navigational misalignments or inconsistencies

Glossary Inline Descriptions

Looking for a quick definition in your glossary? Want to see your full glossary in one view? The Glossary overview has been updated to include inline descriptions for your terms. Click on a term to view additional details and metadata.



Gra.fo: Share document via link

Need some quick feedback on your collaborative modeling project? Want to share your graphical view of the world with your customers or your team?

You can now share your Gra.fo model documents without requiring your audience to have a Gra.fo account or an individual invitation. Use the "Get link" option of the share menu to grant read-only access to your document to anyone you've shared the link with. 



Public Release: Search on custom metadata fields

One of the most powerful features of the data.world catalog is the ability to enrich your catalog resources with custom metadata that's unique to your business.

We are pleased to announce that we've rolled out the first of several improvements to support search matches on custom metadata fields for your catalog resources. 

This improvement expands the fields we match against for free text searches to include any custom metadata fields you have configured in your catalog as text or selection fields. This feature empowers your end users to search for resources by the terminology and categorizations that mean the most to your business.

In the example above, verified by and data steward are custom metadata fields defined in our catalog for tables. A search for sarah smart now yields matches where she is listed as the data steward or the person who has verified the data, in addition to any existing matching fields like owner.

Tip: You can perform more precise searches against custom metadata fields with our advanced search syntax. In the example above, a search for metadata:"data steward:sarah smart" will return filtered results where Sarah Smart is listed as the Data Steward. 

Look for upcoming releases to further support boolean and IRI-based metadata searches.

Coming Soon: Addressing timezone inconsistency

🚨 Default behavior change coming next week 🚨

We have recently discovered that when executing queries, there are some cases where our DATETIME columns contain timezone information, and other cases where they do not. This is primarily an issue that arises with columns containing date/time information in uploaded files (we do not see this with live tables). We have decided to address this inconsistency. Starting next week, query result columns of type DATETIME will no longer contain timezone information, while columns of type DATETIMESTAMP will always contain timezone information.

The impact of this change shouldn’t be significant, and most users will see no change. However, if you have queries across ingested data which aggregate on DATETIME columns, or do DATE_ADD() style calculations, you may notice differences in your results depending on your current timezone.

If you are impacted by this change, here are some ways to clarify your intent w.r.t. timezones:

  1. CAST the resulting column to a DATETIMESTAMP to force timezones, or DATETIME to strip timezones (documentation)
  2. Use AT_TIME_ZONE() to explicitly state your timezone (documentation)
  3. Ensure that the table column type is set to be of type DATETIMESTAMP or DATETIME (documentation)

Note: If timezone information is desired, but not defined, UTC is assumed. 

Please contact support@data.world with any questions or concerns. As always, we’re happy to help.

Search Improvements: Rankings and Special Characters

Search is a core feature of data.world and is consistently a focus of our improvement efforts. This month, we've rolled out improvements to result rankings on searches that contain more than one word. This improvement provides better rankings for results with titles that exactly match the submitted search as well as several other ranking and relevance improvements. These changes also provide better support for searches that contain special characters such as ampersands, dashes, underscores, and slashes.


New Gra.fo API: Export document as TTL or OWL

Head over the Gra.fo API Documentation to learn more about how to use the Gra.fo public API to export your model as TTL or OWL. The export API is designed for use with build scripts, version control tools, integrations, or to upload your model into other tools, like data.world.

Fun fact: Ever wonder why our adorable mascot, Sparkle, is an OWL? Now you know!

Looking for the API Documentation? We've added a link to the user menu in Gra.fo:

Gra.fo supports several additional export formats from the document page. These options can be found under the File menu.


Show Previous EntriesShow Previous Entries