Highlights –
- The new capabilities added to Collibra’s Data Intelligence Cloud, platform are intended to provide accessibility for users of all skill levels.
- The following solution, Data Quality and Observability in the Cloud applies predictive analytics to data quality by proactively identifying data quality concerns across data sources using checks and rules produced by machine learning.
Collibra, a leading data catalog and governance provider, announced several new features for its Data Intelligence Cloud platform. It has also announced new standalone product offerings and integrations with Snowflake, Azure Data Factory, and Google Cloud Storage.
According to the business, the latest releases are made to make it easier for more users across an organization to access data. This development was made during a conference, Data Citizens ‘22, which Collibra is hosting to bring together business executives and data experts to discuss various subjects like data quality, catalogs, privacy, governance, and more.
Updated Data Intelligence Cloud
The new capabilities added to Collibra’s Data Intelligence Cloud platform are intended to provide increased accessibility for users irrespective of the technical expertise. The first among them is the introduction of a new data marketplace that offers customers a shopping-like experience while allowing them to access an organization’s internal datasets. According to Laura Sellers, chief product officer at Collibra, the data market is made “for the data consumer – the casual user of data who doesn’t need to know the ins and outs of the full data ecosystem.”
Users use a Google-like search interface to look for datasets, and datasets are surfaced on the basis of the employee’s domain and function within the company. Administrators can create marketplaces based on user roles, such as designing one exclusively for business analysts that would surface Tableau, Power BI, or Looker reports that have been approved for usage in commercial settings. Administrators can also create a marketplace on the basis of data domain. For instance, a marketplace that caters to the marketing team would display only those datasets that belong to and are maintained by a particular department.
Another feature of note is the Usage Analytics dashboard, which keeps track of platform users and the domains, communities, and data assets that are being used. To abide by local data privacy laws, access to user information may be restricted. Data views filtered by date period can be generated, showing information on which teams are utilizing the platform and which data assets are being used.
In one scenario, when the marketplace is setup for a marketing team of a company, administrators might utilize the Usage Analytics tool to monitor engagement with the marketplace, including the data assets that get the most use and whether new users are engaging with the data. The marketing team may find it helpful to collect related data assets using this information. The data about user engagement or non-engagement can be used to reallocate product licenses or allocate resources to identify and remove engagement barriers.
A redesigned homepage for Collibra is also included in this update to streamline the user experience and display recently popular things or offer suggestions based on browsing behavior.
Workflow Designer, which automates routine data administration operations, including allowing users access to and certifying new datasets, is the last feature introduced to Collibra’s Data Intelligence Platform. Workflow Designer, integrated into the Collibra platform navigation, intends to make it easier for users to build processes. Users can, for instance, “drag an icon from the left sidebar for launching a script, add it into the workflow process at the right moment, load the relevant script, etc.,” according to Sellers, to create a workflow process.
Additionally, Workflow Designer has a form editor that assists users in creating forms to collect the data needed for business processes. This includes specifying the data to be gathered, modifying the form layout, and establishing dependencies for form components. Users may now combine “a defined set of processes and forms that are utilized together to automate a process (e.g., approve access to a dataset)” using Workflow Designer’s new “Apps” functionality, according to Sellers. After the app has been developed and tested, it can be exported and deployed into any environment.
New standalone products
From that list of features, let’s move on to new standalone offerings. The first is Collibra Protect, which permits the development of a no-code policy and execution in Snowflake. Authored policies can limit use or access based on the sensitivity level or purpose of the business. For example, a policy can be written that states that third-party marketing data can only be used for marketing research purposes, thus limiting access to that Data to those responsible for conducting market research data.
The next solution, Data Quality and Observability in the Cloud, offers predictive analytics to data quality: The offering proactively identifies data quality concerns across data sources via checks and rules produced by machine learning. According to Sellers, the value-add of “predictive, continuous, self-service data quality” is that it both assures business users can still access high-quality data and frees up data experts to concentrate on jobs with a larger impact. According to Sellers, users may “connect to more than 40 databases and file systems to scan data where it resides via pushdown or pull-up processing,” Collibra Data Quality & Observability can be used with any cloud.
Data Quality Pushdown for Snowflake, which is presently in beta, is developed for Snowflake customers to “eliminate egress charges and dependencies on Spark compute while running their [data quality] jobs,” according to Sellers. This feature prevents data from being read out of the Snowflake environment, enhancing compliance with privacy laws and removing egress fees. She says, “Pushdown is an alternative compute option for running a [data quality] job, where all processing … is submitted to the target data warehouse. To use pushdown, you can run a setup script that creates a dedicated Snowflake Virtual Warehouse and a service account user for DQ job runs. This designated service account user will need read access to all schemas covering the target data. Collibra will provide customers with a Snowflake Pushdown setup script [to] run to use this new feature.”
Strengthening connections
Collibra also included new connections to Snowflake, Azure Data Factory, and Google Cloud Storage in this update. The integration with Snowflake offers end-to-end visibility of data stored in Snowflake Data Cloud, including column-level lineage and transformations. Sellers says the integration with Azure Data Factory “automatically harvests and stitches lineage from Azure Data Factory so that [users] can get a complete picture of data flow from source to destination.” The integration with Google Cloud Storage enables accessing, mapping, and ingesting metadata from buckets, folders, and files, enabling users to discover and manage Google Cloud Storage data within Collibra.
Greater access
Collibra claims it wants to make data more available to users with its new platform advancements, products, and integrations. The democratization of data within businesses that results from extending access to these capabilities should benefit both technical and business users more.