Simplifying the end user Data Catalog search

4 views
Skip to first unread message

Romain Rigaux

unread,
May 1, 2018, 12:50:35 PM5/1/18
to Hue-Users
Initially published on http://gethue.com/simplifying-the-end-user-data-catalog-search/

Data Catalog Search

Before typing any query to get insights, users need to find and explore the correct datasets. The Data Catalog search usability experience has been improved in each release since. It is accessible from the top bar of the interface and offers free text search of SQL tables, columns, tags and saved queries. This is particularly useful for quickly looking up a table among thousands or finding existing queries already analysing a certain dataset.

It this iteration, the search now provides more results directly via the ‘Show more’ link. Existing tags can be faceted simply by typing ‘tags:’.

Some example of searches:

  • usage → Returns any table matching ‘usage’ in its name, description or tags.
  • type:view customer → Find the customer view
  • tax* tags:finance → List all the tables and views starting with tax and tagged with ‘finance’

 

Searching all the available queries or data in the cluster

Listing the possible tags to filter on. This also works for ‘types’.

Unification and Caching of all SQL metadata

The list of tables and their columns is displayed in multiple part of the interface. This data is pretty costly to fetch and comes from different sources. In this new version, the information is now cached and reused by all the Hue components. As the sources are diverse, e.g. Apache Hive, Cloudera Navigator, Cloudera Optimizer those are stored into a single object, so that it is easier and faster to display without caring about the underlying technical details.

 

In addition to editing the tags of any SQL objects like tables, views, columns… which has been available since version one, table descriptions can now also be edited. This allows a self service documentation of the metadata by the end users, which was not possible until know as directly editing Hive comments require some admin Sentry privileges which are not granted to regular users in a secure cluster.

 

In the upcoming version,this information is also reused on the Catalog pages.

Showing all the common data now cached and unified for a slicker experience

 

 

As usual thank you to all the project contributors and for sending feedback and participating on the hue-user list or @gethue!


r.m...@plenium.com

unread,
May 1, 2018, 1:24:23 PM5/1/18
to Hue-Users
This is cool! Any chance we may get an all HDFS files search capability just like tables or does it already exists.

Romain Rigaux

unread,
May 2, 2018, 10:51:31 AM5/2/18
to r.m...@plenium.com, Hue-Users
It is mostly implemented but not clean enough. It could also support S3 if configured. Created https://issues.cloudera.org/browse/HUE-8272 if you want to officially track it

--
You received this message because you are subscribed to the Google Groups "Hue-Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hue-user+unsubscribe@cloudera.org.

Geoff Duniam

unread,
May 6, 2018, 9:17:47 PM5/6/18
to hue-...@cloudera.org
Hi

I'll be away from the office on leave from Monday 30 April returning Monday
7 May.

Thanks

Geoff

Reply all
Reply to author
Forward
0 new messages