
Teradata Loom® is an integrated big data solution for effective data and metadata management on Hadoop, enabling rapid analyst productivity by making it easy to find, access and understand data. Loom is the complete solution for getting the most from your data lake. Loom provides access to Hadoop data and metadata through an open REST API. Built on the API, the Loom Workbench is a simple browser-based UI for working with data and metadata in Hadoop.
Agile Data Preparation
Agile data preparation reduces the time required to prepare data for analysis and enables analysts to work more effectively with “big data”. Agile data preparation allows analysts to engage with the data up front and then iterate to the right schema or shape to meet the analytic requirement. Although Hadoop provides the fundamental data storage and processing engine, agile data preparation also calls for new kinds of tools for working with data.
Teradata Loom’s inbuilt Weaver tool provides data wrangling, fine-grain data preparation capabilities. Weaver enables highly exploratory, iterative interactions with the datasets to quickly prepare the data for meaningful statistical analysis. Analysts and data scientists today spend 80% of their time in finding and preparing the data—time ideally should have been spent on the analysis itself. With Teradata Loom’s Weaver capabilities, these data professionals can spend more time in analyzing the data rather than preparing the data itself, thereby dramatically increasing their productivity.
Loom Weaver
Loom Weaver is designed to be an interactive solution to work with big data. Built on the Loom’s metadata management capabilities, Weaver helps analysts prepare Hadoop-based data for analysis. Weaver allows users to sample tables registered in Loom, then execute any number of operations on the sample. Weaver supports operations for strings, as well as numeric and date-time columns. Once all of the operations have been specified on the sample data, users submit them to be executed over the full data in HDFS. Weaver generates the MapReduce required to execute the operations. As part of Loom, the lineage of all Weaver transformations is automatically tracked and incorporated into the lineage graph. Weaver increases the productivity of analysts working in Hadoop by making it easier to find and prepare the right data faster.
Hadoop Metadata Management
Hadoop ecosystem includes powerful tools for processing and analyzing large amounts of data. However, Capabilities for managing metadata are limited. The result is that analysts find it difficult to find and understand the data, and enterprises are left to watch the data lake become a data swamp. This can result in substantial risks such as the inability to determine the data origins, data context, semantic consistency and data lineage, making it very difficult for data analysts and the wider audience to find and work with the data in a seamless fashion. Subsequently, organizations struggle to trust, understand and use these data effectively for their unique purposes. It’s very critical to have an integrated data management solution in place to ensure rapid, quick access to high quality, high integrity data. Teradata Loom’s Hadoop management capabilities empower all stakeholders to maximize the return on investment from a lower-cost, more powerful data platform.
Try Loom with CDH
As a convenience we have pre-installed and configured Loom inside a Cloudera QuickStart VM which you can get from the Loom Download Page. Make sure you are logged in to DevX and select the Teradata Loom with CDH option, make sure you agree with the terms of the click through license and if you agree allow the .ova file to download (file is 4.7GB so download speed will depend on your bandwith).
Once the .ova file is download you can use either VirtualBox or VMWare Player/Fusion to open the Appliance according to the instructions provided on the Cloudera QuickStart VM's Website.
In simple terms this involves Importing the .ova file as an Appliance into say VirtualBox:
Before hitting the start button to power up the VM.
Once the Cloudera QuickStart VM and Loom are fully initialised (on smaller VM's this can take 2 to 3 minutes). Note: You may need to manually start Loom by opening a Terminal window and executing the following commands:
cd /opt/loom-2.3.0-beta1 nohup bin/loom-server.sh &> /dev/null &
Now that you have Loom with HDP running you can proceed to the Getting Started With Loom.