Using Microsoft Azure Data Lake Store with Apache Hive in CDH
Microsoft Azure Data Lake Store (ADLS) is a massively scalable distributed file system that can be accessed through an HDFS-compatible API. ADLS acts as a persistent storage layer for CDH clusters running on Azure. In contrast to Amazon S3, ADLS more closely resembles native HDFS behavior, providing consistency, file directory structure, and POSIX-compliant ACLs. See the ADLS documentation for conceptual details.
CDH 5.11 and higher supports using ADLS as a storage layer for MapReduce2 (MRv2 or YARN), Hive, Hive-on-Spark, Spark 2.1, and Spark 1.6. Comparable HBase support was added in CDH 5.12.
For information about using Hive with ADLS, see Configuring ADLS Connectivity for CDH.
Page generated May 18, 2018.
<< How To Set Up a Shared Amazon RDS as Your Hive Metastore for CDH | ©2016 Cloudera, Inc. All rights reserved | Tuning Apache Hive in CDH >> |
Terms and Conditions Privacy Policy |