How to Enable S3 Cloud Storage in Hue
Cloudera S3 Connector in Cloudera Manager securely connects your CDH cluster to Amazon S3.
- C5.11 adds S3 Guard for list consistency and support for IAM roles in Cloudera Manager.
- C5.10 connects Hue, Impala, and Navigator securely with the Cloudera S3 Connector Service.
- C5.9 adds support for Amazon S3 with plain-text credentials using Cloudera Manager safety valves.
Continue reading:
Enable S3 in Hue with the S3 Connector Service
Method | Security | Required | Services |
---|---|---|---|
Secure Mode | High | Kerberos, Sentry | Hue, Impala, Navigator |
Unsecure Mode | Medium | Hue, Impala, Navigator, Hive |
- Log on to Cloudera Manager.
- Select .
- Click Add Access Key Credentials or Add IAM Role-based Authentication.
Important: IAM Role-based Authentication is not fine-grained authentication. Also, to use it with Hue, configure the region in hue_safety_valve.ini–see step step 11.
- Add any Name and enter your S3 credentials:
- To connect your AWS root user, add the Access Key ID and Secret Access Key for your root account.
- To connect an IAM user, add the Access Key ID and Secret Access Key for a read-only IAM account.
- If you have an Amazon DynamoDB database, check Enable
S3Guard for consistent read operations.
Warning: Components writing data to S3 are constrained by the inherent Amazon S3 limitation known as "eventual consistency." This can lead to data loss when a Spark or Hive job writes output directly to S3. Cloudera recommends that you use S3 Guard or write to HDFS and distcp to S3.
- Click Enable for <cluster name> to give Hue access to S3 and S3-backed tables. Impala must have permissions defined in Sentry.
- If using access keys, select Secure or Unsecure mode. Select Unsecure to use Hive.
- Click Continue (at Step 1) if your cluster passes validation. You are automatically taken to step 5.
- Click Continue (at Step 5) to restart Hive, Impala, Oozie, and Hue.
- When finished, click Home to see the S3 Connector.
Note: A gray status icon means the S3 Connector service was successfully added.
- If using S3 Signature Version 4 regions, include the region endpoint name in fs.s3a.endpoint.
- Select the S3 Connector Service.
- Select Configuration.
- Set Default S3 Endpoint with the region endpoint name.
Valid endpoint names are those listed in the Amazon S3 section of AWS Regions and Endpoints.
- Click Save Changes.
- Restart Hue: select and .
- If using IAM roles, set the region to us-east-1 (N. Virginia) in hue_safety_valve.ini.
Note: Configuring hue_safety_valve.ini is a temporary Hue workaround for CDH 5.10.
- Select .
- Filter by .
- Set Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini with the following:
[aws] [[aws_accounts]] [[[default]]] region=us-east-1
- Click Save Changes.
- Restart Hue: select and .
Note: The S3 Connector service is not added when you use IAM roles.
Related topics: How to Configure AWS Credentials and Configuring the Amazon S3 Connector.
Enable S3 in Hue with Safety Valves
This section assumes an AWS account with access keys, but not necessarily a Kerberized cluster.
- Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini
- Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml
- Hive Service Advanced Configuration Snippet (Safety Valve) for core-site.xml.
- Log on to Cloudera Manager and select .
- Select .
- Filter by .
- Set your S3 credentials in Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini:
Note: Store your credentials in a script that outputs to stdout. A security_token is optional.
[aws] [[aws_accounts]] [[[default]]] access_key_id_script=</path/to/access_key_script> secret_access_key_script=</path/to/secret_key_script> #security_token=<your AWS security token> allow_environment_credentials=false region=<your region, such as us-east-1>
For a proof-of-concept installation, you can add the IDs directly.access_key_id=<your_access_key_id> secret_access_key=<your_secret_access_key>
- Clear the scope filters and search on "core-site.xml".
- To enable the S3 Browser, set your S3
credentials in Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml:
<property> <name>fs.s3a.access.key</name> <value>AWS access key ID</value> </property> <property> <name>fs.s3a.secret.key</name> <value>AWS secret key</value> </property>
- To enable Hive with S3, set your S3 credentials in Hive Service Advanced Configuration Snippet (Safety Valve) for core-site.xml.
- Click Save Changes.
- If using S3 Signature Version 4 regions, include the region endpoint name in fs.s3a.endpoint.
- Select the S3 Connector Service.
- Select Configuration.
- Set Default S3 Endpoint with the region endpoint name.
Valid endpoint names are those listed in the Amazon S3 section of AWS Regions and Endpoints.
- Click Save Changes.
- Restart Hue: select and .
- Restart Hive: select and .
Related topics: Amazon Web Services (AWS) Security.
Generate Access Keys in AWS
To integrate Hue with S3, you must have an Amazon Web Services (AWS) account, with access keys for either your root user or a read-only IAM user.
Root Account
- Create an AWS account and sign in to the AWS Console.
- Create access keys for this AWS root account:
- Expand the drop-down menu under your account name and select My Security Credentials.
- Click Continue to Security Credentials.
- Expand Access Keys (Access Key ID and Secret Access Key).
- Click Create New Access Key.
- Click Show Access Key or Download Key File. These are your AWS root credentials.
IAM Account
- Create two IAM groups (AWS admin and S3 Read-only):
Important: AWS requires that your first IAM group and associated user has administrator access.
- Go to the IAM service.
- Click Groups and Create New Group.
- Enter a name and click Next Step.
- Filter on "admin" and select the AdministratorAccess policy.
- Click Next Step and Create Group.
- Create a second group with AmazonS3ReadOnlyAccess.
- Create two IAM users and assign one to the admin policy and one to the S3 read policy.
- Click Users and Add User.
- Enter a name, and at a minimum, select Programmatic access.
- Click Next: Permissions.
- Select the group with administrator permissions.
- Click Next: Review and Create User.
- Create a second user and assign the group with S3 read-only access.
- Create access keys for your read-only IAM user:
- Click the name of your read-only IAM user.
- Click the Security Credentials tab.
- Click Create Access Key.
- Click Show Access Key or Download Key File. These are your IAM user credentials.
IAM Permissions Needed for Hue S3 Browser
In AWS, IAM files are used to create policies that control access to resources in a VPC. You can give IAM roles and permissions to your Hue servers to allow the Hue S3 browser to make API requests without the need to use or distribute AWS credentials (accessKey and secretAccessKey). For more information about IAM, see the AWS Identity and Access Management User Guide in the AWS documentation. For instructions on how to create an IAM role, see Creating a Role to Delegate Permissions to an AWS Service in the AWS documentation. For information about granting permission to Amazon S3 resources, see Managing Access Permissions to Your Amazon S3 Resources in the AWS documentation.
- Only the Hue servers need to have an IAM role applied to them to access S3 with the browser.
- The Hue S3 browser does not become available until the Amazon S3 connector service is added to the cluster.
- ListBucket on the Amazon S3 resource is necessary to drill down into that bucket, along with ListAllMyBuckets on all resources.
- These permissions do not give access to other private buckets in that AWS account, although public buckets are accessible.
- For the Hue S3 browser, your Hue servers require permissions for the following methods:
- s3:ListBucket
- s3:PutObject
- s3:GetObject
- s3:DeleteObject
- s3:PutObjectAcl
- s3:ListAllMyBuckets
{ "Version": "2012-10-17", "Statement": [ { "Sid": "S3BucketPermissions", "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::<S3BucketARN>" ], }, { "Sid": "S3ObjectPermissions", "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:DeleteObject", "s3:PutObjectAcl" ], "Resource": [ "arn:aws:s3:::<S3BucketARN>/*" ], }, { "Sid": "AllS3", "Effect": "Allow", "Action": [ "s3:ListAllMyBuckets" ], "Resource": "*" } ] }
}
<< How to Enable Usage-Based Query Assistance for Hue | ©2016 Cloudera, Inc. All rights reserved | How to Use S3 as Source or Sink in Hue >> |
Terms and Conditions Privacy Policy |