Performing Disk Hot Swap for DataNodes
This section describes how to replace HDFS disks without shutting down a DataNode. This is referred to as hot swap.
Warning: Requirements and Limitations
- Hot swap is supported for CDH 5.4 and higher.
- Hot swap can only add disks with empty data directories.
- Removing a disk does not move the data off the disk, which could potentially result in data loss.
- Do not perform hot swap on multiple hosts at the same time.
Performing Disk Hot Swap for DataNodes Using Cloudera Manager
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
- Configure data directories to remove the disk you are swapping out:
- Go to the HDFS service.
- Click the Instances tab.
- Click the affected DataNode.
- Click the Configuration tab.
- Select .
- Select .
- Change the value of the DataNode Data Directory property to remove the directories that are mount points for the disk you are removing.
Warning: Change the value of this property only for the specific DataNode instance where you are planning to hot swap the disk. Do not edit the role group value for this property. Doing so will cause data loss.
- Click Save Changes to commit the changes.
- Refresh the affected DataNode. Select .
- Remove the old disk and add the replacement disk.
- Change the value of the DataNode Data Directory property to add back the directories that are mount points for the disk you added.
- Click Save Changes to commit the changes.
- Refresh the affected DataNode. Select .
- Run the HDFS fsck utility to validate the health of HDFS.
Performing Disk Hot Swap for DataNodes Using the Command Line
Important:
Use these instructions to perform hot swap of disks in a cluster that is not managed by Cloudera Manager- Follow these command-line instructions on systems that do not use Cloudera Manager.
- This information applies specifically to CDH 5.15.0. See Cloudera Documentation for information specific to other releases.
To add and remove disks:
- If you are adding disks, format and mount them.
- Change the value of dfs.datanode.data.dir in hdfs-site.xml on the DataNode to reflect the directories that will be used from now on (add new points and remove obsolete ones). For more information, see the instructions for DataNodes under Configuring Local Storage Directories.
- Start the reconfiguration process:
- If Kerberos is enabled:
$ kinit -kt /path/to/hdfs.keytab hdfs/<fully.qualified.domain.name@YOUR-REALM.COM> && dfsadmin -reconfig datanode HOST:PORT start
- If Kerberos is not enabled:
$ sudo -u hdfs hdfs dfsadmin -reconfig datanode HOST:PORT start
To check on the progress of the reconfiguration, you can use the status option of the command; for example, if Kerberos is not enabled:$ sudo -u hdfs hdfs dfsadmin -reconfig datanode HOST:PORT status
- If Kerberos is enabled:
- Once the reconfiguration is complete, unmount any disks you have removed from the configuration.
- Run the HDFS fsck utility to validate the health of HDFS.
To perform maintenance on a disk:
- Change the value of dfs.datanode.data.dir in hdfs-site.xml on the DataNode to exclude the mount point directories that reside on the affected disk and reflect only the directories that will be used during the maintenance window. For more information, see the instructions for DataNodes under Configuring Local Storage Directories.
- Start the reconfiguration process:
- If Kerberos is enabled:
$ kinit -kt /path/to/hdfs.keytab hdfs/<fully.qualified.domain.name@YOUR-REALM.COM> && dfsadmin -reconfig datanode HOST:PORT start
- If Kerberos is not enabled:
$ sudo -u hdfs hdfs dfsadmin -reconfig datanode HOST:PORT start
To check on the progress of the reconfiguration, you can use the status option of the command; for example, if Kerberos is not enabled:$ sudo -u hdfs hdfs dfsadmin -reconfig datanode HOST:PORT status
- If Kerberos is enabled:
- Once the reconfiguration is complete, unmount the disk.
- Perform maintenance on the disk.
- Remount the disk.
- Change the value of dfs.datanode.data.dir again to reflect the original set of mount points.
- Repeat step 2.
- Run the HDFS fsck utility to validate the health of HDFS.
Page generated May 18, 2018.
<< Configuring Storage Balancing for DataNodes | ©2016 Cloudera, Inc. All rights reserved | JournalNodes >> |
Terms and Conditions Privacy Policy |