Decommissioning DataNodes Using the Command Line
Decommissioning a DataNode excludes it from a cluster after its data is replicated to active nodes. To decommision a DataNode:
- Create a file named dfs.exclude in the HADOOP_CONF_DIR (default is /etc/hadoop/conf).
- Add the name of each DataNode host to be decommissioned on individual lines.
- Stop the TaskTracker on the DataNode to be decommissioned.
- Add the following property to hdfs-site.xml on the NameNode host.
<property> <name>dfs.hosts.exclude</name> <value>/etc/hadoop/conf/dfs.exclude</value> <property>
When a DataNode is marked for decommission, all of the blocks on that DataNode are marked as under replicated. In the NameNode UI under Decommissioning DataNodes you can see a total number of under replicated blocks, which will reduce over time, indicating decommissioning progress.
Cloudera recommends that you decommission no more than two DataNodes at one time.
Stopping the Decommissioning Process
To stop the decommissioning process for a DataNode using the command line:
- Remove the DataNode name from /etc/hadoop/conf/dfs.exclude.
- Run the command $ hdfs dfsadmin -refreshNodes.
Page generated May 18, 2018.
<< Post-migration Verification | ©2016 Cloudera, Inc. All rights reserved | Managing Individual Services >> |
Terms and Conditions Privacy Policy |