Monitoring the Performance of HDFS Replications
You can monitor the progress of an HDFS replication schedule using performance data that you download as a CSV file from the Cloudera Manager Admin console. This file contains information about the files being replicated, the average throughput, and other details that can help diagnose performance issues during HDFS replications. You can view this performance data for running HDFS replication jobs and for completed jobs.
- Go to .
- Locate the schedule.
- Click Performance Report and select one of the following options:
- HDFS Performance Summary – Download a summary report of the performance of the running replication job. An HDFS Performance Summary Report includes the last performance sample for each mapper that is working on the replication job.
- HDFS Performance Full – Download a full report of the performance of the running replication job. An HDFS Performance Full report includes all samples taken for all mappers during the full execution of the replication job.
- To view the data, import the file into a spreadsheet program such as Microsoft Excel.
- Go to .
- Locate the schedule and click
The Replication History page for the replication schedule displays.
.
- Click to expand the display for this schedule.
- Click Download CSV link and select one of the following options:
- Listing – a list of files and directories copied during the replication job.
- Status - full status report of files where the status of the replication is one of the following:
- ERROR – An error occurred and the file was not copied.
- DELETED – A deleted file.
- SKIPPED – A file where the replication was skipped because it was up-to-date.
- Error Status Only – full status report, filtered to show files with errors only.
- Deleted Status Only – full status report, filtered to show deleted files only.
- Skipped Status Only– full status report, filtered to show skipped files only.
- Performance – summary performance report.
- Full Performance – full performance report.
See HDFS Performance Report Columns for a description of the data in the performance reports.
- To view the data, import the file into a spreadsheet program such as Microsoft Excel.
The performance data is collected every two minutes. Therefore, no data is available during the initial execution of a replication job because not enough samples are available to estimate throughput and other reported data.
Performance Data Columns | Description |
---|---|
Timestamp | Time when the performance data was collected |
Host | Name of the host where the YARN or MapReduce job was running. |
SrcFile | Name of the source file being copied by the MapReduce job. |
TgtFile | Name of the file to which the source file was being copied on the target. |
BytesCopiedPerFile | Number of bytes copied for the file currently being copied. |
TimeElapsedPerFile | Total time elapsed for this copy operation of the file currently being copied. |
CurrThroughput | Current throughput in bytes per second. |
AvgFileThroughput | Average throughput in bytes per second since the start of the file currently being copied. |
TotalSleepTime | Number of seconds the transfer was stalled due to throughput throttling. This is expected to be zero unless the throughput was throttled using the Maximum Bandwidth parameter for the replication schedule. (You configure his parameter on the Advanced tab when creating or editing a replication schedule.) |
AvgMapperThroughput | Average throughput for current mapper. This can include samples of throughput taken for various files copied by this mapper. |
BytesCopiedPerMapper | Total bytes copied by this MapReduce job. This can include multiple files. |
TimeElapsedPerMapper | Total time elapsed since this MapReduce job started copying files. |
A sample CSV file, as presented in Excel, is shown here:
- If you click the CSV download too soon after the replication job starts, Cloudera Manager returns an empty file or a CSV file that has columns headers only and a message to try later
when performance data has actually been collected.
- If you employ a proxy user with the form user@domain, performance data is not available through the links.
- If the replication job only replicates small files that can be transferred in less than a few minutes, no performance statistics are collected.
- For replication schedules that specify the Dynamic Replication Strategy, statistics regarding the last file transferred by a MapReduce job hide previous transfers performed by that MapReduce job.
- Only the last trace per MapReduce job is reported in the CSV file.
<< HDFS Replication Tuning | ©2016 Cloudera, Inc. All rights reserved | Hive/Impala Replication >> |
Terms and Conditions Privacy Policy |