This document tracks on-going efforts to upgrade from Hadoop 2.x to Hadoop 3.x - Refer Umbrella Jira HADOOP-15501 for current status on this.
Upgrade Tests for HDFS/YARN
The following scenarios were tested while upgrading from Hadoop 2.8.4 to Hadoop 3.1.0
Type | Component | Scenario | Issues Found | Resolution | Overall Status |
---|---|---|---|---|---|
EXPRESS/ROLLING UPGRADE | HDFS | Starting 3.1.0 NameNode/DataNode with custom MetricsPlugin configured in hadoop2-metrics.properties | Workaround is applicable only for EXPRESS UPGRADE - Replace MetricsPlugin implementation jars( eg: HadoopTimelineMetricsSink) with recompiled jars which use package "org.apache.commons.configuration2" | ||
EXPRESS UPGRADE | YARN | Starting Hadoop 3.1.0 YARN daemons | |||
ROLLING UPGRADE | HDFS | 3.1.0 NN is started with rollingUpgrade with default policy configured for Erasure coding | Workaround Not known | ||
ROLLING UPGRADE | YARN | Start 3.1.0 NM in batches after starting RM. | Fixed | ||
EXPRESS/ROLLING UPGRADE | YARN | RM started with recovery enabled | Fixed |
Workloads
Application Type | Upgrade Type | Issues Found | Status | Overall Status |
---|---|---|---|---|
MR | EXPRESS/ROLLING UPGRADE | Fixed | ||
HIVE on TEZ | Hive with older versions of Tez (0.7, 0.8.x) with Hadoop 2 client ran into UT failures | Tez 0.10.0 will support Hadoop 3
| ||
Spark 2.2/2.3 | Spark 2.2/2.3 has a fork of older version of Hive (1.2) which does not work with Hadoop 3 Ongoing efforts in community to build/validate Spark with Hadoop 3 Libraries
| IN-PROGRESS | ||
PIG | Support for Hadoop 3 In-Progress in the community - targeted for PIG 0.18.0 PIG-5253 Pig Hadoop 3 support | IN-PROGRESS | ||
OOZIE | Dependent on PIG support for Hadoop 3 | Support for Hadoop 3 In-Progress in the community - Targeted for OOZIE-5.1.0 OOZIE-2973 Make sure Oozie works with Hadoop 3 | IN-PROGRESS | |
MR with Native Task Optimization | Validation Pending | |||
MR with Shared Cache Manager | Validation Pending |