Version: 1.3.1.0

Hardware Recommendations for ODP

This page provides hardware sizing guidance for deploying an ODP cluster. These recommendations cover the typical three-tier topology (master nodes, worker nodes, edge node) and include specific guidance for hardware-sensitive components such as Kudu and Kafka.

Typical ODP Cluster Topology

An ODP cluster is generally organized into three categories of nodes:

Node Type	Count (typical)	Purpose
Master nodes	3–5	Coordination services: NameNode, ResourceManager, HBase Master, ZooKeeper, Ambari Server, Ranger, Atlas, Knox
Worker nodes	3–N	Data storage and processing: HDFS DataNode, YARN NodeManager, Impala daemon, Kafka broker, Kudu tablet server
Edge node	1–2	Client-facing: Hadoop clients, Knox gateway, NiFi, HiveServer2 client connections

This separation of concerns ensures that master coordination services are not impacted by the resource consumption of data workloads running on worker nodes.

Master Node Recommendations

Master nodes host coordination services that must remain highly available and responsive. ODP supports NameNode HA and ResourceManager HA, requiring a minimum of 2 master nodes for HA pairs. A third master (or a dedicated quorum node) is typically required for ZooKeeper and for HBase Master standby.

Recommended Specification

Resource	Minimum	Recommended
CPU	8 cores	16–32 cores
RAM	32 GB	64–128 GB
OS / logs disk	1x SSD 200 GB	2x SSD 400 GB (RAID 1 or mirrored)
Network	10 GbE	25 GbE

Notes:

Use SSD for the operating system volume and log directories (/var/log/) to avoid disk I/O becoming a bottleneck during NameNode edit log flushes or Ambari agent activity
The Ambari Server database (PostgreSQL, MySQL, or Oracle) should reside on a volume with low-latency I/O; SSD is strongly recommended
Ranger and Atlas both benefit from dedicated heap and fast disk for their embedded or external Solr audit stores

Worker Node Recommendations

Worker nodes carry the bulk of the storage and compute load. Sizing depends heavily on your expected data volume, replication factor, and processing workloads (batch, interactive, streaming).

Recommended Specification

Resource	Minimum	Recommended (medium cluster)
CPU	8 cores	16–24 cores
RAM	32 GB	64–256 GB
Data disks	4x HDD 4 TB	6–12x HDD 6–12 TB (JBOD, no RAID)
OS disk	1x SSD 200 GB	1x SSD 200 GB
Network	10 GbE	25 GbE

Notes:

HDFS DataNode data disks should be configured as JBOD (Just a Bunch of Disks) — do not use RAID for data disks. HDFS provides its own replication (default factor: 3) and hardware RAID is unnecessary and wasteful
Size total raw worker storage as: required usable storage × replication factor × 1.25 (for overhead)
YARN NodeManager available memory should be set to total RAM minus OS overhead and any co-located service heap. A good starting point is total RAM - 8 GB for YARN containers

Kudu — SSD Requirement

warning

Kudu tablet servers require SSD storage for tablet data directories. Running Kudu on spinning HDDs results in severe performance degradation and is not supported in ODP.

If Kudu is deployed, worker nodes hosting Kudu tablet servers should have at minimum 2–4 NVMe or SATA SSDs dedicated to Kudu data directories, separate from HDFS data disks.

Kafka — Dedicated Disk Recommendation

tip

Kafka brokers are I/O intensive and perform best with dedicated disks for Kafka log directories.

If Kafka is co-located with HDFS DataNodes on worker nodes (a common pattern for smaller clusters), configure Kafka log directories on separate disks from HDFS data directories. For large Kafka deployments, consider dedicated Kafka broker nodes with high-throughput SSDs or high-capacity HDDs.

Edge Node Recommendations

The edge node serves as the entry point for users and applications connecting to the cluster. It hosts Hadoop client libraries, Knox (API gateway / SSO), and optionally NiFi for data ingestion pipelines.

Recommended Specification

Resource	Minimum	Recommended
CPU	4 cores	8 cores
RAM	16 GB	32 GB
Disk	1x SSD 200 GB	1x SSD 500 GB
Network	10 GbE	10 GbE

Notes:

Knox handles TLS termination and Kerberos SPNEGO; a faster CPU reduces authentication latency for concurrent users
If NiFi is deployed on the edge node, increase RAM to at least 64 GB and ensure sufficient local disk for NiFi repositories (FlowFile, Content, Provenance)

Network Requirements

Requirement	Minimum	Recommended
Internal cluster bandwidth	10 GbE	25 GbE
Switch	Layer 2 (same VLAN)	Layer 3 with dedicated VLAN
Latency (intra-cluster)	< 1 ms	< 0.5 ms

For clusters with more than 20 worker nodes, 25 GbE interconnects are strongly recommended to avoid network becoming a bottleneck for HDFS replication traffic and Spark shuffle operations.

See the Network Requirements page for DNS, NTP, firewall, and Kerberos connectivity details.

Hardware Recommendations for ODP

Typical ODP Cluster Topology​

Master Node Recommendations​

Recommended Specification​

Worker Node Recommendations​

Recommended Specification​

Kudu — SSD Requirement​

Kafka — Dedicated Disk Recommendation​

Edge Node Recommendations​

Recommended Specification​

Network Requirements​

Typical ODP Cluster Topology

Master Node Recommendations

Recommended Specification

Worker Node Recommendations

Recommended Specification

Kudu — SSD Requirement

Kafka — Dedicated Disk Recommendation

Edge Node Recommendations

Recommended Specification

Network Requirements