Version: 1.3.1.0

Worker Node Sizing

Worker nodes in an ODP cluster provide the storage and compute capacity for data workloads. Each worker node typically runs HDFS DataNode, YARN NodeManager, and optionally co-located services such as HBase RegionServer, Impala daemon, Kafka broker, or Kudu tablet server.

This page covers HDFS storage planning, YARN memory and vcore allocation, disk layout best practices, and Kafka co-location sizing.

HDFS Storage Planning

Replication Factor

HDFS stores each data block across multiple nodes for fault tolerance. The default replication factor is 3, meaning each block consumes 3x the logical data size in raw disk space.

Raw storage formula:

Required raw storage per node = (total logical data / number of data nodes) × replication factor × 1.25

The 1.25 factor accounts for HDFS overhead (block metadata, intermediate files, temporary job data, and free space to prevent DataNode disk full conditions).

Example:

100 TB logical data, 10 worker nodes, replication factor 3
Per-node raw: (100 TB / 10) × 3 × 1.25 = 37.5 TB raw disk per worker

Disk Configuration — JBOD Only

warning

Do not use hardware RAID (RAID 5, RAID 6, RAID 10) for HDFS data disks. HDFS provides its own fault tolerance through replication. Hardware RAID adds cost and complexity without benefit, and can reduce throughput due to parity overhead.

Configure HDFS data disks as JBOD (Just a Bunch of Disks). Each disk is mounted independently (e.g., /data/disk1, /data/disk2, ..., /data/diskN) and listed in dfs.datanode.data.dir.

Example dfs.datanode.data.dir configuration for a node with 8 data disks:

/data/disk1/hdfs,/data/disk2/hdfs,/data/disk3/hdfs,/data/disk4/hdfs,
/data/disk5/hdfs,/data/disk6/hdfs,/data/disk7/hdfs,/data/disk8/hdfs

This allows HDFS to distribute blocks across all spindles, maximizing aggregate I/O throughput.

DataNode Heap

The DataNode JVM heap scales with the number of blocks managed per node. A commonly used formula is 1 GB per 1 million blocks stored on the node.

JVM heap example (hadoop-env):

HADOOP_DATANODE_OPTS="-Xms2g -Xmx4g -XX:+UseG1GC"

For most worker nodes in medium clusters, 4 GB DataNode heap is sufficient.

YARN Memory and vCore Allocation

YARN allocates container resources from the NodeManager's configured capacity. Proper sizing avoids both resource underutilization and OOM conditions.

Total Available Memory

The available YARN memory (yarn.nodemanager.resource.memory-mb) should be:

yarn.nodemanager.resource.memory-mb = total RAM - OS overhead - co-located services heap

Example for a 128 GB worker node with HBase RegionServer (16 GB heap) and DataNode (4 GB heap):

128 GB - 8 GB (OS) - 16 GB (HBase) - 4 GB (DataNode) - 4 GB (buffer) = 96 GB → 98304 MB

Total Available vCores

Set yarn.nodemanager.resource.cpu-vcores to the number of logical CPUs on the node minus overhead for OS and co-located services:

yarn.nodemanager.resource.cpu-vcores = total logical CPUs - 2 (OS) - [cores reserved for co-located services]

For a 24-core node: 24 - 2 = 22 vcores available to YARN.

Container Memory Settings

Parameter	Recommended value
`yarn.scheduler.minimum-allocation-mb`	1024 MB (1 GB)
`yarn.scheduler.maximum-allocation-mb`	Equal to `nodemanager.resource.memory-mb`
`yarn.scheduler.minimum-allocation-vcores`	1
`yarn.scheduler.maximum-allocation-vcores`	Equal to `nodemanager.resource.cpu-vcores`

MapReduce and Spark Memory Defaults

For MapReduce:

mapreduce.map.memory.mb: 2048–4096 MB
mapreduce.reduce.memory.mb: 4096–8192 MB

For Spark (per executor):

spark.executor.memory: 4–16 GB depending on workload
spark.executor.cores: 2–5 (avoid single-core executors for efficiency)

Disk Layout Best Practices

Separation of OS and Data Disks

Always use separate disks for the operating system (and service logs) versus HDFS data:

Disk role	Recommended type	Mount points
OS + logs	SSD 200–500 GB	`/`, `/var/log/`
HDFS data	HDD 4–12 TB each	`/data/disk1` ... `/data/diskN`
Kudu data (if deployed)	NVMe / SATA SSD	`/kudu/disk1` ... `/kudu/diskN`
Kafka logs (if deployed)	HDD or SSD	`/kafka/disk1` ... `/kafka/diskN`

HDFS Short-Circuit Reads

Enable HDFS short-circuit reads (dfs.client.read.shortcircuit=true) to allow co-located compute tasks (MapReduce, Spark, Impala) to read HDFS blocks directly from local disk via Unix domain socket, bypassing the DataNode network stack. This significantly reduces latency for local reads.

Kafka Broker Sizing (Co-Located)

When Kafka brokers are co-located with HDFS DataNodes on worker nodes (acceptable for small to medium clusters), follow these guidelines:

Resource	Guidance
Kafka heap	4–8 GB (`-Xms6g -Xmx6g`)
Kafka log disks	Separate disks from HDFS data disks
Kafka log retention	Size per broker: `partitions × retention size per partition`
Network	Kafka is network-intensive; 10 GbE minimum, 25 GbE recommended for high-throughput topics

JVM heap example (kafka-env):

export KAFKA_HEAP_OPTS="-Xms6g -Xmx6g"

Key Kafka broker settings for co-located deployments:

Set log.dirs to dedicated disks (not shared with HDFS or OS)
Set num.io.threads and num.network.threads based on available cores (typically num.io.threads = 2 × number of data disks, num.network.threads = 3–8)
Tune replica.fetch.max.bytes and message.max.bytes for your expected message sizes

note

For large Kafka deployments (high message throughput or large retention), consider dedicated Kafka nodes rather than co-location with HDFS DataNodes to avoid I/O and network contention.

HBase RegionServer Sizing (Co-Located)

HBase RegionServers are typically co-located on worker nodes alongside HDFS DataNode and YARN NodeManager.

Resource	Guidance
RegionServer heap	16–32 GB
`hfile.block.cache.size`	0.40 (40% of heap for read cache)
`hbase.regionserver.global.memstore.size`	0.40 (40% of heap for write buffer)
Regions per RegionServer	20–200 (start with 50 and tune based on workload)

JVM heap example (hbase-env):

HBASE_REGIONSERVER_OPTS="-Xms16g -Xmx16g -XX:+UseG1GC -XX:MaxGCPauseMillis=100"

Use G1GC for HBase RegionServers to minimize GC pause times, which can cause region server timeouts and unnecessary region reassignment.

Worker Node Sizing

HDFS Storage Planning​

Replication Factor​

Disk Configuration — JBOD Only​

DataNode Heap​

YARN Memory and vCore Allocation​

Total Available Memory​

Total Available vCores​

Container Memory Settings​

MapReduce and Spark Memory Defaults​

Disk Layout Best Practices​

Separation of OS and Data Disks​

HDFS Short-Circuit Reads​

Kafka Broker Sizing (Co-Located)​

HBase RegionServer Sizing (Co-Located)​