Skip to main content
Version: 1.3.1.0

Apache Iceberg in ODP

ODP provides end‑to‑end Apache Iceberg support across the data‑platform stack. Apache Iceberg is fully integrated into the ODP distribution starting with version 1.3.1.0. Iceberg tables are usable through Hive and Spark. Governance is provided through Ranger, with tag-based governance being completed as part of the Atlas integration.

What is integrated today

Storage & SQL engines

  • Hive 4.0.1: Iceberg tables are supported natively via the Hive Iceberg storage handler.
  • Spark 3.5.6: Spark SQL can read/write Iceberg tables using the Spark Iceberg catalog.

Using Iceberg with Hive (Beeline)

Below is a minimal end‑to‑end example using Beeline and the Hive Iceberg storage handler.

-- Beeline
!connect jdbc:hive2://<hs2-host>:10000/default

create database if not exists iceberg_demo;

create table iceberg_demo.sample_hive (
id bigint,
data string
)
stored by 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
tblproperties ('table_type'='ICEBERG');

insert into iceberg_demo.sample_hive values (1, 'a'), (2, 'b');

select * from iceberg_demo.sample_hive;

Using Iceberg with Spark (spark‑shell, Scala)

// spark-shell
spark.sql("CREATE DATABASE IF NOT EXISTS iceberg_demo")

spark.sql(
"""
|CREATE TABLE spark_catalog.iceberg_demo.sample_spark (
| id bigint,
| data string
|) USING iceberg
""".stripMargin)

spark.sql("INSERT INTO spark_catalog.iceberg_demo.sample_spark VALUES (1,'a'),(2,'b')")
spark.sql("SELECT * FROM spark_catalog.iceberg_demo.sample_spark").show()

Using Iceberg with Spark (pyspark)

# pyspark
spark.sql("CREATE DATABASE IF NOT EXISTS iceberg_demo")

spark.sql(\"\"\"
CREATE TABLE spark_catalog.iceberg_demo.sample_pyspark (
id bigint,
data string
) USING iceberg
\"\"\")

spark.sql("INSERT INTO spark_catalog.iceberg_demo.sample_pyspark VALUES (1,'a'),(2,'b')")
spark.sql("SELECT * FROM spark_catalog.iceberg_demo.sample_pyspark").show()