Skip to main content
Version: 1.3.1.0

Apache Ranger Reference

Apache Ranger 2.6.0 is the centralised authorisation layer in ODP. It provides fine-grained access control across every data service in the cluster through a plugin architecture — no service-level ACL management required.

Ranger Architecture

┌─────────────────────────────────────────────────────────┐
│ Ranger Admin │
│ (Web UI · REST API · Policy store · Audit config) │
└────────────┬────────────────────────────────────────────┘
│ Policy sync (pull, ~30 s interval)
┌────────▼──────────────────────────────────────┐
│ Ranger Plugins │
│ HDFS · Hive · HBase · Kafka · Knox · YARN │
│ Ozone · Impala · NiFi · … │
└────────────────┬──────────────────────────────┘
│ Audit events
┌──────────▼───────────┐
│ Audit backends │
│ Solr · HDFS │
└──────────────────────┘
  • Ranger Admin: Central web UI and REST API. Stores policies in a relational database (MySQL/PostgreSQL). Manages user/group sync from LDAP.
  • Ranger Plugins: In-process agents embedded in each service (NameNode, HiveServer2, HBase Master, etc.). They cache policies locally and evaluate access decisions without a network round-trip to Admin.
  • Audit: Every access decision (allow or deny) is logged to Solr (for search/UI) and optionally to HDFS (for long-term retention).

Ranger 2.6.0 in ODP

Ranger is installed and configured by Ambari during cluster deployment. All Ranger configuration (database, LDAP, audit, plugin enablement) is managed through the Ambari UI under Services → Ranger.

Key deployment facts:

  • Ranger Admin runs on the host assigned the Ranger Admin role in Ambari.
  • The Ranger KMS (Key Management Server) is a separate service used for HDFS TDE — see Encryption in ODP.
  • All Ranger plugins are co-located with their respective service daemons.

Configuring Ranger Admin via Ambari

Navigate to Ambari → Services → Ranger → Configs.

Important settings:

Config tabKey parameterDescription
Ranger Settingsranger_admin_hostsHost running Ranger Admin
Ranger Settingsranger.jpa.jdbc.*Database connection for policy store
Advanced ranger-ugsync-siteranger.usersync.ldap.urlLDAP/FreeIPA URL for user sync
Advanced ranger-ugsync-siteranger.usersync.ldap.binddnBind DN for LDAP queries
Ranger Auditxasecure.audit.destination.solrEnable Solr audit
Ranger Auditxasecure.audit.destination.hdfsEnable HDFS audit

After making changes in Ambari, restart Ranger Admin and affected services.


Service Plugins

Each data service has a dedicated Ranger plugin. Enable plugins through Ambari — each service config page has a Ranger Plugin tab.

HDFS Plugin

Controls access to HDFS paths. Policies can be set on paths with wildcards (/data/*). Supports:

  • Read / Write / Execute permissions on directories and files.
  • Recursive policy application.
Resource: /data/raw/*
User: etl_user — Allow: Read, Write
Group: analysts — Allow: Read

Hive Plugin

Controls access to Hive databases, tables, columns, and UDFs. Also provides Row Filter and Column Masking policies.

Resource: database=finance, table=transactions, column=*
User: finance_analyst — Allow: Select
Column masking: column=credit_card_number → MASK (show last 4)

HBase Plugin

Controls access to HBase tables, column families, and columns.

Resource: table=user_data, column-family=pii, column=*
User: app_service — Allow: Read, Write

Kafka Plugin

Controls access to Kafka topics with Publish, Consume, Create, Delete, Describe, and Idempotent Write permissions.

Resource: topic=events-*
User: kafka_producer — Allow: Publish
Group: analytics_team — Allow: Consume, Describe

Knox Plugin

Controls which users and groups can access services proxied through Knox topologies.

Resource: topology=default, service=WEBHDFS
Group: external_users — Allow: Allow

Ozone Plugin

Controls access to Ozone volumes, buckets, and keys.

Resource: volume=data, bucket=raw, key=*
User: ingest_user — Allow: Read, Write, Create, List

YARN Plugin

Controls access to YARN queues — which users/groups can submit jobs to which queues.

Resource: queue=root.production
Group: prod_users — Allow: Submit Application, Admin Queue

Impala Plugin

Controls access to Impala databases, tables, columns, and UDFs — mirrors the Hive plugin model.


Policy Types

Allow Policies

The default policy type. Grant explicit access to resources.

Policy type: Allow
Resource: /data/warehouse/hive
User: hive — Allow: Read, Write, Execute

Deny Policies

Explicitly deny access, overriding allow policies. Use sparingly — deny policies apply even to users who have an allow policy for the same resource.

Policy type: Deny
Resource: /data/sensitive/*
Group: contractors — Deny: Read, Write

Row Filter Policies (Hive / Impala)

Automatically append a WHERE clause to queries. Transparent to the application.

Resource: database=hr, table=employees
User: regional_manager_west
Row filter: department = 'WEST'

Column Masking Policies (Hive / Impala)

Replace sensitive column values at query time without modifying stored data.

Masking typeResult
Redactxxxx
Partial mask — show last 4****1234
Hash (SHA-256)a94f...
NullifyNULL
Custom expressionAny SQL expression

Tag-Based Policies with Atlas Integration

Tag-based policies decouple authorisation from resource names — a policy applies to any resource that carries a specific Atlas classification tag, regardless of its name or location.

Workflow

  1. Create a classification in Atlas (e.g., PII, CONFIDENTIAL, GDPR).
  2. Apply the classification to Atlas entities (tables, columns, HDFS paths).
  3. In Ranger, create a Tag-based policy targeting the classification.
  4. The policy automatically applies to all resources carrying that tag across all engines (Hive, HDFS, HBase, etc.).

Example

Tag policy: classification=PII
All users — Deny: Select
Exception: Group=data_stewards — Allow: Select
Column masking for PII tag: Hash

Any Hive column, HDFS file, or HBase column family tagged PII in Atlas is automatically subject to this policy without creating per-resource Ranger rules.


Ranger Audit Configuration

Audit records capture: user, service, resource, access type, result (allowed/denied), timestamp, and client IP.

Ranger Admin includes an embedded Solr view for querying audit logs. Configure in Ambari under Ranger → Ranger Audit:

xasecure.audit.destination.solr = true
xasecure.audit.destination.solr.urls = http://ranger-solr-host:8983/solr/ranger_audits

HDFS Audit (long-term retention)

xasecure.audit.destination.hdfs = true
xasecure.audit.destination.hdfs.dir = hdfs:///ranger/audit

Audit files are stored as JSON lines, partitioned by service and date:

/ranger/audit/hdfs/20260408/hdfs_audit_20260408_000000.log.gz

Ranger REST API Overview

Ranger Admin exposes a REST API for programmatic policy management.

Authentication

# Basic auth (or Kerberos negotiate if cluster-secured)
RANGER_URL="https://ranger-admin-host:6182"
AUTH="-u admin:ranger-admin-password"

List policies for a service

curl -s $AUTH "$RANGER_URL/service/public/v2/api/policy?serviceName=hdp_hdfs" | python3 -m json.tool

Create a policy

curl -s -X POST $AUTH \
-H "Content-Type: application/json" \
-d '{
"name": "etl-raw-access",
"service": "hdp_hdfs",
"resources": {
"path": {"values": ["/data/raw"], "isRecursive": true}
},
"policyItems": [{
"users": ["etl_user"],
"accesses": [{"type": "read"}, {"type": "write"}],
"isAuditEnabled": true
}],
"isEnabled": true
}' \
"$RANGER_URL/service/public/v2/api/policy"

Update a policy

curl -s -X PUT $AUTH \
-H "Content-Type: application/json" \
-d '<updated-policy-json>' \
"$RANGER_URL/service/public/v2/api/policy/<policy-id>"

Delete a policy

curl -s -X DELETE $AUTH "$RANGER_URL/service/public/v2/api/policy/<policy-id>"

User and Group Sync from LDAP / FreeIPA

Ranger User Sync (ranger-usersync) periodically imports users and groups from LDAP so that Ranger policies can reference LDAP group names directly.

Configure in Ambari under Ranger → Advanced ranger-ugsync-site:

ranger.usersync.enabled = true
ranger.usersync.source.impl.class = org.apache.ranger.ldapusersync.process.LdapUserGroupBuilder

# FreeIPA / 389-DS
ranger.usersync.ldap.url = ldap://ipa01.dev01.hadoop.clemlab.com:389
ranger.usersync.ldap.binddn = uid=ranger-sync,cn=users,cn=accounts,dc=dev01,dc=hadoop,dc=clemlab,dc=com
ranger.usersync.ldap.ldapbindpassword = <sync-password>
ranger.usersync.ldap.user.searchbase = cn=users,cn=accounts,dc=dev01,dc=hadoop,dc=clemlab,dc=com
ranger.usersync.ldap.user.objectclass = person
ranger.usersync.ldap.user.nameattribute = uid
ranger.usersync.ldap.group.searchbase = cn=groups,cn=accounts,dc=dev01,dc=hadoop,dc=clemlab,dc=com
ranger.usersync.ldap.group.objectclass = groupofnames
ranger.usersync.ldap.group.nameattribute = cn
ranger.usersync.ldap.group.memberattributename = member

# Sync interval (seconds)
ranger.usersync.sleeptimeinmillisbetweensynccycle = 60000

After configuration, restart Ranger UserSync from Ambari. Users and groups appear in the Ranger Admin UI under Settings → Users/Groups.