Apache Ranger Reference
Apache Ranger 2.6.0 is the centralised authorisation layer in ODP. It provides fine-grained access control across every data service in the cluster through a plugin architecture — no service-level ACL management required.
Ranger Architecture
┌─────────────────────────────────────────────────────────┐
│ Ranger Admin │
│ (Web UI · REST API · Policy store · Audit config) │
└────────────┬────────────────────────────────────────────┘
│ Policy sync (pull, ~30 s interval)
┌────────▼──────────────────────────────────────┐
│ Ranger Plugins │
│ HDFS · Hive · HBase · Kafka · Knox · YARN │
│ Ozone · Impala · NiFi · … │
└────────────────┬──────────────────────────────┘
│ Audit events
┌──────────▼───────────┐
│ Audit backends │
│ Solr · HDFS │
└──────────────────────┘
- Ranger Admin: Central web UI and REST API. Stores policies in a relational database (MySQL/PostgreSQL). Manages user/group sync from LDAP.
- Ranger Plugins: In-process agents embedded in each service (NameNode, HiveServer2, HBase Master, etc.). They cache policies locally and evaluate access decisions without a network round-trip to Admin.
- Audit: Every access decision (allow or deny) is logged to Solr (for search/UI) and optionally to HDFS (for long-term retention).
Ranger 2.6.0 in ODP
Ranger is installed and configured by Ambari during cluster deployment. All Ranger configuration (database, LDAP, audit, plugin enablement) is managed through the Ambari UI under Services → Ranger.
Key deployment facts:
- Ranger Admin runs on the host assigned the
Ranger Adminrole in Ambari. - The Ranger KMS (Key Management Server) is a separate service used for HDFS TDE — see Encryption in ODP.
- All Ranger plugins are co-located with their respective service daemons.
Configuring Ranger Admin via Ambari
Navigate to Ambari → Services → Ranger → Configs.
Important settings:
| Config tab | Key parameter | Description |
|---|---|---|
| Ranger Settings | ranger_admin_hosts | Host running Ranger Admin |
| Ranger Settings | ranger.jpa.jdbc.* | Database connection for policy store |
| Advanced ranger-ugsync-site | ranger.usersync.ldap.url | LDAP/FreeIPA URL for user sync |
| Advanced ranger-ugsync-site | ranger.usersync.ldap.binddn | Bind DN for LDAP queries |
| Ranger Audit | xasecure.audit.destination.solr | Enable Solr audit |
| Ranger Audit | xasecure.audit.destination.hdfs | Enable HDFS audit |
After making changes in Ambari, restart Ranger Admin and affected services.
Service Plugins
Each data service has a dedicated Ranger plugin. Enable plugins through Ambari — each service config page has a Ranger Plugin tab.
HDFS Plugin
Controls access to HDFS paths. Policies can be set on paths with wildcards (/data/*). Supports:
- Read / Write / Execute permissions on directories and files.
- Recursive policy application.
Resource: /data/raw/*
User: etl_user — Allow: Read, Write
Group: analysts — Allow: Read
Hive Plugin
Controls access to Hive databases, tables, columns, and UDFs. Also provides Row Filter and Column Masking policies.
Resource: database=finance, table=transactions, column=*
User: finance_analyst — Allow: Select
Column masking: column=credit_card_number → MASK (show last 4)
HBase Plugin
Controls access to HBase tables, column families, and columns.
Resource: table=user_data, column-family=pii, column=*
User: app_service — Allow: Read, Write
Kafka Plugin
Controls access to Kafka topics with Publish, Consume, Create, Delete, Describe, and Idempotent Write permissions.
Resource: topic=events-*
User: kafka_producer — Allow: Publish
Group: analytics_team — Allow: Consume, Describe
Knox Plugin
Controls which users and groups can access services proxied through Knox topologies.
Resource: topology=default, service=WEBHDFS
Group: external_users — Allow: Allow
Ozone Plugin
Controls access to Ozone volumes, buckets, and keys.
Resource: volume=data, bucket=raw, key=*
User: ingest_user — Allow: Read, Write, Create, List
YARN Plugin
Controls access to YARN queues — which users/groups can submit jobs to which queues.
Resource: queue=root.production
Group: prod_users — Allow: Submit Application, Admin Queue
Impala Plugin
Controls access to Impala databases, tables, columns, and UDFs — mirrors the Hive plugin model.
Policy Types
Allow Policies
The default policy type. Grant explicit access to resources.
Policy type: Allow
Resource: /data/warehouse/hive
User: hive — Allow: Read, Write, Execute
Deny Policies
Explicitly deny access, overriding allow policies. Use sparingly — deny policies apply even to users who have an allow policy for the same resource.
Policy type: Deny
Resource: /data/sensitive/*
Group: contractors — Deny: Read, Write
Row Filter Policies (Hive / Impala)
Automatically append a WHERE clause to queries. Transparent to the application.
Resource: database=hr, table=employees
User: regional_manager_west
Row filter: department = 'WEST'
Column Masking Policies (Hive / Impala)
Replace sensitive column values at query time without modifying stored data.
| Masking type | Result |
|---|---|
| Redact | xxxx |
| Partial mask — show last 4 | ****1234 |
| Hash (SHA-256) | a94f... |
| Nullify | NULL |
| Custom expression | Any SQL expression |
Tag-Based Policies with Atlas Integration
Tag-based policies decouple authorisation from resource names — a policy applies to any resource that carries a specific Atlas classification tag, regardless of its name or location.
Workflow
- Create a classification in Atlas (e.g.,
PII,CONFIDENTIAL,GDPR). - Apply the classification to Atlas entities (tables, columns, HDFS paths).
- In Ranger, create a Tag-based policy targeting the classification.
- The policy automatically applies to all resources carrying that tag across all engines (Hive, HDFS, HBase, etc.).
Example
Tag policy: classification=PII
All users — Deny: Select
Exception: Group=data_stewards — Allow: Select
Column masking for PII tag: Hash
Any Hive column, HDFS file, or HBase column family tagged PII in Atlas is automatically subject to this policy without creating per-resource Ranger rules.
Ranger Audit Configuration
Audit records capture: user, service, resource, access type, result (allowed/denied), timestamp, and client IP.
Solr Audit (recommended for UI search)
Ranger Admin includes an embedded Solr view for querying audit logs. Configure in Ambari under Ranger → Ranger Audit:
xasecure.audit.destination.solr = true
xasecure.audit.destination.solr.urls = http://ranger-solr-host:8983/solr/ranger_audits
HDFS Audit (long-term retention)
xasecure.audit.destination.hdfs = true
xasecure.audit.destination.hdfs.dir = hdfs:///ranger/audit
Audit files are stored as JSON lines, partitioned by service and date:
/ranger/audit/hdfs/20260408/hdfs_audit_20260408_000000.log.gz
Ranger REST API Overview
Ranger Admin exposes a REST API for programmatic policy management.
Authentication
# Basic auth (or Kerberos negotiate if cluster-secured)
RANGER_URL="https://ranger-admin-host:6182"
AUTH="-u admin:ranger-admin-password"
List policies for a service
curl -s $AUTH "$RANGER_URL/service/public/v2/api/policy?serviceName=hdp_hdfs" | python3 -m json.tool
Create a policy
curl -s -X POST $AUTH \
-H "Content-Type: application/json" \
-d '{
"name": "etl-raw-access",
"service": "hdp_hdfs",
"resources": {
"path": {"values": ["/data/raw"], "isRecursive": true}
},
"policyItems": [{
"users": ["etl_user"],
"accesses": [{"type": "read"}, {"type": "write"}],
"isAuditEnabled": true
}],
"isEnabled": true
}' \
"$RANGER_URL/service/public/v2/api/policy"
Update a policy
curl -s -X PUT $AUTH \
-H "Content-Type: application/json" \
-d '<updated-policy-json>' \
"$RANGER_URL/service/public/v2/api/policy/<policy-id>"
Delete a policy
curl -s -X DELETE $AUTH "$RANGER_URL/service/public/v2/api/policy/<policy-id>"
User and Group Sync from LDAP / FreeIPA
Ranger User Sync (ranger-usersync) periodically imports users and groups from LDAP so that Ranger policies can reference LDAP group names directly.
Configure in Ambari under Ranger → Advanced ranger-ugsync-site:
ranger.usersync.enabled = true
ranger.usersync.source.impl.class = org.apache.ranger.ldapusersync.process.LdapUserGroupBuilder
# FreeIPA / 389-DS
ranger.usersync.ldap.url = ldap://ipa01.dev01.hadoop.clemlab.com:389
ranger.usersync.ldap.binddn = uid=ranger-sync,cn=users,cn=accounts,dc=dev01,dc=hadoop,dc=clemlab,dc=com
ranger.usersync.ldap.ldapbindpassword = <sync-password>
ranger.usersync.ldap.user.searchbase = cn=users,cn=accounts,dc=dev01,dc=hadoop,dc=clemlab,dc=com
ranger.usersync.ldap.user.objectclass = person
ranger.usersync.ldap.user.nameattribute = uid
ranger.usersync.ldap.group.searchbase = cn=groups,cn=accounts,dc=dev01,dc=hadoop,dc=clemlab,dc=com
ranger.usersync.ldap.group.objectclass = groupofnames
ranger.usersync.ldap.group.nameattribute = cn
ranger.usersync.ldap.group.memberattributename = member
# Sync interval (seconds)
ranger.usersync.sleeptimeinmillisbetweensynccycle = 60000
After configuration, restart Ranger UserSync from Ambari. Users and groups appear in the Ranger Admin UI under Settings → Users/Groups.