Skip to main content
Version: 1.3.1.0

Apache Knox Reference

Apache Knox 2.1.0 is the perimeter gateway for ODP. It provides a single TLS endpoint for all REST and web UI access to the cluster, eliminating the need to expose individual service ports externally.

Knox Architecture

  External client (browser / curl / JDBC)

│ HTTPS :8443

┌─────────────────────────┐
│ Knox Gateway │
│ │
│ Topology: default │──► WebHDFS :50070
│ Topology: cdp-proxy │──► Hive JDBC :10000
│ Topology: admin │──► Ambari :8442
└─────────────────────────┘

Authentication (LDAP / Kerberos)
Authorisation (Ranger Knox plugin)
TLS termination

Knox operates through topology files — XML descriptors that define:

  • Which services are exposed through a given gateway path.
  • Which authentication provider to use.
  • Service URL mappings to backend hosts/ports.

Each topology is deployed as a WAR inside the Knox gateway and becomes available at https://<knox-host>:8443/gateway/<topology-name>/.


Knox in ODP (Managed by Ambari)

Ambari manages Knox installation, configuration, and topology generation automatically. You do not need to write topology XML by hand for standard services.

Ambari-managed topology generation

When you enable Knox in Ambari and configure which services to proxy, Ambari generates the topology files and deploys them to Knox. Configuration is located under Ambari → Services → Knox → Configs.

Key Ambari Knox parameters:

ParameterDescription
gateway.portKnox HTTPS port (default: 8443)
knox.topology.nameDefault topology name (default)
gateway.hadoop.kerberos.securedEnable Kerberos delegation token handling
knox.master.secretKnox master secret (encrypted in Ambari credential store)

After modifying Knox configuration in Ambari, restart the Knox service. Ambari automatically regenerates and redeploys topology files.


Services Behind Knox

WebHDFS

Knox proxies WebHDFS REST calls to the NameNode. Clients authenticate to Knox (LDAP or Kerberos), and Knox uses its own service keytab to forward requests to HDFS.

# List HDFS root through Knox
curl -iku user:password \
"https://knox-host:8443/gateway/default/webhdfs/v1/?op=LISTSTATUS"

Hive JDBC (HiveServer2)

Knox proxies Hive JDBC connections. Use the Knox JDBC URL in Beeline:

beeline -u "jdbc:hive2://knox-host:8443/default;ssl=true;\
sslTrustStore=/path/to/truststore.jks;trustStorePassword=changeit;\
transportMode=http;httpPath=gateway/default/hive"

YARN Resource Manager UI

https://knox-host:8443/gateway/default/yarn

Ambari UI

Knox can proxy the Ambari web UI, allowing admin access through the Knox gateway without exposing Ambari's port 8442:

https://knox-host:8443/gateway/admin/ambari

Other proxied services

ODP typically exposes through Knox:

ServiceKnox path
HDFS NameNode UI/gateway/default/hdfs
MapReduce History/gateway/default/jobhistory
HBase UI/gateway/default/hbase
Oozie/gateway/default/oozie
Zeppelin/gateway/default/zeppelin
NiFi/gateway/default/nifi

LDAP Authentication

Knox authenticates users against LDAP (or FreeIPA's built-in LDAP) before forwarding requests to backend services.

Ambari configures LDAP settings under Knox → Advanced gateway-site:

<!-- Excerpt from topology — Ambari generates this -->
<provider>
<role>authentication</role>
<name>ShiroProvider</name>
<enabled>true</enabled>
<param>
<name>sessionTimeout</name>
<value>30</value>
</param>
<param>
<name>main.ldapRealm</name>
<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
</param>
<param>
<name>main.ldapRealm.userDnTemplate</name>
<value>uid={0},cn=users,cn=accounts,dc=dev01,dc=hadoop,dc=clemlab,dc=com</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.url</name>
<value>ldap://ipa01.dev01.hadoop.clemlab.com:389</value>
</param>
</provider>

SSO Configuration

Knox SSO provides browser-based single sign-on. After a user authenticates once to Knox, subsequent requests to other Knox-proxied services reuse the Knox token without re-prompting for credentials.

Knox Token Service

Knox issues signed JSON Web Tokens (JWT) after successful LDAP or Kerberos authentication:

  1. Browser hits https://knox-host:8443/gateway/knoxsso/api/v1/websso.
  2. Knox redirects to LDAP login page (or performs Kerberos SPNEGO).
  3. On success, Knox issues a signed JWT cookie (hadoop-jwt).
  4. The browser presents the JWT to all Knox-proxied UIs automatically.

Enable Knox SSO in Ambari under Knox → Advanced topology by adding the federation provider with JWTProvider.


CSRF Protection

Knox includes built-in CSRF (Cross-Site Request Forgery) protection. Browser-based clients must include a custom HTTP header:

X-Requested-By: Knox

Most Hadoop clients (Beeline, HDFS CLI via Knox) add this header automatically. If you write custom REST clients, add the header explicitly:

curl -iku user:password \
-H "X-Requested-By: Knox" \
-X PUT \
"https://knox-host:8443/gateway/default/webhdfs/v1/user/test?op=MKDIRS"

TLS/SSL Configuration

Knox terminates TLS for all incoming connections. In ODP, Ambari's Auto-TLS feature generates and distributes certificates for Knox automatically.

Manual certificate configuration

If not using Ambari Auto-TLS, place the Knox gateway keystore at:

/usr/odp/current/knox-server/data/security/keystores/gateway.jks

Configure the keystore in gateway-site.xml:

<property>
<name>ssl.exclude.protocols</name>
<value>SSLv3,TLSv1,TLSv1.1</value>
</property>
<property>
<name>gateway.tls.keystore.path</name>
<value>/usr/odp/current/knox-server/data/security/keystores/gateway.jks</value>
</property>

Client trust

Clients connecting to Knox must trust the Knox certificate. Distribute the CA certificate or the self-signed gateway certificate to clients:

# Import Knox cert into Java truststore
keytool -import -alias knox-gateway \
-file /path/to/knox.crt \
-keystore /etc/pki/ca-trust/extracted/java/cacerts \
-storepass changeit -noprompt

Port Reference

PortProtocolPurpose
8443HTTPSKnox gateway — all service proxying and Knox SSO
8444HTTPSKnox Admin API (topology CRUD, service management)

The Knox Admin API on port 8444 allows programmatic topology deployment:

# List deployed topologies
curl -iku admin:admin-password \
"https://knox-host:8444/gateway/admin/api/v1/topologies"

Topology File Structure (Reference)

Ambari generates topology files automatically, but understanding their structure helps when troubleshooting.

<topology>
<gateway>
<provider>
<role>authentication</role>
<name>ShiroProvider</name>
<enabled>true</enabled>
<!-- LDAP params -->
</provider>
<provider>
<role>authorization</role>
<name>XASecurePDPKnox</name>
<enabled>true</enabled>
</provider>
<provider>
<role>identity-assertion</role>
<name>Default</name>
<enabled>true</enabled>
</provider>
<provider>
<role>ha</role>
<name>HaProvider</name>
<enabled>true</enabled>
<param><name>WEBHDFS</name><value>maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=true</value></param>
</provider>
</gateway>

<service>
<role>WEBHDFS</role>
<url>http://master01.dev01.hadoop.clemlab.com:50070/webhdfs</url>
<url>http://master02.dev01.hadoop.clemlab.com:50070/webhdfs</url>
</service>

<service>
<role>HIVE</role>
<url>http://master02.dev01.hadoop.clemlab.com:10001/cliservice</url>
<url>http://master03.dev01.hadoop.clemlab.com:10001/cliservice</url>
</service>
</topology>

Topology files are stored at:

/usr/odp/current/knox-server/conf/topologies/<topology-name>.xml

Ambari writes to this path during Knox restarts. Do not edit these files manually — use Ambari to ensure changes survive service restarts.