Securing HDB and Isilon with Kerberos

The following sections describe how to configure MIT Kerberos 5 authentication with Pivotal HD and HAWQ:

Note: Pivotal recommends that you install HDP and integrate it with EMC Isilon without any security configuration, prior to enabling security features. This helps to ensure that all services are installed and running correctly.

Steps for Securing HDB and Isilon with Kerberos

This section describes the high-level steps to take in order to configure Kerberos authentication in your HDP environment when using EMC Isilon as the HDFS layer. Notice that the steps differ depending on whether you to use standalone KDC, or you use Kerberos with Active Directory.

If you are using standalone KDC, follow these steps (in order) to configure Kerberos authentication:

If you are using Kerberos authentication with Active Directory, follow these steps (in order) to configure Kerberos authentication:

Note: Make sure that you have configured NTP and there there is no time skew between any cluster hosts or EMC Isilon nodes.

Note: Make sure that forward and reverse DNS lookups are configured correctly, and that all cluster and EMC Isilon hosts are correctly resolved. In DNS, you must configure each EMC Isilon IP address with a reverse PTR record that points to the correct Smart-Connect Zone name.

Note: If you are using Kerberos with Active Directory, then all users must be created and managed via Active Directory.

Install Standalone KDC

This section outlines a simple stand-alone krb5 KDC setup.

These instructions were largely derived from Kerberos: The Definitive Guide by James Garman, O'Reilly, pages 53-62.

  1. Install the Kerberos packages (krb5-libs, krb5-workstation, and krb5-server) on the KDC host.
  2. Define your REALM in /etc/krb5.conf as shown below:

    • Set the kdc and admin_server variables to the resolvable hostname of the KDC host.
    • Set the default_domain to your REALM.

    In the following example, REALM was changed to VLAN172.FE.EXAMPLE.COM and the admin server and KDC host were changed to eng-dca-mdw.vlan172.fe.example.com:

    [logging]
     default = FILE:/var/log/krb5libs.log
     kdc = FILE:/var/log/krb5kdc.log
     admin_server = FILE:/var/log/kadmind.log
    
    [libdefaults]
     default_realm = VLAN172.FE.EXAMPLE.COM
     dns_lookup_realm = false
     dns_lookup_kdc = false
     ticket_lifetime = 24h
     renew_lifetime = 7d
     forwardable = true
    
    [realms]
     VLAN172.FE.EXAMPLE.COM = {
      kdc = eng-dca-mdw.vlan172.fe.example.com
      admin_server = eng-dca-mdw.vlan172.fe.example.com
     }
    
    [domain_realm]
     .vlan172.fe.example.com = VLAN172.FE.EXAMPLE.COM
     vlan172.fe.example.com = VLAN172.FE.EXAMPLE.COM
    
  3. Set up /var/kerberos/krb5kdc/kdc.conf:

    • Isilon OneFS does not support using AES-256, so DO NOT uncomment the master_key_type line.
    • Remove AES-256 from the supported_enctypes line.
    • Add a key_stash_file entry: /var/kerberos/krb5kdc/.k5.REALM.
    • Set the maximum ticket lifetime and renew lifetime to your desired values (24 hours and 7 days are typical).
    • Add the kadmind_port entry: kadmind_port = 749.

    Important: The stash file lets the KDC server start up for root without a password being entered. The result (NOT using AES-256) for the above REALM is:

    [kdcdefaults]
     kdc_ports = 88
     kdc_tcp_ports = 88
    
    [realms]
     VLAN172.FE.EXAMPLE.COM = {
      #master_key_type = des3-hmac-sha1
      acl_file = /var/kerberos/krb5kdc/kadm5.acl
      dict_file = /usr/share/dict/words
      admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
      key_stash_file = /var/kerberos/krb5kdc/.k5.VLAN172.FE.EXAMPLE.COM
      max_life = 24h 0m 0s
      max_renewable_life = 7d 0h 0m 0s
      kadmind_port = 749
      supported_enctypes = des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
     }
    
  4. Create the KDC master password by running:

    kdb5_util create -s
    

    Do NOT forget your password, as this is the root KDC password. This typically runs quickly, but can take 5-10 minutes if the code has trouble getting the random bytes it needs.

  5. Add an administrator account as username/admin@REALM. Run the kadmin.local application from the command line:

    kadmin.local: addprinc root/admin@VLAN172.FE.EXAMPLE.COM
    

    Type quit to exit kadmin.local.

    Important: The KDC does not need to be running to add a principal.

  6. Start the KDC by running:

    /etc/init.d/krb5kdc start
    

    You should get an [OK] indication if it started without error. If there are errors, correct the condition after examining the log file: /var/log/krb5kdc.log.

  7. Edit /var/kerberos/krb5kdc/kadm5.acl and change the admin permissions username from * to your admin. You can add other admins with specific permissions if you want. This is a sample ACL file:

    root/admin@VLAN172.FE.EXAMPLE.COM     *
    
  8. Use kadmin.local on the KDC to enable remote access for the administrator(s):

    kadmin.local: ktadd -k /var/kerberos/krb5kdc/kadm5.keytab kadmin/admin kadmin/changepw
    

    Important: kadmin.local is a KDC host-only version of kadmin that can do things remote kadmin cannot (such as use the -norandkey option in ktadd).

  9. Start kadmind:

    /etc/init.d/kadmin start
    

    The KDC configuration is now done and the KDC is ready to use.

Configure Clusters Hosts for Standalone KDC

  1. Install krb5-libs , krb5-auth-dialog, and krb5-workstation on all cluster hosts, including any client/gateway hosts.

    yum install krb5-libs krb5-workstation krb5-auth-dialog
    
  2. Use scp to copy /etc/krb5.conf from the KDC host to all cluster hosts.

  3. Validate with a simple kinit test from any cluster host:

    kinit root/admin
    

    You should be able to login as root/admin from any cluster host. If you receive errors, make sure that your KDC is running and reachable on the network from any host. Also examine the /var/log/krb5kdc.log file on the KDC host for any error conditions that are being reported.

Configure Cluster Hosts for Active Directory

  1. Install krb5-libs , krb5-auth-dialog, and krb5-workstation on all cluster hosts, including any client/gateway hosts.

        yum install krb5-libs krb5-workstation krb5-auth-dialog
    
  2. Edit /etc/krb5.conf on any one host in the cluster as shown here:

        [logging]
        default = FILE:/var/log/krb5libs.log
        kdc = FILE:/var/log/krb5kdc.log
        admin_server = FILE:/var/log/kadmind.log
    
        [libdefaults]
        default_realm = VLAN172.FE.EXAMPLE.COM
        dns_lookup_realm = false
        dns_lookup_kdc = false
        ticket_lifetime = 24h
        renew_lifetime = 7d
        forwardable = true
    
        [realms]
        VLAN172.FE.EXAMPLE.COM = {
        kdc = eng-dca-ad-srvr.vlan172.fe.example.com
        #admin_server = eng-dca-ad-srvr.vlan172.fe.example.com
        default_domain = VLAN172.FE.example.com
        kpasswd_server = eng-dca-ad-srvr.vlan172.fe.example.com
        }
    
        [domain_realm]
        .vlan172.fe.example.com = VLAN172.FE.EXAMPLE.COM
        vlan172.fe.example.com = VLAN172.FE.EXAMPLE.COM
    

    Note: Make sure to point the kdc and the kpasswd_server parameters to your AD domain controller. In the above example, eng-dca-ad-srvr.vlan172.fe.example.com is the AD Domain controller for the environment.

  3. Use scp to copy the edited /etc/krb5.conf to all remaining cluster hosts.

  4. Validate with a simple kinit test from any cluster host, after logging in as a user:

    su - hdfs
    kinit
    

    You should be able to login as a user from any cluster host. If you have errors, make sure that your krb5.conf file is configured correctly.

Prepare EMC Isilon for Standalone KDC

Follow these steps to configure EMC Isilon to use a stand-alone MIT Kerberos 5 KDC:

  1. From the OneFS Web UI, navigate to Access, Authentication Providers and click on get Started on the Kerberos Provider tab as shown below:

    Image23

  2. Fill out the form with all the information as it pertains to your environment as shown below.

    Note: Use all uppercase letters when you define a REALM, and use lowercase letters when you define domains either in the FQDN of KDC hosts or in domain mappings, as shown in the example below.

    Image24

  3. After successfully creating a Kerberos REALM and provider, the Kerberos Provider tab becomes available:

    Image25

  4. Click on the Kerberos Settings Tab and enable the settings shown here:

    Image26

    Note: Remember to click Save Changes after you have modified the settings.

  5. Using the OneFS CLI on any one of the Isilon nodes, configure your HDFS zone to use kerberos as the authentication mechanism. Replace your Isilon Zone Name in this command:

    isi zone zones modify --hdfs-authentication=kerberos_only <zone_name>
    

    Note: If you do not recall your HDFS zone name, you can use the command isi zone zones list to list all zones, and use the <zone_name> that serves up the HDFS layer.

  6. In order for the user Hive to impersonate other users correctly, you must create a proxy user called hive on the Isilon cluster:

    isi hdfs proxyusers create --proxyuser hive --add-group <group of users>  
    

    OR

    isi hdfs proxyusers create --proxyuser hive --add-user <single username>
    

    Note: Use a group such as Users or create a Hadoop-specific User group so that any users that you later add to the group on the Isilon side can use Hive successfully.

Prepare EMC Isilon for Active Directory

Follow these steps to configure EMC Isilon to use a stand-alone MIT Kerberos 5 KDC:

  1. From the OneFS Web UI, navigate to Access, Authentication Providers and click on get Started on the Active Directory Provider tab as shown below:

    Image40

  2. Fill out the form with all of the information as it pertains to your environment:

    Note: Use all uppercase letters when defining a DOMAIN as shown here:

    Image41

  3. You do not need to configure a Kerberos Provider tab, but you do need to make sure that the Kerberos Settings are similar to:

    Image42

  4. Use this command on the Isilon CLI to set the authentication mechanism on your Access Zone to use Active Directory:

        isi zone zones modify <zone name> --add-auth-providers ads:<domain name>
    

    Note: In the above command, use System as the <zone name> if you have not created a separate Access Zone. You can use isi zone zones view <zone name> to make sure that your settings are correctly saved.

  5. Use this command on the Isilon CLI to list all of the SPNs on the EMC Isilon cluster:

        isi auth ads spn list --domain=<doman name>
    
  6. Verify that a SPN exists for hdfs/<Smartconnect FQDN>@DOMAIN. If it does not exist, use isi auth ads spn create hdfs/<Smartconnect FQDN@DOMAIN to create it.

  7. Restart the HDFS service on the EMC Isilon cluster from the CLI:

        isi services isi_hdfs_d disable ; isi services isi_hdfs_d enable
    

Enable Security in Ambari

  1. In the Ambari UI, click on Admin in the top right corner, and then Security.
  2. Click on Enable Security to start the wizard and follow on-screen instructions to enable kerberos Security:

    1. Customize the General tab as shown below: Image27 Note: Make sure that the principal and the path to the keytab file are correct for your system.
    2. Customize the HDFS tab as shown below: Image28 Note: Because none of the compute nodes in the cluster run any NameNode or DN Services when running with EMC Isilon as the HDFS layer, you do not need to change any settings for them in the security wizard. You will customize the HDFS configs after the security wizard completes.
    3. Customize the MapReduce2 tab as shown below: Image29 Note: Make sure that the principal and the path to the keytab file are correct for your system.
    4. Customize the YARN tab as shown below: Image30 Image31 Note: Make sure that the principal and the path to the keytab file are correct for your system.
    5. Customize the Hive tab as shown below: Image32 Note: Make sure that the principal and the path to the keytab file are correct for your system.
    6. Customize the Hbase tab as shown below: Image33 Note: Make sure that the principal and the path to the keytab file are correct for your system.
    7. Customize the Oozie tab as shown below: Image34 Note: Make sure that the principal and the path to the keytab file are correct for your system.
    8. Customize the Zookeeper tab as shown below: Image35 Note: Make sure that the principal and the path to the keytab file are correct for your system.
    9. Click Next and follow the on-screen instructions to run the script to create service principals on your KDC:

      1. Run the script to create principals on the Ambari server.
      2. Run the script to copy the keytabs to each host in the cluster.
  3. If a service fails to start up after the wizard completes, review the service log file on the corresponding host and correct the error condition except if the error is for HAWQ or PXF. Those will get resolved in a later step. You can click done and resolve any other service startup issues independantly of the wizard as well.

  4. Modify mapreduce.application.classpath in order for MapReduce jobs to run correctly.

    In the Ambari UI, navigate to the service configs for Mapreduce2 and search for classpath as shown here:

    Image36

    Copy and paste the following into the input box so that the value of mapreduce.application.classpath is similar to:

    `hadoop classpath`:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/${stack.name}/${stack.version}/hadoop/lib/hadoop-lzo-0.6.0.${stack.version}.jar:/etc/hadoop/conf/secure
    
  5. Customize HDFS configuration so that HDFS can run correctly.

    In the Ambari UI, Navigate to the service configs for HDFS and add the custom key/value pairs to Custom hdfs-site.xml as shown here:

    Image37 Note: Make sure that the principal and the path to the keytab file are correct for your system. Note: If the dfs.namenode.kerberos.principal.pattern does not exist, you need to add it by clicking on Add Property.

  6. Also add the property to Custom core-site.xml as shown below:

    Image43

  7. Restart services as the Ambari UI indicates, and validate that the cluster is up and running with Security enabled.

Securing HAWQ and PXF with MIT Kerberos 5

This section describes how to configure Kerberos authentication for HAWQ and PXF when using EMC Isilon as your HDFS layer.

Note: Make sure that you have completed all steps in Securing HDP with MIT Kerberos 5 prior to performing any steps in this section.

Note: Pivotal recommends that you install HAWQ and PXF via Ambari before you start to enable any security configuration.

Securing HAWQ and PXF with stand-alone MIT Kerberos 5 KDC or Active Directory

Note:
Replace FQDN in the commands below with the fully-qualified hostname of the server that will run the PXF service.

Replace REALM in the commands below with the REALM name that you configured on your KDC. In the case of AD, use the domain name instead of the REALM.

Use all uppercase letters when you define a REALM, and use lowercase letters when you define domains in a FQDN.

  1. Login to the KDC server as root.

  2. Use kadmin.local to create a new principal for the postgres user:

        kadmin.local -q “addprinc -randkey postgres@REALM”
    
  3. Create a pxf service principal for all hosts in the cluster that will run PXF services:

        kadmin.local -q "addprinc -randkey pxf/FQDN@REALM"
        kadmin.local -q "addprinc -randkey HTTP/FQDN@REALM"
    
  4. Create a keytab for the newly created principals for each corresponding host:

        kadmin.local -q “xst -k /etc/security/hdp/keytab/hawq.service.keytab postgres@REALM”
        kadmin.local -q “xst -k /etc/security/hdp/keytab/pxf-FQDN1.service.keytab pxf/FQDN1@REALM”
        kadmin.local -q “xst -k /etc/security/hdp/keytab/pxf-FQDN2.service.keytab pxf/FQDN2@REALM”
    

    Note: Repeat the xst command as necessary to generate a keytab for each HAWQ service principal that you created in the previous step.

  5. The HAWQ master server also requires an hdfs.keytab file for the HDFS principal to be available under /etc/security/hdp/keytab. If this file does not already exist, generate it using the command:

        kadmin.local -q “addprinc -randkey hdfs@REALM”
        kadmin.local -q “xst -k /etc/security/hdp/keytab/hdfs.keytab hdfs@REALM”
    
  6. Copy the HAWQ service keytab file (and the hdfs.keytab file if you created one) to the HAWQ master segment host:

        scp /etc/security/hdp/keytab/hawq.service.keytab hawq_master_fqdn:/etc/security/hdp/keytab/hawq.service.keytab
        scp /etc/security/hdp/keytab/hdfs.keytab hawq_master_fqdn:/etc/security/hdp/keytab/hdfs.keytab
    
  7. Change the ownership and permissions on hawq.service.keytab (and on hdfs.headless.keytab if you copied it) as follows:

        ssh hawq_master_fqdn chown gpadmin:gpadmin /etc/security/hdp/keytab/hawq.service.keytab
        ssh hawq_master_fqdn chmod 400 /etc/security/hdp/keytab/hawq.service.keytab
        ssh hawq_master_fqdn chown hdfs:hdfs /etc/security/hdp/keytab/hdfs.keytab
        ssh hawq_master_fqdn chmod 440 /etc/security/hdp/keytab/hdfs.keytab
    
  8. Copy the keytab file for each service principal to its respective HAWQ host:

        scp /etc/security/keytabs/pxf-FQDN1.service.keytab FQDN1:/etc/security/hdp/keytab/pxf.service.keytab
        scp /etc/security/keytabs/pxf-FQDN2.service.keytab FQDN2:/etc/security/hdp/keytab/pxf.service.keytab
    

    Note: Repeat this step for all of the keytab files that you created. Record the full path of the keytab file on each host machine. During installation, you will need to provide the path to finish configuring Kerberos for HAWQ.

  9. Change the ownership and permissions on the pxf.service.keytab files:

        ssh FQDN1 chown pxf:pxf /etc/security/hdp/keytab/pxf.service.keytab
        ssh FQDN1 chmod 400 /etc/security/hdp/keytab/pxf.service.keytab
        ssh FQDN2 chown pxf:pxf /etc/security/hdp/keytab/pxf.service.keytab
        ssh FQDN2 chmod 400 /etc/security/hdp/keytab/pxf.service.keytab
    

    Note: Repeat this step for all of the keytab files that you copied.

  10. Start HAWQ and PXF services via Ambari.

  11. Run the following commands on the hawq_master host to enable security in HAWQ and specify the kerberos keytab file location:

        hawq config -c enable_secure_filesystem -v "on"
        hawq config -c krb_server_keyfile -v "'/etc/security/hdp/keytab/hawq.service.keytab'"
    
  12. Validate that the HAWQ Service configs in the Ambari UI looks like:

    Image38

  13. Validate that the PXF Service configs in the Ambari UI looks like:

    Image39

  14. Restart HAWQ and PXF from the Ambari UI and make sure that everything starts up correctly.