Installing PXF Plug-ins

This topic describes how to install the built-in PXF service plug-ins that are required to connect PXF to HDFS, Hive, and HBase.

Note: The PXF plug-ins require that you run Tomcat on the host machine. Tomcat reserves ports 8005, 8080, and 8009. If you have configured Oozie JXM reporting on a host that will run a PXF plug-in, make sure that the reporting service uses a port other than 8005. This helps to prevent port conflict errors from occurring when you start the PXF service.

PXF Installation and Log File Directories

Installing PXF plug-ins, regardless of method, creates directories and log files on each node receiving the plug-in installation:

Directory Description
/usr/lib/pxf PXF library location
/etc/pxf/conf PXF configuration directory. This directory contains the pxf-public.classpath and pxf-private.classpath configuration files. See Setting up the Java Classpath.
/var/pxf/pxf-service PXF service instance location
/var/log/pxf This directory includes pxf-service.log and all Tomcat-related logs including catalina.out. Logs are owned by user:group pxf:pxf. Other users have read access.
/var/run/pxf/catalina.pid PXF Tomcat container PID location

Installing PXF Plug-ins Using Ambari

If you are using Ambari to install and manage your HAWQ cluster, you do not need to follow the manual installation steps in this topic. Installing using the Ambari web interface installs all of the necessary PXF plug-in components.

Installing PXF Plug-ins from the Command Line

Each PXF service plug-in resides in its own RPM. You may have built these RPMs in the Apache HAWQ open source project repository (see PXF Build Instructions), or these RPMs may have been included in a commercial product download package.

RPMs for PXF plug-ins must be installed on each node in your cluster.

Installing Prerequisite Packages

All PXF plug-ins require that both the Tomcat and PXF service packages be installed on each node in your cluster.

Install Tomcat:

$ sudo rpm -i apache-tomcat-x.x.x.noarch.rpm

where x.x.x corresponds to the version of Tomcat required by PXF. The appropriate version of Tomcat is included in the PXF RPM bundle.

Install the PXF service:

$ sudo rpm -i pxf-service-n.n.n-x.noarch.rpm

where n.n.n-x corresponds to the PXF version and build number you wish to install.

Installing the PXF service package:

  • creates a /usr/lib/pxf-n.n.n directory, adding a softlink from /usr/lib/pxf to this directory
  • copies the PXF service JAR file pxf-service-n.n.n.jar to /usr/lib/pxf-n.n.n/
  • creates a softlink pxf-service.jar in /usr/lib/pxf-n.n.n/
  • sets up the PXF service configuration files
  • starts the PXF service

Installing the PXF HDFS Plug-in

To install PXF support for HDFS, perform the following steps on each node in your cluster:

  1. Install the Tomcat and PXF service packages as described in the previous section.

  2. Install the PXF HDFS plug-in:

    $ sudo rpm -i pxf-hdfs-n.n.n-x.noarch.rpm
    

    The install copies the HDFS JAR file pxf-hdfs-n.n.n.jar to /usr/lib/pxf-n.n.n/ and creates a softlink pxf-hdfs.jar in that directory.

Installing the PXF Hive Plug-in

To install PXF support for Hive, perform the following steps on each node in your cluster:

  1. Install the Tomcat, PXF service, and PXF HDFS RPMs as previously described.

  2. Install the PXF Hive plug-in:

    $ sudo rpm -i pxf-hive-n.n.n-x.noarch.rpm
    

    The install copies the Hive JAR file pxf-hive-n.n.n.jar to /usr/lib/pxf-n.n.n and creates a softlink pxf-hive.jar in that directory.

Installing the PXF HBase Plug-in

To install PXF support for HBase, perform the following steps on each node in your cluster.

  1. Install the Tomcat, PXF Service, and PXF HDFS RPMs as previously described.

  2. Install the PXF HBase plug-in:

    $ sudo rpm -i pxf-hbase-n.n.n-x.noarch.rpm
    

    The install copies the HBase JAR file pxf-hbase-n.n.n.jar to /usr/lib/pxf-n.n.n and creates a softlink pxf-hbase.jar in that directory.

  3. Add the PXF HBase plug-in JAR file to the HBase CLASSPATH by updating the HBASE_CLASSPATH environment variable setting in the HBase environment file /etc/hbase/conf/hbase-env.sh:

    export HBASE_CLASSPATH=${HBASE_CLASSPATH}:/usr/lib/pxf/pxf-hbase.jar
    
  4. Restart the HBase service after making this update to HBase configuration.

    If you are on the HBase Master node:

    $ su -l hbase -c "/usr/hdp/current/hbase-master/bin/hbase-daemon.sh restart master; sleep 25"
    

    If you are on an HBase Region Server node:

    $ su -l hbase -c "/usr/hdp/current/hbase-regionserver/bin/hbase-daemon.sh restart regionserver"