Using Profiles to Read and Write Data
PXF profiles are collections of common metadata attributes that can be used to simplify the reading and writing of data. You can use any of the built-in profiles that come with PXF or you can create your own.
For example, if you are writing single line records to text files on HDFS, you could use the built-in HdfsTextSimple profile. You specify this profile when you create the PXF external table used to write the data to HDFS.
PXF comes with a number of built-in profiles that group together a collection of metadata attributes. PXF built-in profiles simplify access to the following types of data storage systems:
- HDFS File Data (Read + Write)
- Hive (Read only)
- HBase (Read only)
You can specify a built-in profile when you want to read data that exists inside HDFS files, Hive tables, HBase tables, and for writing data into HDFS files.
|HdfsTextSimple||Read or write delimited single line records from or to plain text files on HDFS.||
|HdfsTextMulti||Read delimited single or multi-line records (with quoted linefeeds) from plain text files on HDFS. This profile is not splittable (non parallel); therefore reading is slower than reading with HdfsTextSimple.||
|Hive||Use this when connecting to Hive. The Hive table can use any of the available storage formats: text, RC, ORC, Sequence, or Parquet.||
|HiveRC||Use this when connecting to a Hive table where each partition is stored as an RCFile. This profile is optimized for it.
|HiveText||Use this profile when connecting to a Hive table where each partition is stored as a text file. This profile is optimized for it.
|HBase||Use this profile when connected to an HBase data store engine.||
|Avro||Use this profile for reading Avro files (fileName.avro).||
Administrators can add new profiles or edit the built-in profiles inside
/etc/conf/pxf-profiles.xml. You can use all the profiles in
Note: Add any JAR files that contain custom profile plug-ins to the
/etc/pxf/conf/pxf-public.classpath configuration file.
Each profile has a mandatory unique name and an optional description.
In addition, each profile contains a set of plug-ins that are an extensible set of metadata attributes.
After you make changes in
pxf-profiles.xml (or any other PXF configuration file), propagate the changes to all nodes with PXF installed, and then restart the PXF service on all nodes.
<profile> <name>MyCustomProfile</name> <description>A Custom Profile Example</description> <plugins> <fragmenter>package.name.CustomProfileFragmenter</fragmenter> <accessor>package.name.CustomProfileAccessor</accessor> <customPlugin1>package.name.MyCustomPluginValue1</customPlugin1> <customPlugin2>package.name.MyCustomPluginValue2</customPlugin2> </plugins> </profile>