HAWQ dynamically allocates resources to queries. Query performance depends on several factors such as data locality, number of virtual segments used for the query and general cluster health.
Dynamic Partition Elimination
In HAWQ, values available only when a query runs are used to dynamically prune partitions, which improves query processing speed. Enable or disable dynamic partition elimination by setting the server configuration parameter
OFF; it is
HAWQ allocates memory optimally for different operators in a query and frees and re-allocates memory during the stages of processing a query.
Runaway Query Termination
HAWQ can automatically terminate the most memory-intensive queries based on a memory usage threshold. The threshold is set as a configurable percentage (runaway_detector_activation_percent) of the resource quota for the segment, which is calculated by HAWQ’s resource manager.
If the amount of virtual memory utilized by a physical segment exceeds the calculated threshold, then HAWQ begins terminating queries based on memory usage, starting with the query that is consuming the largest amount of memory. Queries are terminated until the percentage of utilized virtual memory is below the specified percentage.
To calculate the memory usage threshold for runaway queries, HAWQ uses the following formula:
For example, if HAWQ resource manager calculates a virtual memory quota of 9GB,
hawq_re_memory_overcommit_maxis set to 1GB and the value of
runaway_detector_activation_percentis 95 (95%), then HAWQ starts terminating queries when the utilized virtual memory exceeds 9.5 GB.
To disable automatic query detection and termination, set the value of
A query is not executing as quickly as you would expect. Here is how to investigate possible causes of slowdown:
Check the health of the cluster.
- Are any DataNodes, segments or nodes down?
- Are there many failed disks?
Check table statistics. Have the tables involved in the query been analyzed?
Check the plan of the query and run /¾ to determine the bottleneck. Sometimes, there is not enough memory for some operators, such as Hash Join, or spill files are used. If an operator cannot perform all of its work in the memory allocated to it, it caches data on disk in spill files. Compared with no spill files, a query will run much slower.
Check data locality statistics using /¾. Alternately you can check the logs. Data locality result for every query could also be found in the log of HAWQ. See Data Locality Statistics for information on the statistics.
Check resource queue status. You can query view
pg_resqueue_statusto check if the target queue has already dispatched some resource to the queries, or if the target queue is lacking resources. See Checking Existing Resource Queues.
Analyze a dump of the resource manager’s status to see more resource queue status. See Analyzing Resource Manager Status.
For visibility into query performance, use the EXPLAIN ANALYZE to obtain data locality statistics. For example:
postgres=# create table test (i int); postgres=# insert into test values(2); postgres=# explain analyze select * from test;
QUERY PLAN ....... Data locality statistics: data locality ratio: 1.000; virtual segment number: 1; different host number: 1; virtual segment number per host(avg/min/max): (1/1/1); segment size(avg/min/max): (32.000 B/32 B/32 B); segment size with penalty(avg/min/max): (32.000 B/32 B/32 B); continuity(avg/min/max): (1.000/1.000/1.000); DFS metadatacache: 7.816 ms; resource allocation: 0.615 ms; datalocality calculation: 0.136 ms.
The following table describes the metrics related to data locality. Use these metrics to examine issues behind a query’s performance.
|data locality ratio||
Indicates the total local read ratio of a query. The lower the ratio, the more remote read happens. Since remote read on HDFS needs network IO, the execution time of a query may increase.
For hash distributed tables, all the blocks of a file will be processed by one segment, so if data on HDFS is redistributed, such as by the HDFS Balancer, the data locality ratio will be decreased. In this case, you can redistribute the hash distributed table manually by using CREATE TABLE AS SELECT.
|number of virtual segments||Typically, the more virtual segments are used, the faster the query will be executed. If the virtual segment number is too small, you can check whether
|different host number||Indicates how many hosts are used to run this query. All the hosts should be used when the virtual segment number is bigger than the total number of hosts according to the resource allocation strategy of HAWQ. As a result, if this metric is smaller than the total number of hosts for a big query, it often indicates that some hosts are down. In this case, use “select gp_segment_configuration” to check the node states first.|
|segment size and segment size with penalty||“segment size” indicates the (avg/min/max) data size which is processed by a virtual segment. “segment size with penalty” is the segment size when remote read is calculated as “net_disk_ratio” * block size. The virtual segment that contains remote read should process less data than the virtual segment that contains only local read. “net_disk_ratio” can be tuned to measure how much slower the remote read is than local read for different network environments, while considering the workload balance between the nodes. The default value of “net_disk_ratio” is 1.01.|
|continuity||reading a HDFS file discontinuously will introduce additional seek, which will slow the table scan of a query. A low value of continuity indicates that the blocks of a file are not continuously distributed on a datanode.|
|DFS metadatacache||Indicates the metadatacache time cost for a query. In HAWQ, HDFS block information is cached in a metadatacache process. If cache miss happens, time cost of metadatacache may increase.|
|resource allocation||Indicates the time cost of acquiring resources from the resource manager.|
|datalocality calculation||Indicates the time to run the algorithm that assigns HDFS blocks to virtual segments and calculates the data locality ratio.|
The number of virtual segment used has impacts on the query performance. HAWQ decides the number of virtual segments of a query (its parallelism) by using the following rules:
- Cost of the query. Small queries use fewer segments and larger queries use more segments. Note that there are some techniques you can use when defining resource queues to influence the number of virtual segments and general resources that are allocated to queries. See Best Practices for Using Resource Queues.
- Available resources. Resources available at query time. If more resources are available in the resource queue, the resources will be used.
- Hash table and bucket number. If the query involves only hash-distributed tables, and the bucket number (bucketnum) configured for all the hash tables is either the same bucket number for all tables or the table size for random tables is no more than 1.5 times larger than the size of hash tables for the hash tables, then the query’s parallelism is fixed (equal to the hash table bucket number). Otherwise, the number of virtual segments depends on the query’s cost and hash-distributed table queries will behave like queries on randomly distributed tables.
- Query Type: For queries with some user-defined functions or for external tables where calculating resource costs is difficult , then the number of virtual segments is controlled by
hawq_rm_nvseg_perquery_perseg_limitparameters, as well as by the ON clause and the location list of external tables. If the query has a hash result table (e.g.
INSERT into hash_table) then the number of virtual segment number must be equal to the bucket number of the resulting hash table, If the query is performed in utility mode, such as for
ANALYZEoperations, the virtual segment number is calculated by different policies, which will be explained later in this section.
The following are guidelines for numbers of virtual segments to use, provided there are sufficient resources available.
- Random tables exist in the select list: #vseg (number of virtual segments) depends on the size of the table.
- Hash tables exist in the select list: #vseg depends on the bucket number of the table.
- Random and hash tables both exist in the select list: #vseg depends on the bucket number of the table, if the table size of random tables is no more than 1.5 times larger than the size of hash tables. Otherwise, #vseg depends on the size of the random table.
- User-defined functions exist: #vseg depends on the
- PXF external tables exist: #vseg depends on the
- gpfdist external tables exist: #vseg is at least the number of locations in the location list.
- The command for CREATE EXTERNAL TABLE is used: #vseg must reflect the value in the command and use the
ONclause in the command.
- Hash tables are copied to or from files: #vseg depends on the bucket number of the hash table.
- Random tables are copied to files: #vseg depends on the size of the random table.
- Random tables are copied from files: #vseg is a fixed value. #vseg is 6, when there are sufficient resources.
- ANALYZE table: Analyzing a nonpartitioned table will use more virtual segments than a partitioned table.
- Relationship between hash distribution results: #vseg must be the same as the bucket number for the hash table.