Monitoring a HAWQ System

You can monitor a HAWQ system using a variety of tools included with the system or available as add-ons.

Observing the HAWQ system day-to-day performance helps administrators understand the system behavior, plan workflow, and troubleshoot problems. This chapter discusses tools for monitoring database performance and activity.

Also, be sure to review Recommended Monitoring and Maintenance Tasks for monitoring activities you can script to quickly detect problems in the system.

Using hawq_toolkit

Use HAWQ’s administrative schema hawq_toolkit to query the system catalogs, log files, and operating environment for system status information. The hawq_toolkit schema contains several views you can access using SQL commands. The hawq_toolkit schema is accessible to all database users. Some objects require superuser permissions. Use a command similar to the following to add the hawq_toolkit schema to your schema search path:

=> SET ROLE 'gpadmin' ;
=# SET search_path TO myschema, hawq_toolkit ;

Monitoring System State

As a HAWQ administrator, you must monitor the system for problem events such as a segment going down or running out of disk space on a segment host. The following topics describe how to monitor the health of a HAWQ system and examine certain state information for a HAWQ system.

Checking System State

A HAWQ system is comprised of multiple PostgreSQL instances (the master and segments) spanning multiple machines. To monitor a HAWQ system, you need to know information about the system as a whole, as well as status information of the individual instances. The hawq state utility provides status information about a HAWQ system.

Viewing Master and Segment Status and Configuration

The default hawq state action is to check segment instances and show a brief status of the valid and failed segments. For example, to see a quick status of your HAWQ system:

$ hawq state -b

You can also display information about the HAWQ master data directory by invoking hawq state with the -d option:

$ hawq state -d <master_data_dir>

Checking Disk Space Usage

Checking Sizing of Distributed Databases and Tables

The hawq_toolkit administrative schema contains several views that you can use to determine the disk space usage for a distributed HAWQ database, schema, table, or index.

Viewing Disk Space Usage for a Database

To see the total size of a database (in bytes), use the hawq_size_of_database view in the hawq_toolkit administrative schema. For example:

=> SELECT * FROM hawq_toolkit.hawq_size_of_database
     ORDER BY sodddatname;
Viewing Disk Space Usage for a Table

The hawq_toolkit administrative schema contains several views for checking the size of a table. The table sizing views list the table by object ID (not by name). To check the size of a table by name, you must look up the relation name (relname) in the pg_class table. For example:

=> SELECT relname AS name, sotdsize AS size, sotdtoastsize
     AS toast, sotdadditionalsize AS other
     FROM hawq_toolkit.hawq_size_of_table_disk AS sotd, pg_class
   WHERE sotd.sotdoid=pg_class.oid ORDER BY relname;
Viewing Disk Space Usage for Indexes

The hawq_toolkit administrative schema contains a number of views for checking index sizes. To see the total size of all index(es) on a table, use the hawq_size_of_all_table_indexes view. To see the size of a particular index, use the hawq_size_of_index view. The index sizing views list tables and indexes by object ID (not by name). To check the size of an index by name, you must look up the relation name (relname) in the pg_class table. For example:

=> SELECT soisize, relname AS indexname
     FROM pg_class, hawq_size_of_index
   WHERE pg_class.oid=hawq_size_of_index.soioid
     AND pg_class.relkind='i';

Viewing Metadata Information about Database Objects

HAWQ uses its system catalogs to track various metadata information about the objects stored in a database (tables, views, indexes and so on), as well as global objects including roles and tablespaces.

Viewing the Last Operation Performed

You can use the system views pg_stat_operations and pg_stat_partition_operations to look up actions performed on a database object. For example, to view when the cust table was created and when it was last analyzed:

=> SELECT schemaname AS schema, objname AS table,
     usename AS role, actionname AS action,
     subtype AS type, statime AS time
   FROM pg_stat_operations
   WHERE objname='cust';
 schema | table | role | action  | type  | time
  sales | cust  | main | CREATE  | TABLE | 2010-02-09 18:10:07.867977-08
  sales | cust  | main | VACUUM  |       | 2010-02-10 13:32:39.068219-08
  sales | cust  | main | ANALYZE |       | 2010-02-25 16:07:01.157168-08
(3 rows)

Viewing the Definition of an Object

You can use the psql \d meta-command to display the definition of an object, such as a table or view. For example, to see the definition of a table named sales:

=> \d sales
Append-Only Table "public.sales"
 Column |  Type   | Modifiers 
 id     | integer | 
 year   | integer | 
 qtr    | integer | 
 day    | integer | 
 region | text    | 
Compression Type: None
Compression Level: 0
Block Size: 32768
Checksum: f
Distributed by: (id)

Viewing Query Workfile Usage Information

The HAWQ administrative schema hawq_toolkit contains views that display information about HAWQ workfiles. HAWQ creates workfiles on disk if it does not have sufficient memory to execute the query in memory. This information can be used for troubleshooting and tuning queries. The information in the views can also be used to specify the values for the HAWQ configuration parameters hawq_workfile_limit_per_query and hawq_workfile_limit_per_segment.

Views in the hawq_toolkit schema include:

  • hawq_workfile_entries - one row for each operator currently using disk space for workfiles on a segment
  • hawq_workfile_usage_per_query - one row for each running query currently using disk space for workfiles on a segment
  • hawq_workfile_usage_per_segment - one row for each segment where each row displays the total amount of disk space currently in use for workfiles on the segment

HAWQ Error Codes

The following section describes SQL error codes for certain database events.

SQL Standard Error Codes

The following table lists all the defined error codes. Some are not used, but are defined by the SQL standard. The error classes are also shown. For each error class there is a standard error code having the last three characters 000. This code is used only for error conditions that fall within the class but do not have any more-specific code assigned.

The PL/pgSQL condition name for each error code is the same as the phrase shown in the table, with underscores substituted for spaces. For example, code 22012, DIVISION BY ZERO, has condition name DIVISION_BY_ZERO. Condition names can be written in either upper or lower case.

Note: PL/pgSQL does not recognize warning, as opposed to error, condition names; those are classes 00, 01, and 02.

Error Code Meaning Constant
Class 00— Successful Completion
00000 SUCCESSFUL COMPLETION successful_completion
Class 01 — Warning
01000 WARNING warning
0100C DYNAMIC RESULT SETS RETURNED dynamic_result_sets_returned
01008 IMPLICIT ZERO BIT PADDING implicit_zero_bit_padding
01003 NULL VALUE ELIMINATED IN SET FUNCTION null_value_eliminated_in_set_function
01007 PRIVILEGE NOT GRANTED privilege_not_granted
01006 PRIVILEGE NOT REVOKED privilege_not_revoked
01004 STRING DATA RIGHT TRUNCATION string_data_right_truncation
01P01 DEPRECATED FEATURE deprecated_feature
Class 02 — No Data (this is also a warning class per the SQL standard)
02000 NO DATA no_data
02001 NO ADDITIONAL DYNAMIC RESULT SETS RETURNED no_additional_dynamic_result_sets_returned
Class 03 — SQL Statement Not Yet Complete
03000 SQL STATEMENT NOT YET COMPLETE sql_statement_not_yet_complete
Class 08 — Connection Exception
08000 CONNECTION EXCEPTION connection_exception
08003 CONNECTION DOES NOT EXIST connection_does_not_exist
08006 CONNECTION FAILURE connection_failure
08001 SQLCLIENT UNABLE TO ESTABLISH SQLCONNECTION sqlclient_unable_to_establish_sqlconnection
08004 SQLSERVER REJECTED ESTABLISHMENT OF SQLCONNECTION sqlserver_rejected_establishment_of_sqlconnection
08007 TRANSACTION RESOLUTION UNKNOWN transaction_resolution_unknown
08P01 PROTOCOL VIOLATION protocol_violation
Class 09 — Triggered Action Exception
09000 TRIGGERED ACTION EXCEPTION triggered_action_exception
Class 0A — Feature Not Supported
0A000 FEATURE NOT SUPPORTED feature_not_supported
Class 0B — Invalid Transaction Initiation
0B000 INVALID TRANSACTION INITIATION invalid_transaction_initiation
Class 0F — Locator Exception
0F000 LOCATOR EXCEPTION locator_exception
0F001 INVALID LOCATOR SPECIFICATION invalid_locator_specification
Class 0L — Invalid Grantor
0L000 INVALID GRANTOR invalid_grantor
0LP01 INVALID GRANT OPERATION invalid_grant_operation
Class 0P — Invalid Role Specification
0P000 INVALID ROLE SPECIFICATION invalid_role_specification
Class 21 — Cardinality Violation
21000 CARDINALITY VIOLATION cardinality_violation
Class 22 — Data Exception
22000 DATA EXCEPTION data_exception
2202E ARRAY SUBSCRIPT ERROR array_subscript_error
22021 CHARACTER NOT IN REPERTOIRE character_not_in_repertoire
22008 DATETIME FIELD OVERFLOW datetime_field_overflow
22012 DIVISION BY ZERO division_by_zero
22005 ERROR IN ASSIGNMENT error_in_assignment
2200B ESCAPE CHARACTER CONFLICT escape_character_conflict
22022 INDICATOR OVERFLOW indicator_overflow
22015 INTERVAL FIELD OVERFLOW interval_field_overflow
2201E INVALID ARGUMENT FOR LOGARITHM invalid_argument_for_logarithm
2201F INVALID ARGUMENT FOR POWER FUNCTION invalid_argument_for_power_function
2201G INVALID ARGUMENT FOR WIDTH BUCKET FUNCTION invalid_argument_for_width_bucket_function
22018 INVALID CHARACTER VALUE FOR CAST invalid_character_value_for_cast
22007 INVALID DATETIME FORMAT invalid_datetime_format
22019 INVALID ESCAPE CHARACTER invalid_escape_character
2200D INVALID ESCAPE OCTET invalid_escape_octet
22025 INVALID ESCAPE SEQUENCE invalid_escape_sequence
22P06 NONSTANDARD USE OF ESCAPE CHARACTER nonstandard_use_of_escape_character
22010 INVALID INDICATOR PARAMETER VALUE invalid_indicator_parameter_value
22020 INVALID LIMIT VALUE invalid_limit_value
22023 INVALID PARAMETER VALUE invalid_parameter_value
2201B INVALID REGULAR EXPRESSION invalid_regular_expression
22009 INVALID TIME ZONE DISPLACEMENT VALUE invalid_time_zone_displacement_value
2200C INVALID USE OF ESCAPE CHARACTER invalid_use_of_escape_character
2200G MOST SPECIFIC TYPE MISMATCH most_specific_type_mismatch
22004 NULL VALUE NOT ALLOWED null_value_not_allowed
22002 NULL VALUE NO INDICATOR PARAMETER null_value_no_indicator_parameter
22003 NUMERIC VALUE OUT OF RANGE numeric_value_out_of_range
22026 STRING DATA LENGTH MISMATCH string_data_length_mismatch
22001 STRING DATA RIGHT TRUNCATION string_data_right_truncation
22011 SUBSTRING ERROR substring_error
22027 TRIM ERROR trim_error
22024 UNTERMINATED C STRING unterminated_c_string
2200F ZERO LENGTH CHARACTER STRING zero_length_character_string
22P01 FLOATING POINT EXCEPTION floating_point_exception
22P02 INVALID TEXT REPRESENTATION invalid_text_representation
22P03 INVALID BINARY REPRESENTATION invalid_binary_representation
22P04 BAD COPY FILE FORMAT bad_copy_file_format
22P05 UNTRANSLATABLE CHARACTER untranslatable_character
Class 23 — Integrity Constraint Violation
23000 INTEGRITY CONSTRAINT VIOLATION integrity_constraint_violation
23001 RESTRICT VIOLATION restrict_violation
23502 NOT NULL VIOLATION not_null_violation
23503 FOREIGN KEY VIOLATION foreign_key_violation
23505 UNIQUE VIOLATION unique_violation
23514 CHECK VIOLATION check_violation
Class 24 — Invalid Cursor State
24000 INVALID CURSOR STATE invalid_cursor_state
Class 25 — Invalid Transaction State
25000 INVALID TRANSACTION STATE invalid_transaction_state
25001 ACTIVE SQL TRANSACTION active_sql_transaction
25002 BRANCH TRANSACTION ALREADY ACTIVE branch_transaction_already_active
25008 HELD CURSOR REQUIRES SAME ISOLATION LEVEL held_cursor_requires_same_isolation_level
25003 INAPPROPRIATE ACCESS MODE FOR BRANCH TRANSACTION inappropriate_access_mode_for_branch_transaction
25004 INAPPROPRIATE ISOLATION LEVEL FOR BRANCH TRANSACTION inappropriate_isolation_level_for_branch_transaction
25005 NO ACTIVE SQL TRANSACTION FOR BRANCH TRANSACTION no_active_sql_transaction_for_branch_transaction
25006 READ ONLY SQL TRANSACTION read_only_sql_transaction
25007 SCHEMA AND DATA STATEMENT MIXING NOT SUPPORTED schema_and_data_statement_mixing_not_supported
25P01 NO ACTIVE SQL TRANSACTION no_active_sql_transaction
25P02 IN FAILED SQL TRANSACTION in_failed_sql_transaction
Class 26 — Invalid SQL Statement Name
26000 INVALID SQL STATEMENT NAME invalid_sql_statement_name
Class 27 — Triggered Data Change Violation
27000 TRIGGERED DATA CHANGE VIOLATION triggered_data_change_violation
Class 28 — Invalid Authorization Specification
28000 INVALID AUTHORIZATION SPECIFICATION invalid_authorization_specification
Class 2B — Dependent Privilege Descriptors Still Exist
2B000 DEPENDENT PRIVILEGE DESCRIPTORS STILL EXIST dependent_privilege_descriptors_still_exist
2BP01 DEPENDENT OBJECTS STILL EXIST dependent_objects_still_exist
Class 2D — Invalid Transaction Termination
2D000 INVALID TRANSACTION TERMINATION invalid_transaction_termination
Class 2F — SQL Routine Exception
2F000 SQL ROUTINE EXCEPTION sql_routine_exception
2F005 FUNCTION EXECUTED NO RETURN STATEMENT function_executed_no_return_statement
2F002 MODIFYING SQL DATA NOT PERMITTED modifying_sql_data_not_permitted
2F003 PROHIBITED SQL STATEMENT ATTEMPTED prohibited_sql_statement_attempted
2F004 READING SQL DATA NOT PERMITTED reading_sql_data_not_permitted
Class 34 — Invalid Cursor Name
34000 INVALID CURSOR NAME invalid_cursor_name
Class 38 — External Routine Exception
38000 EXTERNAL ROUTINE EXCEPTION external_routine_exception
38001 CONTAINING SQL NOT PERMITTED containing_sql_not_permitted
38002 MODIFYING SQL DATA NOT PERMITTED modifying_sql_data_not_permitted
38003 PROHIBITED SQL STATEMENT ATTEMPTED prohibited_sql_statement_attempted
38004 READING SQL DATA NOT PERMITTED reading_sql_data_not_permitted
Class 39 — External Routine Invocation Exception
39000 EXTERNAL ROUTINE INVOCATION EXCEPTION external_routine_invocation_exception
39001 INVALID SQLSTATE RETURNED invalid_sqlstate_returned
39004 NULL VALUE NOT ALLOWED null_value_not_allowed
39P01 TRIGGER PROTOCOL VIOLATED trigger_protocol_violated
39P02 SRF PROTOCOL VIOLATED srf_protocol_violated
Class 3B — Savepoint Exception
3B000 SAVEPOINT EXCEPTION savepoint_exception
3B001 INVALID SAVEPOINT SPECIFICATION invalid_savepoint_specification
Class 3D — Invalid Catalog Name
3D000 INVALID CATALOG NAME invalid_catalog_name
Class 3F — Invalid Schema Name
3F000 INVALID SCHEMA NAME invalid_schema_name
Class 40 — Transaction Rollback
40000 TRANSACTION ROLLBACK transaction_rollback
40002 TRANSACTION INTEGRITY CONSTRAINT VIOLATION transaction_integrity_constraint_violation
40001 SERIALIZATION FAILURE serialization_failure
40003 STATEMENT COMPLETION UNKNOWN statement_completion_unknown
40P01 DEADLOCK DETECTED deadlock_detected
Class 42 — Syntax Error or Access Rule Violation
42000 SYNTAX ERROR OR ACCESS RULE VIOLATION syntax_error_or_access_rule_violation
42601 SYNTAX ERROR syntax_error
42501 INSUFFICIENT PRIVILEGE insufficient_privilege
42846 CANNOT COERCE cannot_coerce
42803 GROUPING ERROR grouping_error
42830 INVALID FOREIGN KEY invalid_foreign_key
42602 INVALID NAME invalid_name
42622 NAME TOO LONG name_too_long
42939 RESERVED NAME reserved_name
42804 DATATYPE MISMATCH datatype_mismatch
42P18 INDETERMINATE DATATYPE indeterminate_datatype
42809 WRONG OBJECT TYPE wrong_object_type
42703 UNDEFINED COLUMN undefined_column
42883 UNDEFINED FUNCTION undefined_function
42P01 UNDEFINED TABLE undefined_table
42P02 UNDEFINED PARAMETER undefined_parameter
42704 UNDEFINED OBJECT undefined_object
42701 DUPLICATE COLUMN duplicate_column
42P03 DUPLICATE CURSOR duplicate_cursor
42P04 DUPLICATE DATABASE duplicate_database
42723 DUPLICATE FUNCTION duplicate_function
42P05 DUPLICATE PREPARED STATEMENT duplicate_prepared_statement
42P06 DUPLICATE SCHEMA duplicate_schema
42P07 DUPLICATE TABLE duplicate_table
42712 DUPLICATE ALIAS duplicate_alias
42710 DUPLICATE OBJECT duplicate_object
42702 AMBIGUOUS COLUMN ambiguous_column
42725 AMBIGUOUS FUNCTION ambiguous_function
42P08 AMBIGUOUS PARAMETER ambiguous_parameter
42P09 AMBIGUOUS ALIAS ambiguous_alias
42P10 INVALID COLUMN REFERENCE invalid_column_reference
42611 INVALID COLUMN DEFINITION invalid_column_definition
42P11 INVALID CURSOR DEFINITION invalid_cursor_definition
42P12 INVALID DATABASE DEFINITION invalid_database_definition
42P13 INVALID FUNCTION DEFINITION invalid_function_definition
42P14 INVALID PREPARED STATEMENT DEFINITION invalid_prepared_statement_definition
42P15 INVALID SCHEMA DEFINITION invalid_schema_definition
42P16 INVALID TABLE DEFINITION invalid_table_definition
42P17 INVALID OBJECT DEFINITION invalid_object_definition
Class 44 — WITH CHECK OPTION Violation
44000 WITH CHECK OPTION VIOLATION with_check_option_violation
Class 53 — Insufficient Resources
53000 INSUFFICIENT RESOURCES insufficient_resources
53100 DISK FULL disk_full
53200 OUT OF MEMORY out_of_memory
53300 TOO MANY CONNECTIONS too_many_connections
Class 54 — Program Limit Exceeded
54000 PROGRAM LIMIT EXCEEDED program_limit_exceeded
54001 STATEMENT TOO COMPLEX statement_too_complex
54011 TOO MANY COLUMNS too_many_columns
54023 TOO MANY ARGUMENTS too_many_arguments
Class 55 — Object Not In Prerequisite State
55000 OBJECT NOT IN PREREQUISITE STATE object_not_in_prerequisite_state
55006 OBJECT IN USE object_in_use
55P02 CANT CHANGE RUNTIME PARAM cant_change_runtime_param
55P03 LOCK NOT AVAILABLE lock_not_available
Class 57 — Operator Intervention
57000 OPERATOR INTERVENTION operator_intervention
57014 QUERY CANCELED query_canceled
57P01 ADMIN SHUTDOWN admin_shutdown
57P02 CRASH SHUTDOWN crash_shutdown
57P03 CANNOT CONNECT NOW cannot_connect_now
Class 58 — System Error (errors external to HAWQ )
58030 IO ERROR io_error
58P01 UNDEFINED FILE undefined_file
58P02 DUPLICATE FILE duplicate_file
Class F0 — Configuration File Error
F0000 CONFIG FILE ERROR config_file_error
F0001 LOCK FILE EXISTS lock_file_exists
Class P0 — PL/pgSQL Error
P0000 PLPGSQL ERROR plpgsql_error
P0001 RAISE EXCEPTION raise_exception
P0002 NO DATA FOUND no_data_found
P0003 TOO MANY ROWS too_many_rows
Class XX — Internal Error
XX000 INTERNAL ERROR internal_error
XX001 DATA CORRUPTED data_corrupted
XX002 INDEX CORRUPTED index_corrupted