COMMON REASON FOR NZ QUERY PERFORMANCE ISSUE

-          Not having correct distribution keys resulting in certain data slice storing more data resulting in data skew performance of a query depends on the performance of the slice storing most data for the tables involved in query.

-          Even if the data is distributed uniformly, not taking into account processing patterns in distribution results is process skew.  For e.g. Data is distributed by month but if the process looks for a month of data,  then the performance of the query will be degraded since the processing needs to be handled by a subset of SPUs and not all of the SPUs in the system.

-          Performance gets impacted if a large volume of data (fact table) gets re-distributed or broadcast during query execution.

-          Not having zone maps of columns since the data types are not the ones for which NZ can generate Zone Maps like numeric (x,y),char,varchar etc.

-          Not having the table data organized optimally for multi table joins as in the case of multi-dimensional joins performed in a data warehouse environment.

For job long running analysis:

 -          Identify if queries are being queued and the long running queries which is causing the queries to be queued through nzadmin tool/NPS or nzseesion command.

-          Long running queries can also be identified if query history is being gathered using appropriate history configuration (Tidal job history runtime).

-          For long running queries generate query plans using the “EXPLAIN” command.  Recent query execution plans are also stored in “.pln” files in the data. <ver> /plans directory under the /nz/data directory.

-          There might a chance like outdated for stats collection and grooming tables & create / change the distribution keys. 

-          Creation of materialized views and modifying or using organization keys.

-          If the analysis is performed in a non – development environment(QA),  it is key to make sure that the statistics reflect the values expected or in production.

Comments

Popular posts from this blog

Database Size : Calculation in Teradata

The difference between SET and MULTISET tables is?

Hadoop Commands