28 | 03 | 2024
Latest Articles
Popular Articles

SQL Tuning

An introduction to SQL Tuning : Expression Statistics

User Rating:  / 1
PoorBest 

An introduction to SQL Tuning : Expression Statistics


Question : Are Expression statistics really worth the trouble ?

Answer   : Yes they provide the optimizer with accurate information about selectivity / cardinality

When a function, an expression is applied to a column in the where clause, the optimizer doesn' t know the right selectivity / cardinality.
When we gather stats for the expression the optimizer knows the right selectivity / cardinality
Here is an example

 

STEP 1: Without Expression Statistics

 

SQL> set autotrace traceonly
SQL> alter session set events='10053 trace name context forever,level 1';

Session altered.

SQL>
SQL> select * from T_ORDER_OR where CR_FK = 'EUR' and extract(year from OR_ORDERDATE) = '2012';

250564 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value:

3547894390

--------------------------------------------------------------------------------
| Id |        Operation |       Name | Rows | Bytes | Cost (%CPU)|     Time |
--------------------------------------------------------------------------------
| 0  | SELECT STATEMENT |            | 2506 |   589K|    6914 (1)| 00:01:23 |
|* 1 | TABLE ACCESS FULL| T_ORDER_OR | 2506 |   589K|    6914 (1)| 00:01:23 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CR_FK"='EUR' AND EXTRACT(YEAR FROM
           INTERNAL_FUNCTION("OR_ORDERDATE"))=2012)

Statistics
----------------------------------------------------------
       24 recursive calls
        5 db block gets
   269789 consistent gets
    25145 physical reads
     1636 redo size
166582238 bytes sent via SQL*Net to client
 71160699 bytes received via SQL*Net from client
   501130 SQL*Net roundtrips to/from client
        0 sorts (memory)
        0 sorts (disk)
   250564 rows processed

#################################################################################################

Access path analysis for T_ORDER_OR
***************************************
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for T_ORDER_OR[T_ORDER_OR]
Column (#9):
NewDensity:0.124779, OldDensity:0.000000 BktCnt:1000000, PopBktCnt:1000000, PopValCnt:4, NDV:4
Column (#9): CR_FK(
AvgLen: 4 NDV: 4 Nulls: 0 Density: 0.124779
Histogram: Freq #Bkts: 4 UncompBkts: 1000000 EndPtVals: 4
Table: T_ORDER_OR Alias: T_ORDER_OR
Card: Original: 1000000.000000 Rounded: 2506 Computed: 2505.64 Non Adjusted: 2505.64
Access Path: TableScan
Cost: 6913.94 Resp: 6913.94 Degree: 0
Cost_io: 6880.00 Cost_cpu: 678673412
Resp_io: 6880.00 Resp_cpu: 678673412

Access Path: index (RangeScan)
Index: OR_IDX4
resc_io: 251490.00 resc_cpu: 2056568786
ix_sel: 0.250564 ix_sel_with_filters: 0.250564
Cost: 251592.85 Resp: 251592.85 Degree: 1
Best:: AccessPath: TableScan
Cost: 6913.94 Degree: 1 Resp: 6913.94 Card: 2505.64 Bytes: 0

#################################################################################################


There are 2 filters

CR_FK = 'EUR' since there are only 4 currencies for which there are orders the NDV for conumn 9 is 4

extract ( year or_orderdate = '2012' ) : The optimizer does not have an idea about the right selectivity / cardinality, the optimizer assumes here a selectivity iof 0.01 ( 1% )

But execution lists that this assumed selectivity is totally wrong, since all the orders were ordered in 2012

 

STEP 2 : With Expression Statistics ( 11G ) 

 

SQL> select dbms_stats.create_extended_stats(null,'T_ORDER_OR','(extract(year from OR_ORDERDATE))') from dual;

DBMS_STATS.CREATE_EXTENDED_STATS(NULL,'T_ORDER_OR','(EXTRACT(YEARFROMOR_ORDERDATE))')
--------------------------------------------------------------------------------------------------------------
SYS_STUBQ5SF4KAJNMA00YIWL3XX9L

SQL> begin
   2 dbms_stats.gather_table_stats(user,'T_ORDER_OR',estimate_percent => 100, method_opt => 'FOR ALL COLUMNS SIZE AUTO',no_invalidate => false);
   3 end;
   4 /

PL/SQL procedure successfully completed.

SQL> set autotrace traceonly
SQL> alter session set events='10053 trace name context forever,level 1';

Session altered.

SQL>

SQL> select * from T_ORDER_OR where CR_FK = 'EUR' and extract(year from OR_ORDERDATE) = '2012';

250564 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value:

3547894390

--------------------------------------------------------------------------------
| Id |        Operation |       Name | Rows | Bytes | Cost (%CPU)|     Time |
--------------------------------------------------------------------------------
|  0 | SELECT STATEMENT |            |  250K|    58M|    6914 (1)| 00:01:23 |
|* 1 | TABLE ACCESS FULL| T_ORDER_OR |  250K|    58M|    6914 (1)| 00:01:23 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CR_FK"='EUR' AND EXTRACT(YEAR FROM
           INTERNAL_FUNCTION("OR_ORDERDATE"))=2012)

Statistics
----------------------------------------------------------
       31 recursive calls
        7 db block gets
   269791 consistent gets
    25146 physical reads
     1588 redo size
166582238 bytes sent via SQL*Net to client
 71160699 bytes received via SQL*Net from client
   501130 SQL*Net roundtrips to/from client
        0 sorts (memory)
        0 sorts (disk)
   250564 rows processed

#################################################################################################

Access path analysis for T_ORDER_OR
***************************************
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for T_ORDER_OR[T_ORDER_OR]
Column (#9):
NewDensity:0.124779, OldDensity:0.000000 BktCnt:1000000, PopBktCnt:1000000, PopValCnt:4, NDV:4
Column (#9): CR_FK(
AvgLen: 4 NDV: 4 Nulls: 0 Density: 0.124779
Histogram: Freq #Bkts: 4 UncompBkts: 1000000 EndPtVals: 4
Column (#10): SYS_STUBQ5SF4KAJNMA00YIWL3XX9L(
AvgLen: 4 NDV: 1 Nulls: 0 Density: 1.000000 Min: 2012 Max: 2012
Column (#6): OR_TOTALAMOUNT(
AvgLen: 6 NDV: 952029 Nulls: 0 Density: 0.000001 Min: 50 Max: 100000
ColGroup (#1, Index) OR_IDX4
Col#: 6 9 CorStregth: 3.86
ColGroup Usage:: PredCnt: 2 Matches Full: Partial:
***** Virtual column Adjustment ******
Column name SYS_STUBQ5SF4KAJNMA00YIWL3XX9L
cost_cpu 550.00
cost_io 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.00
***** End virtual column Adjustment ******
Table: T_ORDER_OR Alias: T_ORDER_OR
Card: Original: 1000000.000000 Rounded: 250564 Computed: 250564.00 Non Adjusted: 250564.00
Access Path: TableScan
Cost: 6913.94 Resp: 6913.94 Degree: 0
Cost_io: 6880.00 Cost_cpu: 678673412
Resp_io: 6880.00 Resp_cpu: 678673412

Access Path: index (RangeScan)
Index: OR_IDX4
resc_io: 251490.00 resc_cpu: 2056568786
ix_sel: 0.250564 ix_sel_with_filters: 0.250564
Cost: 251592.85 Resp: 251592.85 Degree: 1
Best:: AccessPath: TableScan
Cost: 6913.94 Degree: 1 Resp: 6913.94 Card: 250564.00 Bytes: 0

***************************************


#################################################################################################

 

STEP 3 : WITH EXPRESSION STATISTICS : OTHER JOIN CARDINALITY, OTHER JOIN METHOD

Because the optimizer assumes a selective cardinality for the t_order_or row source, t_order_or , t_order_or is the driving table and joined using a nested loop join with t_delivery_dl using index DL_IDX1

 

SQL> begin
   2 dbms_stats.drop_extended_stats(null,'T_ORDER_OR','(extract(year from OR_ORDERDATE))');
   3 end;
   4 /

PL/SQL procedure successfully completed.

SQL> select * from T_ORDER_OR,
2                  T_DELIVERY_DL
3    where OR_FK = OR_PK
4    and CR_FK = 'EUR' and extract(year from OR_ORDERDATE) = '2012'
5    and su_fk = 1;

2635 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 319855565

----------------------------------------------------------------------------------------------
| Id |                  Operation |          Name | Rows | Bytes | Cost (%CPU)|     Time |
----------------------------------------------------------------------------------------------
|  0 |           SELECT STATEMENT |               |   26 | 12272 |    6985 (1)| 00:01:24 |
|  1 |               NESTED LOOPS |               |      |       |            |          |
|  2 |               NESTED LOOPS |               |   26 | 12272 |    6985 (1)| 00:01:24 |
|* 3 |          TABLE ACCESS FULL |    T_ORDER_OR |   26 |  6266 |    6907 (1)| 00:01:23 |
|* 4 |           INDEX RANGE SCAN |       DL_IDX1 |    1 |       |       2 (0)| 00:00:01 |
|  5 | TABLE ACCESS BY INDEX ROWID| T_DELIVERY_DL |    1 |   231 |       3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("SU_FK"=1 AND "CR_FK"='EUR' AND EXTRACT(YEAR FROM
           INTERNAL_FUNCTION("OR_ORDERDATE"))=2012)
4 - access("OR_FK"="OR_PK")

Statistics
----------------------------------------------------------
     31 recursive calls
      7 db block gets
  33369 consistent gets
  29411 physical reads
   1732 redo size
3060326 bytes sent via SQL*Net to client
1468218 bytes received via SQL*Net from client
   7907 SQL*Net roundtrips to/from client
      0 sorts (memory)
      0 sorts (disk)
   2635 rows processed >> estimated cardinality of 26 rows is NOT close to the real number of rows returned

 

STEP 4 : WITH Expression StaTistics : other join cardinality, OTHER JOIN METHOD

 

Because the optimizer knows that access to the t_order_or row source is not selectif at all, t_order_or , t_order_or is the hash table table and joined using a hash join with t_delivery_dl 


SQL> select dbms_stats.create_extended_stats(null,'T_ORDER_OR','(extract(year from OR_ORDERDATE))') from dual;

SQL>

SQL> begin
  2 dbms_stats.gather_table_stats(user,'T_ORDER_OR',estimate_percent => 100, method_opt => 'FOR ALL COLUMNS SIZE AUTO',no_invalidate => false);
  3 end;
  4 /

PL/SQL procedure successfully completed.

SQL> select * from T_ORDER_OR,
2                  T_DELIVERY_DL
3    where OR_FK = OR_PK
4    and CR_FK = 'EUR' and extract(year from OR_ORDERDATE) = '2012'
5    and su_fk = 1;

2635 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1384926585

------------------------------------------------------------------------------------
| Id |        Operation |          Name | Rows | Bytes | Cost (%CPU)|     Time |
------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT |               | 2584 |  1201K|   13538 (1)| 00:02:43 |
|* 1 |        HASH JOIN |               | 2584 |  1201K|   13538 (1)| 00:02:43 |
|* 2 | TABLE ACCESS FULL|    T_ORDER_OR | 2580 |   617K|    6907 (1)| 00:01:23 |
|  3 | TABLE ACCESS FULL| T_DELIVERY_DL | 1001K|   220M|    6625 (1)| 00:01:20 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("OR_FK"="OR_PK")
2 - filter("SU_FK"=1 AND "CR_FK"='EUR' AND EXTRACT(YEAR FROM
           INTERNAL_FUNCTION("OR_ORDERDATE"))=2012)

Statistics
----------------------------------------------------------
     38 recursive calls
      6 db block gets
  51991 consistent gets
  49010 physical reads
   1588 redo size
3060326 bytes sent via SQL*Net to client
1468218 bytes received via SQL*Net from client
   7907 SQL*Net roundtrips to/from client
      1 sorts (memory)
      0 sorts (disk)
   2635 rows processed >> estimated cardinality of 2584 is close to the real number of rows returned