What is the cost ?

Why do I want to talk about the cost, since it has already been explained in the famous book by Jonathan Lewis ?

Because I want to give here some simple exemples for the DBAs who did not read Jonathan’s book (altough they should do it )

Also, I will try to compare the what the optimizer thinks and what the runtime engine will do.

So , how the cost is calculated ?

From Cost based optimizer by Jonathan Lewis:

According to the CPU costing model:
Cost = (
#SRds * sreadtim +
#MRds * mreadtim +
#CPUCycles / cpuspeed
) / sreadtim

 

where

#SRDs – number of single block reads
#MRDs – number of multi block reads
#CPUCycles – number of CPU Cycles
sreadtim – single block read time
mreadtim – multi block read time
cpuspeed – CPU cycles per second

Translated, this says the following:
The cost is the time spent on single-block reads, plus the time spent on multiblock reads, plus the CPU time required, all divided by the time it takes to do a single-block read. Which means the cost is the total predicted execution time for the statement, expressed in units of the single-block read time.

More precisely, the formulas are the following  (from the Select statement graph):

cost of the tablescan =ceiling(nblks/mbrc) *(mreadtim/sreadtim)+1

cost of the index scan = blevel + ceiling(leaf_blocks * effective index selectivity) + ceiling(clustering_factor * effective table selectivity)

 If we are using the workload system statistics, then sreadtim, mreadtim and mbrc are calculated and stored inside sys.aux_stats$ table.

If not, they are calculated as follows

sreadtim=ioseektim+ db_block_size/iotfrspeed
mreadtim=ioseektim+db_file_multiblock_read_count *bd_block_size/iotftspeed

mbrc=db_file_multiblock_read_count

 

Now let’s look at some most typical exemples, an index unique scan, the range scan, than  the  full scan, and let’s try to undestand how the cost is calculated.

But before do that , I’ll do another test: a range scan that I know will not return any rows, since there are no employees with the empno number less

than 1. Let’s how CBO will consider it, and also what the runtime engine needs to do.

 

select * from emp where empno<1;

 

We will  first use the autotrace, because it will tell us both and the optimizer estimation and the runtime work , and it  says the following:

 

Execution plan
———————————————————-
Plan hash value: 679380930

——————————————————————————–

——–

| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Ti

me     |

——————————————————————————–

——–

|   0 | SELECT STATEMENT            |          |     1 |    46 |     4   (0)| 00

:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| EMP      |     1 |    46 |     4   (0)| 00

:00:01 |

|*  2 |   INDEX RANGE SCAN          | IX_EMPNO |     1 |       |     3   (0)| 00

:00:01 |

——————————————————————————–

——–
Predicate Information (identified by operation id):
—————————————————

2 – access(“EMPNO”<0)
Statistiques
———————————————————-
1  recursive calls
0  db block gets
3  consistent gets
0  physical reads
0  redo size
1021  bytes sent via SQL*Net to client
509  bytes received via SQL*Net from client
1  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
0  rows processed

 

So the optimizer has estimated the cost of 4 and the runtime engine says that the query 3 buffer gets.

Let’s try to find the easier part, the 3 buffer gets of the runtime engine.  Why we still need 3 buffer gets,even no data is being returned ?

SQL> select leaf_blocks,blevel,index_name from user_indexes;

LEAF_BLOCKS     BLEVEL INDEX_NAME
——————–   ———- ——————————
1950                                   2         IX_DEPTNO
0                                         0         IX_COUNTRY
2226                                   2        IX_EMPNO
0                                          0        IX_LANGUAGENO

 

If level=0, there is only one block in the index, the root block.
if level=1 there are root and branch blocks
if level=2 there is a root block, one or more branch blocks and one or more leaf blocks.

So if there is 2 level of the index ix_deptno, it must start with the root, go to the branch and go to the leaf to understand
that there is no rows corresponding to the search criteria, that makes 3 buffer gets

Now, let’s look to the optimizer, and find out why 3 buffer gets give cost of 4.

The 10053 trace says the following:
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: EMP  Alias: EMP
#Rows: 1000001  #Blks:  10100  AvgRowLen:  46.00  ChainCnt:  0.00
Index Stats::
Index: IX_COUNTRY  Col#: 10
LVLS: 0  #LB: 0  #DK: 0  LB/K: 0.00  DB/K: 0.00  CLUF: 0.00
Index: IX_DEPTNO  Col#: 8
LVLS: 2  #LB: 1950  #DK: 3  LB/K: 650.00  DB/K: 5907.00  CLUF: 17721.00
Index: IX_EMPNO  Col#: 1
LVLS: 2  #LB: 2226  #DK: 1000001  LB/K: 1.00  DB/K: 1.00  CLUF: 5907.00
Index: IX_LANGUAGENO  Col#: 11
LVLS: 0  #LB: 0  #DK: 0  LB/K: 0.00  DB/K: 0.00  CLUF: 0.00
Access path analysis for EMP
***************************************
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for EMP[EMP]
Column (#1): EMPNO(
AvgLen: 5 NDV: 1000001 Nulls: 0 Density: 0.000001 Min: 1 Max: 1000001
Using prorated density: 0.000001 of col #1 as selectvity of out-of-range/non-existent value pred
Table: EMP  Alias: EMP
Card: Original: 1000001.000000  Rounded: 1  Computed: 1.00  Non Adjusted: 1.00
Access Path: TableScan
Cost:  2750.36  Resp: 2750.36  Degree: 0
Cost_io: 2737.00  Cost_cpu: 271926944
Resp_io: 2737.00  Resp_cpu: 271926944
Using prorated density: 0.000001 of col #1 as selectvity of out-of-range/non-existent value pred
Access Path: index (RangeScan)
Index: IX_EMPNO
resc_io: 4.00  resc_cpu: 29036
ix_sel: 0.000001  ix_sel_with_filters: 0.000001
Cost: 4.00  Resp: 4.00  Degree: 1
Best:: AccessPath: IndexRange
Index: IX_EMPNO
Cost: 4.00  Degree: 1  Resp: 4.00  Card: 1.00  Bytes: 0

***************************************

So the cost is estimated as 4 with the density of 0,0000001
Go to the cardinality matrix and see that the cost of the index is=>

cost =
blevel +
ceiling(leaf_blocks * effective index selectivity) +
ceiling(clustering_factor * effective table selectivity)
cost=2+
(0,000001*2226)
2+ 0,002226
(0,000001*5907)
+0,005907
2,1

So where the cost 4 is coming from ?
As we can see,the DBA should always have a rough idea about the number of buffer gets needed for a query, but usually it’s not so easy to completely understand Oracle cost, altough it is quite simple  for this trivial exemple.
So in everyday optimization, the DBA will seldom try to fully explain the cost.
Jonathan’s book is just an excellent guide to give you an idea about what the cost approximately is.

 

Now, let’s look at the full scan exemple, and what’s the cost behaviour with it.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
——————————————————————————-

Plan hash value: 2083865914

——————————————————————-
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
——————————————————————-
|   0 | SELECT STATEMENT   |      |     1 |  2744   (1)| 00:00:33 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| EMP  |  1000K|  2744   (1)| 00:00:33 |
——————————————————————-

Perhaps, it is easier to understand if we query directly the plan table to see the stored figures:

SQL> select cpu_cost/(2818*10000),io_cost from plan_table;

cost=io_cost + cpu_cost/cpuspeed*sreadtim*1000

 

CPU_COST/(2818*10000)    IO_COST
——————— ———-
7,87532626       2737

2737+7=2744

 

Where the 2737 came from ?

It’s related to to “adjusted mbfrc”.

If you play with db_file_multiblock_read_count or with mbrc ,you will be able to draw a non-linear curve.

 

Where the 7,87 is coming from ?

 

This is the calculated CPU cost.

When we change the clause where by widening or narrowing the range ,we see that only the  CPU component is changing, not the IO component. Why? Because , the model assumption is that , in every case, the runtime  engine must read the all the block under the HWM , so the number of IOs remain unchanged. However, the CPU work will not be  the same, because the model logically suppose that a where clause that matches more lines will require more work in terms of CPU, to compare and return rows instead of just discarding them. (The same thing will happen if we query the first column, then the second, then the third , the small difference will be shown on the CPU component)

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
——————————————————————————–

Plan hash value: 3956160932

————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————–
| 0 | SELECT STATEMENT | | 952K| 41M| 2751 (1)| 00:00:34 |
|* 1 | TABLE ACCESS FULL| EMP | 952K| 41M| 2751 (1)| 00:00:34 |
————————————————————————–

Predicate Information (identified by operation id):
—————————————————

PLAN_TABLE_OUTPUT
——————————————————————————–
1 – filter(“SAL”>1000)
13 rows selected.
SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
——————————————————————————–

Plan hash value: 3956160932

————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————–
| 0 | SELECT STATEMENT | | 714K| 31M| 2750 (1)| 00:00:34 |
|* 1 | TABLE ACCESS FULL| EMP | 714K| 31M| 2750 (1)| 00:00:34 |
————————————————————————–

Predicate Information (identified by operation id):
—————————————————

PLAN_TABLE_OUTPUT
——————————————————————————–
1 – filter(“SAL”>2000)

13 rows selected.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
——————————————————————————–

Plan hash value: 3956160932

————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————–
| 0 | SELECT STATEMENT | | 476K| 20M| 2749 (1)| 00:00:33 |
|* 1 | TABLE ACCESS FULL| EMP | 476K| 20M| 2749 (1)| 00:00:33 |
————————————————————————–

Predicate Information (identified by operation id):
—————————————————

PLAN_TABLE_OUTPUT
——————————————————————————–
1 – filter(“SAL”>3000)

 

It’s more clear if we query directly the plan table:
SQL> select cpu_cost/(2818*10000),io_cost from plan_table;

CPU_COST/(2818*10000) IO_COST
——————— ———-
16.5778937 2737
15.7329824 2737
13.1982556 2737

 

 

 

 

Leave a comment