The Cost of Adaptive Execution Plans in Oracle (a Small Study)

September 26, 2013, 8:41 am

≫ Next: The Cost of Adaptive Execution Plans in Oracle (a Small Study) Part 2

≪ Previous: New York Oracle User Group (NYOUG) Fall 2013 Conference

Adaptive execution plans is a new Oracle 12c feature. I consider it one of the most important ones.

Oracle 12c would pick the join method based on the actual number of rows retrieved during the first execution of the query.

There are a few great papers about this topic:
Here is the Oracle’s white paper and here is Kerry Osborbe’s Hostsos 2013 presentation. There is even a video.

Every time a new feature that fundamentally changes the way Oracle works is introduced, we should ask ourselves what is the cost, in terms of CPU or other resources, used by that feature.
For me, the cost is an issue only when the feature does not change the join type that was initially chosen.
If the feature causes a change of join type, then the benefits of it would be huge, so it is not worth worrying about the cost.

I’ll measure the elapsed times / looking at internals is hard…  / of a query with and without adaptive execution plans and then run a paired t-test to see if there is a statistically significant difference. I’ll use OPTIMIZER_ADAPTIVE_REPORTING_ONLY to turn on and off the feature and I’ll do my best to get rid of any bias in this test.

First, let’s see what happened when we have a query that uses Hash Join. The table/view set up scripts is here. The test procedure is here. I get the end-result using this simple query:
select avg(r), avg(a) , STATS_T_TEST_PAIRED (R,A) FROM track_v

For this case, we get avg(r) = 450.6991608 , avg(a) = 451.5071414 , STATS_T_TEST_PAIRED (R,A) /p/ = 0.48716074654373898

This means that for 10K runs, we are nowhere near statistical significance (p< 0.05). Things were not different for 100K runs:

avg(r) = 336.78178535 , avg(a) = 336.67725615 , STATS_T_TEST_PAIRED (R,A) /p/ = 0.82934281196842896

Therefore, the adaptive execution plan feature does not add any meaningful cost when Hash Joins are used.

Let’s see the behavior for a query the uses nested loops. The table/view set up scripts are here. The test procedure is here. I ran 100K iterations, and this is what I got:
select avg(r), avg(a) , STATS_T_TEST_PAIRED (R,A) FROM track_v

For this case, we get avg(r) = 15.31589686 , avg(a) = 15.11440871 , STATS_T_TEST_PAIRED (R,A) /p/ = 0.015795071536522352

Very interesting result - using adaptive execution plans is slightly faster even when no runtime change of join type is happening. The result seems to have statistical significance…

I then run 500K iterations and I got the even stronger result:
avg(r) = 20.4530 , avg(a) = 19.982423, STATS_T_TEST_PAIRED (R,A) /p/ = 0.00000320884674226468

The averages went up probably because there was some load on the box, but the statistical significance of the difference is out of the question.

So, it appears that the cost of adaptive execution plans is actually negative when it comes to nested loops (NL).

Overall, adaptive execution plans could bring huge benefits without any cost. I cannot think of a scenario where we should turn them off…

↧

The Cost of Adaptive Execution Plans in Oracle (a Small Study) Part 2

November 21, 2013, 7:22 am

≫ Next: The Cost of Adaptive Execution Plans in Oracle (a Small Study) Part 3

≪ Previous: The Cost of Adaptive Execution Plans in Oracle (a Small Study)

In my previous post, I tried to find out the performance impact of adaptive execution plans (AEP). During the test, I used OPTIMIZER_ADAPTIVE_REPORTING_ONLY parameter to switch on and off adaptive execution plans(AEP). It should be noted that the parameter does not control the generation of AEP, but rather the executtion of the “adaptive” part. I came to a rather surprising result that turning off adaptive execution plans actually made the query run slower. This behavior seems to be a side -effect of frequent shared pool flushing. The original test, i.e. the one that showed tuning on adaptive execution plans speeds up a query, is reproducible, however! I ran it at least three times and I got similar results.

To confirm the peculiar results, I ran a new test that avoids SQL reuse by generating unique SQL statements. Here is the new table/view setup script. Here is the new test procedure. After running it, we can see the expected outcome – adaptive execution plans do cost something.

select avg(r), avg(a) , STATS_T_TEST_PAIRED (R,A) FROM track_v

avg(r) = 2.05312594 avg(a) = 2.07863544 STATS_T_TEST_PAIRED (R,A) /p/ = 0

The difference is quite small – approximately 2% of the overall time or a fraction of a millisecond.
The following query allows us drill down and see where the difference comes from :
select c.name , avg (a.val) a , avg (r.val) r , STATS_T_TEST_PAIRED (r.val,A.val) from stats_a a , stats_r r , v$statname c where a.id = r.id and a.stat = r.stat and a.stat = c.statistic# group by c.name

The only statistics that are statistically different are:
CCursor + sql area evicted            – the SQL with AEP on consumed less that the SQL with AEP off
CPU used by this session              – the SQL with AEP on consumed more (marginally) that the SQL with AEP off
parse time elapsed                          – the SQL with AEP on consumed more that the SQL with AEP off
sql area evicted                                – the SQL with AEP on consumed less that the SQL with AEP off

↧

The Cost of Adaptive Execution Plans in Oracle (a Small Study) Part 3

November 26, 2013, 1:34 pm

≫ Next: Dynamic sampling in Oracle 12c – it goes up to 11. Really!

≪ Previous: The Cost of Adaptive Execution Plans in Oracle (a Small Study) Part 2

My previous two posts, The Cost of Adaptive Execution Plans in Oracle (a Small Study) Part 1 and The Cost of Adaptive Execution Plans in Oracle (a Small Study) Part 2, focused on estimating the impact of OPTIMIZER_ADAPTIVE_REPORTING_ONLY parameter in Oracle 12c. The parameter controls if the adaptive plan would be acted upon. The adaptive plan, however, is always generated.

The post will try to estimate the impact of adaptive execution plans, while taking into account the resources needed for generating the adaptive execution plans. Since I was not able to find a way to suppress the generation of adaptive plans in Oracle 12c, I decided to just use an older version of the optimizer. I did that by utilizing “optimizer_features_enable” parameter.

Here is the table/view setup script, and here is the test script.

We can see the difference in elapsed time using this query:

select avg(r), avg(a) , STATS_T_TEST_PAIRED (R,A) FROM track_v

avg(r) = 1.5880118 avg(a) = 2.5002121 STATS_T_TEST_PAIRED (R,A) /p/ = 0

The adaptive (12c) plan takes almost 1 millisecond more than the non-adaptive, “old-style” 11gR2 plan.

This query can show where the difference comes from:

select c.name , avg (a.val) a , avg (r.val) r , STATS_T_TEST_PAIRED (r.val,A.val) from stats_a a , stats_r r , v$statname c where a.id = r.id and a.stat = r.stat and a.stat = c.statistic# group by c.name

The only statistics that are difference are:
CCursor + sql area evicted – the SQL with AEP(12c) on consumed more that the SQL run with 11gR2
CPU used by this session – the SQL with AEP(12c) on consumed more that the SQL run with 11gR2
parse time cpu – the SQL with AEP(12c) on consumed more that the SQL run with 11gR2
parse time elapsed – the SQL with AEP(12c) on consumed more that the SQL run with 11gR2
recursive cpu usage – the SQL with AEP(12c) on consumed more that the SQL run with 11gR2
sql area evicted – the SQL with AEP(12c) on consumed more that the SQL run with 11gR2

In conclusion, “adaptive execution plans” is a wonderful feature. The resource it needs are typically negligible, except for very light (fast) OLTP type queries.

↧

Dynamic sampling in Oracle 12c – it goes up to 11. Really!

December 13, 2013, 12:22 pm

≫ Next: What is the Deal with AVG_LEAF_BLOCKS_PER_KEY (DBA_INDEXES) !?!

≪ Previous: The Cost of Adaptive Execution Plans in Oracle (a Small Study) Part 3

For the first time in many years, Oracle introduced a new option for dynamic sampling, and a major one for that – http://docs.oracle.com/cd/E11882_01/server.112/e16638/stats.htm#PFGRF95292 .

The new dynamic sampling level 11 is the all-encompassing “automatic” switch, which leaves all decisions to Oracle.
The big question now is when would the new auto dynamic sampling kick-in? Maria Colgan (http://www.oracle.com/technetwork/database/bi-datawarehousing/twp-optimizer-with-oracledb-12c-1963236.pdf) indicates that the auto option would fire more frequently that the default.
To begin filling in the blanks, I started with a really simple test – retrieval by primary key.

Here are the create scripts:

create table tab3 as
with v1 as 
     (select rownum n from dual connect by level <= 10000)
select
     rownum id , 
     dbms_random.string('u',dbms_random.value*29 + 1) str
from
     v1, v1
where
     rownum < 1000000;

alter table tab3 add primary key (id)

execute dbms_stats.gather_table_stats('JORDAN','TAB3')

Now let see when happens when set the optimizer_dynamic_sampling =11 and issue the search by primary key:

SQL> alter session set optimizer_dynamic_sampling =11 ;

Session altered.

SQL> set autotrace on
SQL> select * from tab3 where id = 123 ;

…output truncated…

Note
-----
- dynamic statistics used: dynamic sampling (level=AUTO)
Statistics
----------------------------------------------------------
15 recursive calls
0 db block gets
14 consistent gets
0 physical reads
0 redo size
468 bytes sent via SQL*Net to client
532 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed

We see that that Oracle decided to do dynamic sampling. There is no guesswork in those types of queries. The cardinality of the statement is one, so dynamic sampling was not necessary.
But that’s not the worst part. The statement required 14 consistent gets and 15 recursive calls for a simple scan. Without dynamic sampling, the same statement requires 4 consistent gets and 1 recursive call.

SQL> select * from tab3 where id = 345 ;

…output truncated…

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
468 bytes sent via SQL*Net to client
532 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

The idea behind automatic dynamic sampling is excellent. Starting dynamic sampling only when the Oracle CBO cannot get reliable cardinalities would be a giant step towards query self-tuning.

The implementation of the feature in Oracle version 12.1.0.1, however, needs some more work. Hopefully, the logic would get more precise in the newer versions….

↧

What is the Deal with AVG_LEAF_BLOCKS_PER_KEY (DBA_INDEXES) !?!

December 17, 2013, 8:59 am

≫ Next: More on Dynamic Sampling Level 11 (AUTO) in Oracle 12c

≪ Previous: Dynamic sampling in Oracle 12c – it goes up to 11. Really!

AVG_LEAF_BLOCKS_PER_KEY column in DBA_INDEXES should show the average number of leaf blocks in which each distinct value in the index appears. Common sense and Christian’s Antognini’s book Troubleshooting Oracle Performance (pg.136) suggest that the value of AVG_LEAF_BLOCKS_PER_KEY should by derived from the values of LEAF_BLOCKS and DISTINCT_KEYS .

Well, that does not appear to be the case in Oracle 11g!

This query can help you find the indexes that do not conform to the rule:

select 
       * 
from 
       dba_indexes
where 
       DISTINCT_KEYS > 0
and 
       AVG_LEAF_BLOCKS_PER_KEY  > 3*LEAF_BLOCKS/DISTINCT_KEYS
and 
       LEAF_BLOCKS > DISTINCT_KEYS

The discrepancy is certainly not a rounding error. I have an index where the value of the AVG_LEAF_BLOCKS_PER_KEY is 150 times larger than LEAF_BLOCKS/DISTINCT_KEYS.

The problem seems to be happens only when stats are gathered (DBMS_STATS) with AUTO sampling. Computing stats ( estimate_percent = NULL) resolves the issue.

Ironically, DBMS_STATS AUTO sampling delivers good estimates for LEAF_BLOCKS and DISTINCT_KEYS. If only Oracle computed AVG_LEAF_BLOCKS_PER_KEY by using LEAF_BLOCKS and DISTINCT_KEYS, everything would have been great…

↧

More on Dynamic Sampling Level 11 (AUTO) in Oracle 12c

January 29, 2014, 1:15 pm

≫ Next: RMOUG 2014

≪ Previous: What is the Deal with AVG_LEAF_BLOCKS_PER_KEY (DBA_INDEXES) !?!

In a previous post I showed an example of how the new AUTO dynamic sampling level (11) consumed significant resources for a very simple SQL statement.

Here, I’ll try to find your why.
First surprise!
10053 trace does not capture information about dynamic sampling (DS) level 11. It works quite fine for other levels (0-10) of dynamic sampling though.
Here is how a dynamic sampling section of 10053 file for levels 0 to 10 looks like:

————————————————————————————

SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for TAB3[TAB3]
SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE

*** 2014-01-29 11:41:02.678
** Performing dynamic sampling initial checks. **
** Dynamic sampling initial checks returning TRUE (level = 3).

*** 2014-01-29 11:41:02.678
** Generated dynamic sampling query:
query text :
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0), COUNT(DISTINCT C3) FROM (SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("TAB3") FULL("TAB3") NO_PARALLEL_INDEX("TAB3") */ 1 AS C1, CASE WHEN DECODE("TAB3"."ID",0,1)=567 THEN 1 ELSE 0 END AS C2, DECODE("TAB3"."ID",0,1) AS C3 FROM "JORDAN"."TAB3" SAMPLE BLOCK (0.829542 , 1) SEED (1) "TAB3") SAMPLESUB

*** 2014-01-29 11:41:03.597
** Executed dynamic sampling query:
level : 3
sample pct. : 0.829542
actual sample size : 9783
filtered sample card. : 0
orig. card. : 999999
block cnt. table stat. : 3737
block cnt. for sampling: 3737
max. sample block cnt. : 32
sample block cnt. : 31
unique cnt. C3 : 0
min. sel. est. : 0.01000000
** Using single table dynamic sel. est. : 0.00007085
Table: TAB3 Alias: TAB3
Card: Original: 999999.000000 Rounded: 71 
                             Computed: 70.85 Non Adjusted: 70.85

————————————————————————————

The section includes the query issues by the dynamic sampling along with lots of valuable information.

The same section for a query with dynamic sampling level 11 looks strikingly different:
————————————————————————————

SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for TAB3[TAB3]
SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
Table: TAB3 Alias: TAB3
Card: Original: 999999.000000 >> Single Tab Card adjusted from:9999.990000 to:1.000000
Rounded: 1 Computed: 1.00 Non Adjusted: 9999.99

————————————————————————————
Even though we can see that the cardinality was adjusted from 9999.99 to 1 as a result of the dynamic sampling, there are no details about the DS queries that did the actual sampling. I was not able to see DS summary information either.

Since 10053 trace file did not give me the information I needed, I decided to look elsewhere – in the V$ tables.
After I flushed the shared pool and ran the query from the previous post, I issued the following SQL to get the DS SQL related to the statement:

select * from v$sql where sql_text like '%TAB3%'

Second surprise! The query returned a few records related to DS.
——————————————————————–

SELECT /* DS_SVC */ /*+ cursor_sharing_exact dynamic_sampling(0) 
...
FROM "TAB3" SAMPLE BLOCK(21.4075, 8) SEED(1) "TAB3" WHERE ...


SELECT /* DS_SVC */ /*+ cursor_sharing_exact dynamic_sampling(0) 
...
FROM "TAB3" SAMPLE BLOCK(42.8151, 8) SEED(2) "TAB3" WHERE ...

SELECT /* DS_SVC */ /*+ cursor_sharing_exact dynamic_sampling(0) 
...
FROM "TAB3" SAMPLE BLOCK(85.6302, 8) SEED(3) "TAB3" WHERE ....

———————————————————————

They are similar with the exception of SAMPLE BLOCK and SEED arguments. It appears that the optimizer tried to get sampling data using small sample block argument, but it failed. Them it tried again with larger sample size, but failed again. Finally, in the third attempt, with 85% sampling, it succeeded.

Generally speaking, the new policy to retry sampling until the desired result is achieved is good. The algorithm needs to be tweaked a bit to account for situations like the one I described in a previous post though.

↧

RMOUG 2014

February 11, 2014, 8:02 am

≫ Next: When Oracle would Choose an Adaptive Execution Plan – General Thoughts

≪ Previous: More on Dynamic Sampling Level 11 (AUTO) in Oracle 12c

I was very excited to present at RMOUG 2014 – my first time at that conference.

Unfortunately, I got sick and I had to cancel.

The name of the presentation was:
Working with Confidence: How Sure Is the Oracle CBO About Its Cardinality Estimates, and Why Does It Matter?

Here are the Powerpoint and the White paper.

↧

When Oracle would Choose an Adaptive Execution Plan – General Thoughts

March 31, 2014, 1:03 pm

≫ Next: An Oracle Distributed Query CBO Bug Finally Fixed (After 7 Years)

≪ Previous: RMOUG 2014

Adaptive Execution Plans is one of the most existing new features in Oracle 12c.
This post is not about how this feature works or its benefits, but rather about when Oracle would choose to use it.

In general, the Oracle CBO would use Adaptive Execution Plans if it is not sure which standard join (NL or HJ) is better:

If at SQL parse time, the Oracle CBO estimated that one of the sets to join is “significantly” smaller the other, where “significantly” is defined internally by the CBO, and there are appropriate indexes, then Oracle would opt for Nested Loops. Oracle CBO probably figured out that that the cost of NL is so much better than the cost of HJ, so it is not worth the effort of using an adaptive execution plan.
If one of the sets is only “slightly” smaller than the other, where “slightly” is defined internally by the CBO, then the performance of the two standard join types would be similar, so Oracle would typically decide to go with an Adaptive Plan and postpone the decision until run time. Oracle CBO probably saw that that that the cost of NL is “close” to the cost of HJ, so it is worth the effort of using an adaptive execution plan.
Finally, when the two sets have “similar” sizes, where “similar” is defined internally by the CBO, then Oracle would go with Hash join. Oracle CBO probably figured out that that the cost of HJ is so much better than the cost of NL, so it is not worth the effort of using an adaptive execution plan.

The figure below illustrates that behavior:

↧

An Oracle Distributed Query CBO Bug Finally Fixed (After 7 Years)

April 9, 2014, 12:52 pm

≫ Next: When Oracle would Choose an Adaptive Execution Plan (part 1)

≪ Previous: When Oracle would Choose an Adaptive Execution Plan – General Thoughts

Optimizing distributed queries in Oracle is inherently more difficult. The CBO not only has to account for the additional resources related with distributed processing in Oracle, such as networking, but also has to get reliable table/column statistics for the remote objects.

It is well documented that Oracle has(d) trouble passing information about histograms for distributed queries( http://jonathanlewis.wordpress.com/2013/08/19/distributed-queries-3/ ).

In addition, Oracle was not able to pass selectivity information for “IS NULL/NOT NULL” filters via a DB link, even though the value of records with NULL is already written to NUM_NULLS column in DBA_TAB_COLUMNS…
As a result of this bug, every query that has IS NULL against a remote table ended up with cardinality of 1, even if there were many NULL records in the table.

PLAN_TABLE_OUTPUT
SQL_ID  djpaw3d54d5uq, child number 0
-------------------------------------
select 
       * 
from 
       tab1@db_link_loop a , dual  
where 
       a.num_nullable is null

Plan hash value: 3027949496

-------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     8 (100)|          |        |      |
|   1 |  NESTED LOOPS      |      |     1 |    11 |     8   (0)| 00:00:01 |        |      |
|   2 |   TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |        |      |
|   3 |   REMOTE           | TAB1 |     1 |     9 |     6   (0)| 00:00:01 | DB_LI~ | R->S |
-------------------------------------------------------------------------------------------

Remote SQL Information (identified by operation id):
----------------------------------------------------

   3 - SELECT ID,NUM,NUM_NULLABLE FROM TAB1 A 
WHERE NUM_NULLABLE IS  NULL (accessing 'DB_LINK_LOOP')

The behavior was due to MOS Bug 5702977 Wrong cardinality estimation for “is NULL” predicate on a remote table

Fortunatly, the bug is fixed in 12c and 11.2.0.4. A patch is available for 11.2.0.3 on certain platforms.

↧

When Oracle would Choose an Adaptive Execution Plan (part 1)

July 2, 2014, 1:33 pm

≫ Next: A Patch for JUST_STATS Package

≪ Previous: An Oracle Distributed Query CBO Bug Finally Fixed (After 7 Years)

Understanding when a useful feature, such as Adaptive Execution Plans, would fire is of crucial importance for the stability of any DB system.

There are a few documents explaining how this feature works, including some that dig deep into the details:

http://kerryosborne.oracle-guy.com/2013/11/12c-adaptive-optimization-part-1/

http://www.dbi-services.com/index.php/blog/entry/oracle-12c-adaptive-plan-inflexion-point

http://scn.sap.com/community/oracle/blog/2013/09/24/oracle-db-optimizer-part-vii–looking-under-the-hood-of-adaptive-query-optimization-adaptive-plans-oracle-12c

However, I was not able to find a comprehensive technical document about when this feature fires.
My previous post included some general thoughts about issue. The simple explanations there, while plausible in general, do not fully match the messy reality.

I this post I will try to identify when a SQL plan goes from non-Adaptive (NL/HJ) to Adaptive and back. Once I have the “switching” point, I’ll review the 10053 trace just before and just after the switch.
Tables T1 andT2 was created this script. T2 has 1 million records and T1 has one.
In a loop, I insert a single record into T1 and run this query:

select t2.id ,

t1.str,

t2.other from

t1,

t2 where t1.id = t2.id

and t1.num = 5

and <UNIQUE NUMBER> = <UNIQUE NUMBER > ( insure that there is no plan reuse)

Initially the SQL uses Nested Loops, but after inserting 5 or 6 records, it switched to an Adaptive Execution Plan. We have a “switch” point!!!

The 10053 trace for the Non-Adaptive (NL) plan looks like this:

—————————————————————————————

Searching for inflection point (join #1) between 1.00 and 139810.13

AP: Computing costs for inflection point at min value 1.00

DP: Costing Nested Loops Join for inflection point at card 1.00

…

NL Join : Cost: 5.00 Resp: 5.00 Degree: 1

DP: Costing Hash Join for inflection point at card 1.00

….

Hash join: Resc: 135782.55 Resp: 135782.55 [multiMatchCost=0.00]

….
DP: Costing Nested Loops Join for inflection point at card 139810.13

….

NL Join : Cost: 279679.55 Resp: 279679.55 Degree: 1

P: Costing Hash Join for inflection point at card 139810.13

….
Hash join: Resc: 290527.15 Resp: 290527.15 [multiMatchCost=0.00]

DP: Found point of inflection for NLJ vs. HJ: card = -1.00
——————————————————————————————————–

The 10053 trace for the Adaptive plan looks like this:

——————————————————————————————————–

Searching for inflection point (join #1) between 1.00 and 155344.59

+++++

DP: Costing Nested Loops Join for inflection point at card 1.00

…
NL Join : Cost: 5.00 Resp: 5.00 Degree: 1

….

DP: Costing Hash Join for inflection point at card 1.00

…

Hash join: Resc: 135782.55 Resp: 135782.55 [multiMatchCost=0.00]

+++++

DP: Costing Nested Loops Join for inflection point at card 155344.59

….

NL Join : Cost: 310755.84 Resp: 310755.84 Degree: 1

….

DP: Costing Hash Join for inflection point at card 155344.59
..

Hash join: Resc: 290536.21 Resp: 290536.21 [multiMatchCost=0.00]

+++++

DP: Costing Nested Loops Join for inflection point at card 77672.80

…

NL Join : Cost: 155380.42 Resp: 155380.42 Degree: 1

…

DP: Costing Hash Join for inflection point at card 77672.80

…

Hash join: Resc: 290392.89 Resp: 290392.89 [multiMatchCost=0.00]
+++++

DP: Costing Nested Loops Join for inflection point at card 116508.69
…

NL Join : Cost: 233068.13 Resp: 233068.13 Degree: 1
…

DP: Costing Hash Join for inflection point at card 116508.69

…

Hash join: Resc: 290464.05 Resp: 290464.05 [multiMatchCost=0.00]
+++++

DP: Costing Nested Loops Join for inflection point at card 135926.64

…

NL Join : Cost: 271911.98 Resp: 271911.98 Degree: 1

…

DP: Costing Hash Join for inflection point at card 135926.64

…

Hash join: Resc: 290500.13 Resp: 290500.13 [multiMatchCost=0.00]

+++++

(skiped iterations)

DP: Found point of inflection for NLJ vs. HJ: card = 145228.51

——————————————————————————————————–

The relationship between cardinality and cost for the non-adaptive plan (NL) is shown here:

The respective graphic for adaptive plan is here:

In this situation, Oracle went with an adaptive plan because it was able to find an inflection point.

One important factor that determines whether an inflection point is found is the range the inflection point is searched in. That is, the main reason the CBO could not find an inflection point for the non-adaptive plan is that the range was from 1 to 139810. If the range was wider, it would have probably found an inflection point.

That means that in some cases the decision to use adaptive plans depends on what cardinality range it would use when searching for the inflection point.

It should also be noted that there are situations where Oracle would decide not to use adaptive plans without going through the motions of looking for an inflection point.

All in all, lots of additional research is needed to answer those questions…

↧

A Patch for JUST_STATS Package

August 11, 2014, 1:01 pm

≫ Next: “Orphaned” LOB Segments

≪ Previous: When Oracle would Choose an Adaptive Execution Plan (part 1)

An alert user recently notified my about a problem with the JUST_STATS package. It appears that it does not work properly with PARTITIONs. So, click here to download the first patch.
Please note that you are free to review and modify the code of the package.

↧

“Orphaned” LOB Segments

October 31, 2014, 12:58 pm

≫ Next: What to Do about Tools Hoarding Parallel Slaves in Oracle

≪ Previous: A Patch for JUST_STATS Package

Recently, after doing reorganization in Oracle 11.2.0.2.0 DB, I came upon a LOB segment that does not link to a LOB.
That is, there was an entry in USER_SEGMENTS and USER_OBJECTS, but there was no entry in USER_TABLES, USER_LOBS and USER_INDEXES.
It turned out that the in the recycle bin (USER_RECYCLEBIN). It appears that the DROP command did not “remove” all references to the dropped object.
The moral of the story is that if something is missing, look for it in the trash (the recycle bin, that is). Some things never change…

↧

What to Do about Tools Hoarding Parallel Slaves in Oracle

November 26, 2014, 8:06 am

≫ Next: RMOUG 2015

≪ Previous: “Orphaned” LOB Segments

Most GUI tools use cursors with relatively small fetch size (around 50) to retrieve data from Oracle. They open the cursor, fetch some data, show it and then wait on user input. All resources related to the connection and the open session are held while the tool waits on the user. While those resources are usually trivial for serial SELECT statements, they can be significant for parallel SELECT statements.

Each parallel statement gets assigned a number of parallel slaves and those slaves are not freed until their respective cursor is closed, regardless of the amount of work the slaves do. Since there are limited number of parallel slaves available in an instance (PARALLEL_MAX_SERVERS init.ora parameter) the hoarding of parallel slaves can prevent future statements to be executed in parallel, severely impacting the performance of those statements.

This following blog post describes the situation quite well:

http://structureddata.org/2012/02/29/pitfalls-of-using-parallel-execution-with-sql-developer/

Since this behavior is not a bug, i.e. there is never going to be a fix, we need to find a way to manage it.

One solution is to disable parallelism for all sessions coming from GUI tools that use cursors and therefore could cause this problem. This is a radical step that would deprive the users from running anything in parallel.

The second option is to educate the users of the problems associated with open cursors and ask them to close all cursor as soon as possible. This approach is ideal when executed diligently, but in reality not all users are compliant.

The approach I would like to propose is to allow parallelism for all, but monitor those who do not close their open cursors. Here is the query that I use to monitor:

select count(*) from
  (
      select *
      from gv$px_session px_QC
      where px_QC.qcinst_id IS NULL
      minus
      select * from gv$px_session px_QC
      where px_QC.qcinst_id IS NULL
      and  exists
            (select *
             from gv$px_session  px_Slaves , gv$session sess
             where px_QC.qcsid = px_Slaves.qcsid
             and px_Slaves.sid = sess.sid
             and (sess.wait_class = 'Idle'
                 or ( sess.seconds_in_wait < 600
                      and sess.wait_class = 'Idle'
                     )
                 )
             )
   )

This query returns zero if the parallel slaves are actively used and greater than zero if there is a set of parallel slaves that have been idle for 600 second. This query can be used to get the offending session and “talk” with the end user. It could be integrated with OEM using Metric Extensions or it could be part of a monitoring script that kills the offending sessions. The possibilities are endless.

↧

RMOUG 2015

December 31, 2014, 8:17 am

≫ Next: Confidence of Cardinality Estimates Optimization Techniques – When to Use?

≪ Previous: What to Do about Tools Hoarding Parallel Slaves in Oracle

I am very excited to be selected to present at RMOUG 2015.

I got two presentations – “Managing Statistics of Volatile Tables in Oracle” on Wednesday and “Working with Confidence: How Sure Is the Oracle CBO about Its Cardinality Estimates, and Why Does It Matter?” on Thursday. The presentations are updated with new 12c stuff, so they could be useful even if you have seen them before.

I’ll be there from Wednesday noon to Thursday afternoon. Hope to see you!

↧

Confidence of Cardinality Estimates Optimization Techniques – When to Use?

March 31, 2015, 6:53 am

≫ Next: Detecting Connection Imbalance in Oracle RAC

≪ Previous: RMOUG 2015

Presenting on professional conferences frequently brings out important points that were not highlighted well enough.

RMOUG 2015 was no different.

After I presented my techniques for performance tuning by accounting for the confidence of cardinality estimates (slides 35-46 ), an attendee asked why my way of optimization was better than well-established optimization methods, such as tuning by cardinality feedback.

Well, that was a good question!

The short answer is that the methods I presented are better when Oracle gets low confidence cardinality estimates, i.e. it is forced to guess, because the selectivity varies greatly across executions and Oracle has no way to account for that.

Since the matter got s bit abstract, let’s go through an example:

Let’s have a column Name, that would contain the names of people. Let see what our options for dealing with predicate, such as

 Name like ‘%<Specific Name>%’

are?
This predicate can be very selective, if the name is something like… IOTZOV. That is, predicate

 name like ‘%IOTZOV%’

would return very few records.

The very same predicate can be not that selective, if the name is something like SMITH. That is, predicate

name like ‘%SMITH%’

could return a few records.

If there is no way for Oracle to figure out that one name (IOTZOV) is much more selective than another (SMITH) then the techniques I proposed for accounting for the confidence of cardinality estimates are probably the best choice.

If there were a way for Oracle to figure out that one name (IOTZOV) is much more selective than another (SMITH), then Oracle would have probably gotten a good plan anyway.

If all names (IOTZOV, SMITH) have similar selectivity, but Oracle cannot figure it out for whatever reason, then we can use other techniques to feed Oracle the correct info. In this case, the other optimization techniques can lead to faster execution plans than the confidence of cardinality techniques I proposed.

↧

Detecting Connection Imbalance in Oracle RAC

May 8, 2015, 8:15 am

≫ Next: Importance of Data Clustering when Deleting in Batches

≪ Previous: Confidence of Cardinality Estimates Optimization Techniques – When to Use?

For Oracle RAC configurations that rely on spreading the load equally among instances, ensuring that inbound connections are balanced is quite important.

Since I was not able to find a suitable metric in the Oracle Enterprise Manager, I had to create one myself.

As I started thinking about it, some fundamental questions started popping up. How much difference in connection count should trigger an alert? Should I measure the percentage difference, the actual difference (number of sessions) , or something else?

Resisting the temptation to start coding convoluted logic, I reviewed what applied statistics has to offer and this is what I found – the Chi Square test !

Here is an example for homogeneity testing and here is one for independence testing.

The way I understand it, you can use that test to see if the connections are independently (uniformly) distributed across DB instances, or if there are some DB instances tend to get more or fewer connections than “expected”.

Another great thing about Chi Square test is that it is already implemented in the Oracle RDBMS.

The build-in Oracle function (STATS_CROSSTAB) can give us the value of Chi-squared (CHISQ_OBS) , the degree of freedom (CHISQ_DF), and the statistical significance (CHISQ_SIG). What we are interested in is the statistical significance. A number less that 0.05 indicates that the data is likely not distributed uniformly.

Here is the query that can detect if a DB user prefers/avoids a DB instance:
—————————————————————————————–

SELECT COUNT(*) cnt
FROM
    (SELECT STATS_CROSSTAB(inst_id, username, 'CHISQ_SIG') p_value
    FROM gv$session
    )
WHERE p_value < 0.05

—————————————————————————————–

Detecting connection imbalance at client machine level is bit more tricky because each instance received a few connections from the server it is on.
That can be easily accounted for my excluding the servers that run the DBs:
—————————————————————————————–

SELECT COUNT (*) cnt
FROM
    (SELECT STATS_CROSSTAB(inst_id, machine, 'CHISQ_SIG') p_value
     FROM gv$session
     WHERE machine NOT LIKE '%dbp%.newsamerica.com'
     )
WHERE p_value < 0.05

—————————————————————————————–

These monitoring queries work without modification for any size RAC cluster. Adding or removing nodes is handled without issues, apart from a temporary imbalance that may come with adding nodes.

↧

Importance of Data Clustering when Deleting in Batches

July 27, 2015, 1:21 pm

≫ Next: How to Find the Optimal Configuration for Your Virtualized Environment

≪ Previous: Detecting Connection Imbalance in Oracle RAC

The inspiration for this post comes from the OTN discussion “Bulk Deletes from a Large Table” where I volunteered the idea that we can improve the performance by taking into the account the clustering of the data to be deleted. Since my statement was rather general, I decided to create this post to fill in all the details.

First, we start with the table definition:

CREATE TABLE test
(
dt DATE ,
st NUMBER ,
other_num NUMBER ,
other_str VARCHAR(100)
)

The the first column of the table is a DATE, the second one status – a number between 0 and 100.

We want to populate the table with test data with special data clustering characteristics. We want to simulate the data distribution (clustering) we would get if users insert data in chronological order – i.e. the older data (the smallest dt value) gets inserted first, then slightly newer data, and so forth.
The following code fragment will not only fill the table with data, but it will also get us the desired data clustering.

BEGIN
  FOR i IN 1..200
  LOOP
    INSERT INTO test
  WITH v1 AS
    (SELECT rownum n FROM dual CONNECT BY level <= 10000
    )
  SELECT trunc(sysdate - i ),
    mod(rownum , 100) ,
    rownum ,
    'BLAHBLAH123'
  FROM v1,
    v1
  WHERE rownum <= 5000;
  COMMIT;
END LOOP;
END;
/

The data in the TEST table is clustered by DT. Records with the same DT are likely to be in the same block. That cannot be said for records with the same ST. Those records are scattered all over the table.

Now, let’s create indexes that would support equally well each of the purge methods and gather stats:

CREATE INDEX idx1 ON test
  (dt , st
  )
CREATE INDEX idx2 ON test
  (
    st ,
    dt
  )
  EXEC dbms_stats.gather_table_stats
  (
    '??????',
    'TEST'
  )

To better simulate memory pressure, let create a procedure that would flush the shared pool.
As SYS:

CREATE
PROCEDURE flush_bc
AS
BEGIN
  EXECUTE immediate 'alter system flush buffer_cache';
END;
/
GRANT EXECUTE ON flush_bc TO ??????;

The first purge technique accesses the data via DT. Since the data is clustered on DT, it is expected this technique to be faster.

BEGIN
  FOR i IN 100..200
  LOOP
    DELETE TEST WHERE DT BETWEEN TRUNC(SYSDATE - i ) AND TRUNC(SYSDATE - i +1 ) ;
    COMMIT;
    sys.flush_bc ;
  END LOOP;
END;
/

It takes 20263 physical reads.

The second one (please rebuild the TEST table before retrying) accesses the data via ST. The data for a ST is spread across many blocks, so this technique is expected to be slower.

BEGIN
  FOR i IN 0..101
  LOOP
    DELETE TEST WHERE ST = i AND DT < TRUNC(SYSDATE - 98 ) ;
    COMMIT;
    sys.flush_bc;
  END LOOP;
END;
/

took 239100 physical reads – more than 10 times the first one.

This test clearly shows that the first technique is better than the second one. The memory pressure in the test scenario is significant, so it is likely that the difference between the techniques would not be as great is most real world settings.

↧

How to Find the Optimal Configuration for Your Virtualized Environment

December 16, 2015, 12:19 pm

≪ Previous: Importance of Data Clustering when Deleting in Batches

Thanks to the New York Oracle User Group for allowing me to present and to all who attended for their valuable questions and comments.
The presentation showcases a framework for finding optimal server configurations that can be useful to anyone who is dealing with sever consolidation. Those working with Oracle Enterprise Manager/Grid Control can get some practical tips as well.

The PDF version of the presentation is here.
The Power Point version of the presentation is here. Many PowerPoint slides have notes.

↧