Battle of The Nodes RAC Performance Myths
Battle of The Nodes RAC Performance Myths
Battle of The Nodes RAC Performance Myths
By Riyaj Shamsudeen
Who am I?
16 years using Oracle products Over 15 years as Oracle DBA Certified DBA versions 7.0,7.3,8,8i &9i Specializes in performance tuning, Internals and E-business suite Principal at OraInternals A performance consulting company. http://www.orainternals.com OakTable member Email: rshamsud@orainternals.com Blog : http://orainternals.wordpress.com
Disclaimer
These slides and materials represent the work and opinions of the author and do not constitute official positions of my current or past employer or any other organization. This material has been peer reviewed, but author assume no responsibility whatsoever for the test cases. If you corrupt your databases by running my scripts, you are solely responsible for that. This material should not should not be reproduced or used without the authors' written permission.
Agenda - Myths
High CPU usage in one node doesn't affect other node performance. All global cache performance issues are due to interconnect performance. Inter instance parallelism is excellent, since CPUs from all nodes can be effectively used. Set sequence to nocache value in RAC environments to avoid gaps in sequence. Small tables should not be indexed in RAC. Bitmap index performance is worse compared to single instance.
Instance#1
Instance #2
Instance #3
Database
Reporting node
Idea here is to put online money-paying users to a all nodes and throw costly reports/adhoc SQL/batch in to one node. Only a small part of online users are in batch node. High CPU usage in the batch node shouldn't cause any issues to online users, right? If SQL is bad, don't worry about tuning, let it run in report node. It wouldn't affect much online users performance, right? If batch process is costly, no need to tune it, run that in batch node. Not Exactly!
OraInternals Riyaj Shamsudeen 6
Database
Server #3
10
GC waits
GC CR waits 'gc cr grant 2 way' (10g) and 'global cache cr request' (9i) latency increases due to global cache latencies.
Avg %Time Total Wait wait Waits Event Waits -outs Time (s) (ms) /txn ---------------------------- -------------- ------ ----------- ------- --------... gc cr grant 2-way 11,518 3.0 23 2 14.7
Much of these GC waits are blamed on interconnect interface and hardware. In many cases, interconnect is performing fine, it is that GCS server processes are introducing latencies.
11
Typical response from DBA to improve global cache performance is to increase # of LMS processes adjusting _lm_lms (9i) or gcs_server_processes(10g). This has detrimental effect in performance. More LMS processes increases latency due to TLB thrashing. From mpstat/trapstat outputs, it is visible that there is increased amount of xcalls/migrates/tlb-misses. Few busy LMS processes are better than many quasi-busy LMS processes
OraInternals Riyaj Shamsudeen 12
In 9i, increasing priority of LMS processes to RT helps (more covered later). From Oracle release 10.2.0.3 LMS processes run in Real Time priority. This is alleviating much of performance issues with LMS issues. Two parameters control this behaviour: _high_priority_processes : High Priority Process Name Mask with a default value of LMS* _os_sched_high_priority : OS high priority level with a default value of 1.
OraInternals Riyaj Shamsudeen 14
Agenda - Myths
High CPU usage in one node doesn't affect other node performance. All global cache performance issues are due to interconnect performance. Inter instance parallelism is excellent, since CPUs from all nodes can be effectively used. Set sequence to nocache value in RAC environments to avoid gaps in sequence. Small tables should not be indexed in RAC. Bitmap index performance is worse compared to single instance.
15
Node1 GC workload
Global Cache and Enqueue Services - Workload Characteristics ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Avg global enqueue get time (ms):
63.3
2.1
Avg global cache cr block build time (ms): Global cache log flushes for cr blocks served %:
0.3
0.1
51.5
4.5
Global cache log flushes for current blocks served %: Avg global cache current block flush time (ms):
30.0
16
17
Interconnect performance
Before LMS sends a block back to remote cache, LMS waits for Log flush to complete. Even CR block transfer suffer from this wait. Of course, CUR blocks needs to have log flush complete. So, Global cache latency ~= Interconnect latency for message from & to LMS + LMS processing latency + LGWR log flush latency
18
Node2 GC workload
In this specific case, log flush was very slow due to an hardware issue
Global Cache and Enqueue Services - Workload Characteristics ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Avg global enqueue get time (ms): 0.3
10.4
3.2
Avg global cache cr block build time (ms): Global cache log flushes for cr blocks served %:
0.1
0.0
4380.0
5.0
Global cache log flushes for current blocks served %: Avg global cache current block flush time (ms):
0.0
19
LGWR priority
LGWR processes should also run with higher priority, in addition to LMS processes.
Better write throughput on redo log files is essential for overall RAC performance. High interconnect block transfer inevitably will result in hyper active LGWR. Increase priority for LGWR and LMS (Example for Solaris) priocntl -e -c class -m userlimit -p priority priocntl -e -c RT -p 59 `pgrep -f ora_lgwr_${ORACLE_SID}` priocntl -e -c FX -m 60 -p 60 `pgrep -f ora_lms[09]*_${ORACLE_SID}`
OraInternals Riyaj Shamsudeen
20
Binding..
Another option is to bind LGWR/LMS to specific processors or processor sets. Still, interrupts can pre-empt LMS processors and LGWR. So, binding LMS to processor set without interrupts helps (see psradm in solaris).
But, of course, processor binding is applicable to only servers with high # of CPUs such as E25K platforms.
21
Summary
In summary, Use optimal # of LMS processes Use RT or FX high priority for LMS and LGWR proceses. Configure decent hardware for online redo log files. Tune LGWR writes and Of course, avoid double buffering and double copy using optimal file systems. Of course, tune SQL.
22
Agenda - Myths
High CPU usage in one node doesn't affect other node performance. All global cache performance issues are due to interconnect performance. Inter instance parallelism is excellent, since CPUs from all nodes can be effectively used. Set sequence to nocache value in RAC environments to avoid gaps in sequence. Small tables should not be indexed in RAC. Bitmap index performance is worse compared to single instance.
23
Parallelism
Few parameters controls this behaviour: parallel_min_servers parallel_max_servers Two more parameters, RAC specific: instance_group parallel_instance_group In a multi-instance RAC cluster, we can control parallelism to specific instances.
OraInternals Riyaj Shamsudeen 24
Parallelism
SQL
Let's say that there are three instances: inst1, inst2, inst3. To span slaves across all instances inst1.instance_groups='inst1','all' inst2.instance_groups='inst2','all' inst3.instance_groups='inst3','all' inst1.parallel_instance_group='all'
Inst 1 QC Inst 2 Inst 3
P001
P002
P001
P001
P001
P001
25
Parallelism
SQL
To span slaves across all instances inst1 and inst2 alone, parameters will be: inst1.instance_groups='inst1','all', 'inst12' inst2.instance_groups='inst2','all','inst12' inst3.instance_groups='inst3','all' inst1.parallel_instance_group='inst12'
Inst 1 QC Inst 2 Inst 3
P001
P002
P001
P001
26
Parallel Select
Alter session set parallel_instance_group=inst12'; select /*+ full(tl) from t_large tl; avg(n1), max(n1), avg(n2), max(n2), max(v1) parallel (tl,4) */
Username CBQT
------------ ---------- ---------- ------ ------ ------------- ---------- --------- p001 - p000 - p000 - p001 1 4 4 4 4 1 1 2 2
INST_ID
27
SQL
P001
P002
28
SQL
P001
P002
29
Parallel Select
select /*+ full(tl) from t_large tl; call count 3 3 6 avg(n1), max(n1), avg(n2), max(n2), max(v1)
Alter session set parallel_instance_group ='ALL';
parallel (tl,4) */
cpu
elapsed
disk
query
current
---------0 0 ---------3 3
rows
------- -----12
69.90
189.92
count
-------- ---------- ---------- ---------- ---------0.00 0.00 7.48 7.50 30.63 60.69 91.38 0.05 0 0 0 0 0 3 0 3 0 0 0 0
cpu
elapsed
disk
query
current
---------0 0 ---------1 1
rows
------- ------
30
PQ-Summary
Inter instance parallelism need to be carefully considered and measured. For partition based processing, when processing for a set of partitions is contained within a node, performance may be better. Excessive inter instance parallelism will increase interconnect traffic leading to performance issues.
31
Agenda - Myths
High CPU usage in one node doesn't affect other node performance. All global cache performance issues are due to interconnect performance. Inter instance parallelism is excellent, since CPUs from all nodes can be effectively used. Set sequence to nocache value in RAC environments to avoid gaps in sequence. Small tables should not be indexed in RAC. Bitmap index performance is worse compared to single instance.
32
2. These changes might not result in physical reads/writes. 3. Gaps in sequence values. 4. Still, log flush needed for cache transfer.
Inst 2
1. 3 access to sequence results in 3 block changes. 2. No gaps in sequence values. 3. But, SEQ$ table blocks transferred back and forth.
4 SEQ$ updated with last_value as 11
34
Inst 2
begin
-- Also making undo blocks to be pinged.. select t1_seq.nextval, lpad( loop_cnt, 500, 'x') from dual; commit;
end; /
end loop;
35
INSERT INTO T1 SELECT T1_SEQ.NEXTVAL, LPAD( :B1 , 500, 'x') FROM DUAL -------- ---------- ---------- ---------- ---------794 0 0 0 0 query current
---------0 0
rows
25670
10000 ----------
------- ------
10001
25670
10000
update seq$ set increment$=:2,minvalue=:3,maxvalue=:4,cycle#=:5,order$=:6, call cache=:7,highwater=:8,audit$=:9,flags=:10 where count 10000 10000 0 cpu elapsed disk obj#=:1 query current 0 rows 0
-------- ---------- ---------- ---------- ---------0.32 2.74 0.00 3.06 0.30 3.04 0 0 10000 0
----------
------- ------
20000
0.00
20287 0
10000 ---------0
20287
10000
36
10000
27191
10000 ----------
------- ------
10001
27191
10000
Elapsed times include waiting on following events: ---------------------------------------row cache lock gc current block 2-way gc cr block 2-way Event waited on
Waited
Times 5413 46 63
0.00
0.16
0.06
0.41
37
-------- ---------- ---------- ---------- ---------0.35 4.08 0.00 4.44 11.18 0.30 0 0 10000 0
disk
query
current 0
----------
rows 0
------- ------
Event waited on
20000
-------- ---------- ---------- ---------- ---------11.49 0 Times 5166 1 10000 Max. Wait 0.01 0.00
0.00
20290 0
10000 ---------0
20290
---------------------------------------gc current block 2-way gc current grant busy log file switch completion
Waited
----------
-----------0.22
10000
0.16
0.00
38
Sequence- summary
Nocache sequences increases 'row cache lock' waits. Increases interconnect traffic. Increases elapsed time. If no gaps are needed, control sequence access from just one node or use non-sequence based techniques.
39
Agenda - Myths
High CPU usage in one node doesn't affect other node performance. Inter instance parallelism is excellent, since CPUs from all nodes can be effectively used. Set sequence to nocache value in RAC environments to avoid gaps in sequence. Small tables should not be indexed in RAC. Bitmap index performance is worse compared to single instance. All global cache performance issues are due to interconnect performance.
40
Small tables
Even small tables must be indexed. Excessive full table scans on smaller tables will increase CPU usage. This guideline applies to RAC environments too. I think, this myth arises due to misunderstanding of the problem.
41
Small tables
set timing on drop table t_small2; ; create table t_small2 (n1 number, v1 varchar2(10) ) tablespace users insert into t_small2 select n1, lpad(n1,10,'x') from commit;
select segment_name, sum(bytes)/1024 from dba_segments where segment_name='T_SMALL2' and owner='CBQT' group by segment_name SQL> / SEGMENT_NAME T_SMALL2
---------------------------------------- --------------256
SUM(BYTES)/1024
42
Test case
alter session set events '10046 trace name context forever , level 8'; set serveroutput on size 100000 declare v_n1 number; b_n1 number;
exception
end loop;
end; /
dbms_output.put_line (b_n1);
43
---------0
62.72
63.71
3100000
---------100000
100000
Rows
-------
100000
---------------------------------------------------
44
call
count
-------- ---------- ---------- ---------- ---------0.00 1.64 1.79 3.43 0.00 1.61 1.78 3.40 0 0 0 2 0 0 0
cpu
elapsed
disk
query
current
---------0
rows
23
300209
---------100000
100000
Rows
-------
100000
100000 INDEX RANGE SCAN T_SMALL2_N1 (cr=200209 pr=23 pw=0 time=1109464 us)(object id 53783) OraInternals Riyaj Shamsudeen 45
TABLE ACCESS BY INDEX ROWID T_SMALL2 (cr=300209 pr=23 pw=0 time=1896719 us)
---------------------------------------------------
Agenda - Myths
High CPU usage in one node doesn't affect other node performance. Inter instance parallelism is excellent, since CPUs from all nodes can be effectively used. All global cache performance issues are due to interconnect performance. Small tables should not be indexed in RAC. Trigger performs worse in RAC compared to single instance. Bitmap index performance is worse compared to single instance.
46
Bitmap index
Bitmap indices are optimal for low cardinality columns. Bitmap indices are not suitable for table with massive DML changes. Bitmap index performance does not worsen because of RAC for select queries. Of course, having bitmap indices on columns with enormous DML changes is not optimal even in single instance databases.
47
exception
end loop;
end; /
dbms_output.put_line (b_n1);
48
78
200746
---------100000
100000
-------
100000
100000
---------------------------------------------------
49
call
count
-------- ---------- ---------- ---------- ---------0.00 2.82 1.90 4.73 0.01 2.95 1.94 4.90 0 0 3 3 0 2 0 0 0 0
cpu
elapsed
disk
query
current
---------0 0
rows
200753
---------100000
100000
50
References
Oracle support site. Metalink.oracle.com. Various documents Internals guru Steve Adams website www.ixora.com.au Jonathan Lewis website www.jlcomp.daemon.co.uk Julian Dykes website www.julian-dyke.com Oracle8i Internal Services for Waits, Latches, Locks, and Memory by Steve Adams Tom Kytes website Asktom.oracle.com
51