Nothing Special   »   [go: up one dir, main page]

Ogh20180613 Rob Lasonder

Download as pdf or txt
Download as pdf or txt
You are on page 1of 64

01/06/2018

Oracle Golden Gate


. . . from AIX to Exadata . . .

Even voorstellen ......

Rob Lasonder
In dienst bij Qualogy als Exadata Specialist

... Sinds 2010 bezig met Exadata (V2)


... Momenteel project X2-6 bij int. verzekeraar
... Laatste tijd vooral bezig met Golden Gate

OCP: Oracle 8, 8i, 9i, 10g, 11g, 12c


OCE: Exadata X3, X4, RAC11g, ZFS Storage Appliance, Big Data & GG 11g Impl. specialist

Email: rob.lasonder@qualogy.com
www.rob.lasonder.org

2 1-6-2018

1
01/06/2018

Agenda

• 1 Golden Gate Use Case


• 2 Golden Gate introduction
• 3 Golden Gate customer architecture
• 4 Golden Gate installation & configuration
• 5 Golden Gate findings & troubleshooting
• 6 Golden Gate Monitoring
• 7 Golden Gate High Availability
• 8 Lessons learned

3 Agenda 1-6-2018

1 Golden Gate Use Case

2
01/06/2018

1 Golden Gate Use Case

• The customer purchased an Exadata 1/8 Rack and already


migrated some of their databases to Exadata

• De core legacy database is too complex to migrate at this


stage, the migration is planned for 2019.

• In order to already use Exadata for this core legacy


database, the database will be replicated to Exadata.

• This Golden Gate Clone on Exadata will be used to build data


services, mainly for reporting services.

5 1-6-2018

2 Golden Gate introduction

3
01/06/2018

2 Golden Gate Introduction : What is Golden Gate?

Oracle GoldenGate provides low-


low-impact capture,
capture routing,
routing
transformation,
transformation and delivery of transactional data across
heterogeneous environments in real time

• Golden Gate replicates data or subsets of data, in real time

• Golden Gate can transform data during the replication

• Golden Gate supports heterogenous environments


• Different platforms (Linux, Windows, etc.)
• Different databases (Oracle, MySQL, SQLServer, etc)
• Different database versions

7 1-6-2018

2 GG Introduction: Golden Gate core functionality

1 Database transactions are captured in trail files

2 Trail files are transported to target

3 Transactions are applied in the target

Real time (asynchronous) processing, Decoupled Architecture

8 1-6-2018

4
01/06/2018

2 GG Introduction: available GG architectures

9 1-6-2018

2 GG Introduction: heterogenous support

10 1-6-2018

5
01/06/2018

2 GG Introductie: Golden Gate 12.2 New Features

• Self describing trail files: Metadata records used to interpret DML


records instead of SOURCEDEFS or ASSUMETARGETDEFS

• Built-in Automatic Heartbeat table (GGSCI> ADD HEARTBEATTABLE)

• Transparent integration with Oracle Clusterware & ASM

• Integrated Replicat (and Extract)

• Graphical real-time instance monitoring (OGG Performance Tool Kit)

• Oracle Data Pump Integration for Table Instantiation

• Support for Invisible Columns (Oracle Only)


• New parameter – MAPINVISIBLECOLUMNS (Integrated only)

11 1-6-2018

2 GG Introductie: Golden Gate 12.2 New Features

Integrated Capture (GG 11.2 NF) & Replicat (GG 12.2 NF):
Extract & Replicat processes interact with Oracle Log Mining Services

• Only applicable to Oracle Databases.


• STREAMS_POOL_SIZE 1250 MB per Extract/Replicat process !

12 Agenda 1-6-2018

6
01/06/2018

2 GG Introduction: Golden Gate 12.2 New Features

Benefits Integrated Capture (11.2 NF) & Replicat (12.2 NF)

Benefits Integrated Capture (Extract)


• More datatypes natively supported (LOB's, XML)
• Compression Supported
• Easier configuration
• Transparent support for ASM, RAC, RMAN, DataGuard and TDE

Benefits Advantages Integrated Replicat


• Automatic parallel processing
• Support for 12c Multitenant databases
• Performance
Very easy to configure:

GGSCI> DBLOGIN USERIDALIAS ALIAS_GGADMIN


GGSCI> REGISTER EXTRACT ext1 DATABASE
GGSCI> ADD EXTRACT ext1 INTEGRATED TRANLOG, BEGIN NOW

13 Agenda 1-6-2018

2 GG Introduction: Golden Gate 12.2 New Features

Reference: Oracle GoldenGate Performance Tuning for Oracle Database Integration CON7773
Doc in Handout

14 1-6-2018

7
01/06/2018

2 GG Introduction: Golden Gate 12.2 New Features

References integrated capture & replicat Docs in Handout


• Extracting Data in Oracle GoldenGate Integrated Capture Mode
• Oracle GoldenGate Performance Best Practices (May 2017)

Oracle by Example:

15 1-6-2018

2 GG Introduction: Golden Gate versus Dataguard

Oracle ships 2 seperate products for database replication:


Dataguard and Golden Gate

Oracle Dataguard:
• Optimized for disaster recovery and data protection.
protection
• (physical) standby DB is a physical block-level copy of the database
• Part of Oracle Maximum Availability Architecture (MAA)

Oracle Golden Gate:


• Optimized for database replication and datatransformation in
heterogenous environments.

16 1-6-2018

8
01/06/2018

2 GG Introduction: Golden Gate versus Dataguard

17 1-6-2018

2 GG Introduction: Golden Gate versus ODI

Oracle Data Integrator and Golden Gate complement each other

Oracle ODI: is based on a new technological concept called E –LT


(Extract – Load, Transform), as opposed to traditional ETL tools

Disadvantages ETL architecture: Advantages E-


E-LT architecture:
• CPU/RAM / IO overhead • Less CPU/RAM/IO overhead
• Data transform process at • Data transform processes at
application level DB level

18 1-6-2018

9
01/06/2018

2 GG Introduction: Golden Gate versus ODI

Load &
Transform

Extract

Best of both worlds:


• Oracle Golden Gate provides a cross-platform data replication and
changed data capture. (Extract)
• Oracle ODI is used for the transformation and loading

19 1-6-2018

3 Golden Gate customer architecture

10
01/06/2018

3 Golden Gate customer architecture: the “old” world

IBM Power8 series (up to 8 threads per Power8 CPU) & IBM PowerVM

Oracle 12cR1 RDBMS


Production Database : PROD

SGA : 38 GB
PGA : 21 GB
Avg Redo per day : 73,4 GB
Size PROD_SCHEMA : 2.111 GB

Typical characteristics:
• Maintenance silo’s: network, sysadmin, storage, DBA
• DBA has no control over (v)CPU & shared storage
• DBA lacks required privileges by default (root, and even oracle for production)

21 1-6-2018

3 Golden Gate customer architecute : the “new” world


Exadata X6-2 1/8 Rack compute nodes: 2 x 22-core Intel Xeon E5 v4 processors
DB(M)A in Full Control, Total Stack Maintenance
DB + Sysadamin + Storage + Network

22 1-6-2018

11
01/06/2018

3 Golden Gate customer architecture: new versus old

New versus old: real-


real-life comparison in customer landscape.........
• Sample of 10 report in old (AIX) and new (Exadata) DB
• All 10 reports were faster on Exadata (“out-of-the-box”)
• Performance gains between 1 and 181 (!) times faster on Exadata
(especially with "big volume data reports")

23 1-6-2018

3 Golden Gate customer architecture: GG Setup

old world (AIX)

new world (Exadata)

GoldenGate 12.2
• 1 technical test environment
• 1 functional test environment
• 1 production environment

24 1-6-2018

12
01/06/2018

4 Golden Gate installation & configuration

4 Golden Gate installation & configuration

Analyse & Prepare the Prepare the Instantiate


Install GG
Architecture GG target source GG

• Golden Gate installation & configuration is a multi step approach

• Plan time for every step & substep

• If possible, implement incrementally. No Big Bang approach !!!

26 1-6-2018

13
01/06/2018

4 Golden Gate installation & configuration

This presentation shares some insight on the GoldenGate implementation.


It is not a step by step recipe to follow... For that.....

27 1-6-2018

Analyse &
Architecture GG 4 Golden Gate installation & configuration

Analyse phase :
• Which data needs to be replicated (schemas, object types)
• What is the replication architecture: unilateral / bilateral, etc.
• Is any filtering / transformation required
• Type of data changes (DML only or DML and DDL)
• What is the allowed latency
• Golden Gate usage, DR & High Availability requirements
• Are any non supported mechanisms in place, like nologging
operations, transportable tablespaces (TTS), etc

Use Oracle CHECK scripts to investigate source data


• Oracle GoldenGate database Schema Profile check script for
Oracle DB (Doc ID 1296168.1).
• Oracle GoldenGate database Complete Database Profile check
script for Oracle DB (All Schemas) (Doc ID 1298562.1)

28 1-6-2018

14
01/06/2018

Analyse &
Architecture GG 4 Golden Gate installation & configuration

Results analysis I : many non supported objects, all AQ$ tables

Solution: after discussion with data architects => AQ$ tables


are not required in the Golden Gate clone.

Exclude the AQ$ tables from the extract processes:


TABLE HR.*;
TABLEEXCLUDE HR.*_QUEUE_*

hence the importance of Naming conventions ...

29 1-6-2018

Analyse &
Architecture GG 4 Golden Gate installation & configuration

Results analysis II : Many tables without PMK/UK: 650+ tables !!

Problem with tables without PMK/UK:


• On the source DB: Storage & I/O overhead PMK/UK required for
supplemental logging (alternatively all columns will be used )
• On the target: Performance overhead when processing updates &
deletes => Full tablescans

Oracle has a solution for this problem


How to Handle Tables Without Primary Keys or Unique Indexes With
Oracle GoldenGate (Doc ID 1271578.1)

30 1-6-2018

15
01/06/2018

Analyse &
Architecture GG 4 Golden Gate installation & configuration

Oracle Solution for tables without PMK/UK : How to Handle Tables Without
Primary Keys or Unique Indexes With Oracle GoldenGate (Doc ID
1271578.1)

on the source database: add a column & autofill/backfill with unique value
SQL> alter table table_1 add column OGG_KEY_ID raw(16);
SQL> alter table table_1 modify OGG_KEY_ID default sys_guid();

Configure this column at GG Level as the unique identifier:


GGSCI> add trandata <owner>.<table_name>, COLS (OGG_KEY_ID), nokey

on the target database: Create a UNIQUE index on this column


SQL> create unique index ix_table_1 on table_1 (OGG_KEY_ID);

31 1-6-2018

Analyse &
Architecture GG 4 Golden Gate installation & configuration

Problem with this solution:

Compilation in many stored procedures: ORA-00947: not enough values


We did not even try the Forms & other applications.

This error occurs when you do not specify all the columns in your coding:
Bad coding: INSERT INTO TABLE_1 VALUES ('A','B');
Good coding: INSERT INTO TABLE_1 (COL1, COL2) VALUES ('A','B');

SQL> alter table HR.TABLE_1 modify OGG_KEY_ID invisible


GGSCI> add trandata HR.TABLE_1, COLS (OGG_KEY_ID), nokey

ERROR: Column OGG_KEY_ID not found in table HR.TABLE_1

Oracle Support did not provide a solution. An enhancement request was rejected ....

32 1-6-2018

16
01/06/2018

Install GG 4 Golden Gate installation & configuration

Installation Golden Gate: as of version 12.2 via OUI,

33 1-6-2018

Install GG 4 Golden Gate installation & configuration

• Local software installation, recovery related files on shared


storage (HA). Installation binaries about 700 GB.

• After base (12.2) install: look for latest GG patches and apply
with Opatch utililty.

• Every GG installation has its own software installation. So on


consolidated platforms (Exadata): multiple gg software
installations.

• Housekeeping: backup job for software + rotate job for


ggserr.log (copy_truncate). GG report files get rotated via
parameter configuration. (REPORTROLLOVER)

34 1-6-2018

17
01/06/2018

Install GG 4 Golden Gate installation & configuration

Install Golden Gate: recovery related files on shared storage

• Recovery related files are softlinked to shared storage

• Report files & ggserr.log file can grow quite large and are therefore
also softlinked to shared storage (with much larger storage capacity)

35 1-6-2018

Install GG 4 Golden Gate installation & configuration

Install Golden Gate: multiple software installations on Exadata for GG


• GG_HOME 1: /u01/app/oracle/product/12.2/oggcore_1
• GG_HOME 2: /u01/app/oracle/product/12.2/oggcore_2

Create a script (gg) to set the GG environment

[oracle@exa01dbadm01 ~]$ gg 1
setting gg environment to
/u01/app/oracle/product/12.2/oggcore_1

[oracle@exa01dbadm01 ~]$ echo $GG_HOME


/u01/app/oracle/product/12.2/oggcore_1

36 1-6-2018

18
01/06/2018

Install GG 4 Golden Gate installation & configuration

37 Agenda 1-6-2018

Prepare the
target 4 Golden Gate installation & configuration

1 Create the target database

2 Prepare database settings for Golden Gate

3 Create GGADMIN user and related objects

4 Start GG Manager process

5 Configure & start the GG Manager process

38 1-6-2018

19
01/06/2018

Prepare the
source 4 Golden Gate installation & configuration

1 Size archive log destination (increased due to Supplemental Logging

2 Create local trail file (/oracle/trail) destination, ++ sufficient size


potential DB Impact
3 Prepare database settings for Golden Gate

4 Create GGADMIN user and related objects


potential DB Impact
5 Enable Supplemental Logging

6 Configure & Start GG Manager process

7 Configure & Start GG Extract process

8 Configure & Start GG Pump process

39 1-6-2018

Prepare the
source 4 Golden Gate installation & configuration

Some details step 3: Prepare settings for source database

SQL> alter database force logging;


SQL> alter database add supplemental log data;

SQL> alter system set undo_retention=86400 scope=both;


SQL> alter system set "_log_read_buffers"=64 scope=spfile;
SQL> alter system set "_log_read_buffer_size" = 128 scope=spfile;
SQL> alter system set enable_goldengate_replication = true;
SQL> alter system set shared_pool_size = <current_value + 1250M>;
SQL> alter system set sga_target = <current_value + 1250M>;

Database in force logging !! May have effects on batch operations.


Check for unrecoverable actions (v$datafile.unrecoverable_change#)

Several parameter changes. More memory required for integrated


extract (logminer) processes.

40 1-6-2018

20
01/06/2018

Prepare the
source 4 Golden Gate installation & configuration

Some details step 5: Supplemental logging


The data stored in the online redo logfiles contain the ROWID of the
row being changed and the new values for the column(s) that are
changed. The ROWID is used to identify the changed row.
The ROWID is a mapping to a physical block address and cannot be
used by Golden Gate to identify the changed row. Therefore, the PMK
or UK are added to the online redo logfile as supplemental logging.

Supplemental logging can take place at the table level or at the


schema level:

GGSCI> add schematrandata HR


GGSCI> add trandata HR.table_1

A DDL lock is required on the table when Supplemental Logging is added. In


busy systems: SQL> alter session set ddl_lock_timeout=7200;

41 Aenda 1-6-2018

Prepare the
source 4 Golden Gate installation & configuration

Some details step 5: Supplemental logging

Redo size increased by 60% due to Supplemental Logging. Reason is


the many tables without PMK / UK

Supplemental logging was (initially) implemented at the SCHEMA level

42 Agenda 1-6-2018

21
01/06/2018

Prepare the
source 4 Golden Gate installation & configuration

Advantages of enabling supplemental logging at the schema level

• Automatically enables supplemental logging for new tables created


with a CREATE TABLE DDL command.

• Updates supplemental logging for tables affected by an ALTER


TABLE DDL command that adds or drops columns

• Updates supplemental logging for tables affected by RENAME


TABLE command

• Updates supplemental logging for tables affected by adding or


dropping of unique or primary key constraints

43 1-6-2018

Prepare the
source 4 Golden Gate installation & configuration

Some details step 7: Configure the Extract process parameter file

-- configure integrated extract in extract parameter file


TRANLOGOPTIONS INTEGRATEDPARAMS (MAX_SGA_SIZE 1024)

-- include DDL
-- MAPPED means we only replicat DDL from the mapped objects
-- and we only want tables and indexes, no other object types
DDL &
INCLUDE MAPPED OBJTYPE TABLE &
INCLUDE MAPPED OBJTYPE INDEX

-- we want all the tables from user HR


table HR.*

-- etc. More specifications & options.

44 1-6-2018

22
01/06/2018

Prepare the
source 4 Golden Gate installation & configuration

Some details step 7: Configure & start the Extract Process

GGSCI> dblogin useridalias alias_ggadmin


Successfully logged into database.

GGSCI> register extract X1PROD database


2017-09-14 12:43:29 INFO OGG-02003 Extract X1PROD
successfully registered with database at SCN 6725773753243.

GGSCI> add extract X1PROD, integrated tranlog, begin now

GGSCI> add exttrail /oracle/trail/PROD/x1, extract X1PROD,


megabytes 200

GGSCI> start extract x1prod

45 1-6-2018

Prepare the
source 4 Golden Gate installation & configuration

Some details step 7: Local trail files after Extract is started....

46 1-6-2018

23
01/06/2018

Prepare the
source 4 Golden Gate installation & configuration

Some details Step 8, remote trail files after Pump is configured & started:

47 1-6-2018

Prepare the
source 4 Golden Gate installation & configuration

Status of the Golden Gate processes on the source:

GGSCI> info all


Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
JAGENT STOPPED
EXTRACT RUNNING P1PROD 00:12:18 00:00:04
EXTRACT RUNNING X1PROD 00:00:04 00:00:15

48 1-6-2018

24
01/06/2018

Prepare the
source 4 Golden Gate installation & configuration

After the prepare source operation part of the


Golden Gate configuration is up and running:
• MGR, Extract & Pump processes running on source
• Local trail files are being written
• Local trail files are being transferred to remote trail

49 1-6-2018

Instantiate
GG 4 Golden Gate installation & configuration

Instantiation: inital copy of the source database + replication start

• Several methods available: GG, SQL*Loader, Oracle Datapump


• We use Oracle Datapump because of performance
• Instantiation needs to be a consistent, automated and well
performing operation:
• Functional test environment (source DB) is being cloned from
Production every 3 months, and GG needs to be re-instantiated.
• In case of very serious issues, we want to be able to re-
instantiate the Golden Gate Production target quickly

Objectives:
Objectives
• (re)instantiate the target database in maximum 8 hours.
(Source DB is 2.111 GB)
• Impact on Production should be as little as possible.
• (re)instantiation should run as an automated batch job

No DDL allowed during instantiation !! Prevent error below:


ORA-01466: Unable to read data -- Table definition has changed" (during the export)

50 1-6-2018

25
01/06/2018

Instantiate
GG 4 Golden Gate installation & configuration

Instantiation architecure: NO (!) impact on source database

EXPDP
flashback_scn=12345678

IMPDP

GGSCI> start replicat r2symp,


aftercsn 12345678

51 1-6-2018

Instantiate
GG 4 Golden Gate installation & configuration

Datapump Export Import


• Optimized for speed
• 2 seperate export jobs, parallel 24 (regular & large tables)
• Import jobs on 2 seperate Exadata compute nodes
• Import data only !
• Rebuild indexes & enable constraints (novalidate) afterwards
• During import operation: implement Advanced Compression.
(size source: 2.111 MB, size target 1.446 MB)

52 Agenda 1-6-2018

26
01/06/2018

Instantiate
GG 4 Golden Gate installation & configuration

Export & import parameter files

53 1-6-2018

Instantiate
GG 4 Golden Gate installation & configuration

Configure the Replicat parameter file after the IMPORT operation

54 1-6-2018

27
01/06/2018

Instantiate
GG 4 Golden Gate installation & configuration

Start the replicat after the IMPORT operation

GGSCI> register replicat rprod database


2017-08-23 13:08:44 INFO OGG-02528 REPLICAT RPROD
successfully registered with database as inbound server
OGG$RPROD.

GGSCI> add replicat rprod, integrated, exttrail


/zs01sn01/gg_prod_trail/prod/x1
REPLICAT (Integrated) added.

GGSCI> start replicat rprod, aftercsn 12345678


Sending START request to MANAGER ...
REPLICAT RPROD starting

55 1-6-2018

Instantiate
GG 4 Golden Gate installation & configuration

Everything is scripted and automated runs in Batch Scheduler

Batchjobs on source

56 1-6-2018

28
01/06/2018

Instantiate
GG 4 Golden Gate installation & configuration

Batchjobs on target

57 1-6-2018

5 Findings & Troubleshooting

29
01/06/2018

5 Findings & troubleshooting


. . . Mitigate the whole chain when issues occur

59 1-6-2018

5 Findings & troubleshooting


. . . Document findings & solutions !!!

60 Agenda 1-6-2018

30
01/06/2018

5 Findings & troubleshooting

Major issue 1: Issues with supplemental logging on the source

ORA-22328: object "HR"."GENERIC_OTC_QUEUE_TABLE" has errors.


ORA-62009: An attempt to encode invalid XML character 0x4 in
supplemental log has occurred.

Even though we excluded the Advanced Queuing tables from the


Extract process, this error still occurred, because supplemental
logging was enabled at the SCHEMA level, including these AQ tables
inside the HR schema.

There is NO solution for this problem. You cannot exclude objects


when you add supplemental logging at the SCHEMA Level.
⇒ Confirmed by Oracle (Service Request)
⇒ Enhancement request rejected. (works as designed)

61 1-6-2018

5 Findings & troubleshooting

Major issue 1: Issues with supplemental logging on the source

The best solution would be to move the AQ$ tables to another


schema. However, this requires code & application changes & testing.
Impact analysis planned for next year.

For now: move from SCHEMA level logging to TABLE level logging.
GGSCI> add trandata HR.table_1 (instead of add schematrandata HR)

In Extract parameter file: map every single table. In stead of HR.*;


TABLE HR.table_1;
TABLE HR.table_2;
.......

Disadvantate of this approach:


1. GG extract configuration much more complex. 2000+ tables listed.
2. New tables are not automatically added to the GG replication.
3. Certain DDL operations are not supported with table level logging:
RENAME table, changes to PMK.

62 1-6-2018

31
01/06/2018

5 Findings & troubleshooting

Major issue 2: Very bad performance due to tables without PMK/UK

• Tables without PMK/UK on source were initially ignored. (No


adequate solution with invisible OGG_KEY_ID column)

• However, they cause very bad performance on the target, during


updates and deletes: => full tablescans for every (grouped)
transaction.

63 1-6-2018

5 Findings & troubleshooting

Major issue 2: Very bad performance due to tables without PMK/UK

64 1-6-2018

32
01/06/2018

5 Findings & troubleshooting

Major issue 2: Very bad performance due to tables without PMK/UK

65 1-6-2018

5 Findings & troubleshooting

Major issue 2: Very bad performance due to tables without PMK/UK

Subsequent table fragmentation causes performance to degrade even worse !!!


66 1-6-2018

33
01/06/2018

5 Findings & troubleshooting

Major issue 2: Very bad performance due to tables without PMK/UK

Permanent solution (impact analysis planned for next years program)


⇒ create PMK/UK or implement visible OGG_KEY_ID
⇒ Requires code changes and extensive testing

Temporary workarounds 1: exclude tables from the GG replication


=> Discussion with data architects.

Temporary workarounds 2: define logical keys


=> Discussion with data architect to identify tables

Temporary workarounds 3: use IGNOREDELETES for “snapshot tables”


=> Discussion with data architect to identify tables

Temporary workarounds 4: create all column index on remaining tables


(on target)

67 1-6-2018

5 Findings & troubleshooting

Major issue 2: Very bad performance due to tables without PMK/UK

Table Analasis sheet, with colour coded “solutions”/workarounds


68 Agenda 1-6-2018

34
01/06/2018

5 Findings & troubleshooting

Major issue 2: Very bad performance due to tables without PMK/UK


..... Workaround with logical keys

On the source:
GGSCI> add trandata HR.TABLE_1 COLS (A,B) NOKEY

In extract parameter file: table HR.TABLE_1 (A,B);

On target:
SQL> Create unique index hr.ix_table_1 on hr.table_1 (A,B);

In replicat parameter file:


MAP HR.TABLE_1, target HR.TABLE_1, KEYCOLS (A,B);

69 1-6-2018

5 Findings & troubleshooting

Major issue 2: Very bad performance due to tables without PMK/UK


..... Workaround with ignoredeletes

• Only updates and deletes are affected negatively. Inserts are fine
• Customer has ‘snapshot’ tables which only contains inserts and
periodic deletes (overnight, cleaning up old entries)

Solution: ignore the deletes and perform them on the target outside
of the Golden Gate scope, in a nightly batch. Configure this in the
source extract parameter file:

IGNOREDELETES
TABLE HR.TABLE_2;
TABLE HR.TABLE_3;

70 1-6-2018

35
01/06/2018

5 Findings & troubleshooting


. . . Several minor issues, some examples. . .

Example 1: Performance optimization

Example 2: DDL Error, table with added column

WARNING OGG-01431 Aborted grouped transaction on


'HR.TBBU_REPROCESS_BUYER_STATUSES', Mapping error

Example 3: DDL Error, system named constraints

ERROR OGG-00519 Fatal error executing DDL replication: error


[Error code [2443], ORA-02443: Cannot drop constraint -
nonexistent constraint SQL alter table "

71 1-6-2018

5 Findings & troubleshooting


. . . Several minor issues: example 1 performance optimization

Replication lag 10+

72 1-6-2018

36
01/06/2018

5 Findings & troubleshooting


. . . Several minor issues: example 1 performance optimization

Changing parameters

EXTRACT
-- to minimize the latency during low volume changes
EOFDELAYCSECS 1
FLUSHCSECS 1

PUMP
-- to minimize the latency during low volume changes
EOFDELAYCSECS 1
FLUSHCSECS 1

REPLICAT
-- to minimize the latency during low volume changes
EOFDELAYCSECS 1

Reference: How can I optimize (reduce delay, lag ) throughput from NSK to a target
platform in a low volume situation. (Doc ID 1489540.1)
73 1-6-2018

5 Findings & troubleshooting


. . . Several minor issues: example 1 performance optimization

Replication lag 1+ seconds

74 1-6-2018

37
01/06/2018

5 Findings & troubleshooting


. . . Several minor issues: example 2 new column added

WARNING OGG-01431 Aborted grouped transaction on


'HR.TBBU_REPROCESS_BUYER_STATUSES', Mapping error

Normally, new columns can be added to a table, and this DDL is


captured and applied by Golden Gate.

However, when it concerns a table with no PMK or UK:


• as a consequence, all columns are required as supplemental logging.
• In those cases, the new column is not automatically added to the
supplemental log group.

Solution:
⇒ Temporarily exclude the table from the Replicat (tableexclude
HR.TABLE_1;) and restart the replicat.
⇒ Process all the “bad” trails with the missing column information.
⇒ Add the new column as Supplemental Logging on the source
⇒ Reinstatiate the new table, remove the tableexclude from the
Replicat and restart the Replicat

75 1-6-2018

5 Findings & troubleshooting


. . . Several minor issues: example 3 system generated names

ERROR OGG-00519 Fatal error executing DDL replication:


error [Error code [2443], ORA-02443: Cannot drop constraint
- nonexistent constraint SQL alter table "

Try to avoid these type of errors by naming your constraints !!

Use Replicat Error Handling to handle this type of errors:


-- DDL Errors
DDLERROR DEFAULT ABEND
DDLERROR 2443 IGNORE

Monitor your Replicat report files and discard files errors !!

Oracle claims in an Oracle note that you can treat DML & DDL errors with the
same error handling clause (REPERROR). I did not get it to work. In stead:
DDL Errors => DDLERROR clause
DML Errors => REPERROR clause
76 1-6-2018

38
01/06/2018

5 Findings & troubleshooting


. . . Several minor issues: example 3 system generated names

Sample section in Replicat parameter file to handle DML errors

77 1-6-2018

6 Golden Gate monitoring

39
01/06/2018

6 Golden Gate monitoring

Standard Oracle tools


• GGSCI
• Oracle AWR & SPADV reports
• Oracle Healthcheck script
• Oracle Enterprise Manager & GG Plug-in
• Oracle Performance Monitoring Toolkit

Customized tools & reports


• Custom process monitoring scripts
• Custom heartbeat tables
• Custom gg table stats report

79 1-6-2018

6 Golden Gate monitoring


. . . Standard Oracle tools: AWR report (local only!)

Seperate section in AWR for GoldenGate


80 1-6-2018

40
01/06/2018

6 Golden Gate monitoring


. . . Standard Oracle tools: AWR report (local only!)

81 1-6-2018

6 Golden Gate monitoring


. . . Standard Oracle tools: AWR report (local only!)

82 1-6-2018

41
01/06/2018

6 Golden Gate monitoring


. . . Standard Oracle tools: OGG Healthcheck script
• Mos Note: GoldenGate Integrated Capture and Integrated Replicat
Healthcheck Script (Doc ID 1448324.1)
• Script did not run at first. Script hang. Comment out alert log check:

Whenever you log an SR with Oracle they often ask you to run the
healthcheck script and upload the results (html file).

83 1-6-2018

6 Golden Gate monitoring


. . . Standard Oracle tools: OGG Healthcheck script
header information

database information

process state information

sys checks

configuration checks

performance checks & advice

rep errors, conflicts & collisions

table statistics

84 1-6-2018

42
01/06/2018

6 Golden Gate monitoring


. . . Standard Oracle tools: OGG Healthcheck script

Output above based on view gv$goldengate_table_stats

85 1-6-2018

6 Golden Gate monitoring


. . . Oracle Enterprise Manager and GoldenGate plug-
plug-in

Requires GG Management Pack license

Golden Gate has it’s own agent (JAGENT), which communicates with OEM Agent.
Every Golden Gate configuration has it’s own JAGENT installed.

Reference: https://docs.oracle.com/goldengate/em1311/gg-emplugin/EMGGP/toc.htm
86 1-6-2018

43
01/06/2018

6 Golden Gate monitoring


. . . Oracle Enterprise Manager and GoldenGate plug-
plug-in

Step 1: Download, Install & Deploy GG Plugin to OMS and GG targets

Step 2: Install & configure GG agents (JAGENT) on target nodes

Step 3: Discover GG plugins in OEM Cloud Control

Reference: https://docs.oracle.com/goldengate/em1311/gg-emplugin/EMGGP/toc.htm
87 1-6-2018

6 Golden Gate monitoring


. . . Oracle Enterprise Manager and GoldenGate plug-
plug-in

88 1-6-2018

44
01/06/2018

6 Golden Gate monitoring


. . . GoldenGate Monitoring Performance Toolkit

89 1-6-2018

6 Golden Gate monitoring


. . . GoldenGate Monitoring Performance Toolkit

90 1-6-2018

45
01/06/2018

6 Golden Gate monitoring


. . . GoldenGate Monitoring Performance Toolkit

91 1-6-2018

6 Golden Gate monitoring Detailed work instruction in Handout

. . . Custom tooling & reports

Challenge:
• Report on current Golden Gate status
• Alert if the replication lag increases or GG processes are down
• No access (as Oracle) to source database server. Only PUID access

Customization 1: making reports accessible for PUID


Customization 2: remote monitoring
Custumization 3: monitoring replication lag
Customization 4: reporting on Golden Gate Statistics

92 1-6-2018

46
01/06/2018

6 Golden Gate monitoring


. . . Customization 1: making reports available

-- parameter setting in ./GLOBALS on source


-- allows user read access to the report files
-- overriding the default (026)
OUTPUTFILEUMASK 022

93 1-6-2018

6 Golden Gate monitoring


. . . Customization 1: making reports available

94 1-6-2018

47
01/06/2018

6 Golden Gate monitoring Detailed work instruction in Handout

. . . Customization 2: remote monitoring

External tables call OS scripts:


OS scripts that run GGSCI commands preprocessor GG_DIR2:'gg_status_02.sh'

AIX --
/home/oracle/scripts/gg_status2.sh --
PROD

DB LINK GG_SYMP
User GGADMIN_READ with Read Acces to External Table

Exadata -- SQL> query


/home/oracle/scripts/gg_status2.sh --
prodgg

Advantage of this setup: we do not need access (as oracle) the source system to view the
current GG status

95 1-6-2018

6 Golden Gate monitoring


. . . Customization 2 : remote monitoring

96 1-6-2018

48
01/06/2018

6 Golden Gate monitoring


. . . Customization 2 : remote monitoring, part of report

97 1-6-2018

6 Golden Gate monitoring


. . . Customization 3 : heartbeat tables

98 1-6-2018

49
01/06/2018

6 Golden Gate monitoring


. . . Customization 3 : heartbeat tables

On the target:
• The INSERT_ON_TARGET Date/Time is recorded as part of the heartbeat
• This allows queries to see the Replication lag:

99 Agenda 1-6-2018

6 Golden Gate monitoring


. . . Customization 3 : heartbeat tables , part of report

100 1-6-2018

50
01/06/2018

6 Golden Gate monitoring


. . . Customization 3 : heartbeat tables implementation

ON THE SOURCE
• Create Heartbeat tables on source with an automatically updated
insert_on_source date/time column
• Create some dependant objects (sequences).
• Create DBMS_SCHEDULER_JOBS to insert records every minute

ON THE TARGET
• Create a trigger to update the insert_on_target date/time column
• Enable the trigger (triggers are by default disabled by GoldenGate)
• Create some reports based on the hearbeat tables

101 1-6-2018

6 Golden Gate monitoring


. . . Customization 4 : GoldenGate table statistics

• Good table update statistics are hard to find

• GGSCI interfrace quite primitive and insufficient

• Oracle ships the licensed Veridata product for data reporting


and comparison, but very expensive

• Currently I am working on a customized solution using view


GV$GOLDENGATE_TABLE_STATS. This view is also used by the
Oracle Healthcheck Script.

102 1-6-2018

51
01/06/2018

7 Golden Gate High Availability

7 Golden Gate High Availability


. . . Overview

HA Source System : with startup script


HA Remote Trail file System : clustered ZFS
HA Target System : Oracle Clusterware

104 1-6-2018

52
01/06/2018

6 Golden Gate High Availability


. . . Relevant Golden Gate processes for High Availability

• The only process we have to care about (when configured correctly) is


the Manager process

• The Extract, Pump and Replicat processes are managed by the


Manager process (via AUTOSTART & AUTORESTART parameter)

• But … the manager process should be managed by some external


mechanism
• On AIX: startup scripts
• On Exadata: by Oracle Clusterware

105 1-6-2018

6 Golden Gate High Availability


. . . Relevant Golden Gate processes for High Availability

• The only process we have to care about (when configured correctly) is


the Manager process

• The Extract, Pump and Replicat processes are managed by the


Manager process

• But … the manager process should be managed by some external


mechanism
• On AIX: startup scripts
• On Exadata: by Oracle Clusterware

106 1-6-2018

53
01/06/2018

6 Golden Gate High Availability


. . . Golden Gate extract and replicat processes

• The GG Manager (can) start / restart the Extract and Replicat


processes.

• This is configured with parameters in the MANAGER parameter file:

AUTOSTART ER *
-- Starts Extract and Replicat processes when Manager starts

AUTORESTART ER *, retries 12, WAITMINUTES 5, RESETMINUTES 60


-- restarts Extract and Replicat processes again after abnormal
termination.

HA events are recorded in the ggserr.log file and manager report file:

107 1-6-2018

6 Golden Gate High Availability


. . . Golden Gate High Availability on the target (Exadata)

Application resource ggate_prod


• hard dependency on Application VIP (pullup,
shutdown)
• action script to stop/start/check gg processes

Application VIP
• Separate IP Address on public network (10 Gb/s)
• Added to DNS as dm1db01-appvip

108 1-6-2018

54
01/06/2018

6 Golden Gate High Availability


. . . Clusterware configuration steps Detailed work instruction in Handout

Step 1: Make sure certain data (like parameter files, checkpoint files,
process files, etc) is on a shared location.

Step 2: Create & Start Application VIP (IP address & name must be
entered in DNS)

Step 3: Create & Test GG Action Script (to be used by GG Application


Resource

Step 4: Create & Start GG Application Resource

Step 5: Reconfigure GG Pump Parameter:

RMTHOST dm1db01-appvip, MGRPORT 7810, TCPBUFSIZE 10485760,


TCPFLUSHBYTES 10485760

109 1-6-2018

6 Golden Gate High Availability


. . . Golden Gate Failover steps

1 Check current status of cluster resources

2 Check current status of GG processes

110 1-6-2018

55
01/06/2018

6 Golden Gate High Availability


. . . Golden Gate Failover steps

3 Perform the failover

4 Check the GG processes

The Golden Gate PUMP process on the source will temporarily ABEND, but will be restarted automatically by the
Manager process on the source.

111 1-6-2018

6 Golden Gate High Availability


. . . Golden Gate Failover, confusing .....

MGR and Replicat processes appear to be running on both nodes simultaneously ?????

112 1-6-2018

56
01/06/2018

6 Golden Gate High Availability


. . . Golden Gate Failover, confusing .....

The GGSCI Interface uses the shared checkpoint and process locations on the ZFS-SA to
display this information. The ACTUAL GG processes are only running on only one node
Ps –ef|grep mgr on first node

Ps –ef|grep mgr on second node

113 1-6-2018

6 Golden Gate High Availability


. . . Golden Gate Failover steps

3 Perform the failover

4 Check the GG processes

The Golden Gate PUMP process on the source will temporarily ABEND, but will be
restarted automatically by the Manager process on the source.
114 1-6-2018

57
01/06/2018

8 Lessons Learned

Lessons Learned 1: DBMA not for every DBA

The Exadata Platform requires a scaling up and scaling out of the


traditional Oracle DBA activities:
• Scaling up: RAC, RMAN, +ASM, Dataguard, OEM++
• Scaling out:
□ Exadata (& optional ZFS-SA) Storage Management
□ Exadata network management (Private, Public, InfiniBand)
□ Exadata platform management (sysadmin skills)

Not every DBA is capable to adapt those changes.

116 1-6-2018

58
01/06/2018

Lessons Learned 2: Make use of existing structures

Make use of existing structures & procedures already in place at


the customer site.
• It will save you work
• It will change “opponents” into supporters

Examples applied in this project:


• Existing snapshot & cloning procedures were used for GG
database instantiation
• Batch scheduling team was asked to run & monitor GG batches

117 1-6-2018

Lessons Learned 3: Make use of Oracle Support !!

• When working on technical issues, make use of Oracle Support


to increase your project team.
• Once in place, it only takes about 20 minutes to create a
Service Request.
• Quality of Oracle Support analysists varies heavily.

(example: one SR (not being able to discover GG in OEM) took 2 months


with hardly any progress. Then support analyst went on holiday and
colleague took over. Within a few days the SR was resolved.

118 1-6-2018

59
01/06/2018

Lessons Learned 4: Document the solution !!

Preferably on a WIKI

119 1-6-2018

Lessons Learned 5: Data analysis of source DB !!

Perform the data analysis of the source DB first, with Oracle


provided scripts. Reports on Golden Gate challenges:
• Non supported objects (like AQ$ tables)
• Tables without PMK/UK
• Non supported datatypes (like IDENTITY columns)
• Etc.

Try and avoid making promises up front, do a POC

Contact the data administrators who know their data and


discuss the outcome and possible solutions with them.

References:
Oracle GoldenGate database Schema Profile check script for Oracle DB (Doc ID 1296168.1). Script
Oracle GoldenGate database Complete Database Profile check script for Oracle DB (All Schemas)
(Doc ID 1298562.1)
GoldenGate Integrated Capture and Integrated Replicat Healthcheck Script (Doc ID 1448324.1
120 1-6-2018

60
01/06/2018

Lessons Learned 6: Incremental approach

If possible, choose an incremental approach and evaluate


every step over time:
• Place source DB in archive log & force logging mode
• Add supplemental logging to the source DB
• Configure and start the extract process
• Configure and start the pump process
• Instantiate the target database
• Automate the instantiation
• Configure High Availability
• Configure monitoring & reporting

• Configure the TEST environment first, then PRODUCTION

121 1-6-2018

Lessons Learned 7: Take time to work out the solution

Take time to work out the solution.

As soon as you are outside the “Golden Gate sweetspot”, you are much
on your own, with little to no help from Oracle Support.

Documentation on internet/My Oracle Support hard to find. Available


documentation just touches the surface.

122 1-6-2018

61
01/06/2018

Lessons Learned 8: Supersize if possible

Try and supersize your Golden Gate trail file locations, both the local
trail and the remote trail.

This will allow you room to manage Golden Gate issues. While you fix
the issues, there is still sufficient storage for Golden Gate to keep on
shipping trail files.

In this Customer setup:


• GG local trail file location with 200 GB of storage. Enough for +10
days of data changes
• GG remote trail file location of “unlimited” storage capacity (hosted
on ZFS Storage Appliance).

123 1-6-2018

Lessons Learned 9: GoldenGate stability

GoldenGate Extract and Pump processes turned out to be VERY


stable.

GoldenGate Manager process monitors other processes VERY


succesfully. (For example AUTORESTART Pump after network issue)

Most problems occur with the GoldenGate Replicat processes.


⇒ Important to configure CDR (Conflict Detection & Resolution) in
Replicat parameter file
⇒ Important to have sufficient remote trail file storage. While error
is being analyzed and resolved, remote trail file processing stalls
and new trails keep coming !

124 1-6-2018

62
01/06/2018

Lessons Learned 10: Reporting is important

Customers like to know the status of their GG replication. Take time


to build some custom reports:
• Golden Gate availability
• Golden Gate lag (latency) => use the Heartbeat tables
• Golden Gate table stats => use GV$GOLDENGATE_TABLE_STATS

Do not promise the impossible! For real up to date reporting,


products like GoldenGate Veridata need to be purchased.

125 1-6-2018

Lessons Learned 11: develop GG standards

DBA and developers should be aware of the consequences of


GoldenGate and adopt their standards:
• No NOLOGGING operations allowed
• No Transportable Tablespace methods allowed
• Always create tables with PMK or Unique Key
• Make sure all new objects within the replicated schema(s) are
supported by GoldenGate
• Do not use objects that are not supported by GoldenGate, like AQ$
tables and IDENTITY columns
• Name your constraints and indexes, avoid system generated names.
• Etc.

126 1-6-2018

63
01/06/2018

End of Presentation

127 Agenda 1-6-2018

64

You might also like