DM 1205datastageopsdb1 PDF
DM 1205datastageopsdb1 PDF
DM 1205datastageopsdb1 PDF
17 May 2012
Introduction
The DataStage and QualityStage Operations Database (DSODB) is designed as a relational
schema whose tables contain information about job runs and system resources used on a system
Copyright IBM Corporation 2012
Get started with the IBM InfoSphere DataStage and QualityStage
Operations Console Database, Part 1: An introduction
Trademarks
Page 1 of 22
developerWorks
ibm.com/developerWorks/
that has a DataStage engine installed. This information is used to drive the Operations Console,
but can also be queried directly given some knowledge of the key columns of the tables and the
relationships among them.
This article describes the main tables and views of the DSODB, including all columns used
as primary or natural keys, and all foreign key relationships defined. It gives examples of SQL
SELECT statements, which can be used to answer specific questions based on the information
held by DSODB.
Online reference material for DSODB 8.7, including descriptions of all current tables and columns,
can be found in the schema document reference in the Resources section.
Table diagrams
Some pictures are included to illustrate the relationships between various tables. Each box
represents a table or view. Relevant column names are shown, with icons against each column to
show their use as follows (Figure 1).
Note that only the columns of each table that are mentioned in the text of this article are shown.
For a complete list of table columns, consult the schema document reference in the Resources
section.
The arrows between tables show the direction of relationships established by foreign keys and
SQL views, as shown next (Figure 2).
Page 2 of 22
ibm.com/developerWorks/
developerWorks
Each row in the HOST table represents a system on which all or part of an InfoSphere Information
Server engine is installed. The HOSTID column is a surrogate primary key; The HostName column
is the system name as derived from the hostname operating system command or the HA_ALIAS
environment variable, if that is set in the environment of the EngMonApp process when it starts.
For systems acting as a conductor node and on which an instance of the EngMonApp process
is running, the InstallationDir column is set to a path name. This is the home directory for the
Information Server engine installation, such as /opt/IBM/InformationServer/Server/DSEngine. The
key of the HOSTS table is the combination of HostName and InstallationDir, which allows for more
than one engine to be installed on the same host system.
The CreatedTimestamp column is the UTC time at which EngMonApp inserted the row. Each
time a ResMonApp instance starts, it looks to see if there is a row in the table with the values of
HostName and InstallationDir it requires, and if so, it updates the MonStartTimestamp column;
otherwise, it creates a new row. ResMonApp also creates new HOST entries, as required.
Remote nodes have InstallationDir set to a single hyphen character, '-'. These are systems whose
HostNames have been defined in the DSODBConfig.cfg file via the ResourceNode property. These
rows have a CreatedTimestamp, but no MonStartTimestamp entry.
Cascading delete constraints are set so that deleting a row in HOST removes all rows that point
back to it. Therefore, records of any jobs that ran on that system and all resource usage records
are deleted at the same time.
Page 3 of 22
developerWorks
ibm.com/developerWorks/
Page 4 of 22
ibm.com/developerWorks/
developerWorks
Cascading delete constraints are set so that deleting an entry in JOBEXEC removes all records to
do with any runs of that version of the job.
Page 5 of 22
developerWorks
ibm.com/developerWorks/
specific status for the run (see the schema document reference in the Resources section for all
possible values).
Cascading delete constraints are set so that deleting an entry in JOBRUN deletes all related
details of the run (parameters, logs, metrics, etc.). If the run started other runs (if it was a job
sequence, for example), all those runs are also deleted, along with their details.
Page 6 of 22
ibm.com/developerWorks/
developerWorks
Constraints are set so that deleting an entry in JOBRUN deletes any corresponding entry in
JOBRUNPARAMS.
Page 7 of 22
developerWorks
ibm.com/developerWorks/
Constraints are set so that deleting an entry in JOBRUN deletes all corresponding entries in
JOBRUNLOG.
Page 8 of 22
ibm.com/developerWorks/
developerWorks
Constraints are set so that deleting a row in HOST deletes all PARALLELCONFIG entries
generated by runs on that host (and all the runs are deleted at the same time). However, deleting
entries in JOBRUN does not delete entries in PARALLELCONFIG; they might be linked to by a
later run of another job on that host.
Page 9 of 22
developerWorks
ibm.com/developerWorks/
Page 10 of 22
ibm.com/developerWorks/
developerWorks
Page 11 of 22
developerWorks
ibm.com/developerWorks/
separated string of numbers rather than a single value. But it is easier to select the TotalCPU
column, which gives the overall value by summing entries in the list, if necessary.
Page 12 of 22
ibm.com/developerWorks/
developerWorks
Usage
ComputerName
SoftwareProductName
DataStoreSubClass
DataStoreName
DataSchemaSubClass
DataSchemaName
DataCollectionSubClass
DataCollectionName
Figure 7. Key columns and relationships between the HOST and HOSTDETAIL
tables
Page 13 of 22
developerWorks
ibm.com/developerWorks/
Page 14 of 22
ibm.com/developerWorks/
developerWorks
Page 15 of 22
developerWorks
ibm.com/developerWorks/
Page 16 of 22
ibm.com/developerWorks/
developerWorks
Table
Column
Page 17 of 22
developerWorks
ibm.com/developerWorks/
RUNMAJORSTATUSREF
JOBRUN
RunMajorStatus
RUNMINORSTATUSREF
JOBRUN
RunMinorStatus
RUNTYPEREF
JOBRUN
RunType
JOBTYPEREF
JOBEXEC
JobType
LOGTYPEREF
JOBRUNLOG
LogType
STAGESTATUSREF
JOBRUNSTAGE
StageStatus
LINKTYPEREF
JOBLINK
LinkType
For example, if a row of the JOBRUN table contains "FIN" in the RunMajorStatus column, this can
be looked up via the RUNMAJORSTATUSREF view as Name = "Finished" and Description = "Run
has finished". A RunMinorStatus of "FWF" can be looked up via RUNMINORSTATUSREF as Name
= "Finished aborted" and Description = "Run has finished and logged at least one fatal
message".
Listing 18. SQL Example: Querying the JOBRUN table using lookups against
reference views
-- List the job names , job types, run types, and minor status names
-of all runs on host H that have finished.
SELECT
X.ProjectName, X.JobName,
JT.JobTypeName, RT.RunTypeName,
R.RunStartTimestamp, R.RunEndTimestamp,
SR.MinorStatusName
FROM
DSODB.JOBRUN AS R
JOIN DSODB.JOBEXEC AS X
ON R.JOBID = X.JOBID
JOIN DSODB.HOST AS H
ON X.HOSTID = H.HOSTID
JOIN DSODB.JOBTYPEREF AS JT
ON X.JobType = JT.JobTypeCode
JOIN DSODB.RUNTYPEREF AS RT
ON R.RunType = RT.RunTypeCode
JOIN DSODB.RUNMINORSTATUSREF AS SR ON R.RunMinorStatus = SR.MinorStatusCode
WHERE
H.HostName = "H"
AND R.RunMajorStatus = "FIN"
ORDER BY R.RunStartTimestamp
Conclusion
We have examined how the tables that make up the Operations Database schema contain
rows that describe job runs, host systems, and system resource usage. We have described the
identifying key fields for the main tables, some of the other important fields, and how foreign
key fields are used to join other tables to them. We have given several examples of real-world
questions that can be answered by appropriate use of these fields.
Acknowledgements
Thanks go to the members of the DataStage Operations Console development team who ensured
that the schema got ported to the various flavors of database supported: DB2 LUW 9.5 and 9.7,
Oracle 10g and 11gR2, Microsoft SQL Server 2005 and 2008, on the various Windows and
UNIX platforms.
Get started with the IBM InfoSphere DataStage and QualityStage
Operations Console Database, Part 1: An introduction
Page 18 of 22
ibm.com/developerWorks/
developerWorks
Thanks also to the QA team who patiently checked that data could indeed be inserted and
retrieved correctly from all those variants.
Page 19 of 22
developerWorks
ibm.com/developerWorks/
Resources
Learn
The operations database schema is documented on the IBM InfoCenter.
IBM InfoSphere Information Server 8.7 What's New whitepaper, containing details of the
Operations Console.
IBM DB2 9.7 for Linux, UNIX, and Windows Information Center and SQL reference.
Learn more about Information Management at the developerWorks Information Management
zone. Find technical documentation, how-to articles, education, downloads, product
information, and more.
Stay current with developerWorks technical events and webcasts.
Follow developerWorks on Twitter.
Get products and technologies
Build your next development project with IBM trial software, available for download directly
from developerWorks.
Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2
Express Edition for the community that offers the same core data features as DB2 Express
Edition and provides a solid base to build and deploy applications.
Discuss
Check out the developerWorks blogs and get involved in the developerWorks community.
Page 20 of 22
ibm.com/developerWorks/
developerWorks
Arron Harden
Arron Harden is a senior software engineer for IBM InfoSphere DataStage and
QualityStage. Staying with the DataStage product after several mergers and
acquisitions, he has worked on DataStage for more than 12 years, joining IBM
through the acquisition of Ascential Software Inc in 2005. Having spent a year
working in Boston, he is currently based in the United Kingdom, working at the IBM
Milton Keynes office. In his most recent role, he was the lead developer for the web
application component of the DataStage and QualityStage Operations Console,
written using the Dojo toolkit.
Geoff McClean
Geoff McClean was on the original DataStage development team at its inception and
is currently a senior software developer for core components of the IBM InfoSphere
DataStage and QualityStage development and production tools, part of the IBM
InfoSphere Information Server suite. He oversaw the implementation of the database
management, event handling, and resource tracking services of the IBM InfoSphere
DataStage and QualityStage Operations Console.
Sumit Kumar
Sumit Kumar has 13 years of industry experience, including the financial, banking,
telecom, supply-chain management, insurance, and healthcare domains. He has
worked with IBM InfoSphere Information Server from early 2010, including the
Operations Console feature of Information Server, from the beginning of its design
Page 21 of 22
developerWorks
ibm.com/developerWorks/
Page 22 of 22