Nothing Special   »   [go: up one dir, main page]

Transformer Stage To Filter The Data

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

TRANSFORMER STAGE TO FILTER THE DATA

If our requirement is to filter the data department wise from the file below
samp_tabl
1,sam,clerck,10
2,tom,developer,20
3,jim,clerck,10
4,don,tester,30
5,zeera,developer,20
6,varun,clerck,10
7,luti,production,40
8,raja,priduction,40

And our requirement is to get the target data as below
In Target1 we need 10th & 40th dept employees.
In Target2 we need 30th dept employees.
In Target3 we need 20th & 40th dept employees.
Read and Load the data in Source file
In Transformer Stage just Drag and Drop the data to the target tables.
Write expression in constraints as below
dept_no=10 or dept_no= 40 for table 1
dept_no=30 for table 1
dept_no=20 or dept_no= 40 for table 1
Click ok

Give file name at the target file and
Compile and Run the Job to get the Output

TRANSFORMER STAGE USING STRIPWHITESPACES FUNCTION
Stripwhitespaces is the function used for the remove before,after and middle of the characters.
Sometimes we get the data as below
e_id,e_name
10,em y
20, j ul y
30,re v o l
40,w a go n
Take Job Design as
Se.File ------ Tx------D.s
Read and load the data in Sequential file stage
Go to Transformer stage
Here, we use stripwhitespaces function in the required column derivation.
You can write expression as below
Stripwhitespaces(e_name) for e_name
Click ok
Compile and Run the data
You will get the data after removal of all the spaces between the characters,
before and after spaces also.

TRANSFORMER STAGE USING PADSTRING FUNCTION
Padstring is a function used to padding the data after the string.
If we have a data as below
Table_1
e_id,e_name
10,emy
20,july
30,revol
40,wagon
(Remember to give gap between the words to understand the Padstring function)
Take Job Design as
Seq.File------------Tx--------------D.s
Read and load the data in sequential file.
Now Go to the Transformer stage, here in required column derivation write your
expression as below
padstring(e_name,'@',5) for e_name
Here '@' is called padding you want to get after the data
5 is the padlength Now click ok
Give file name at the target file
Compile and Run the Job

CONCATENATE DATA USING TRANSFORMER STAGE
If we have a Table as below
e_id,e_name,e_job,e_Sal
1,sam,clerck,2000
2,tim,salesman,2100
3,ram,clerck,1800
4,jam,salesman,2000
5,emy,clerck,2500

Read and Load the data in sequential file
In Transformer stage Create one column as Total_one
In derviation you can write expression as below
click ok
Give File name in the target file
Compile and Run the Job
That's it


FIELD FUNCTION IN TRANSFORMER STAGE
Sometimes we get all the columns in single column like below
xyztable
1,sam,clerck,2000
2,tim,salesman,2100
3,pom,clerck,1800
4,jam,pa,1900
5,emy,clerck,2500

Read and Load the data in sequential file stage.
In Transformer Stage Create Columns as
e_id, e_name,e_job,e_sal
and in all derivations write as below
Field(dslink3.xyztable,',',1) for e_id
Field(dslink3.xyztable,',',2) for e_name
Field(dslink3.xyztable,',',3) for e_job
Field(dslink3.xyztable,',',4) for e_Sal
Give File name at the target
Compile and Run the Job
That's it you will get the 4 columns with required data.

TRANSFORMER STAGE WITH SIMPLE EXAMPLE
If we have a data as below
x_Comp
e_id,e_name,s_1,s_2,s_3,s_4,s_5
100,kelvin,35,40,50,49,60
200,rudd,40,80,60,55,56
300,emy,65,50,35,45,60
400,lin,30,45,60,60,55
500,jim,34,40,60,70,55
We are going to find Total_Score and Percentage using Transformer Stage
Take Job Design as
Seq.File--------------Tx ----------------------D.s
Read and load the data in Seq.File
In Transformer Stage Drag and Drop all the columns
Create two columns as Total_Score and Percentage
In total_score derivation write expression as

s_1+s_2+s_3+S_4+s_5

In Percentage Derivation write expression as
(s_1+s_2+s_3+s_4+s_5)/500 *100 Click OK
Give File name to the target file
Compile and Run the Job
CONVERT ROWS INTO COLUMNS USING SORTING AND TRANSFORMER STAGE
If you have Some Data like below to convert rows into the columns
xyz_comp
e_id,e_name,e_add
100,jam,chicago
200,sam,newyork
300,tom,washington
400,jam,indiana
500,sam,sanfransico
600,jam,dellas
700,tom,dellas

Take Job Design as
Seq.File----Sort-----Tx-----R.d-----D.s
Tx- Transformer stage
R.D- Remove Duplicates Stage
Here we are taking remove duplicate stage, in order to remove duplicates after getting the output.
Read and Load the Data in Sequential file stage .
In Sort Stage Select Key column as e_name
and select key change column as True
In output Drag and Drop all the Columns
Go to Transformer stage Create two stage variables as Temp and Add
Map the key change to temp and
in add derivation write expression as
If temp=1 then e_add else add:',':e_add
Than create one column in output table as hist_add
Now Drag and Drop the Add(From Stage Varable ) to Hist_Add (Output Column )
That's it Click ok
In Remove Duplicate stage Select key column as e_add than
Select Duplicate to retain as last and click ok
Give File name to the target file
Compile and Run the Job.

TRANSFORMER STAGE FOR DEPARTMENT WISE DATA

In order to get the data according to department wise.
And if we have the data as below
a_comp ( Table name )
e_id,e_name,e_job,dept_no
100,rocky,clerck,10
200,jammy,sales,20
300,tom,clerck,10
400,larens,clerck,10
500,wagon,sales,20
600,lara,manager,30
700,emy,clerck,10
800,mary,sales,20
900,veer,manager,30

And have three targets. Our requirement is as below
In 1st target, we need a 10th and 20th department records
In 2nd Target, we need a 30th department records
In 3rd Target, We need a 10th and 30th department records
You can take Job design as below
Read and Load the data in Sequential File
Go to Transformer Stage,
Just Drag and Drop all the columns in to the three Targets.
In 1sT Constraint write expression as,
dept_no=10 or dept_no=20
In 2nd constraint write expression as,
dept_no=30
In 3rd Constraint write expression as,
dept_no=10 or dept_no=30
click ok
Give file names in all the targets.
Compile and run the jobs.

HOW TO CONVERT ROWS INTO THE COLUMNS IN DATASTAGE
If we have some customers information with different address as below.
mult_add
e_id,e_name,e_add
10,john,melbourne
20,smith,canberra
10,john,sydney
30,rockey,perth
10,john,perth
20,smith,towand
If w like to get all multiple addresses of the customer into one single row from multiple rows. .
We can perform this using Sort Stage, Transformer Stage and Remove Duplicate Stage
Take Job Design as below

SeqFile----Sort-----Tx----R.D----D.S
Read and load the data in Seq.File
In Sort Stage Select key column and select Key change = True to generate group id's
In Transformer stage Create one Stage variable and select name as temporary
and Write expression for that as
If keychange=1 then s_add else temporary:",": s_add And click ok
Go to Remove dupilcates and select last in properties and select key column
to remove dupilicates ( You can select address column here )
That's it compile and run the job. You will get the required output.

SORT STAGE AND TRANSFORMER STAGE WITH SAMPLE DATA EXAMPLE
If we have some customers information as below.
cust_info
c_id,c_name,c_plan
11,smith,25
22,james,30
33,kelvin,30
22,james,35
11,smith,30
44,wagon,30
55,ian,25
22,james,40
We can see the customers information and there mobile plans ( for example)
If we like to find lowest plan taken by all customers
Take Job Design as

Seq.File--------Sort------Tx-----------------D.s
Read and Load the data in Sequential file
In Sort Stage select Key Change =True to generate group id
In Transformer Stage write Key Change=1 in Constraint
Write File name for Target D.S File
Compile and Runt the Job
You Get the Output as required
Lowest plans of the customers.

FIELD FUNCTION IN TRANSFORMER STAGE WITH EXAMPLE
Sometimes we get the data as below
Customers
1,tommy,2000
2,sam,2300
3,margaret,2000
4,pinky,1900
5,sheela,2000
Take Job Design as

Seq.File ------- Tx ------ Ds
Read and load the data in Seq.file
Select first line is column name
And in Transformer stage Create three columns to get the data
You can take columns names as c_id,c_name,c_sal with respective data types.
Write the expression in Derivations to the columns as below
Field (dslink3.customers,',',1) for c_id
Field (dslink3.customers,',',2) for c_name
Field (dslink3.customers,',',3) for c_sal
That's it you will get the data in 3 different columns in the output as required.
After compile and Run the Job.





RIGHT AND LEFT FUNCTIONS IN TRANSFORMER STAGE WITH EXAMPLE
For example some times we get the data from warehouse as below
This is just a sample example data
Customers
1 vanitha 2000
2 ramesh 2300
3 naresh 2100
4 kiran 1900
5 sunitha 2000
They are exactly straight. They just have spaces in between the data.
Our requirement is to get the data into the three different columns from
single column.
Here The data is customers is the column name we are getting and we have only
single column.
Now Take Job Design as below

Seq.File------------Tx-------------Ds
Read the data in Seq.file
and dont forget to tick first line is column name.
In Transformer stage Create 3 columns and write the expressions in derivations.
Create Columns as c_id , c_name, c_sal
You can create the names as your wish.
Expressions for three columns are
left(dslink3.customers,1) for c_id
right(left(dslink3.customers,8),7) for c_name
right(dslink3.customers,4) for c_sal
That's it
Give name for the file in the Target.
Now Compile and Run the Job.
You will get the Output as required

You might also like