Showing posts with label TALEND. Show all posts
Showing posts with label TALEND. Show all posts

Tuesday, 5 June 2018

TAC : Scheduling Using CRON TRIGGER with EXAMPLE SCREENSHOTS


* Fire at 10:15am on EVERY DAY of every month during the years 2002, 2003, 2004 and 2005



* Fire at 12pm (noon) every day


*Fire at 11:55Pm every Monday, Tuesday, Wednesday, Thursday and Friday


Wednesday, 25 April 2018

Handling NULL POINTER EXCEPTION

1ST WAY  USAGE:
if( (Relational.ISNULL(test_string)) || ("null".equals(test_string)) || ("".equals(test_string))   )
System.out.println("HAS GOT A NULL VALUE ");
    else
     
        System.out.println("HAS GOT A VALID VALUE ");


2ND WAY USAGE:

( (Relational.ISNULL(test_string)) || ("null".equals(test_string)) || ("".equals(test_string))   )
? "HAVING NULL" :  "HAVING DATA"


=======================================================
Below is the Tested Talend code in DETAIL with various ways of NULL data:
=======================================================

String test_string="TALEND";

System.out.println("\ntest_string=\"TALEND\";");
if( (Relational.ISNULL(test_string)) || ("null".equals(test_string)) || ("".equals(test_string))   )
System.out.println("HAS GOT A NULL VALUE ");
    else
     
        System.out.println("HAS GOT A VALID VALUE ");

test_string="null";

System.out.println("\ntest_string=\"null\";");
if( (Relational.ISNULL(test_string)) || ("null".equals(test_string)) || ("".equals(test_string))   )
System.out.println("HAS GOT A NULL VALUE ");
    else
     
        System.out.println("HAS GOT A VALID VALUE ");

test_string="";
System.out.println("\ntest_string=\"\";");

if( (Relational.ISNULL(test_string)) || ("null".equals(test_string)) || ("".equals(test_string))   )
System.out.println("HAS GOT A NULL VALUE ");
    else
     
        System.out.println("HAS GOT A VALID VALUE ");


test_string=null;

System.out.println("\ntest_string=null;");

if( (Relational.ISNULL(test_string)) || ("null".equals(test_string)) || ("".equals(test_string))   )
System.out.println("HAS GOT A NULL VALUE ");
    else
     
        System.out.println("HAS GOT A VALID VALUE ");
     

Relational.ISNULL() USAGE WITH RESULTS

System.out.println("Relational.ISNULL(null)      : "+Relational.ISNULL(null));

System.out.println("Relational.ISNULL(\"null\")   :  "+Relational.ISNULL("null"));

System.out.println("Relational.ISNULL(\"\")        :"+Relational.ISNULL(""));

System.out.println("Relational.ISNULL(false)     :  "+Relational.ISNULL(false));

System.out.println("Relational.ISNULL(true)     :  "+Relational.ISNULL(true));

============
OUTPUT :
===========
Relational.ISNULL(null)      : true
Relational.ISNULL("null")   :  false
Relational.ISNULL("")        :false
Relational.ISNULL(false)     :  false
Relational.ISNULL(true)     :  false

TALEND CURRENT DATE FORMAT

Various ways of getting Current Date :
===============================


TalendDate.getDate("yyyy-MM-dd HH:mm:ss")

new Date()

TalendDate.getDate("yyyy-MM-dd'T'HH:mm:ss")

Exception in component tDecryptColumn_1

Exception in component tDecryptColumn_1
javax.crypto.IllegalBlockSizeException: Input length must be multiple of 16 when decrypting with padded cipher
at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:922)
at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:833)
at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:446)
at javax.crypto.Cipher.doFinal(Cipher.java:2165)
at unified_platform_git.test_ram_0_1.TEST_RAM.tFixedFlowInput_8Process(TEST_RAM.java:2937)
at unified_platform_git.test_ram_0_1.TEST_RAM.runJobInTOS(TEST_RAM.java:5782)
at unified_platform_git.test_ram_0_1.TEST_RAM.main(TEST_RAM.java:5201)


Sol : Sol : The sequence of columns (in Edit Schema) used in tDecryptColumn_1 component should be same as the column sequence in the prior component.

Eg :   Say, like you got the SCHEMA in   tDecryptColumn_1 component as            name,class,sharedsecret,password as the sequence.


Say like , we have used XMLMAP component prior to tDecryptColumn_1 component that has
sequence of columns(in Edit Schema) as name,class,password,,sharedsecret.

Then we will be getting the above error.
In order to avoid the error, we just need change the sequence of columns as below.

name,class,sharedsecret,password  in  EDIT SCHEMA of XMLMAP component.

Saturday, 20 January 2018

TALEND ERROR : DEBUGGING/TROUBLESHOOTING in TALEND


Sometimes, Talend (JAVA) works wierdly(To us, Actually it works correctly only by its nature).

It gives a ERROR specifying a COMPONENT name.
you/I will go to that specified COMPONENT and check the code.
All of the code is 100% correct.
Even it gives the ERROR in that COMPONENT only.

In this case How are we going to HANDLE?


SOL:

For Example :

A talend job consists of 3 flows.
Say,Each flow consists of 4 to 5 components.

In order to debug any kind of ERROR, just follow the BASIC thumb rule i am going to tell you.
and Most of the time 99.99% you will definitely find , where the job is EXACTLY failing.

you just got to DISABLE/DE-ACTIVATE the COMPONENTS one by one from the STARTING COMPONENT TO THE COMPONENT where it is giving ERROR.

MOSTLY, the ERROR occurs with the PRIOR component adjacent to the ERRORED component.



TALEND/TAC ERROR : CONNECTION TO THE SERVER FAILED in TAC (Talend Adminstrator Center)

0)Check the SERVERS section in the  left hand pane whether the corresponding JOB SERVER is in RUNNING state or not.
If the JOBSERVER  is not RUNNING, then RESTART the VM in which TAC is installed.

1)The job you deployed in TAC, might have been corrupted.
    Hence BUILD the talend job again and re-deploy the job in TAC.

2)Check for the  HARDDISK capacity in the VM in which TAC is installed.
   If the HARDDISK  size/capacity is very LOW then try to delete some of the UNNECESSARY data and try to RE-DEPLOY the job.





Tuesday, 16 May 2017

ADDING CRON TRIGGER OR SCHEDULING JOB IN TALEND OR TAC

For those unfamiliar with "cron", this means being able to create a firing schedule such as: "At 8:00am every Monday through Friday" or "At 1:30am every last Friday of the month".

A "Cron-Expression" is a string comprised of 6 or 7 fields separated by white space. The 6 mandatory and 1 optional fields are as follows:

Field Name Allowed Values Allowed Special Characters
Seconds 0-59 , - * /
Minutes 0-59 , - * /
Hours 0-23 , - * /
Day-of-month 1-31 , - * ? / L C
Month 1-12 or JAN-DEC , - * /
Day-of-Week 1-7 or SUN-SAT , - * ? / L C #
Year (Optional) empty, 1970-2099 , - * /

The '*' character is used to specify all values. For example, "*" in the minute field means "every minute".
The '?' character is allowed for the day-of-month and day-of-week fields. It is used to specify 'no specific value'. This is useful when you need to specify something in one of the two fileds, but not the other. See the examples below for clarification.
The '-' character is used to specify ranges For example "10-12" in the hour field means "the hours 10, 11 and 12".
The ',' character is used to specify additional values. For example "MON,WED,FRI" in the day-of-week field means "the days Monday, Wednesday, and Friday".
The '/' character is used to specify increments. For example "0/15" in the seconds field means "the seconds 0, 15, 30, and 45". And "5/15" in the seconds field means "the seconds 5, 20, 35, and 50". You can also specify '/' after the '*' character - in this case '*' is equivalent to having '0' before the '/'.
The 'L' character is allowed for the day-of-month and day-of-week fields. This character is short-hand for "last", but it has different meaning in each of the two fields. For example, the value "L" in the day-of-month field means "the last day of the month" - day 31 for January, day 28 for February on non-leap years. If used in the day-of-week field by itself, it simply means "7" or "SAT". But if used in the day-of-week field after another value, it means "the last xxx day of the month" - for example "6L" means "the last friday of the month". When using the 'L' option, it is important not to specify lists, or ranges of values, as you'll get confusing results.
The '#' character is allowed for the day-of-week field. This character is used to specify "the nth" XXX day of the month. For example, the value of "6#3" in the day-of-week field means the third Friday of the month (day 6 = Friday and "#3" = the 3rd one in the month). Other examples: "2#1" = the first Monday of the month and "4#5" = the fifth Wednesday of the month. Note that if you specify "#5" and there is not 5 of the given day-of-week in the month, then no firing will occur that month.
The 'C' character is allowed for the day-of-month and day-of-week fields. This character is short-hand for "calendar". This means values are calculated against the associated calendar, if any. If no calendar is associated, then it is equivalent to having an all-inclusive calendar. A value of "5C" in the day-of-month field means "the first day included by the calendar on or after the 5th". A value of "1C" in the day-of-week field means "the first day included by the calendar on or after sunday".
The legal characters and the names of months and days of the week are not case sensitive.
Here are some full examples:

Expression Meaning
"0 0 12 * * ?" Fire at 12pm (noon) every day
"0 15 10 ? * *" Fire at 10:15am every day
"0 15 10 * * ?" Fire at 10:15am every day
"0 15 10 * * ? *" Fire at 10:15am every day
"0 15 10 * * ? 2005" Fire at 10:15am every day during the year 2005
"0 * 14 * * ?" Fire every minute starting at 2pm and ending at 2:59pm, every day
"0 0/5 14 * * ?" Fire every 5 minutes starting at 2pm and ending at 2:55pm, every day
"0 0/5 14,18 * * ?" Fire every 5 minutes starting at 2pm and ending at 2:55pm, AND fire every 5 minutes starting at 6pm and ending at 6:55pm, every day
"0 0-5 14 * * ?" Fire every minute starting at 2pm and ending at 2:05pm, every day
"0 10,44 14 ? 3 WED" Fire at 2:10pm and at 2:44pm every Wednesday in the month of March.
"0 15 10 ? * MON-FRI" Fire at 10:15am every Monday, Tuesday, Wednesday, Thursday and Friday
"0 15 10 15 * ?" Fire at 10:15am on the 15th day of every month
"0 15 10 L * ?" Fire at 10:15am on the last day of every month
"0 15 10 ? * 6L" Fire at 10:15am on the last Friday of every month
"0 15 10 ? * 6L" Fire at 10:15am on the last Friday of every month
"0 15 10 * * 6L 2002-2005" Fire at 10:15am on EVERY DAY of every month during the years 2002, 2003, 2004 and 2005
"0 15 10 * * 6#3" Fire at 10:15am on the third Friday of every month
Pay attention to the effects of '?' and '*' in the day-of-week and day-of-month fields!
NOTES:
Support for the features described for the 'C' character is not complete.
Support for specifying both a day-of-week and a day-of-month value is not complete (you'll need to use the '?' character in on of these fields).

Monday, 20 February 2017

TALEND COMPARE DATE FUNCTION EXAMPLES


1)if first one less than second one return number -1,
TalendDate.compareDate("2016-DEC-01" ,"2016-DEC-020","yyyy-MM-dd");

2)equlas return number 0,
TalendDate.compareDate("2016-DEC-01" ,"2016-DEC-01","yyyy-MM-dd");

3)bigger than return number 1. (can compare partly)
TalendDate.compareDate("2016-DEC-15" ,"2016-DEC-01","yyyy-MM-dd");

Working Code :
Var.start : TalendDate.parseDate("yyyy-MM-dd","2016-DEC-01")

Var.End : TalendDate.parseDate("yyyy-MM-dd","2016-DEC-01" )

TalendDate.compareDate(Var.start,Var.End,"yyyy-MM-dd");

COMPARE DATE () SUMMARY :

Date1 < Date2    :         Returns -1
Date1 = Date 2     :         Returns 0
Date1> Date 2    :         Returns 1


Thursday, 19 January 2017

tFileList Component Properties in TALEND


***tFileList_2_CURRENT_FILEDIRECTORY will give the value as below.
\\RemoteVM\workspace


***tFileList_2_CURRENT_FILEPATH will give the value as below.
\\RemoteVM\workspace\svc_txn_2017-01-1817-34-01.zip


***tFileList_2_CURRENT_FILE will give the value as below.
svc_txn_2017-01-1817-34-01.zip

SFTP not Working In TALEND?

When SFTP is not working then, in the PORT properties copy the below content.

Sol :

"context.SFTP_PORT); session_tFTPConnection_1.setConfig(\"PreferredAuthentications\",\"publickey,keyboard-interactive,password\"",

context.SFTP_PORT contains the value of PORT(will be 22).

Thursday, 12 January 2017

Running Multiple TALEND Versions on PC

Scenario :  Want to Run Talend 5 and Talend 6 Versions on PC.

Sol :  Talend 5 uses JAVA 7
          Talend 6 uses JAVA 8.

1)Download JAVA 7 and JAVA 8 versions.

2)Install JAVA 7

3)Install JAVA 8

4)Copy the contents of C:\Program Files\Java\jre1.8.0_112 into some other folder say like C:\Talend\jre1.8.0_112

5)Copy the contents of C:\Program Files\Java\jre7 into some other folder say like C:\Talend\jre7\

6)Uninstall JAVA 7 from install/uninstall programs of Control Panel.

7) I have got TALEND 5 in the below folder.
     C:\TALEND5\
  TALEND 6 in the below folder
  C:\TALEND6\

8)
Changing the .ini file for TALEND 5

we will be having one Talend Studio configuration settings file of windows 64 machine as below.

C:\TALEND5\APM_Connect-Studio-win-x86_64.ini

Open it and modify as below.

-vm
C:\Talend\jre7\bin\javaw.exe
-vmargs
-Xms2048m
-Xmx5120m
-XX:MaxPermSize=512m
-Dfile.encoding=UTF-8


Changing the .ini file for TALEND 6

we will be having one Talend Studio configuration settings file of windows 64 machine as below.

C:\TALEND6\APM_Connect-Studio-win-x86_64.ini

Open it and modify as below.

-vm
C:\APMConnect\Java\jre1.8.0_112\bin\javaw.exe
-vmargs
-Xms2048m
-Xmx5120m
-Dfile.encoding=UTF-8


Note : As you already have installed JAVA 8 it will be doing the needful for other applications that use JAVA.


Wednesday, 9 November 2016

Talend - Out of Memory Error and Java Heap Space Error

The Out of Memory Error and Java Heap Space Error are two of the usual errors which occur in the Talend jobs handling a large volume of data. These errors can be avoided to an extent by following some design guidelines.

(1) Keep in mind that tMap is a heavy component. Minimize its use in your jobs.

  • Avoid tMap if you need just simple transformations like trimming the string values, replacing null numbers by zeroes, etc. In its place you can use tJavaRow component.
  • If you want to get only a small set of columns from a huge collection avoid using a tMap. For that you can use a lighter component- tFilterColumns
  • Similarly, to filter rows you can use tFilterRow instead of a tMap
(2) Use store on disk option whenever necessary.
          This option is available in tMap, tUniqRow, tSortRow, etc.
  • tMap
While using store on disk option in tMap the directory to store temporary data will be created automatically. This data will not be deleted or replaced on subsequent run(s) of the job. So it is advised to delete the temporary directory created using tFileDelete component from within the job. You can give that in On Subjob Ok of tPostJob component.

  • tUniqRow
In the case of tUniqRow the temporary directory should be created manually before the job run/or can be handled within the job. If the temporary directory is not available, the component tUniqRow will give out FileNotFoundException!

  • tSortRow
In the case of tSortRow the temporary directory will be created automatically


(3) The JVM arguments can be modified as and when needed
.
-Xms256M - initial memory size available to JVM is 256 MB
-Xmx1024M - maximum memory size available to JVM is 1024 MB


TALEND Installation Instructions

A)Pre-requisites before Talend Installation :

1)      Talend User needs to be created, in order to access the PROJECTS in REMOTE Repository.
For Local Repository, you can igore this.

2)       Need to have Java  JDK  V1.6 or higher

Oracle JDK Setup Details :

1)      Install Oracle JDK 1.6 or Higher. You will need the JDK, not the JRE to develop jobs using the Talend 6 studio. You can download and install this here:

Talend Installer automatically Checks for JDK version 1.6 or higher version. If no instance of JDK is found, the installer will shutdown.

2)Set up your JAVA_HOME environment Path Variable.

Define your JAVA_HOME evironment variable so that it points to the JDK directory.
For example, if the JDK path is C:\java\JDKx.x.x\bin, then you must set the JAVA_HOME environment variable to point to C:\Java\JDKx.x.x.

It is highly recommended that the full path to the server installation directory is as short as possible and does not contain any space character. If you already have a suitable JDK installed in a path with a space, you simply need to put quotes around the path when setting the values for the environment variable.

If you use Talend Installer, you also have to set the Path system variable.

3)Add the previously defined JAVA_HOME variable to the Path environment variable.

For Example  <PathVariable>%JAVA_HOME%\bin.


Note : You can add the Path variable and  create the JAVA_HOME environment variable in System Variables Section.



Thursday, 8 September 2016

STRING RELATED FUNCTIONS IN TALEND

CONTAINS & EQUALS :

row1.contains("ram")
row2.equals("Name")

LASTINDEXOF :
("tFileList_1_CURRENT_FILE")).lastIndexOf("."))

SUBSTRING:
Input  :  "ABCD".substring(0,2)
Result :   ABC

GETTING CURRENT FILEPATH  IN tFILELIST :
((String)globalMap.get("tFileList_1_CURRENT_FILEPATH"))

GETTING CURRENT FILENAME FROM  tFILELIST :
((String)globalMap.get("tFileList_1_CURRENT_FILE")).substring(0,((String)globalMap.get
("tFileList_1_CURRENT_FILE")).lastIndexOf("."))

replace() Example :
-------------------------

we can use this for replacing special characters in a string
String input_row="abcd:[],kdkkxyzkll:[],";


 

String abc="{\"NOTIFICATION_RESPONSE\":{"+input_row.replace(":[],",":\"\"")+"}}";
System.out.println(abc);
Output :  
-----------
{"NOTIFICATION_RESPONSE":{abcd:""kdkkxyzkll:""}}
Note :  Go for replace() if  replaceAll() is not working.
replaceAll() will not work for dollar ($),doublequotes("),comma(,) etc special characters, in such
cases you can go for replace().

Friday, 19 August 2016

TALEND/TAC ERROR : TAC, TALEND JOB CONDUCTOR shows the TALEND job in SENDING state when the job is executed...

When a talend job is deployed in TAC  it stores the job in multiple places as below

1)C:\TAC Installed Folder\repository
2)C:\TAC Installed Folder\generated_jobs\task_NUMBER
3)C:\TAC Installed Folder\TalendJobServersFiles\archiveJobs

We got to delete the task_NUMBER(NUMBER will be generated by TAC once you deploy the job) folder in the path C:\TAC Installed Foldergenerated_jobs\

Once you delete the task, the hanging mode or SENDING state or  DEPLOYING state will be released automatically.


Wednesday, 27 July 2016

LOGIC OF==ALPABETIC===ALPHANUMERIC===NUMERIC===NEGATIVENO===checking in TALEND

**AlphaNumeric||length > 5|| ==0 then reject else Accept

b=myString.matches("[0-9]+") ? myString.length() > 5 || myString.matches("0") ? "false" : "true" : "false";

**AlphaNumeric
myString.matches("[A-Za-z0-9]+")

**Numeric
myString.matches("[0-9]+"))

**Negative
myString.matches("-[0-9]+"))

**Alphabetic
myString.matches("[A-Za-z]+")


Eg:

* package whatever; // don't place package name! */

import java.util.*;
import java.lang.*;
import java.io.*;

/* Name of the class has to be "Main" only if the class is public. */
class Ideone
{
 public static void main (String[] args) throws java.lang.Exception
 {
  String myString = "-123456";
System.out.println(myString.matches("[A-Za-z0-9]+"));

String a=(myString.matches("[A-Za-z]+")) ? "String1" :

(myString.matches("[0-9]+")) ? myString.length() > 5 ? ">5" : "Good" :
(myString.matches("-[0-9]+")) ? "Negative" : "none";

System.out.println("hi"+a);

if(myString.matches("[A-Za-z]+"))
{
 System.out.println("String");
}
else
{   if(myString.matches("[0-9]+"))
      {
         System.out.println("Number");
      }
}    

System.out.println(myString.matches("[A-Za-z]+"));
 }
}

BUSINESS MODEL in TALEND

Business Model
Talend's Business Models allow data integration project stakeholders to graphically represent their needs regardless of the technical implementation requirements. Business Models help the IT operation staff understand these expressed needs and translate them into technical processes (Jobs).

draw your business needs,
•create and assign numerous repository items to your model objects,
•define the business model properties of your model objects.


All objects are represented in the Palette as shapes, and can be included in the model.

Shapes

Select the shape corresponding to the relevant object you want to include in your Business Model. Double-click it or click the shape in the Palette and drop it in the modeling area.

Alternatively, for a quick access to the shape library, keep your cursor still on the modeling area for a couple of seconds to display the quick access toolbar:



Terminal--The rounded corner square can illustrate any type of output terminal.

List--forms a list with the extracted data. The list can be defined to hold a certain nature of data.

InpuT--Inserts an input object allowing the user to type in or manually provide data to be processed.

Gear--This gearing piece can be used to illustrate pieces of code programmed manually that should be replaced by a Talend Job for example.

Ellipse--Inserts an ellipse shape.

Document--Inserts a Document object which can be any type of document and can be used as input or output for the data processed.

Decision--The diamond shape generally represents an if condition in the model. Allows to take context-sensitive actions.

Database--Inserts a database object which can hold the input or output data to be processed.

Data--A parallelogram shape symbolize data of any type.

Actor--This schematic character symbolizes players in the decision-support as well technical processes.

Action--The square shape can be used to symbolize actions of any nature, such as transformation, translation or formatting.

Assignment tab

The Assignment tab displays in a tabular form details of the Repository attributes you allocated to a shape or a connection.

To display any assignment information in the table, select a shape or a connection in the active model, then click the Assignment tab in the Business Model view.

You can also display the assignment list placing the mouse over the shape you assigned information to.


Assigning repository elements to a Business Model
The Assignment tab in the Business Models view lists the elements from the Repository tree view which have been assigned to a shape in the Business Model.
You can define or describe a particular object in your Business Model by simply associating it with various types of information, for example by adding metadata items.

You can set the nature of the metadata to be assigned or processed, thus facilitating the Job design phase.

To assign a metadata item, simply drop it from the Repository tree view to the relevant shape in the design workspace.

The Assignment table, located underneath the design workspace, gets automatically updated accordingly with the assigned information of the selected object.

The types of items that you can assign are:

Job designs--If any Job Designs developed for other projects in the same repository are available, you can reuse them as metadata in the active Business Model.
Metadata--You can assign any descriptive data stored in the repository to any of the objects used in the model. It can be connection information to a database for example.
Business Models--You can use in the active model all other Business Models stored in the repository of the same project.
Documentation--You can assign any type of documentation in any format. It can be a technical documentation, some guidelines in text format or a simple description of your databases.
Routines (Code)--If you have developed some routines in a previous project, to automate tasks for example, you can assign them to your Business Model. Routines are stored in the Code folder of the Repository tree view.



Talend Activity Monitoring Console (AMC )


Activity monitoring information can be stored in delimited files or database tables.

Before collecting and reusing the activity monitoring information of your Talend Jobs, you have to:

•Create files or database tables to be used as datasources for the activity monitoring information.

•Enable activity monitoring either by configuring the Stats & Logs settings at the project level or Job level or by adding the relevant components to your Jobs in order to catch and record the activity monitoring information and deliver it to the defined output (files or database tables).

•Configure the datasources to retrieve the activity monitoring information, which can be displayed on Talend Activity Monitoring Console either from the studio or from the Monitoring module of Talend Administration Center.


Different Views in AMC :

Jobs vieW--The Jobs view provides the list of Jobs mentioned in the execution log data collected.
The History view provides a summary of the Job main steps.
The Detailed History view splits up each Job into components and provides the execution details.
The Main Chart view displays a pie chart representing for each component, its respective share of the execution time.
The Meter Log view displays the detailed information of the various flows processed in the Job selected on the Jobs view.
The Logged Events view displays in full the messages generated through tWarn or tDie components as well as Java Exception
The Error Report view provides an analysis of the proportion of errors that occurred over a number of Job executions
The Job Volume view displays a line chart representing the volumetrics of the flow being processed. Depending on the tFlowMeter component settings on your Talend Job design, the scale and units will differ.
The Threshold chart view displays the proportion of the flow measured.

Tuesday, 26 July 2016

Talend Job Design - Performance Optimization Tips

1.Remove Unnecessary fields/columns ASAP using tFilterColumns component.


It is very important to remove the data from the Job flow which is not required as soon as possible. e.g. we have a huge lookup file having more than 20 fields but we only need two fields (Key, Value) while performing the lookup operation. Now if we do not filter the columns before join then the whole file will be read into memory for performing lookup hence occupying unnecessary space. However, if we filter fields and only keep two required columns then the memory occupied by lookup data is much less i.e. in this example 10 times less.



2. Remove Unnecessary data/records ASAP using tFilterRows component.

Similarly, It is necessary to remove the data from the job flow which is not required in the Job. Having less data in your job flow will always allow your Talend Job to perform better.

3. Use Select Query to retrieve data from database


4. Use Database Bulk components -

5. Store on Disk Option -

6. Allocating more memory to the Jobs-

7. Parallelism -
  • Using the tParallelize component of Talend. (only available in Talend Integration Suite)
  • Running SubJobs in Parallel by using the Multithreaded Executions. This option is also available in Talend Open Studio. However, this option is disabled by default. You can enable this option from Job view. Visit the article “Parallel Execution Sub Jobs in Talend Open Studio” for more details and demonstration of Parallel execution of Sub Jobs in Talend Open Studio.
8. Use Talend ELT Components when required-


9. Use SAX parser over Dom4J whenever required -
When parsing Huge XML files try using the SAX parser in the Generation mode in the Advanced Settings of tFileInputXML component. However SAX parser comes with few downsides e.g. we can only basic XPATH expression and can not use expressions like Last , array selection of data [ ] etc. But if your requirement is getting accomplished using SAX parser, you must prefer it over Dom4J.

10. Index Database Table columns -


11. Split Talend Job to smaller Subjobs- Whenever possible, one should split the complex Talend job to smaller Subjobs. Talend operates pipe line parallelism i.e. after processing few records it passes to downstream components even if the previous component has finished processing all records. Hence if we will design a JOb having complex number of operations in single subjob then the performance of the job will reduce. It is advisable to bread the complex Talend job to smaller Subjobs and then control the flow of Job using Triggers in Talend. Thanks Guys for reading this post. I am looking forward to your expert comments.


get it

 http://hinekv1.ddns.net:8008/get.php?username=Varga_Florentina1tv&password=j5MgWBs18t&type=m3u_plus&output=mpegts http://www.lo...