Talend online training

→Introduction On Talend DI for Big Data:

  • About Talend Corporation and Their Journey
  • Products under Talend Platform?
  • What is Talend?
  • Advantages of using Talend over other competitor integration tools?
  • Why Talend is getting popular in the current trend?
  • Talend Installation system Requirements?
  • Types of repository connections to connect Talend Studio?
  • Use of workspace, Project?
  • What is Big data!!! List of software platforms come under Big data?
  • What is Hadoop and How it is different from traditional technologies?
  • What are the advantages with using Hadoop? In cost and Architectural feasibility prospective.
  • High level hadoop cluster architecture and physical core components
  • Hadoop eco system components.
  • what are the challenges in Implementing a Big data project with conventional Hadoop framework?
  • Pros and Cons in using Talend BD DI compared to conventional Hadoop eco system components?
  • Talend Architecture and its components.
  • Demo on Talend sample job design and execution.

→Talend GUI and Internal Tools

  • Main window
  • Menu bar and tools
  • Repository tree view
  • Design Workspace
  • Palette
  • Configuration tabs
  • Outline and code summary panels
  • window -- show view, preferences

→ Brief explanation on

  • working with Projects - Create,open,import,delete,export project
  • Job: Create job, Add desired components to job
  • Types of component connection links
  • Row connection: Main, Reject, Unique, Duplicate, Iterate connection
  • Trigger connection : on subjob ok,on component ok,onsubjob error,on component error, run if
  • How to change label format for components and component connections
  • Component connection indicators
  • How do I determine Job starting point?

→Centralize Metadata and Schemas

  • Database connection
  • Flat file, Excel file, XML file
  • Hadoop cluster
  • FTP
  • Schema types and difference between the schemas.

→Data Validation:

  • Roll of Die on error
  • Enable & Disable reject flows
  • Capture rejected data prior to job failure
  • Input data validation against the schema object
  • Lab practical

→Pre-requisites to design and execute a Talend job

  • How to determine and fix Talend job errors with the help of problems tab.
  • Major and commonly using components
    • File
    • Database
    • Logs & Errors
    • Orchestration
    • System
  • Lab practical with combination of above components

→Essential processing components :

  • tConvertType
  • tFilterRow
  • tSortRow
  • tJoin
  • tMap
  • tAggregateRow
  • Comparison between tJoin and tMap components

→ Data Mapping:

  • Basic mapping
  • Expressions in tMap
  • Conditional logic with ternary operator
  • Variables ,Filters usage in tMap expressions
  • Row split into multiple routes
  • Joins in tMap
  • Reload at each row lookup
  • Reject data handling in tMap
  • Testing expressions
  • Built in Functions

Lab practical with tMap

→More Practical on:

  • File - Multi structure, Regex
  • Orchestration -- tFlowtoIterate, tLoop
  • XML readers/writers -- tXMLMap

→Context Variables:

  • what is globalMap variables and how to use globalMap variables
  • Context group creation
  • Add a context group to job
  • Add contexts to context group
  • tContextLoad, Implicit Load context from a file, tContextDump
  • Context file location assignment with operating system environment variables
  • Talend Job debugging

→Custom Java in Talend

  • Conditional logic implementation with tjava & tJavaRow
  • Set context and globalMap variable values with tJava
  • Code routines
  • How to use external java classes
  • Difference between tJavaRow and tJavaFlex

→Talend with Database reader and writers: (S3)

  • Read from database tables
  • How to use context,glomapMap variables in sql override
  • Print sql override query in output log
  • Write to database table
  • Database connection session management, Shared database connection
  • Column selection for Update, Insert operation
  • Rejects and error management - Bulk load

→Logging and Testing

  • Log console output to an operating system file
  • Custom job killing using system.exit(<custom return code>)
  • Code deployment & execution
  • Compiled executables - JAR files
  • Select desired context group from context group list
  • Command line context parameters
  • Job dependency management
  • Return codes from child job without Die
  • Parent & child job management

→Miscellaneous

  • Miscellaneous components -- FixedFlowInput, tRowgenerator, tMemorizerows, tBufferInput, tBufferOutput
  • CDC implementation in Talend
  • SCD2 implementation in Talend
  • Incremental Loading
  • Unit testing
  • Joblets
  • Difference between Talend open studio and Talend Enterprise edition

  • Jobs execution in parallel
  • tParallelize vs Multi thread execution

→Theory on Enterprise edition features

  • Remote Repository connections
  • Sandbox Project
  • @Reference project
  • SVN branches
  • Lock Types - Checkin,Checkout
  • Talend Administration center
  • Talend Activity monitor console
  • Talend SDLC - Job deployment process
  • Job publishing into Artifact repository -- From studio or command
  • Difference between Jobserver & Runtime server
  • Talend products related to DI Prospective:
  • Talend open studio - Data Integration edition, Bigdata edition
  • Talend Subscription Solutions - DI, BD - Bigdata, Bigdata platform, Real-time Bigdata platform
Contact for Demo
Training Enquiry Form





Online Courses Videos