Talend online training
→Introduction On Talend DI for Big Data:
- About Talend Corporation and Their Journey
- Products under Talend Platform?
- What is Talend?
- Advantages of using Talend over other competitor integration tools?
- Why Talend is getting popular in the current trend?
- Talend Installation system Requirements?
- Types of repository connections to connect Talend Studio?
- Use of workspace, Project?
- What is Big data!!! List of software platforms come under Big data?
- What is Hadoop and How it is different from traditional technologies?
- What are the advantages with using Hadoop? In cost and Architectural feasibility prospective.
- High level hadoop cluster architecture and physical core components
- Hadoop eco system components.
- what are the challenges in Implementing a Big data project with conventional Hadoop framework?
- Pros and Cons in using Talend BD DI compared to conventional Hadoop eco system components?
- Talend Architecture and its components.
- Demo on Talend sample job design and execution.
→Talend GUI and Internal Tools
- Main window
- Menu bar and tools
- Repository tree view
- Design Workspace
- Configuration tabs
- Outline and code summary panels
- window -- show view, preferences
→ Brief explanation on
- working with Projects - Create,open,import,delete,export project
- Job: Create job, Add desired components to job
- Types of component connection links
- Row connection: Main, Reject, Unique, Duplicate, Iterate connection
- Trigger connection : on subjob ok,on component ok,onsubjob error,on component error, run if
- How to change label format for components and component connections
- Component connection indicators
- How do I determine Job starting point?
→Centralize Metadata and Schemas
- Database connection
- Flat file, Excel file, XML file
- Hadoop cluster
- Schema types and difference between the schemas.
- Roll of Die on error
- Enable & Disable reject flows
- Capture rejected data prior to job failure
- Input data validation against the schema object
- Lab practical
→Pre-requisites to design and execute a Talend job
- How to determine and fix Talend job errors with the help of problems tab.
Major and commonly using components
- Logs & Errors
Lab practical with combination of above components
→Essential processing components :
- Comparison between tJoin and tMap components
→ Data Mapping:
- Basic mapping
- Expressions in tMap
- Conditional logic with ternary operator
- Variables ,Filters usage in tMap expressions
- Row split into multiple routes
- Joins in tMap
- Reload at each row lookup
- Reject data handling in tMap
- Testing expressions
- Built in Functions
Lab practical with tMap
→More Practical on:
- File - Multi structure, Regex
- Orchestration -- tFlowtoIterate, tLoop
- XML readers/writers -- tXMLMap
- what is globalMap variables and how to use globalMap variables
- Context group creation
- Add a context group to job
- Add contexts to context group
- tContextLoad, Implicit Load context from a file, tContextDump
- Context file location assignment with operating system environment variables
- Talend Job debugging
→Custom Java in Talend
- Conditional logic implementation with tjava & tJavaRow
- Set context and globalMap variable values with tJava
- Code routines
- How to use external java classes
- Difference between tJavaRow and tJavaFlex
→Talend with Database reader and writers: (S3)
- Read from database tables
- How to use context,glomapMap variables in sql override
- Print sql override query in output log
- Write to database table
- Database connection session management, Shared database connection
- Column selection for Update, Insert operation
- Rejects and error management - Bulk load
→Logging and Testing
- Log console output to an operating system file
- Custom job killing using system.exit(<custom return code>)
- Code deployment & execution
- Compiled executables - JAR files
- Select desired context group from context group list
- Command line context parameters
- Job dependency management
- Return codes from child job without Die
- Parent & child job management
- Miscellaneous components -- FixedFlowInput, tRowgenerator, tMemorizerows, tBufferInput, tBufferOutput
- CDC implementation in Talend
- SCD2 implementation in Talend
- Incremental Loading
- Unit testing
Difference between Talend open studio and Talend Enterprise edition
- Jobs execution in parallel
- tParallelize vs Multi thread execution
→Theory on Enterprise edition features
- Remote Repository connections
- Sandbox Project
- @Reference project
- SVN branches
- Lock Types - Checkin,Checkout
- Talend Administration center
- Talend Activity monitor console
- Talend SDLC - Job deployment process
- Job publishing into Artifact repository -- From studio or command
- Difference between Jobserver & Runtime server
- Talend products related to DI Prospective:
- Talend open studio - Data Integration edition, Bigdata edition
- Talend Subscription Solutions - DI, BD - Bigdata, Bigdata platform, Real-time Bigdata platform