- Avoid Propagation of unnecessary metadata between the stages: use Modify Stage and drop the metadata (need to be explicitly specified using DROP clause, and KEEP clause to keep the metadata).
- Use MODIFY, FILTER, AGGREGATION, COLUMN GENERATOR, etc instead of TRANFORMER Stage only if anticipated volumes are high and performance becomes a problem.
- Turn off RCP (Runtime Column Propagation) whenever it's not required.
- Estimates the volumes before decide to use JOIN or LOOKUP or MERGE stages.
- Add Reject Files wherever you need to reprocess of rejected records or you think considerable data loss may happen. Try to keep reject file at least at sequential file stages and writing to Database stages.
- Make use of ORDER BY clause when a DB Stage is being used in join.
- Use SORT stage instead of REMOVE DUPLICATE stages.
- Set APT_STRING_PADCHAR to OxOO (C/C++ end of string) when converting strings of lower precision to higher precision.
This blog is dedicated to all person who shared the information that help us a lot. Some of the information (and mostly :D) is collected from notes, documents, forums or blogs where I can't tell it one by one, because the main purpose is used for my personal notes for every documents that I'd found when learning this great stuff. BIG thanks for all the knowledge that had been shared, success to all of you.
Saturday, July 23, 2011
DataStage Jobs Design Tips
There are design tips to build more effective jobs :
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment