Saturday, July 23, 2011

DataStage Jobs Design Tips

There are design tips to build more effective jobs :

  1. Avoid Propagation of unnecessary metadata between the stages: use Modify Stage and drop the metadata (need to be explicitly specified using DROP clause, and KEEP clause to keep the metadata).
  2. Use MODIFY, FILTER, AGGREGATION, COLUMN GENERATOR, etc instead of TRANFORMER Stage only if anticipated volumes are high and performance becomes a problem.
  3. Turn off RCP (Runtime Column Propagation) whenever it's not required.
  4. Estimates the volumes before decide to use JOIN or LOOKUP or MERGE stages.
  5. Add Reject Files wherever you need to reprocess of rejected records or you think considerable data loss may happen. Try to keep reject file at least at sequential file stages and writing to Database stages.
  6. Make use of ORDER BY clause when a DB Stage is being used in join.
  7. Use SORT stage instead of REMOVE DUPLICATE stages.
  8. Set APT_STRING_PADCHAR to OxOO (C/C++ end of string) when converting strings of lower precision to higher precision.

No comments:

Post a Comment