Here is a good article about tips for debugging a datastage job :
Steps
Enable the following environment variables in DataStage Administrator:
- APT_PM_PLAYER_TIMING – shows how much CPU time each stage uses
- APT_PM_SHOW_PIDS – show process ID of each stage
- APT_RECORD_COUNTS – shows record counts in log
- APT_CONFIG_FILE – switch configuration file (one node, multiple nodes)
- OSH_DUMP – shows OSH code for your job. Shows if any unexpected settings were set by the GUI.
- APT_DUMP_SCORE – shows all processes and inserted operators in your job
- APT_DISABLE_COMBINATION – do not combine multiple stages in to one process. Disabling this will make it easier to see where your errors are occurring.
- Use a Copy stage to dump out data to intermediate peek stages or sequential debug files. Copy stages get removed during compile time so they do not increase overhead.
- Use row generator stage to generate sample data.
- Look at the phantom files for additional error messages: c:\datastage\project_folder\&PH&
- To catch partitioning problems, run your job with a single node configuration file and compare the output with your multi-node run. You can just look at the file size, or sort the data for a more detailed comparison (Unix sort + diff commands).
No comments:
Post a Comment