Datastage Administrator and Director - Day 1
Comments
Description
DataStage Administrator and DirectorBasic C3: Protected About the Author Created By: Credential Information: Version and Date: Mandhagini P.S (127057) An expert in DataStage having 3 years of IT experience DS/PPT/1106/1.0 ©Copyright 2005, Cognizant Academy, All Rights Reserved 2 Icons Used Questions Hands-on Exercise A Welcome Break Test Your Understanding Coding Standards Reference Demo Key Contacts ©Copyright 2005, Cognizant Academy, All Rights Reserved 3 DataStage Administrator and Director: Overview Introduction: DataStage is a Widely used Data Warehousing (DW) tool used to develop Complex ETL jobs. It has a unique feature of Real Time Integration and also provides a very user friendly Interface. DataStage has many features to make easier back end query. DataStage administrator allows you to prepare the setup for DataStage Projects and General Administration of DataStage DataStage director allows you to monitor, schedule, and run the jobs and helps in viewing the Job Log after running the job ©Copyright 2005, Cognizant Academy, All Rights Reserved 4 DataStage Administrator and Director: Objectives Objective: After completing this chapter, you will be able to: Identify what is DataStage tool Define DataStage Administrator Work with DataStage Administrator Explain DataStage Director Work with DataStage Director ©Copyright 2005, Cognizant Academy, All Rights Reserved 5 DataStage Administrator: Logging In • Logging into a DataStage server using the Administrator requires the host name of the server, the fully qualified name if necessary or the server’s IP address, and an operating system username and password. • For UNIX servers, users logging in as root or as a root-equivalent account, or as dsadm will have full administrative rights. • For Windows servers, users logging in who are members of the Local Administrators (standalone server) or Domain Administrators (domain controller or servers in an Active Directory Forest) groups will have full administrative rights. ©Copyright 2005, Cognizant Academy, All Rights Reserved 6 DataStage Administrator: Logging In (Contd.) The Administrator Login Dialog Box Enter the hostname or IP address of the server where DataStage is installed Enter your operating system username and password ©Copyright 2005, Cognizant Academy, All Rights Reserved 7 Viewing the Project List • This page lists the DataStage projects, and shows the pathname of the selected project in the Project pathname field. The Projects page has the following buttons: – Add: Adds new DataStage projects. This button is enabled only if you have administrator status. – Delete: Deletes projects. This button is enabled only if you have administrator status. – Properties: Views or sets the properties of the selected project. – NLS: Lets you change project maps and locales (if the NLS option was installed during the server installation). – Command: Issues DataStage Engine commands directly from the selected project. ©Copyright 2005, Cognizant Academy, All Rights Reserved 8 Adding Projects • • Provided that you have the proper permissions, you can add as many projects to the DataStage server as necessary. In normal projects any DataStage developer can create, delete, or modify any object within the project once it has been created. Tip: The default directory path in which to create projects is located under the root directory of the DataStage server installation. For example, if the server was installed to /appl/Ascential/DataStage the projects would be installed to /appl/Ascential/DataStage/Projects/{project name}. ©Copyright 2005, Cognizant Academy, All Rights Reserved 9 Deleting Projects Highlight the project to be deleted Make sure you have a current backup of your project, just in case! ©Copyright 2005, Cognizant Academy, All Rights Reserved 10 General Project Options • Enable job administration in Director - enabling this feature allows the user the ability to Cleanup Resources and Clear Status File from within the Job menu of DataStage Director. • Enable Runtime Column Propagation for Parallel Jobs - if you enable this feature, stages in parallel jobs can handle undefined columns that they encounter when the job is run, and propagate these columns through to the rest of the stages in the job. • Auto-purge of job log - this setting will automatically purge job log entries for jobs based on the auto-purge action setting. For example, if you specify to auto purge up to the previous 3 job runs, entries for the previous 3 job runs are kept as new job runs are completed. ©Copyright 2005, Cognizant Academy, All Rights Reserved 11 General Project Options (Contd.) Auto purge settings for job logs—not a global or retroactive setting Create Environmental Variables ©Copyright 2005, Cognizant Academy, All Rights Reserved 12 Setting Project-wise Environment Variables • You can set project-wide defaults for general environment variables or ones specific to parallel jobs from this page. • You can also specify new variables. All of these are then available to be used in jobs. • In each of the categories except User Defined, only the default value can be modified. In the User Defined category, users can create new environment variables and assign default values. ©Copyright 2005, Cognizant Academy, All Rights Reserved 13 Setting Project-wise Environment Variables (Contd.) ©Copyright 2005, Cognizant Academy, All Rights Reserved 14 Enable Server-Side Job Tracing You can trace the activities on the server to help diagnose project problems. Enable or disable tracing in the project View or delete the currently highlighted file Trace files that have been created ©Copyright 2005, Cognizant Academy, All Rights Reserved 15 Validating User Account for Job Scheduling • • This tab applies to Windows NT/2000 servers only. DataStage uses the Windows NT Schedule service to schedule jobs. Select a user account with proper access to the DataStage project Verification that the currently selected user account can schedule jobs ©Copyright 2005, Cognizant Academy, All Rights Reserved 16 Performance Tuning Options Some performance tuning options are: • Row buffering • Hashed file stage caching ©Copyright 2005, Cognizant Academy, All Rights Reserved 17 Server Commands Select a project and click ‘Command’ Enter a valid DataStage command When you execute the command, a new window will show the response from the engine ©Copyright 2005, Cognizant Academy, All Rights Reserved 18 Assigning Roles (Operator/Developer) to User Accounts There are four roles for a DataStage user account: • • DataStage Developer: Has full access to all areas of a DataStage project. DataStage Production Manager: Has full access to all areas of a DataStage project, and can also create and manipulate protected projects. • • DataStage Operator: Has permission to run and manage DataStage jobs. <None>: Does not have permission to log on to DataStage. ©Copyright 2005, Cognizant Academy, All Rights Reserved 19 Assigning Roles (Operator/Developer) to User Accounts (Contd.) Select the user role, which is to be assigned to particular user accounts. ©Copyright 2005, Cognizant Academy, All Rights Reserved 20 Settings for Parallel Jobs • Enable Runtime Column Propagation for Parallel Jobs When this feature is enabled, stages in parallel jobs can handle undefined columns that they encounter when the job is run, and propagate these columns through to the rest of the job. • Enable Remote Execution of Parallel Jobs Select this to specify that parallel jobs in this project are to be deployed on USS machine (Unix systems Services). When this option is selected, the Remote tab is enabled and you can specify details about the jobs that are deployed ©Copyright 2005, Cognizant Academy, All Rights Reserved 21 Settings for Parallel Jobs (Contd.) Enable these options. ©Copyright 2005, Cognizant Academy, All Rights Reserved 22 Settings for Parallel Jobs (Contd.) ©Copyright 2005, Cognizant Academy, All Rights Reserved 23 DataStage Director: Logging In • • Logging into a DataStage server using the Director requires. The host name of the server, the fully qualified name if necessary, or the server’s IP address and the operating system username and password. ©Copyright 2005, Cognizant Academy, All Rights Reserved 24 DataStage Director: Logging In (Contd.) The Director Login Dialog Box Enter the hostname or IP address of the server where DataStage is installed Enter your operating system username and password Select the project to attach to ©Copyright 2005, Cognizant Academy, All Rights Reserved 25 Viewing the Job Run Status • The Job Status view shows the status of all the jobs in the currently selected job category, or, if the job category pane is hidden, in the current project. The view has the following columns: – Job name: The name of the job. – Status: The status of the job. – Started on date: The time and date a job was started. These fields are only filled in for a job with a status of Running. – Last ran on date: The time and date the job was finished, stopped, or aborted. These columns are blank for jobs that have never been run. – Description: A description of the job, if available. • To view more details about a job’s status, select the job and do one of the following: – Choose View —> Detail. – Right-click to display the shortcut menu and choose Detail. – Double-click the job. ©Copyright 2005, Cognizant Academy, All Rights Reserved 26 Viewing the Job Run Status (Contd.) Detailed information about a job’s status ©Copyright 2005, Cognizant Academy, All Rights Reserved 27 Validating a Job • • You can check that a job or job invocation will run successfully by validating it. Jobs should be validated before running them for the first time, or after making any significant changes to job parameters. When a server job is validated, the following checks are made without actually extracting, converting, or writing data. • • • Connections are made to the data sources or data warehouse. SQL SELECT statements are prepared. Files are opened. Intermediate files in Hashed File, UniVerse, or ODBC stages that use the local data source are created, if they do not already exist. ©Copyright 2005, Cognizant Academy, All Rights Reserved 28 Validating a Job (Contd.) Click Validate when Job Run Options and parameters have been set ©Copyright 2005, Cognizant Academy, All Rights Reserved 29 Running a Job Click Run when Job Run Options, parameters and tracing options have been set ©Copyright 2005, Cognizant Academy, All Rights Reserved 30 Monitoring a Job Expand tree to see all links attached to an active stage Optionally show CPU utilization for each active stage ©Copyright 2005, Cognizant Academy, All Rights Reserved 31 Stopping a Job Click Stop button to stop a running job ©Copyright 2005, Cognizant Academy, All Rights Reserved 32 Resetting a Job • If a job has stopped or aborted, then it is difficult to determine whether all the required data was written to the target data tables. When a job has a status of Stopped or Aborted, you must reset it before running the job again. By resetting a job, you set it back to a runnable state and, optionally, return your target files to the state they were in before the job was run. • To reset a job or job invocation: 1. Select the job or invocation you want to reset in the Job Status view. 2. Choose Job —> Reset or click the Reset button on the toolbar. A message box appears. 3. Click Yes to reset the tables. All the files in the job are reinstated to the state they were in before the job was run. The job’s status is updated to “Has been reset”. ©Copyright 2005, Cognizant Academy, All Rights Reserved 33 Resetting a Job (Contd.) Click Reset button to return a job to a runnable state ©Copyright 2005, Cognizant Academy, All Rights Reserved 34 Interpreting the Job Execution Details in Log View Current run—black Previous run—blue Additional information is available for this entry (…) 35 ©Copyright 2005, Cognizant Academy, All Rights Reserved Log Event Detail Window Detail information can be copied to the system clipboard and pasted into a text editor— useful for sending errors to support! Additional lines of information regarding this particular event ©Copyright 2005, Cognizant Academy, All Rights Reserved 36 Filtering Log Events Where to start showing log entries Where to stop showing log entries What type of log entries to show How many log entries to show ©Copyright 2005, Cognizant Academy, All Rights Reserved 37 Clearing Log Entries Immediately delete log entries or automatically purge entries Which entries to remove immediately Which entries to remove automatically ©Copyright 2005, Cognizant Academy, All Rights Reserved 38 Clearing Log Entries (Contd.) Options in Auto- Purge: • Up to previous (job runs): Purges old log entries, leaving the specified number of recent job run entries in the file. • Older than (days): Purges all log entries older than the specified number of days. Specify the number of job run entries or days by clicking the arrow buttons or entering the value directly. ©Copyright 2005, Cognizant Academy, All Rights Reserved 39 Schedule View ©Copyright 2005, Cognizant Academy, All Rights Reserved 40 Scheduling a Job Execution You can schedule a job to run in a number of ways: • • • • • Once today at a specified time Once tomorrow at a specified time On a specific day and at a particular time Daily at a particular time On the next occurrence of a particular date and time ©Copyright 2005, Cognizant Academy, All Rights Reserved 41 Scheduling a Job Execution (Contd.) Select a job and click Schedule button ©Copyright 2005, Cognizant Academy, All Rights Reserved 42 Rescheduling a Job Execution Select a previously scheduled job and click Reschedule button ©Copyright 2005, Cognizant Academy, All Rights Reserved 43 Un-scheduling a Job Execution Right click on a previously scheduled job and click Unschedule ©Copyright 2005, Cognizant Academy, All Rights Reserved 44 Cleaning Up Resources • If the Enable Job Administration in Director option has been set in the DataStage Administrator, then certain functions are available to help you clean up the resources of a job that has hung or aborted or return a job to a state in which you can rerun it after the cause of the problem has been fixed. • You should use them with care, and only after you have tried to reset the job and you are sure it has hung or aborted. • The Cleanup Resources command lets you: – View and end job processes – View and release the associated locks ©Copyright 2005, Cognizant Academy, All Rights Reserved 45 Cleaning Up Resources (Contd.) Operating system’s process ID number Logout (kill) selected O/S process Engine locks associated with processes ©Copyright 2005, Cognizant Academy, All Rights Reserved 46 Clearing the Status File Select a hung job and select Clear Status File from Job menu ©Copyright 2005, Cognizant Academy, All Rights Reserved 47 Clearing the Status File (Contd.) Before you clear a status file you should: • • Try to reset the job. Ensure that all the job’s processes have ended. ©Copyright 2005, Cognizant Academy, All Rights Reserved 48 • Allow time for questions from participants ©Copyright 2005, Cognizant Academy, All Rights Reserved 49 Test Your Understanding • • • • • • What is the use of having User Defined Environment Variables? Can a DataStage operator manipulate a protected Project? What is the default cache size of a Hash size? When will “Clear Status File” be enabled in Director? What does (…) in the JOB LOG mean? Where do you see the CPU Utilization of each stage in a job? ©Copyright 2005, Cognizant Academy, All Rights Reserved 50 DataStage Administrator and Director: Summary • • DataStage is an ETL tool widely used in Data Warehousing. It has 4 components: Administrator, Director, Designer and Manager. Administrator can be used to: – Create or delete projects – Assign roles to user accounts – Set project specific environment variables – Enable tracing and Performance tuning • Director can be used to: – View job statistics – Validate/Run/Monitor/Stop/Reset and Schedule jobs – View logs/ filter log events and clear log entries – Clean up job resources ©Copyright 2005, Cognizant Academy, All Rights Reserved 51 DataStage Administrator and Director: Source • DataStage 7.5.1 manual Disclaimer: Parts of the content of this course is based on the materials available from the Web sites and books listed above. The materials that can be accessed from linked sites are not maintained by Cognizant Academy and we are not responsible for the contents thereof. All trademarks, service marks, and trade names in this course are the marks of the respective owner(s). ©Copyright 2005, Cognizant Academy, All Rights Reserved 52 You have successfully completed DataStage Administrator and Director.
Copyright © 2024 DOKUMEN.SITE Inc.