DS Special Characters for Scribe

March 28, 2018 | Author: datastageresource | Category: Character Encoding, Databases, File Format, Oracle Database, Computer Data


Comments



Description

Handling Special Characters in DatasageRevision History Versio n Date Author Reasons for Change Section(s) Affected . ........................................................................8 ........1 – HANDLING OF SPECIAL CHARACTERS WITH NLS SETTINGS...................................................................................5.................................................5 DETAILS OF PROCEDURE TO FOLLOW..................................................................................4 DATASTAGE 7...4 OBJECTIVE..................1 ....1 HANDLING OF SPECIAL CHARACTERS ..........................................................................5 DATASTAGE 8.............................................................................................................1 .........................4 INTRODUCTION...................................................1 – HANDLING OF SPECIAL CHARACTERS WITHOUT NLS SETTINGS................Contents ............................................................................................................................................................................................................. INTRODUCTION DataStage has built-in National Language Support (NLS). NLS_LANG is the environment variable that oracle uses for character map recognition. and money. . Target database must know what character set datastage is loading and if it is different from the database character set it will attempt to convert data during the operation. This variable should be set with the value of NLS_CHARACTERSET of database in either . © . Sort data according to local rules Convert data between different encodings of the same language. DataStage can do the following: 1) 2) 3) 4) 5) Process data in a wide range of languages Accept data in any character set into most DataStage fields Use local formats for dates. With NLS installed.) using datastage when reading or writing against any database.dsenv file or in the administrator client under user defined category of particular datastage project. times.HANDLING OF SPECIAL CHARACTERS OBJECTIVE The objective of this document is to provide steps to handle special characters(®. few of the characters from foreign language etc. DETAILS OF PROCEDURE TO FOLLOW 1. which gives the value AL32UTF8. but NLS_CHARACTERSET value of database and datastage must be same.5. the DataStage server engine holds data in Unicode format. DataStage maps data to or from Unicode format as required. Then. © ) .AL32UTF8 . Or Select userenv ('LANGUAGE') from dual. For eg: Oracle DB Select * from NLS_DATABASE_PARAMETERS. which gives value AMERICAN_AMERICA.) For overriding or setting new value to this variable Find the NLS_CHARACTERSET value from DB using the query.) Check whether the NLS_LANG environment variable is already set in . Datastage 7. If the value is set you can view the value through director in the environment settings of the each job logs. 3.1 – handling of special characters without NLS settings NLS installation is not required for loading special characters like (®. This is an international standard character set that contains nearly all the characters used in languages around the world. 2.) If it is already set then change to NLS_LANG =$ENV in administrator client which will ensure it stays at the currently set value.dsenv file or in administrator client.Using NLS. add the parameter in your datastage job and override the default value. [AMERICAN_AMERICA.characterset). returns a valid value.AL32UTF8 the issue faced was that other interfaces like UI and Documentum are unable to view it.AL32UTF8 NLS_LANG is a combination of values which should be given in the format (Language_territory.[%NLS_LANG%]. gives the session's <Language>_<territory> but the DATABASE character set not the client. If you get this as result: Unable to open file ". so the value returned is not the client's complete NLS_LANG setting! . 5) Client Character Set : When the NLS_LANG was set to AMERICAN_AMERICA.WE8MSWIN1252]. All other NLS parameters can be retrieved by a: SELECT * FROM NLS_SESSION_PARAMETERS. technique reports the NLS_LANG known by the SQL*Plus executable. The "file name" between the braces is the value of the registry parameter.4. NLS_LANG=AMERICAN_AMERICA." then the parameter NLS_LANG is also not set in the registry. Note the @. it will not read the registry itself. SQL>@. If you get something like: Unable to open file. But if you run the HOST command first and the NLS_LANG is not set in the environment then you can be sure the variable is set in the registry if the @.[%NLS_LANG%]. [%NLS_LANG%].[%NLS_LANG%].) Provide the value to the Datastage by setting the NLS_LANG environment variable with oracle characterset value. Note: SELECT USERENV ('language') FROM DUAL. The DS variable is set to AMERICAN_AMERICA.WE8MSWIN1252 . utf8chartable. Find out the NLS MAP of the source file from the provider. Hence its not necessary to override the NLS_MAP in stages.Datastage 8. For parametersing NLS_MAP. GSOR is handling the same in sequences based on the region. Once we find the character set that the source provider is using. the same needs to be set to NLS_MAP. NLS_LANG to be consistent with the database. select dump(<column names>. d.de/) 3.1 – handling of special characters with NLS settings 1. Open in ultra edit (word might also help). If they cannot provide we need to figure it out ourselves (refer step b) b.AL32UTF8 – used for Oracle loads 2. used for the reading the sources NLS_MAP=windows-1252 for International NLS_MAP=UTF-8 for Italy and CEE NLS_MAP=ISO_8859-1:1987 austria . Verify the same with utf8 char table (http://www. converts to Unicode and processes the same through the stages.1016) from <table name> g. In a scenario wherein we are reading a source file. post the SEQUENTIAL. Verify the data load by using . The understanding that we currently hold is as below for how ETL processes data when the NLS is turned on. A hex value of c2 indicates a 2 byte representation c. e. Document to get the value is as above.de/). The NLS_LANG is set at a PROJECT level and used during the data load into Oracle f. The assumption is that DS reads the data based on the NLS_MAP. Refer Point 3.utf8-chartable. Convert to HexValue and compare it to utf8 char table (http://www. transforming and loading data into Oracle a. $NLS_LANG=AMERICAN_AMERICA. ü . Records are being loaded from Dataset (DataStage internal file format) into Oracle and found that some of the sales_names are not being loaded properly when using $NLS_LANG = AMERICAN_AMERICA. One of the issues faces was one of the source providers confirmed that they were sending us data in Windows <windows-1252> whereas it was actually utf8 <UTF-8>. Wrong NLS character set was supplied by source system. Issues Faced 1. 4. many names loaded without issue which contains character like ö. This issue helped us to come up with the above steps Issue Desc : Source system in Austria provided data in UTF-8 encoded file.a.AL32UTF8 (Oracle Environment Variable) But with the same settings and configurations. . which has sales names to be inserted into Oracle table. Verified (by extracting values to a file) that the dataset contains the correct values but in the table it is not. .
Copyright © 2024 DOKUMEN.SITE Inc.