Informatica Interview Questions

TCS How you pull the records on daily basis into your ETL Server.Answer: Running incremental/delta/cdc.. For this we have so many logics some of them are.. 1) using mapping level variable 2) using control table and ... How to Join Tables if my Source is having 15 tables and the target is one? Answer: If your source is Flat file and Same Structure then you can go for indir ect file type. If source is relation is file then you can use Source Qualifier - Sql override o r join condition. or Joiner Transformation (n-1) A flat file having 1 laks records. if I want convert excel file what happen. (bc oz a excel sheet having 65536 columns but flat files one lak columns). how to ge t one laks columns in excel sheet. Answer; If you want Flat file to Excel format, convert it into excel format and save it as .CSV (comma separate value) format. It is having a capacity 10lakhs r ecord.. MINDTREE I have a workflow I want to run this very day 3 times every 3 hours how can you schedule that? In the Scheduler we have option called Customized Report. By selecting Days(s)op tion in Repeat every option,u can see Daily Frequency options so select the Run Every option particular Hours(Gap between every run of the WF) your workflow want to run, after that select End after(3 runs) option of the listen End Optio ns in the main Scheduler tab What is Mapplet? What is logic? We can create any No of Mapplets for 1 mapping. There is no limit for Mapplets. Every Mapplet can have a Logic or logics, There is no limit for logics. KPIT When we develop a project what are the performance issues will raise?? KPIT 2:if a table has INDEX and CONSTRAINT why it raise the performance issue bcoz when w e drop the index and disable the constraint it performed better??KPIT 3: what ar e Unix commands frequently used in Informatica?? Performance Issue in Informatica: In your Project, My Project, Any Project Mos tly our final Target was Loading the Data in to Target Table, under Efficiently in Less time. Tune the Mapping, Less Active Transformations, Use Best Loading Option, Partitio n the Data, Index less Target Table,,,,, 2)Yes, we drop the index and disable the constraint Perform Better, That loads the Data based on the Particular Table only, Else I will go for Parent, Child test in FK Relation, Condition in constraint,,,After Loading the Data You can Configure indexes, constraints on Table. 3)SED-commands,AWK-commands,Dir Commands, File Commands, Copy Commands only,, By seeing the parameter file how do u identify whether it is a workflow paramete r or mapping parameter? mapping parameter starts with $$ and workflow parameter starts with $ What is the query to find nth highest salary? What is the use of cursors? There are 3 ways to find out the nth highest salary in a given table (e.g. emp) as below; 1) select distinct sal from emp e1 where &n=(select count distinct sal from emp e2 where e1.sal <=e2.sal); 2) select empno,enaame,sal,deptno,rank () over (order by sal desc) as ra from em p where ra=&n; 3) select empno,enaame,sal,deptno,dense_rank () over (order by sal desc) as ra f rom emp where ra=&n; What is a Cursor. When a query is executed in oracle, a result set is produced and stored in the m emory. Oracle allows the programmer to access this result set in the memory thro ugh cursors. Why use a cursor? Many times, when a query returns more than one row as a result, we might want to go through each row and process the data in different w ay for them. Cursor is handy here. Types of cursors: Oracle PL/SQL declares a cursor implicitly for all queries and DML statements (i ncluding queries that return only one row). But in most of the cases we don’t use these cursors for queries that return one row. What r the type of indexes u generally used in Informatica? B tree and bitmap indexes we r using in Informatica. B tree is low cardinality a nd bitmap have high cardinality that is why bitmap index gives better performanc e. Why we select the table which has minimum number of records as Master table in j oiner? The Integration Service (IS) reads data from master table and builds data cache and data index for building cache. After that read data from detail table perfor m joins. Because routine will be very less (master means very less record compar ed to detail table). Finally time saving. Performance increasing. What is a diff between joiner and lookup transformation? If you are not a performance geek, somewhat basic things you can do with both jo iner and lookup t/f. However Joiner operates on sources where as lookup operates source as well as target. There are other things can also be identified as limitation of workin g with heterogeneous sources in both the case.(can lookup work on xml file?) Joiner doesn’t support nonequi joins where as lookup support nonequijoin Joiner transformation does not match for null values where as lookup transformat ion matches for null values joiner transformation supports only equality operato r in condition of join where as lookup transformation can support <=,>=, =,! = i n the lookup condition L & T How can you display only hidden files in UNIX ls -a|grep "^\." Tell me one complex query in oracle? Select users.user_id, users.email, max (classified_ads.posted) from users, class ified_ads where users.user_id = classified_ads.user_id group by users.user_id, users.email order by upper (users.email); Data is passed from one active trans and one passive trans into a passive transf ormation. Is the mapping valid or invalid? The answer is: Mapping is invalid. We can never connect a assive transformation TO A PASSIVE TRANSFORMATION (Coming es, we should take the question in this way). Can you connect output of n to an expression transformation (where output of joiner from different pipelines). data from Active and p from different pipelin a joiner and Expressio and expression coming I have one source 52 million records I want target only 7 records? How will you do what logic to implement? if u want load 7 records from source to target, we can use sequence generator t/ r current value =0 end value =7 reset enable then drag the nextval port to exp t/r create one output port give the condition like nextval =7. I have ten flat files with same structure, if I want to load it to single target , and mapping needs to should show only one source. What will be the steps to ta ken to achieve it? Create a file system with extension .txt in that copy all the 10 flat file paths . Now at session level Source type: indirect File directory: File system path Source file name: Name of file system.txt When we use lookup, if have 5 records, if I don’t need 1st and 2nd records..what w ill be the procedure to achieve by using lookup? Use lookup override as Select *from emp minus select *from emp where rownum<=2 Note: we can’t use > with romnum Can we see default group, when we use router? If yes how? Yes we can see the default group. When we add router, there r no groups. When we add any group, then a default group is also added. We can add conditions for th is group according to our requirement. If we do not give any conditions, all the records that doesn’t meet the condition comes to this default group. Just connect it to any other transformation and use It according to your reqmnt. What is left outer join? For eg. Select M.*, D.* from Master M LEFT OUTER JOIN Details D on M.ID = D.ID In the above query we get all the records from Master table and matching records from Detail table. In simple it takes all rows from the table which is left of the join and matchin g records from table which is in other side of the join. When we use dynamic lookup, if condition matches what will be the o/p? It updates the row in the cache r leaves it unchanged. This lookup property is r ecommended only for the mapping where sources has duplicate records A View is just a stored query and has no physical role. Once a view is instanti ated, performance can be quite good, until it is aged out of the cache. A materialized view has a physical table associated with it, it does not have to resolve the query each time it is queried. Depending on how large the result an d how complex the query, a materialized view should perform better. How to import oracle sequence into Informatica? With the help of stored procedures as well as SQL Override through unconnected l ookup also we can import sequence into Informatica Why can’t we put a sequence generator or update strategy transformation before joi ner transformation? Joiner is to join two different sources .If u use Update strategy T.F and tryin g to use DD_delete&DD_reject option the some of the data will get deleted and u can t see the at joiner output. What is metadata? A metadata is Data about data. The repository contains the metadata, it means al l the information of the mappings, task.....etc We have tables like c1 a c2 b c3 c c4 x c5 y And I need output like abcx in a single row and abcy in a single row? How do u do this? You can override the source qualifier with the following query Select c1, c2, c3, c4 from table1 union all select c1, c2, c3, c5 from table1 We have to use order by, where, having we to implement sql query In a query we can use all the three....but using where n having doesn’t mean any s ense... If we use group by we can use where clause before group by port... And if we use group by we can use having after group by clause. And generally order by clause is always written at the end of the query Where is the cache stored in Informatica? Cache is stored in the cache directory. For Aggregator, Joiner and Lookup transf ormations cache values is stored in the cache directory. For sorter transformation cache values stored in the temp directory. How we can add header and footer to flat files? go to session edit>>>>mapping tab>>>>>>>select source>>>>>>in header command opt ion and footer command option type the command What is data merging, data cleansing and sampling? Data merging: multiple details values are summarized into single summarized valu e. Data cleansing: to eliminate the inconsistent data Sampling: it is the process, orbitarly reading the data from group of records. I have thousand records in my source (flat file) I want to load 990 records I do n t want load first 5 records and last 5 records at Informatica level? Pass records from source qualifier to expression. Create a variable with a aggre gation as count n assign its value as 0. Create two parameters n assign them val ues. First one 5 second one 995 In expression create an output port as number datatype n in the expression edito r write.... Sequence=setcountvariable (variable name u created). Now attach it to the router create a group and enter condition sequence>5 and se quence<995 Connect this port to target... How the data will be loaded while designing the schema? Which one first (for e.g .-dimensions and facts) Dimension tables are loaded first and fact tables. As all primary keys of the di mension table is linked with the foreign keys of fact table for proper lookup we need to load dimension tables first. What is sql override? What is the use of it? It is a process of writing user defined SQL queries to define SQL joins, source filters, sorting input data and eliminating duplicates. how to get the first row without using rank t/r? step 1 if first row h as not null then we can use first () function in aggregator T/F. step 2 use sequence generator T/F And filter T/F write condition nextval=1 In Source Qualifier Sql Override Ex: Table Name: Dept1 SELECT DEPT1.DEPTNO, DEPT1.DNAME, DEPT1.LOC FROM DEPT1 WHERE ROWNUM=1 I have Flat file like the data, sal have 10,000. I want to load the data in the same format as sal as 10,000. Can anybody know the answer means please mail me? in target table options we have thousand separators, hundred separators to_char(10000, 99,999.00 ) from dual; tell me some dimension tables names in banking domain Informatica project (don’t t ell like depends on project, tell me names of dimension and fact table names in your project)? Fact Tables: 1) Bank 2) Advanced Dimension Tables: 1) Customer 2) Accounts 3) Transaction Dirty Dimension 4)Time write a subject maths science social sql query following source? marks 30 20 80 social 80 required output maths science 30 20 select (select marks from sub where subject= maths ) maths, (select marks from sub where subject= science ) science, (select marks from sub where subject= social ) social from sub group by 1; select decode(subject, maths ,mark) maths ,decode(subject, science ,mark) science ,decode(subject, social ,mark) social from <<table>> Yesterday my session run ten minutes. Today its run 30min, wt is the reason? If any issues how to solve that? Delay of session ----------------------(1) Amount of source data may huge (2) Database connection may slow down, that’s why data transfer is slow (3) If you are using cache based t/r, then there must be performance issue in ca che par I want to load data in to two targets. One is dimension table and the other is f act table? How can I load at a time Generally we all knew that is, In DatawareHouse Environment, we should load data first in the dimension table then we load into the fact table. Because fact tab le which contains the Primary keys of the dimension table along with the measu res. So we need to check first that whether the fact table which you are going t o load that has foreign key relationship with the dimension table or not? If yes , Use pipeline mapping, and load dimension data first in first pipeline and in t he second pipeline load fact table data by taking the lookup transformation on the dimension table which has loaded data already..and return the key value fr om the lookup transformation then calculate the measures by using Aggregator and also give "group by" on the dimension keys and map to the Target (Fact) ports a s required. most importantly specify the "Target Load Plan" where dimension targ et as first, fact table target as second. SOURCE 1 a 1 b 1 c 2 a 2 b 2 c TARGET 1 A B C 2 A B C IN ORACLE & INFORMATICA LEVEL HOW TO ACHIVE first sort the first column empid using sorter t/r then expression t/r create va riable port v_ename:=iff(prev.empid=curr.empid,v_ename|| ||ename,ename) o_ename=v_ename then using expression t/r convert lower case higher case. Tell me how many tables used in Ur project and how many fact tables and dimensio n tables used in ur project In my project we have more than 100 tables but in my module we can use only 10 t o 15 tables only. Then dimension: 5 to 6 dimension tables And 1 or 2 fact tables what is the command to get the list of files in a directory in Unix? Ls How can I explain my project architecture in interview? 1. Source Systems: Like Mainframe, Oracle, People soft, DB2. 2. Landing tables: These are tables act like source. Used for easy to access, fo r backup purpose, as reusable for other mappings. 3. Staging tables: From landing tables we extract the data into staging tables a fter all validations done on the data. 4. Dimension/Facts: These are the tables those are used for analysis and make de cisions by analyzing the data. 5. Aggregation tables: These tables have summarized data useful for managers who wants to view monthly wise sales, year wise sales etc. 6. Reporting layer: 4 and 5 phases are useful for reporting developers to genera te reports. I hope this answer helps you Difference between session variables and workflow variables? A workflow variable can be used across sessions inside that workflow. Whereas a session variable is exclusive only for that particular session what is full process of Information source to target just like starting to produ ction and development? Initially data comes from OLTP systems, of a company, which get loaded into a da tabase or flat files with the help of legacy systems or any pre defined methods. from here data is transferred to the staging database applying business logic w ith the help of Informatica or other ETL tools. at times stage to target is also loaded using Informatica mappings. these are transferred to another QA (quality analysis database) in XML files. from there deployment is done onto the product ion environment. I HAVE A SOURCE FILE CONTAINING 1|A,1|B,1|C,1|D,2|A,2|B,3|A,3|B AND IN TARGET I SHOULD GET LIKE 1|A+B+C+D 2|A+B 3|A+B WHICH TRANSFORMATION I SHOULD USE Aggregator with group by on column with values 1, 2, 3 My session have to run Monday to Saturday not run Sunday how to schedule in Info rmatica level? In the Scheduler we have option called Customized Report. By selecting week opti on in Repeat every option. u can see day options so select the particular days w hich your workflow want to run. what is dynamic cache? The dynamic cache represents the data in the target. The integration service use s the data in the associated port to insert or update rows in the lookup cache. 1) It is used to insert the data in Cache and Target 2) Informatica dynamically inserts the Data in target 3) Data is inserted only when the condition is false i.e. It means no data is av ailable in target and cache table get me the resultant input:1 x,y,z 2 a,b 3 c output:- 1 1 1 2 2 3 x y z a b c Use the following flow: Source ---> SQ ---> Expression ---> Normalizer ---> Filter - --> Target In the expression use variable ports to form 3 columns depending on the values r eceived in Column2. I mean to say the given value is X, Y, Z in column2 so creat e 3 ports and each port will have 1-1 values i.e. X then Y then Z. For this use SUBSTR and INSTR functions. SUBSTR to get the part of the string and INSTR to f ind the position. VARIABLE_PORT1 ---> substr(column2,1,1) VARIABLE_PORT2 ---> IIF(instr(column2, , ,1,1)!=0,substr (column2,instr(column2, , ,1,1)+1,1),NULL) VARIABLE_PORT3 ---> IIF(instr(column2, , ,1,2)!=0,substr (column2,instr(column2, , ,1,2)+1,1),NULL) Direct the variable ports to 3 output ports and this output ports will go to Nor malizer. In Normalizer create 2 ports Column1 and Column2 and put the number of occurrences for column2 as 3. The output will be 2 ports from Normalizer which will be fed to filter. In filte r, filter out the null values in column2 if it exists (IIF (ISNULL (Column2), FA LSE, TRUE) Direct the output of filter to target. Hey I am net to Informatica? Can anyone explain me step by step How scd will wor k? Select all rows. Cache the existing target as a lookup table. Compare logical key columns in the source against corresponding columns in the target lookup tab le. Compare source columns against corresponding target columns if key columns match. Flag new rows and changed rows. Create two data flows: one for new rows, one for changed rows. Generate a prima ry key for new rows. Inserts new rows to the target. Update changed rows in th e target, overwriting existing rows. How to list Top 10 salaries, without using Rank Transmission? First use sorter with salary based and sequent generator next filter transformat ion Sorter (salary descending order) -----> Sequent generator --------->Filter ( seq<=10) Can you use flat file for lookup table? Why? Yes we can....we can definitely use the flat file as lookup but we cannot use th e xml files as look up...if u want to use then u have to change the xml file to another database or flatfile..Then u can able to use In my source table I want to delete first and last records and load in between r ecords into target? How can it possible? The flow will be like this source--->sq--->aggregator---->filter--->target Generate sequence number using the sequence generator connect it to the aggregat or and the flow from sq.group by sequence num and create two o/p ports 1) min(seqnumber) 2) max(seqnumber) In filter write the condition as seqnumber<>min AND max Connect the required ports to the target How the facts will be loaded? The most important thing about loading fact tables is that first you need to loa d dimension tables and then according to the specification the fact tables. The fact table is often located in the center of a star schema, surrounded by di mension tables. It has two types of columns: those containing facts and other co ntaining foreign keys to dimension tables. * A typical fact table consists of: Measurements: additive-measures that can be added across all dimensions , non-additive cannot be added and semi-additive can be added across few dimensions. * Metrics * Facts - The fact table has frequently more than one measurement field and then each field is called a fact. Usually one fact table has at least three dime nsion tables. Note: Found this answer at http://www.etltools.org/loading/facts.html How are parameters defined in Informatica? Parameters are defined in a mapping parameter/variables wizard. we can pass the values to the parameter outside the mapping without disturbing the design of map ping, but parameters are constant, until and unless user changes how do u get sequence numbers with oracle sequence generator function in Informa tica.... I don’t need to use sequence generator transformation..... how do u achie ve this??? If you want Oracle seq then go for SQL t/r in query mode. In that write a query ( select Sequence_name from dual) in o/p port. how to run workflow in Unix? To Run a Workflow in Unix you can run it pmcmd command Syntax: pmcmd startworkflow -sv <server name> -d <domain name> -u <user name> -p <password> -f <folder name> <workflow name> how I will stop my workflow after 10 errors session properties we have an option I have source like this 1:2;3. so I want to load the target as 123 S.D--->S.Q....>EXP T/R......>TGT In exp t/r create one out put port give condition by using Replace function we c an achieve this scenario. or sql query : select replace( 1:2;3 , 1:2;3 , 123 ) from dual; REP --123 What is workflow variable Workflow variable is similar to Mapping variable where as in workflow variable we will pass the workflow statistics and suppose you want to configure the mult iple run of workflows by using variable that you can do with this. Which gives the more performance when compare to fixed width and delimited file ? and why? Surely fixed width gives best performance. Because it need not to check each and every time where the delimiter is taking place. Two tables from two different databases are there. Both having same structure bu t different data. How to compare these two tables? If u want to compare the data present in the tables go for joining and compariso n. If u want to compare the metadata (properties)of the tables to for "compare Objects" in source analyzer/ A table contains some null values. How to get (not applicable (na)) in place of that null value in target? Use decode function in Expression taking one new column as Flag iif is_null(column_name,-NA-,column_name) In scd type 1 what is the alternative to that lookup transformation? Use "update else insert" in the properties of session One flat file is there which is comma delimited. How to change that comma delimi ter to any other at the time of running? I think we can change it in session properties of mapping tab. if select flat fi le on top of that we see set file properties. Three date formats are there. How to change these three into one format without using expression transformation? Use SQL Override and apply the TO_DATE function against the date columns with th e appropriate format mask. What are the reusable tasks in Informatica? Reusable tasks means the task that is created in task developer is called reusab le tasks. (Session, Command, Email) The task that created in workflow designer that is non reusable task. What are events in workflow manager? Events are the wait which v implement on other tasks on workflow before the spec ified requirement is fulfills. These are of two types 1. predefined (also called file watcher event) 2. User defined In predefined we can check for a file to be present in a path we specify before we proceed with the workflow. In user defined we can make any task to wait before a specified task in complete . In user defined event wait n event raise task are used in combination. I want skip first 5 rows to load in to target? What will be the logic at session level? The one way to skip records for relational sources would be adding a SQL Query i n session properties. Select * from employee minus select * from employee where rownum <= 5 This query would skip the first 5 records Why do flat file load is faster if you compare that with table load? Flat file doesn t contain any indexes or keys so it will directly load into it. Whereas in a table it will first check for indexes and keys while loading into t able so it is slow when compared to flat file loading 2) another reason is that when we load the data into table integration service also verifies data type and will do parsing if needed?. But in case of flat file there is no need of parsin g and checking data types.3) One more thing is while loading the data into table ,Informatica writes the data into database logs before loading into target ,and this can’t be done while loading a flat file. 4) One more answer is in general fl at files are kept on the server where Informatica is installed If I have a index defined on target table and if set it to bulk load will it wor k? Bulk load never support the index. So if index in target with bulk load the sess ion will fail. So before use bulk load u have to drop index in target and recrea te index using target post sql Replace Function: Use this function for above question, replacestr (0, col, $ , ! , @ , # , % , ^ , & , * , NULL) Or replacechr (0, col, $!@#%^&* , NULL) I have oracle table A and target B. I don t know how many records. I want to get last record in table A as first record in target table B. write a sql query? Create table b as select * from a order by rownum desc; I have two tables, table 1 having 2 columns and 3 rows, table2 having 3 columns and 2 rows. What is the output if I do left outerjoin, full outer join, right ou ter join? In table data like following Left table1 right table2 c1, c2 c3, c4, c5 1, 2 1,1,10 4, 5 6,5,12 7, 8 matching columns c1 and c3 Left join c1, c2, c3, c4, c5 1, 2, 1,1,10. -,-, 6,5,12. Right join 1, 2, 1,1,10. 4, 5,-,-,-. 7, 8,-,-,-. Full outer 1, 2, 1,1,10. 4, 5,-,-,-. 7, 8,-,-,-. -,-, 6,5,12. - indicates null in above table Which transformation should we use to get the 5th rank member from a table in In formatica? for this we have 2 use two t/r i.e. first we have 2 use rank t/r and then use a filter t/r in filter give the condition as rank=5 connect to target the flow is like this src --->sq--->rank--->filter--->trg we can also do this in sql use the following query select * from(select * from emp order by sal desc) where rownum<=5 MINUS select * from(select * from emp order by sal desc) where rownum<=4 How do you avoid duplicate records without using source qualifier, expression, a ggregator, sorter and lookup transformations ? u can use Unix command in pre session as sort -u file1 >newfile in a mapping when we use aggregator transformation we will use group by port. if groupby is not selected by default it will take only the last column why? Aggregator t/r performs calculations from first record to last record. There is no group it will take last record if u are not selected groupby the integration service returns last record from all input rows. What is the use of Data Mart? For overwriting Data.Ex" we used to load Flat file data from DSO to Cube. It is Data Mart. Loading data from an infoprovider used as data target to another dat a target. The concept we used is DATA MART. How will u get 1 and 3rd and 5th records in table what is the query in oracle Display odd records Select * from EMP where (rowid,1) in ( select rowid,mod(rownum,2) from EMP) Display even records Select * from EMP where (rowid,0) in ( select rowid,mod(rownum,2) from EMP) Have you developed documents in your project? And what documents we develop in r ealtime? We have to create Low level design documentation in real time... In that we hav e to specify naming conventions, source types, target types , business requirem ents, logics etc... My source contain data like this cno cname sal 100 [email protected] 1000 200 karun [email protected] 2000 I want load my data to the target is cno cname sal 100 Rama 100 0 200 karuna 2000 In the expression editor, write replacestr(0,cname,(substr(cname,instr(cname, @ ),instr(cname, m ,-1,1)), ) Pass this port to the output port...... my source is a comma delimited flatfile as eno, ename, sal 111,sri,ram,kumar,100 0 and my target should be eno ename sal 111 sri ram kumar 1000 i.e.; we need to eliminate the commas in between the data of a comma delimited file. See while we load into source analyzer as considered as cama separated it shows us 5 columns now we should replace the cama with null use s->s/q-->expre-->target give addi tional outputport as ename exp condition=column2|| ||column3|| ||column3 Hi, in a mapping I have 3 targets and one fixed width file as source. Total 193 records are there . I connected one port in aggregator to all 3 targets. The sam e value need to be load into these 3 targets . It is loaded like that only but i n different order. Why? The order of insertion should be same know for all 3 tar gets ? Then why the order is changed ? Informatica don t consider the sequence of records at the time of insertion to T arget , for that u should use a sequence generator TF. or u can use sorter Hi, In source I have records like this No name address 10 Manoj mum 10 Manoj Del hi 20 kumar usa 20 kumar Tokyo I want records in target like shown below No name addr1 addr2 10 Manoj mum Delhi 20 kumar usa Tokyo If it is reverse we can do th is by using Normalizer transformation by setting occurrence as 2. Somebody will say use demoralization technique. But as of my knowledge I couldn’t find any renor malization technique. Is there any concept like that? I tried this seriously but I could find any idea to implement this Use a dynamic lookup to check if the record exists, If it is not then Insert that record in No , Name and Address1 If it is then use that record to update the address 2 field always, this might b e a case where the client wants to keep the first record and current record in t he address 2 field I have one flatfile as target in a mapping . When I am trying to load data secon d time into it. The records already is in flatfile is getting override. I don t want to override existing records. Note : we can do this by implementing CDC / I ncremental pool logic if target is relational . But this is flatfile. So, even I use this same technique it will override only So what is the solution ? Is ther e any option at session level for flatfile target ? It s very Simple We have one option at session level. Double click on session-->Mapping tab-->>Target properties->Append if exists(check this option). What is version control in Informatica? Version control is an option while installing the Informatica s/w. Either enable or disable the version control option it will create an instance to an object. FOR EXAMPLE (version control enable): if you create a mapping in Informatica, so it will create an instance for that mapping. If you update that mapping, again an instance is create for updated mapping. FOR EXAMPLE (version control disable): if you create a mapping in Informatica, s o it will create an instance for that mapping. If you update that mapping again an instance is create for updated mapping and deleted first (initial) instance. How to connect two or more table with single source qualifier? 1. 2. 3. 4. Drag all the tables in Mapping Designer. Delete all the source qualifiers associated with all the tables. Create a SQ transformation. Drag all the columns from all the tables into the SQ transformation. What is procedure to use mapping variable in source qualifier transformation? Go to source qualifier transformation with along with source and target after th at go to the properties, select user defined join sql editor will open after that there is tw o options on left hand side select variable in that there is mapping variables a vailable How do you find out whether the column is numeric or combination of char and num bers or it contains char, numeric and special characters? in expression flag->outputport iif(is_number(id),1,0) in filter Condition flag=1 (return numeric ) or flag=0 it will return alphabetic character Second way is we use ascii() function ascii(id)>64 it return alphabet ascii(id)<64 it return number Input flatfile1 has 5 columns, faltfile2 has 3 columns (no common column) output should contain merged data (8 columns) Please let me know how to achieve? As per your question two files have same no of records Take first file s 5 column to an expression transformation, add an output column in it let say A’. Create a mapping variable let say countA having datatype int eger. Now in port A you put expression like setcountvariable (countA). take second file s 3 column to a expression transformation ,add a output column in it let say B .create a mapping variable let say countB having datatype int eger. Now in port B you put expression like setcountvariable (countB). The above two is only meant for creating common fields with common data in two p ipelines. Now join two pipe line in a joiner transformation upon the condition A=B. Now join the required 8 ports to the target. You get your target as per your requirement. Input is like 1 1 1 2 2 3 and output should be 1 2 3 How can u achieve using ran k transformation?? src->sq->aggregator->filter->target From aggt/r we would calculate count (empid) for eliminating duplicates we will write the filter condition count (empid) =1 What is the significance of new lookup port in dynamic look up When we configure a look up t/r to use a dynamic cache, the designer automatical ly adds a port called "New Lookup Row" to the look up t/r. This port indicates w ith a numeric value whether the Informatica server inserts or updates the row in the cache or makes no change in the cache. To keep the lookup cache and the target table synchronized, we pass rows t o the target when the New Lookup Row value is equal to 1 or 2. I am having a table with columns ID NAME 1 x and the requirement is to get the o /p like this 1 y ID Count (*) 1 z 1 3 2 a 2 2 2 b 3 c so write a sql query to ge t the id n how many times its count of repetition n there u shouldn t get the di stinct (i.e. id-3) Select id, count (*) from <table name> group by id having count(*)>1; Every DWH must have time dimension so now what is the use of the time dimension how we can calculate sales for one month, half-yr ly, and yearly? how we are doi ng this using time dimension. By taking the time dimension to expression transformation and create new ports a nd make the new ports as output ports Write condition in the new ports by using the function getdate_part (date dimens ion) There is a table with EMP salary column how to get the fields belongs to the sal ary greater than the average salary of particular department write a query Select * from EMP e where sal > (select avg (sal) from emp m where e.deptno = m. deptno group by deptno) order by deptno Gives u salary greater than the average salary of their departments... How we can get unique records into one target table and duplicate records into a nother target table?? Data flow for this one is: sq-->Aggregator-->Router-->target In aggregator t/r take 2 output ports and give condition like uniqe_port---->cou nt(*)=1 duplicate_port--->count(*)>1 Connect this one router and in router take 2 groups and give condition like Uniqgroup=uniqe_port duplicategroup=duplicate_port and connect that groups to ta rget Can we load the data without a primary key of a table? What is target plan? we can load data without primary key but if whenever u need to update the table from Informatica then u have to use update override then only u can update the t able Alternative to update strategy transformation 2) out of 1000 records after loading 200 records, the session got failed. How do u load the rest of records? 3) Use of lookup override 1) You can update the target table by using target update override. I think this might be the alternative 2) Consider performance recovery 3) Lookup override is nothing but overriding the default sql which is generated by the lookup at run time. The default sql contains select, groupby and orderby clauses. The orderby clause orders the data based on the ports in lookup tab. If you want to change the order then we need to mention two hyphens at the end of the sql, which means the query generated by the Informatica server is commented. How do you merge multiple Flat files for example 100 flat files without using Un ion T/F First of all create a new flat file and paste the location of the entire flat fi les in that newly created flat file. Now import the source definition using "Imp ort from file" and define the mapping. And assign the session properties as indi rect file type of loading and give the location of the newly created file. If we set dd_insert in mapping and Delete in session properties what will happen ? Yes it will perform delete as session properties overrides mapping properties. What is diff between grep and find? Grep is used for finding any string in the file. Syntax - grep <String> <filename> Example - grep compu details.txt Display the whole line, in which line compu string is found. Find is used to find the file or directory in given path, Syntax - find <filename> Example - find compu* Display aal file name starting with compu How can we perform incremental aggregation? Explain with example? You have to perform the Incremental Aggregation in Session Properties only. Ex:You Target Table loaded(with Agg Calculation) 100 Records on yesterday , No w newly you have 50 Records(25 update,25 insert), To do the Agg calculation for Updated Records ,Insert Records you need to Perform Incremental Aggregation. This is simple way to incre ase the Performance, Reducing the time etc. If you not perform the Incremental A ggregation, you can do the same thing in another way. But it’s lengthy What is a time dimension? Give an example? Time Dimension: Generally, to generate dates as per the requirement we use date dimension. If you’re loading of data in fact table on the basis of time/date then we use the values of date dimension to populate the fact. We take the last date on which the fact is populated. Then check for the existen ce of dates for the data to be populated.ifnot we generate through some stored p rocedure or as per requirement. Eg: Daily, weekly, financial year, calendar year, business year etc., F10 and F5 are used in debugging process By pressing F10, the process will move to the next transformation from the curre nt transformation and the current data can be seen in the bottom panel of the wi ndow.. Whereas F5 will process the full data at a stretch. In case of F5, u can see the data in the targets at the end of the process but cannot see intermediate trans formation values What is data quality? How can a data quality solution be implemented into my Inf ormatica transformations, even internationally? Data Quality is when you verify your data and identify if the data present in th e warehouse is efficient and error free. The data present in each column should have the meaning full information like it should not have null data, it should n ot have garbage data, complete correct data must be transformed to the target da ta, if all the data types are correct as per the requirement, etc. Asking for Informatica transformations means you want to know how source target transformations to be implemented in Informatica? What is the architecture of any Data warehousing project? Step-01------>source to staging Step-02------>staging to dimension Step-03------>dimension to fact Project planning---Requirements gathering ---product selection and installation Dimensional modeling --- physical modeling ---- deployment --- maintenance In easy terms for dimensional modeling 1. Select the business process 2 identify the granins 3. Design the dimension ta ble 4. Design the fact table Once these 4 steps are over it will move to physical modeling in physical modeli ng u apply the etl process and performance techniques. What is the function of F10 Informatica? What is causal dimension? One of the most interesting and valuable dimensions in a data warehouse is one t hat explains why a fact table record exists. In most data warehouses, you build a fact table record when something happens. For example: When the cash register rings in a retail store, a fact table record is created f or each line item on the sales ticket. The obvious dimensions of this fact table record are product, store, customer, sales ticket, and time. At a bank ATM, a fact table record is created for every customer transaction. Th e dimensions of this fact table record are financial service, ATM location, cust omer, transaction type, and time. When the telephone rings, the phone company creates a fact table record for each "hook event." A complete call- tracking data warehouse in a telephone company r ecords each completed call, busy signal, wrong number, and partially dialed call . In all three of these cases, a physical event takes place, and the data warehous e responds by storing a fact table record. However, the physical events and the corresponding fact table records are more interesting than simply storing a smal l piece of revenue. Each event represents a conscious decision by the customer t o use the product or the service. A good marketing person is fascinated by these events. Why did the customer choose to buy the product or use the service at th at exact moment? If we only had a dimension called "Why Did the Customer Buy My Product Just Now?" our data warehouses could answer almost any marketing questio n. We call a dimension like this a "causal” dimension; because it explains what ca used the event. How many repositories can we create in Informatica?? In Informatica Power mart we can create any no of repositories, but we cannot sh are the metadata across the repositories, In Informatica Power center we can create any no of repositories, but we can des ignate only one repository as a global repository which can access or share meta data from all other repositories. How can we run workflow with pmcmd? Connect to pmcmd, connect to integration service. Pmcmd>connect -sv service_name -d domain_name -u user_name -p password; Start workflow, Pmcmd>startworkflow -f folder_name What is the exact difference b/w IN and EXIST in Oracle..? Here s the EXPLAIN PLAN for this query: OBJECT OPERATION ---------- ---------------------------------------SELECTSTATEMENT() NESTEDLOOPS() EMP TABLEACCESS(FULL) EMP TABLEACCESS(BY INDEX ROWID) PK_EMP INDEX(UNIQUESCAN) This query is virtually equivalent to this: Select e1.ename from EMP e1, (select empno from EMP where ename = KING ) e2 whe re e1.mgr = e2.empno; Select ename from EMP e Where mgr in (select empno from EMP where ename = KING ); You can write the same query using EXISTS by moving the outer query column to a sub query condition, like this: Select ename from EMP e where exists (select 0 from EMP wheree.mgr = empno and e name = KING ); When you write EXISTS in a where clause, you’re telling the optimizer that you wan t the outer query to be run first, using each value to fetch a value from the in ner query(think: EXISTS = outside to inside). In what type of scenario bulk loading and normal loading we use? We use bulk loading in such scenarios where there is bulk amount of data is to b e loaded into target. I.e. we when we want to load large amount of data fast int o the target we use bulk loading. When u don’t want to do the session recovery and u r target should not contain any primary keys How to join two flat file if they have diff. structure? How to join one relation al and one flat file? You have two flat-files with u. prepare two source instances with the structure you have as Oracle relational tables. After that change them to flat files in the source ins tances. Then connect to the target s that u already prepared by simple-pass mapp ing. Now you have two relational source tables with you, then join the tables us ing joiner. -------------------------------------------------How to join one relational and one flat file? Same as above convert the flat file to relational table through simple-pass mapp ing. Now join two relational tables using joiner in the same mapping itself. If the flat files has diff structure then sd-->sq-->exp> ...>join-->tgt sd1-->sq1-->exp1> that means take flat file source then take 2 exp transformation in which u take variable ports i.e. a=1 and exp1 is b=1 then based on condition u take a joine d trans and connect to target. Join takes different sources How to join two flat files in Informatica? If the structure of the two flat files is same we can use SQ by using in direct. if there is no common field in the two flat files then create dummy columns i n the exp t/r and then by using the dummy columns u can join them in the joiner t/r by giving the condition dummy = dummy1. the flow is like this src1--->SQ----->exp---> |--->joiner---->target src2--->SQ----->exp---> How do u identify or filter out a 0 byte file available in a folder by using UNI X command? Most files of the following command output will be lock-files and place holders created by other applications. # find ~ -empty List all the empty files only in your home directory. # find . -maxdepth 1 -empty List only the non-hidden empty files only in the current directory. # find. -maxdepth 1 -empty -not -name ".*" Can we use unconnected lookup as dynamic lookup? No, Unconnected lookup will return one port only. But dynamic lookup will return more than one port and it updates and inserts the target while session runs. How can u avoid duplicate rows in flat file? Sorter, aggregator, dynamic lookup Normalizer transformation is not involved in Mapplet. Why? Mapplet er is a that is you can is a reusable logic that you dynamic transformation which dependendent on the input to reuse in other mappings have can use across different mappings. Normaliz converts rows to columns or vice-versa, so the Normalizer, it is not fixed logic that 2 mappings for this 2 mappings I want use only one lookup/r how? We can reuse the LKP in different mapping... What we need to do is.. step1:--create a LKP in transformation designer.. if we create a transformation in transformation developer its reusable... step2:-- create a transformation ...click--->transformation tab--->change to reu sable How can one eliminate duplicate data without using distinct option? Using Group by command removes all duplicates records In a Table my source having 10 records but how can I load 20 records in target; I am not bother about duplicates? QRC-----> TRGT | ----> TRGT Have two instance of target and connect source to both target instances. In Lookup transformation a sql override should be done and disable the cache how do you do this procedure? If you disable cache you can t override the default sql query. What is the meaning of upgradtion of repository? Do one thing Upgradtion of repository means u can upgrade the lower version int o higher version this u can do in Repository Manager right click on that there is the option upgrade select that and then add the license & product code..... . I have flat file source. I want to load the maximum salary of each deptno into t arget. What is the mapping flow? We can use an aggregator to group by on deptno and create a Newport to find the max (salary) and load dept no and salary, we’ll get unique dept no and the max sal ary. How to run the batch using pmcmd command Using Command task in the workflow What is Test Load? The power center reads, transforms data and without writing into targets. The po wer center generates all session files and pre-post sql functions, as if running full session. The power center writes data into relational targets. But rollbac ks data when the session is completes how DTM buffer size and buffer block size are related The number of buffer blocks in a session = DTM Buffer Size / Buffer Block Size. Default settings create enough buffer blocks for 83 sources and targets. If the session contains more than 83, you might need to increase DTM Buffer Size or decrease Default Buffer Block Size (Total number of sources + total number of targets)* 2] = (session buffer blocks ) (Session Buffer Blocks) = (.9) * (DTM Buffer Size) / (Default Buffer Block Size) * (number of partitions) What r the transformations that r not involved in Mapplet? 1. Normalizer transformations 2.COBOL sources 3.XML Source Qualifier transfo rmations 4. XML sources 5.Target definitions 6.Other mapplets 7.Pre- and post- session s tored procedures Define Informatica repository? Informatica repository is a central meta-data storage place, which contains all the information which is necessary to build a data warehouse or a data mart. Meta-data like source def, target def, businessrules, sessions, mappings, workfl ows, mapplets, worklets, database connections, user information, shortcuts etc How much memory (size) occupied by a session at runtime A session contains mapping and sources, trans, targets in that mapping. I think the size of session depends on the caches that we used for diff transformations in mapping and the size of the data that passes through transformations. What are the different options used to configure the sequential batches? Two options Run the session only if previous session completes successfully. Always runs the session. From where you extract the data, how you did it into Informatica? If source is relation tables source data is relational (i.e. means oracle 9i/10g ) But source is flat that can in UNIX server (client can give path details).in I nformatica we have to create the sturuture of table and give that path address i n the session property tab. What are your source in project and how you import in Informatica? How can I exp lain about this? Sources in Informatica can differ from Client to Client and project to project. But mostly client will send sample data through flat files. And the metadata of the sample data is imported from Source analyzer by clicking on the option impor t from file. What is data modeling? What are types of modelling? In which situation will use each one? Data modeling: It is the process of designing a Datamart or DatawareHouse. There are three phases in data modeling 1) Conceptual designing: In this phase the database architects and the managers will understand the client requirements. After understanding the requirements th ey will identify the attributes and entities (columns and tables) 2) Logical designing: In this phase the dimension tables and the fact tables are identified and also the relationship between the fact and dimension tables will gets identified. Now the schema looks like either a star or snowflake schema. To perform logical designing we use data modeling tools like ER STUDIO or ERWIN. 3) Physical designing: Once the logical designing is approved by the client it w ill be converted in to physical existence in the database. When will we use unconnected & connected lookup? How it will effect on the perfo rmance of mapping? if u want to perform look up on less values then we go for connected lkp if u wa nt to perform look up from more than one table then we go for un connected lkp i n your source u have more date columns then we should go for unconnected lkp. What is the difference between warehouse key and surrogate key? Surrogate key concept:- Surrogate keys are generated by system and they identif ies unique ENTITY ! yes its entity and not record ,while primary key is used f or finding unique record. Let me give you a simple classical example for surrogate key: On the 1st of January 2002 Employee E1 belongs to Business Unit BU1 (that s what would be in your Employee Dimension). This employee has a turnover allocate d to him on the Business Unit BU1 but on the 2nd of June the Employee E1 is muted from Business Unit BU1 to Business Unit BU2. All the new turnovers hav e to belong to the new Business Unit BU2 but the old one should belong to the Business Unit BU1. If you used the natural business key E1 for your employee within your Dataware House everything would be allocated to Business Unit BU2 even what actually be longs to BU1. If you use surrogate keys you could create on the 2nd of June a new record for t he Employee E1 in your Employee Dimension with a new surrogate key. This way in your fact table you have your old data (before 2nd of June) with the You could consider Slowly Changing Dimension as an enlargement of your natural k ey: natural key of the Employee was Employee Code E1 but for you it becomes Em ployee Code Business Unit - E1 BU1 or E1 BU2. But the difference with th e natural key enlargement process is that you might not have all part of your new key within your fact table so you might not be able to do the join on the new enlarge key -> so you need another id. After we make a folder shared can it be reversed? Why? They cannot be unshared. Because it is to be assumed that users have created shortcuts to objects in thes e folders. Un-sharing them would render these shortcuts useless and could have d isastrous consequences. What is the filename which you need to configure in UNIX while installing Inform atica? pmserver.cfg How u know when to use a static cache and dynamic cache in lookup transformation . Dynamic cache is generally used when u are applying lookup on a target table and in one flow same data is coming twice for insertion or once for insertion and o nce for updation. Performance: dynamic cache decreases the performance in comparison to static ca che since it first looks in the whole table that whether data was previously pr esent if no then only it inserts so it takes much time Static cache do not see such things just insert data as many times as it is coming. Whether Sequence generator T/r uses Caches? Then what type of Cache it is Sequence generator uses a cache when reusable. This option is to facilitate multiple sessions that are using the same reusable sequence generator. The number of values cached can be set in the properties of the sequence generat or. Not sure about the type of Cache. Explain grouped cross tab? Grouped cross tab means same as cross tab report particularly grouped Ex:- emp d ept tables take select row empno and column in ename and group item deptno and cell select sal then its comes 10 ------------------raju|ramu|krishna|.... 7098| 500 7034| 7023|600 -------------20 SID of the Employee SID of the employee E1 E1 BU1. All new data (after 2nd of June) would take the BU2. ...... .... Like type ok Explain about HLD and LLD? HLD: It refers to the functionality to be achieved to meet the client requiremen t. Precisely speaking it is a diagrammatic representation of client’s operational systems, staging areas, dwh n datamarts. Also how n what frequency the data is e xtracted n loaded into the target database. LLD: It is prepared for every mapping along with unit test plan. It contains the names of source definitions, target definitions, transformations used, column n ames, data types, business logic written n source to target field matrix, sessio n name, mapping name. Reach me on [email protected] 9866188658 Transformer is a __________ stage option1: Passive 2.Active 3.Dynamic 4.Static Dynamic more than active stage because it’s not taking space in your DB its initia te run time with session, cache data do transformations and end with session. How can u connect client to your Informatica sever if server is located at diffe rent place (not local to the client) Hi U need to connect remotely to your server and access the repository. U will be given repository user name and password and add this repository and co nnect to it with your credentials.

Comments

Description