CP7019-Managing Big Data-Anna University -Question Paper

Reg No.Question Paper Code : 13277 M.E. DEGREE EXAMINATION, NOVEMBER / DECEMBER 2014 Elective Computer Science and Engineering CP7019 – MANAGING BIG DATA (Common to M.E. Computer Science and Engineering (with specialization in Networks) and M.E. Biometrics and Cyber Security) (Regulations 2013) Time : Three hours Maximum : 100 Marks Answer ALL questions PART A – (10 x 2 = 20 Marks) 1. What is Big Data? Why we need to analyze Big Data? 2. Write down any four industry examples for Big Data. 3. Compare and contrast NoSQL vs. Relational Databases. 4. Write down the disadvantages of Aggregate Oriented Database. How to overcome that? 5. Define “Data Locality Optimization”. 6. State the purpose of Hadoop Pipes. 7. What do you mean by Apache Oozie? What are its contents? 8. List down the entities of YARN? 9. How Cassendra integrated with Hadoop? 10. List down the tools related with Hadoop. PART B – (5 x 16 = 80 Marks) 11. (a) (i) Discuss about the three dimensions of Big Data. (ii) Why Crowd Sourcing Analytics needed? Explain. (10) (6) Or (b) (i) Why Hadoop is called a Big Data technology? Explain how it supports Big Data? 12. (a) (10) (ii) Illustrate on how Cloud and Big Data related to each other. (6) (i) With the help of a Data Model explain aggregations and relations. (8) (ii) Give an example for Map Reduce Calculations. (8) Or With an example code explain on how Hadoop analyzes data? (10) (6) (16) Or (b) (i) (ii) Discuss the steps involved in designing HDFS. B. (a) (16) Give an overview on: (i) HBase Data Model (8) (ii) Pig Data Model (8) Or (b) (i) What are the different ways to insert data into a table using Hive. 15. Give a sample query for each kind? (ii) Write down the queries involved in Hive Data Definition. Explain on how to combine Sharding and Replication.BHUVANESWARAN / AP (SS) / CSE / REC .(b) (i) (ii) 13. Give an example code.2 (10) (6) . (a) Describe about Graph Databases and Schemaless Databases. (a) (8) (8) With necessary diagram explain the Anatomy of MapReduce Job run? (16) Or (b) Discuss on the different types and formats of MapReduce with an example eachone. 14. Show on how a client read and write data in HDFS. NOVEMBER / DECEMBER 2015 Elective Computer Science and Engineering CP7019 – MANAGING BIG DATA (Common to M. What are the different ways of executing Pig program? 10.BHUVANESWARAN / AP (SS) / CSE / REC .E. Question Paper Code : 63326 M. Computer Science and Engineering (with specialization in Networks) and M. What is failover and fencing? 7. (a) What is the role of Big Data Analytics in industries? Illustrate with three domain examples. Define TaskTracker failure.E. Biometrics and Cyber Security) (Regulations 2013) Time : Three hours Maximum : 100 Marks Answer ALL questions PART A – (10 x 2 = 20 Marks) 1. (a) Explain the components of Hadoop system. Define Crowd Sourcing Analytics. (16) Or (b) (i) 12. What are the three dimensions used in Big Data? 2. What is schemaless database? 4. What are aggregates? 5. 3. (8) (ii) Brief about Inter-firewall and Tran-firewall analytics. (8) (i) Explain the process of partitioning and combining MapReduce System. DEGREE EXAMINATION.Reg No. What is MapReduce data flow with multiple related tasks? 6. (8) Or B.3 . Contrast TINY-INT and SMALL-INT in Hive data types? PART B – (5 x 16 = 80 Marks) 11. What is shuffle and sort? 9. (8) (ii) Explain about peer-to-peer replication in distribution models. 8.E. BHUVANESWARAN / AP (SS) / CSE / REC . (a) Explain YARN MapReduce in anatomy of MapReduce job runs. 14. (a) (16) Describe the following in terms of HDFS: (i) Name Node and Data Node (4) (ii) Basic Filesystem operations in Hadoop (4) (iii) Query in Filesystem (4) (iv) Coherency Model in Hadoop Filesystem (4) Or (b) (i) (ii) Explain the concept of serialization in Hadoop. Write short notes on sequence file and map file in file based data structures. (8) B.(b) Describe about Materialized Views and Sharding. (8) Explain packaging. (8) (ii) Discuss about common issues when running HBase cluster under load. deployment running for the above workflow job. (8) Explain about four types of functions in Pig.4 . 13. (8) Or (b) (i) (ii) Briefly explain the join operations using Hive. (8) Or (b) (i) 15. (a) (i) (10) (6) Write down and explain MapReduce workflow for the following system: Find the mean maximum recorded temperature for everyday of the year (ii) and every weather station. (8) (ii) Explain any two multi-user schedulers in MapReduce. (8) (i) Explain in detail about HBase implementation.

CP7019-Managing Big Data-Anna University -Question Paper

Comments

Description