PART III SOLUTIONS TO REVIEW QUESTIONS AND EXERCISESDatabase Systems: Instructor's Guide - Part III Solutions to Review Questions and Exercises Part One Background 4 Chapter 1 Introduction to Databases............................................................................................................................................. 4 Chapter 2 Database Environment ................................................................................................................................................. 6 Chapter 3 The Relational Model................................................................................................................................................... 8 Chapter 4 Database Planning, Design, and Administration ....................................................................................................... 12 Part Two Methodology 14 Chapter 5 Entity-Relationship Modeling.................................................................................................................................... 14 Chapter 6 Normalization............................................................................................................................................................. 17 Chapter 7 Methodology - Conceptual Database Design ............................................................................................................ 22 Chapter 8 Methodology - Logical Database Design for Relational Model................................................................................ 23 Chapter 9 Methodology - Physical Database Design for Relational DBMSs........................................................................... 25 Chapter 10 Conceptual Database Design Methodology - Worked Example ............................................................................. 27 Chapter 11 Logical Database Design Methodology - Worked Example ................................................................................... 29 Chapter 12 Physical Database Design Methodology – Worked Example ................................................................................. 32 Part Three Database Languages 33 Chapter 13 SQL .......................................................................................................................................................................... 33 Chapter 14 Advanced SQL......................................................................................................................................................... 39 Chapter 15 QBE.......................................................................................................................................................................... 46 Part Four Selected Database Issues 47 Chapter 16 Security..................................................................................................................................................................... 47 Chapter 17 Transaction Management......................................................................................................................................... 49 Chapter 18 Query Processing ..................................................................................................................................................... 54 Part Five Current Trends 62 Chapter 19 Distributed DBMSs - Concepts and Design ............................................................................................................ 62 Chapter 20 Distributed DBMSs - Advanced Concepts .............................................................................................................. 67 Chapter 21 Introduction to Object DBMSs ................................................................................................................................ 73 Chapter 22 Object-Oriented DBMSs.......................................................................................................................................... 74 2 Database Systems: Instructor's Guide - Part III Chapter 23 Object-Relational DBMSs ....................................................................................................................................... 84 Part Six Future Trends 89 Chapter 24 Web Technology and DBMSs ................................................................................................................................. 89 Chapter 25 Data Warehousing.................................................................................................................................................... 91 Chapter 26 OLAP and Data Mining........................................................................................................................................... 93 3 Database Systems: Instructor's Guide - Part III Part One Background Chapter 1 Introduction to Databases Review Questions 1.1 List four examples of database systems other than those listed in Section 1.1. Some examples could be: • • • • 1.2 A system that maintains component part details for a car manufacturer; An advertising company keeping details of all clients and adverts placed with them; A training company keeping course information and participants' details; An organization maintaining all sales order information. Discuss each of the following terms: Data For end users, this constitutes all the different values connected with the various objects/entities that are of concern to them. (See also Section 1.3.3) Database (See Section 1.3.1) Database Management System (See Section 1.3.2) Data Independence This is essentially the separation of underlying file structures from the programs that operate on them, also called program-data independence. (See also Sections 1.2.2 and 1.3.1) Security The protection of the database from unauthorized users, which may involve passwords and access restrictions. (See also Section 1.6) Integrity The maintenance of the validity and consistency of the database by use of particular constraints that are applied to the data. (See also Section 1.6) Views These present only a subset of the database that is of particular interest to a user. Views can be customized, for example, field names may change, and they also provide a level of security preventing users from seeing certain data. (See also Section 1.3.2) 1.3 Describe the approach taken to the handling of data in the early file-based systems. Discuss the disadvantages of this approach. Focus was on applications for which programs would be written, and all the data required would be stored in a file or files owned by the programs. (See also Section 1.2). Clearly, each program was responsible for only its own data, which could be repeated in other program’s data files. Different programs could be written in different languages, and would not be able to access another program’s files. This would be true even for those programs written in the same language, because a program needs to know the file structure before it can access it. (See also Section 1.2.2). 1.4 Describe the main characteristics of the database approach and contrast it with the file-based approach. Focus is now on the data first, and then the applications. The structure of the data is now kept separate from the programs that operate on the data. This is held in the system catalog or data dictionary. Programs can now share data, which is no longer fragmented. There is also a reduction in redundancy, and achievement of program-data independence. (See also Section 1.3) 1.5 Describe the five components of the DBMS environment and discuss how they relate to each other. See Section 1.3.3. 1.6 Discuss the roles of the following personnel in the database environment: 4 Database Systems: Instructor's Guide - Part III Data Administrator Database Administrator Logical Database Designer Physical Database Designer Application Programmer End Users 1.7 See Section 1.4.1 See Section 1.4.1 See Section 1.4.2 See Section 1.4.2 See Section 1.4.3 See Section 1.4.4 Discuss the advantages and disadvantages of database processing. See Section 1.6 Exercises 1.8 Interview some users of database systems. Which DBMS facilities do they find most useful and why? Which DBMS facilities do they find least useful and why? What do these users perceive to be the advantages and disadvantages of the DBMS? Select a variety of users for a particular DBMS. If the users are using different DBMSs, group the answers for the different systems, which will give an overall picture of specific systems. 1.9 Write a small program that allows entry and display of renter details including a renter number, name, address, telephone number, preferred number of rooms and maximum rent. The details should be stored in a file. Enter a few records and display the details. Now repeat this process but rather than writing a special program, use any DBMS that you have access to. What can you conclude from these two approaches? The program can be written in any appropriate programming language, such as Pascal, FORTRAN, C. It should adhere to basic software engineering principles including being wellstructured, modular, and suitably commented. It is important to appreciate the process involved even in developing a small program such as this. The DBMS facilities to structure, store, and retrieve data are used to the same effect. The differences in the approaches, such as the effort involved, potential for extension, ability to share the data should be noted. 1.10 Study the DreamHome case study presented in Section 1.7. In what ways would a DBMS help this organization? What data can you identify that needs to be represented in the database? What relationships exist between the data? What queries do you think are required? It may be useful to review the file-based approach and the database approach here before tackling the first part of the exercise. Careful reading and thinking about how people might use the applications should help in carrying out the rest of the exercise. 1.11 Study the Wellmeadows Hospital case study presented in Appendix A. In what ways would a DBMS help this organization? What data can you identify that needs to be represented in the database? What relationships exist between the data? The approach used for Exercise 1.10 should be used for this exercise also. 5 Database Systems: Instructor's Guide - Part III Chapter 2 Database Environment Review Questions 2.1 Discuss the concept of data independence and explain its importance in a database environment. See Section 2.1.5 2.2 To address the issue of data independence, the ANSI-SPARC 3-level architecture was proposed. Compare and contrast the 3 levels of this model. See Section 2.1 2.3 What is a data model? Discuss the main types of data models. An integrated collection of concepts for describing data, relationships between data and constraints on the data in an organization. (See also Section 2.3). Object-based data models such as the entity-relationship model (see Section 2.3.1). Record-based data models such as the relational data model, network data model, and hierarchical data model (see Section 2.3.2). 2.4 Discuss the function and importance of conceptual modeling. See Section 2.3.4. 2.5 Describe the types of facilities you would expect to be provided in a multi-user Database Management System. Data Storage, Retrieval and Update A User-Accessible Catalog Transaction Support Concurrency Control Services Recovery Services See also Section 2.4 2.6 Of the facilities in 2.4, which ones do you think would not be needed in a standalone PC Database Management System? Provide justification for your answer. Concurrency Control Services - only single user. Authorization Services - only single user, but may be needed if different individuals are to use the DBMS at different times. Utility Services - limited in scope. Support for Data Communication - only standalone system. 2.7 Describe the main components in a DBMS and suggest which components are responsible for each facility identified in question 2.5. Query Processor, DML Preprocessor, Query Optimizer, Data Manager Dictionary Manager Data Manager Scheduler Utilities Authorization Control Utilities Integrity Checker Database Manager, DDL Compiler, File Manager Data Storage, Retrieval and Update A User-Accessible Catalog Transaction Support Concurrency Control Services Recovery Services Authorization Services Support for Data Communication Integrity Services Services to Promote Data Independence Authorization Services Support for Data Communication Integrity Services Services to Promote Data Independence Utility Services 6 Database Systems: Instructor's Guide - Part III Utility Services See also Sections 2.5 and 2.4. 2.8 What is meant by the term "client-server architecture" and what are the advantages of this approach? Compare the client-server architecture with two other architectures. The client is a process that requires some resource, and the server provides the resource. Neither need reside on the same machine. Advantages include: • • • • Better performance Likely reduction in hardware costs Reduction in communication costs Better consistency See also Section 2.6. 2.9 Discuss the function and importance of the data dictionary. See Section 2.7 Exercises 2.10 Analyze the DBMSs that you are currently using. Determine each system’s compliance with the functions that we would expect to be provided by a DBMS. What types of languages does each system provide? What type of architecture does each DBMS use? Check the accessibility and extensibility of the data dictionary. Is it possible to export the data dictionary to another system? To do this you will need to obtain appropriate information about each system. There should be manuals available or possibly someone in charge of each system who could supply information. 2.11 Write a program that stores names and telephone numbers in a database. Write another program that stores names and addresses in a database. Modify the programs to use external, conceptual, and internal schemas. What are the advantages and disadvantages of this modification? The programs can be written in any suitable language and should be well structured and appropriately commented. Two distinct files result. The structures can be combined into one containing name, address, and tel_no, which can be the representation of both the internal and conceptual schemas. The conceptual schema should be created separately with a routine to map the conceptual to the internal schema. The two external schemas also must be created separately with routines to map the data between the external and the conceptual schema. The two programs should then use the appropriate external schema and routines. 2.12 Write a program that stores names and dates of birth in a database. Extend the program so that it stores the format of the data in the database; in other words, create a data dictionary. Provide an interface that makes this data dictionary accessible to external users. Again, the program can be written in any suitable language. It should then be modified to add the data format to the original file. This should not be difficult, if the original program is well structured. The interface for other users operates on the data dictionary and is separate from the original program. A menu-based interface is adequate. 2.13 How would you modify this program to conform to a client-server architecture? What would be the advantages and disadvantages of this modification? The server should hold the data dictionary and the programs that operate on it. The user interface should be separate, on the client, and call the data dictionary programs. 7 Database Systems: Instructor's Guide - Part III Chapter 3 The Relational Model Review Questions 3.1 Discuss each of the following concepts in the context of the relational data model: (a) (b) (c) (d) (e) Relation Attribute Tuple Intension and Extension Degree and Cardinality. Each term defined in Section 3.2.1. 3.2 Discuss the differences between the candidate keys and the primary key of a relation. Explain what is meant by a foreign key. How do foreign keys of relations relate to candidate keys? The primary key is the candidate key that is selected to identify tuples uniquely within a relation. A foreign key is an attribute or set of attributes within one relation that matches the candidate key of some (possibly the same) relation. 3.3 Define the two principal integrity rules for the relational model. Discuss why it is desirable to enforce these rules. Two rules are Entity Integrity (Section 3.3.2) and Referential Integrity (Section 3.3.3). 3.4 Define the five basic relational algebra operations. Define the remaining three relational algebra operations in terms of the five basic operations. Five basic operations are: • • Selection and Projection (Unary) Cartesian Product, Union and Set Difference (Binary). There is also the Join, Intersection and Division operations: • Can rewrite θ-Join in terms of the basic selection and Cartesian product operations: R F S = σF(R × S) • Can express the intersection operator in terms of the set difference operation: R ∩ S = R - (R - S) • Can express the division operator in terms of the basic operations: T1 = ΠC(R) T2 = ΠC( (S x T1) - R) T = T1 - T2 3.5 What is a view? Discuss the difference between a view and a base relation. Explain what happens when a user accesses a database through a view. View is the dynamic result of one or more relational operations operating on the base relations to produce another relation. Base relation exists as a set of data in the database. A view does not contain any data, rather a view is defined as a query on one or more base relations and a query on the view is translated into a query on the associated base relations. 8 Database Systems: Instructor's Guide - Part III Exercises The following tables form part of a database held in a relational DBMS:Hotel Room Booking Guest where (Hotel_No, Name, Address) (Room_No, Hotel_No, Type, Price) (Hotel_No, Guest_No, Date_From, Date_To, Room_No) (Guest_No, Name, Address) and 3.6 Hotel contains hotel details and Hotel_No is the primary key Room contains room details for each hotel and (Hotel_No, Room_No) forms the primary key Booking contains details of the bookings and the primary key comprises (Hotel_No, Guest_No, and Date_From) Guest contains guest details and Guest_No is the primary key. Generate the relational algebra for the following queries: (a) List all hotels. HOTEL (b) List all single rooms with a price below £20 per night. σtype='S' AND price < 20(ROOM) (c) List the names and addresses of all guests. Πname, address(GUEST) (d) List the price and type of all rooms at the Grosvenor Hotel. Πprice, type(ROOM (e) hotel_no (σname='Grosvenor Hotel'(HOTEL))) List all guests currently staying at the Grosvenor Hotel. GUEST guest_no (σdate_from <= '01-01-99' AND date_to >= '01-01-99' ( BOOKING hotel_no (σname='Grosvenor Hotel'(HOTEL)))) (substitute '01-01-99' for today’s date). (f) List the details of all rooms at the Grosvenor Hotel, including the name of the guest staying in the room, if the room is occupied. (ROOM // Outer Join hotel_no (σname='Grosvenor Hotel'(HOTEL)) Πguest.name, hotel.hotel_no, room.room_no( (GUEST guest_no (σdate_from <= '01-01-99' AND date_to >= '01-01-99' ( BOOKING hotel_no (σname='Grosvenor Hotel'(HOTEL)))) (substitute '01-01-99' for today’s date). (g) List the guest details (Guest_No, Name and Address) of all guests staying at the Grosvenor Hotel. Πguest_no, name, address(GUEST guest_no (σdate_from <= '01-01-99' AND date_to >= '01-01-99' ( BOOKING hotel_no (σname='Grosvenor Hotel'(HOTEL))))) (substitute '01-01-99' for today’s date). 9 Database Systems: Instructor's Guide - Part III 3.7 Using relational algebra, create a view of all rooms in the Grosvenor Hotel, excluding price details. What would be the advantages of this view? Πroom_no, hotel_no, type(ROOM hotel_no (σname='Grosvenor Hotel'(HOTEL))) Security - hides the price details from people who should not see it. Reduced complexity - a query against this view is simpler than a query against the two underlying base relations. 3.8 Produce the equivalent tuple and domain relational calculus statements for the above queries. Tuple Relational Calculus (a) RANGE OF H IS HOTEL {H} RANGE OF R IS ROOM {R | R.Type = 'S' AND R.Price < 20} RANGE OF G IS GUEST {G.Name, G.Address} RANGE OF H IS HOTEL RANGE OF R IS ROOM {R.Price, R.Type | ∃H (R.Hotel_No = H.Hotel_No AND H.Name = 'Grosvenor Hotel')} RANGE OF H IS HOTEL RANGE OF G IS GUEST RANGE OF B IS BOOKING {G | B ((B.Date_From <= '01-01-99' AND B.Date_To >= '01-01-99') AND (B.Guest_No = G.Guest_No) AND ∃H (B.Hotel_No = H.Hotel_No AND H.Name = 'Grosvenor Hotel'))} Need to use Union of a relation containing all Rooms that are occupied with a relation extended by a NULL name for all unoccupied rooms. RANGE OF H IS HOTEL RANGE OF G IS GUEST RANGE OF B IS BOOKING {G.Guest_No, G.Name, G.Address | B((B.Date_From <= '01-01-99' AND B.Date_To >= '01-01-99') AND (B.Guest_No = G.Guest_No) AND ∃H (B.Hotel_No = H.Hotel_No AND H.Name = 'Grosvenor Hotel'))} (b) (c) (d) (e) (f) (g) Domain Relational Calculus (a) {Hotel_No, Name, Address | ∃Hotel_No, Name, Address (HOTEL(Hotel_No, Name, Address)} {Room_No, Hotel_No, Type, Price | ∃Room_No, Hotel_No, Type, Price (ROOM(Room_No, Hotel_No, Type, Price) AND Type = 'S' AND Price < 20)} {Name, Address | ∃Name, Address (GUEST(Name, Address)} {Price, Type | ∃Room_No, Type, Price, Hotel_No (b) (c) (d) 10 Database Systems: Instructor's Guide - Part III (ROOM(Hotel_No, Type, Price) AND ∃Name (HOTEL(Hotel_No, Name) AND Name = 'Grosvenor Hotel')} (e) {Guest_No, Guest.Name, Guest.Address | ∃Guest_No, Name, Address, (GUEST(Guest_No, Name, Address) AND ∃Date_From, Date_To, Hotel_No (BOOKING(Hotel_No, Guest_No, Date_From, Date_To) AND Date_From <= '01-01-99' AND Date_To >= '01-01-99') AND ∃Name (HOTEL(Hotel_No, Name) AND Name = 'Grosvenor Hotel')))} Rest similar to above. (f) 3.9 Explain how the entity and referential integrity rules apply to these relations. For each relation, the primary key must not contain any nulls. Room is related to Hotel through the attribute Hotel_No. Therefore, the Hotel_No in Room should either be null or contain the number of an existing hotel in the Hotel relation. In this case study, it would probably be unacceptable to have a Hotel_No in Room with a null value. Booking is related to Hotel through the attribute Hotel_No. Therefore, the Hotel_No in Booking should either be null or contain the number of an existing hotel in the Hotel relation. However, because Hotel_No is also part of the primary key, a null value for this attribute would be unacceptable. Similarly for Guest_No. Booking is also related to Room through the attribute Room_No. 3.10 Analyze the RDBMSs that you are currently using. Determine the support the system provides for primary keys, alternate keys, foreign keys, relational integrity and views. What types of relational languages does the system provide? For each of the languages provided, what are the equivalent operations for the eight relational algebra operations? This is a small student project, the result of which is dependent on the system analyzed. 11 Database Systems: Instructor's Guide - Part III Chapter 4 Database Planning, Design, and Administration Review Questions 4.1 Discuss the relationship between the information system lifecycle and the database application lifecycle. See Sections 4.1 and 4.2. 4.2 Describe the purpose of each of the stage of the database application lifecycle. See Section 4.2. 4.3 Identify some of the techniques available to help document the users' requirements specification. See Section 4.2.3. 4.4 Describe the main aims of the conceptual and logical database design phases. The aims of conceptual database design are described in Section 4.3.2 and the aims of logical database design are described in Section 4.3.3. 4.5 Explain why it is necessary to select the target database management system before commencing with the physical database design phase. Describe the main aims of physical database design. Physical database design is tailored to a specific DBMS, therefore it is essential the DBMS is determined before the physical design phase can begin. The aims of physical database design are described in Section 4.3.4 4.6 Describe the prototype approach and identify the potential advantages of using this approach. See Section 4.2.7. 4.7 Outline the procedure for selecting a DBMS. See Section 4.6. 4.8 Define the purpose and tasks associated with data administration and database administration. See Section 4.7. Exercises 4.9 Produce a corporate data model for the Wellmeadows Hospital case study described in Appendix A. 12 Database Systems: Instructor's Guide - Part III Hospital Supplies Patient Management Doctor 1 Refers M Has 1 Clinic M Item 1 Drug Form M Medication M Takes 1 Patient 1 Attend Is 1 M RelatedT NOK 1 d 1 GoesT M Supply M Provides OPClinic M Appointmt M InPatient 1 For M WorksIn Examine M M M AssignedTo M Requires I N Supplier M M Requisitio Makes 1 1 Staff 1 1 M WorksIn 1 N 1 Ward M Has 1 Bed M 1 4.10 Assume that you are responsible for selecting a new DBMS product for a group of users in your Ward Management organization. To undertake this exercise, you must first establish a set of requirements for the group and then identify a set of features that a DBMS product must provide to fulfill the requirements. Describe the process of evaluating and selecting the best DBMS product. The student should follow the approach to DBMS selection described in Section 4.6 and produce a report that identifies a suitable DBMS product that meets the requirements of the organization. The selection should be fully justified. Receives 4.11 Assume that you are responsible for selecting a DBMS product for the Wellmeadows Hospital case study. Describe the process of evaluating and selecting the best DBMS product. The student should follow the approach to DBMS selection described in Section 4.6 and produce a report that identifies a suitable DBMS product that meets the requirements of the Wellmeadows Hospital case study. The selection should be fully justified and any assumptions made about the case study should be highlighted.. 4.12 Investigate whether data administration and database administration exists as distinct functional areas within your organization. If identified, describe the organization, responsibilities, and tasks associated with each functional area. The student should investigate the organization and identify whether data administration and database administration exists as distinct functional areas. However, the student should be careful to note that these functions may be named differently, merged as a single function or included as part of a larger IT/IS function. If identified, the student should compile a report documenting the organization, responsibilities, and tasks. 13 Database Systems: Instructor's Guide - Part III Part Two Methodology Chapter 5 Entity-Relationship Modeling Review Questions 5.1 Describe the purpose of high-level data models in database design. The main purpose for developing a high-level data model is to support a user’s perception of data and to conceal the more technical aspects associated with database design. Furthermore, a conceptual data model is independent of the particular DBMS and hardware platform that is used to implement the database. 5.2 Describe the basic concepts of the Entity-Relationship (ER) model. Present the diagrammatic representation of these concepts. The basic concepts of the Entity-Relationship model include entity types, relationship types, and attributes. See Section 5.1 5.3 Describe the constraints that may be placed on participating entities in a relationship. There are two main types of restrictions on relationships called cardinality and participation constraints. See Section 5.2 5.4 Describe the problems that may occur when creating an ER model. Problems may occur due to the misinterpretation of the meaning of certain relationships and these problems are referred to as connection traps. There are two main types of connection traps called fan traps and chasm traps. See Section 5.3 5.6 Describe the main concepts associated with the Enhanced Entity-Relationship model. Present the diagrammatic representation of these concepts. The main concepts of the EER model are specialization/generalization, aggregation and categorization. However, in the accompanying textbook, we focus only on specialization/ generalization and categorization, which is associated with the related concepts of entity types described as superclasses or subclasses and the process of attribute inheritance. See Section 5.4 Exercises The University Accommodation Office Case Study The Director of the University Accommodation Office requires you to design a database to assist with the administration of the office. The requirements collection and analysis phase of the database design process based on the Director’s view has provided the following requirements specification for the Accommodation Office database. (1) The data stored on each full-time student includes the matriculation number, name (first and last name), home address (street, city/town, postcode), date of birth, sex, category of student (for example, first year undergraduate (1UG), postgraduate (PG)), nationality, smoker (yes or no), special needs, any additional comments, current status (placed/waiting), and what course the student is studying on. The student information stored relates to those currently renting a room 14 Database Systems: Instructor's Guide - Part III (2) (3) (4) (5) (6) (7) (8) and those on the waiting list. Students may rent a room in a university owned hall of residence or student flat. When a student joins the University he or she is assigned to a member of staff who acts as his or her Advisor of Studies. The Advisor of Studies is responsible for monitoring the student’s welfare and academic progress. The data held on a student’s Advisor includes their full name, position, name of department, internal telephone number, and room number. Each hall of residence has a name, address, telephone number, and a hall manager who supervises the operation of the hall. The halls provide only single rooms, which have a room number, place number, and monthly rent rate. The place number uniquely identifies each room in all the halls controlled by the Accommodation Office and is used when renting a room to a student. The Accommodation Office also offers student flats. These flats are fully furnished and provide single room accommodation for groups of 3, 4, or 5 students. The information held on student flats includes a flat number, address, and the number of single bedrooms available in each flat. The flat number uniquely identifies each flat. Each bedroom in a flat has a monthly rent rate, a room number, and a place number. The place number uniquely identifies each room available in all student flats and is used when renting a room to a student. A student may rent a room in a hall or student flat for various periods of time. New lease agreements are negotiated at the start of each academic year with a minimum rental period of one semester (15 weeks) and a maximum rental period of one year, which includes Semesters 1, 2, and the Summer Semester. Each individual lease agreement between a student and the Accommodation Office is uniquely identified using a lease number. The data stored on each lease includes the lease number, duration of the lease (given as semesters), name, and matriculation number of the student, place number, room number, address details of the hall or student flat, the date the student wishes to enter the room, and the date the student wishes to leave the room (if known). Student flats are inspected by staff on a regular basis to ensure that the accommodation is well maintained. The information recorded for each inspection is the name of the member of staff who carried out the inspection, the date of inspection, an indication of whether the property was found to be in a satisfactory condition (yes or no), and any additional comments. Some information is also held on members of staff on the Accommodation Office and includes the staff number, name (first and last name), home address (street, city/town, postcode), date of birth, sex, position (for example, Hall Manager, Administrative Assistant, Cleaner), and location (for example, Accommodation Office or Hall). The Accommodation Office also stores a limited amount of information on the courses run by the University including the course number, course title (including year), course leader’s name, internal telephone number, and room number, and department name. Each student is associated with a single course. Whenever possible, information on a student’s next-of-kin is stored which includes the name, relationship, address (street, city/town, postcode), and contact telephone number. 5.6 Create an Enhanced Entity–Relationship (EER) model to represent the data requirements of the University Accommodation Office case study. Develop the model using the following the steps: (a) Identify entity types. (b) Identify relationship types and determine the cardinality and participation constraints of the relationships. (c) Identify attributes and associate attributes with entity or relationship types. (d) Determine candidate and primary key attributes. (e) Specialize / generalize entity types (where appropriate). (f) Categorize entity types (where appropriate). (g) Draw the EER diagram. State any assumptions you made when creating the EER model. An example of a possible ER model for the University Accommodation Office case study is shown below. 15 Database Systems: Instructor's Guide - Part III Accomm adatio HallN d 1 Provid M Room 1 Fo M Lease M Lease FlatN Place ∩ Reques ∩ Hall 1 1 Flat 1 O 1 1 NOK Related 1 Studen M M Studi Matric Manag Works M Inspecti IDat Advis 1 Accom Staff M M Undertak 1 Academi Staff 1 Run 1 M Cours Course 1 M 1 Located AcdStaff StaffN Office The student should identify any ambiguities in the specification for the University Accommodation Office and state clearly his or her assumptions. For example, we may assume that we only hold information on a single next-of-kin per student and that some students do not provide this information. 16 Database Systems: Instructor's Guide - Part III Chapter 6 Normalization Review Questions 6.1 Describe the purpose of normalizing data. When we design a database for a relational system, the main objective in developing a logical data model is to create an accurate representation of the data, its relationships and constraints. To achieve this objective, we must identify a suitable set of relations. A technique that we can use to help identify such relations is called normalization. Normalization is a technique for producing a set of relations with desirable properties, given the data requirements of an enterprise. Normalization supports database designers by presenting a series of tests, which can be applied to individual relations so that a relational schema can be normalized to a specific form to prevent the possible occurrence of update anomalies. See also Sections 6.1 and 6.4. 6.2 Describe the problems that are associated with redundant data. A major aim of relational database design is to group attributes into relations so as to minimize information redundancy and thereby reduce the file storage space required by the base relations. Another serious difficulty using relations that have redundant information is the problem of update anomalies. These can be classified as insertion, deletion, or modification anomalies. See Section 6.2 6.3 Describe the concept of functional dependency. Functional dependency describes the relationship between attributes in a relation. For example, if A and B are attributes of relation R, B is functionally dependent on A (denoted A → B), if each value of A in R is associated with exactly one value of B in R. Functional dependency is a property of the meaning or semantics of the attributes in a relation. The semantics indicate how the attributes relate to one another and specify the functional dependencies between attributes. When a functional dependency is present, the dependency is specified as a constraint between the attributes. See also Section 6.3. 6.4 How is the concept of functional dependency associated with the process of normalization? Normalization is a formal technique for analyzing relations based on their primary key (or candidate keys in the case of BCNF) and functional dependencies. Normalization is often performed as a series of tests on a relation to determine whether it satisfies or violates the requirements of a given normal form. Three normal forms were initially proposed, which are called first (1NF), second (2NF) and third (3NF) normal form. Subsequently, a stronger definition of third normal form was introduced and is referred to as Boyce-Codd normal form (BCNF). All of these normal forms are based on the functional dependencies among the attributes of a relation. 6.5 Provide a definition for first, second, third and Boyce-Codd normal forms. First Normal Form (1NF) is a relation in which the intersection of each row and column contains one and only one value. Second Normal Form (2NF) is a relation that is in first normal form and every non-primary-key attribute is fully functionally dependent on the primary key. Third Normal Form (3NF) is a relation that is in first and second normal form in which no nonprimary-key attribute is transitively dependent on the primary key. 17 Database Systems: Instructor's Guide - Part III Boyce-Codd Normal Form (BCNF) is a relation in which every determinant is a candidate key. See also Sections 6.5 to 6.8. 6.6 Describe the purpose of fourth (4NF) and fifth (5NF) normal form. For 4NF see Section 6.11 and for 5NF see Section 6.12. Exercises The table shown in Figure 6.25 lists dentist/patient appointment data. A patient is given an appointment at a specific time and date with a dentist located at a particular surgery. On each day of patient appointments, a dentist is allocated to a specific surgery for that day. StaffNo S1011 S1011 S1024 S1024 S1032 S1032 DentistName Tony Smith Tony Smith Helen Pearson Helen Pearson Robin Plevin Robin Plevin PatNo P100 P105 P108 P108 P105 P110 PatName Gillian White Jill Bell Ian MacKay Ian MacKay Jill Bell John Walker Appointment Date Time 12/9/95 10.00 12/9/95 12.00 12/9/95 10.00 14/9/95 14.00 14/9/95 16.30 15/9/95 18.00 SurgeryNo S15 S15 S10 S10 S15 S13 Figure 6.25: Lists Dentist/Patient Appointment Data. 6.7 The table shown in Figure 6.25 is susceptible to update anomalies. Provide examples of insertion, deletion, and update anomalies. The student should provide examples of insertion, deletion and update anomalies using the data shown in the table. An example of a deletion anomaly is if we delete the details of the dentist called 'Helen Pearson', we also lose the appointment details of the patient called 'Ian MacKay'. 6.7 Describe and illustrate the process of normalizing the table shown in Figure 6.25 to Boyce-Codd normal forms. State any assumptions you make about the data shown in this table. The student should state any assumptions made about the data shown in the table. For example, we may assume that a patient is registered at only one surgery. Also, a patient may have more than one appointment on a given day. 18 Database Systems: Instructor's Guide - Part III 1NF PK StaffNo ADate ATime DName PatNo PName SurgeryNo fd1 fd2 fd3 fd4 fd5 fd2 and fd4 violates 2NF 2NF StaffNo ADate ATime PatNo PName StaffNo ADate SurgeryNo StaffNo DName Fd3’ violates 3NF 3NF / BCNF FK PK StaffNo ADate ATime FK PatNo fd1 PK StaffNo DName fd2 fd5 FK PK StaffNo ADate SurgeryNo PK PatNo PName Fd3’ fd4 19 Database Systems: Instructor's Guide - Part III An agency called Instant Cover supplies part-time/temporary staff to hotels within Strathclyde Region. The table shown in Figure 6.26 lists the time spent by agency staff working at various hotels. The National Insurance Number (NIN) is unique for every member of staff. NIN 1135 1057 1068 1135 ContractNo C1024 C1024 C1025 C1025 Hours 16 24 28 15 EName Smith J Hocine D White T Smith J H_No H25 H25 H4 H4 H_Loc East Kilbride East Kilbride Glasgow Glasgow Figure 6.26: Instant Cover’s Contracts. 6.8 The table shown in Figure 6.26 is susceptible to update anomalies. Provide examples of insertion, deletion, and update anomalies. The student should provide examples of insertion, deletion and update anomalies using the data shown in the table. An example of an update anomaly is if we wish to change the name of the employee called 'Smith J', we may only change the entry in the first row and not the last with the result that the database becomes inconsistent. 6.9 Describe and illustrate the process of normalizing the table shown in Figure 6.26 to Boyce-Codd Normal Form. State any assumptions you make about the data shown in this table. The student should state any assumptions made about the data shown in the table. For example, we may assume that a hotel may be associated with one or more contracts. 20 Database Systems: Instructor's Guide - Part III 1NF PK NIN Contract_No Hours EName H_No H_Loc fd1 fd2 fd3 fd4 fd2 and fd3 violates 2NF 2NF NIN Contract_No Hours Contract_No H_No H_Loc NIN EName fd4 violates 3NF 3NF / BCNF FK PK NIN Contract_No Hours FK PK Contract_No FK H_No fd1 fd3 PK NIN EName PK H_No H_ Loc fd2 fd4 21 Database Systems: Instructor's Guide - Part III Chapter 7 Methodology - Conceptual Database Design Review Questions 7.1 Describe the purpose of a design methodology. A structured approach that uses procedures, techniques, tools, and documentation aids, to support and facilitate the process of design. A design methodology consists of phases that contain steps, which guide the designer in the choice of techniques that are appropriate at each stage of the project and also helps to plan, manage, control and evaluate database development projects. Furthermore, it is a structured approach for analyzing and modeling a set of requirements for a database in a standardized and organized manner. See Section 7.1. 7.2 Describe the main phases involved database design. See Section 7.1.2 7.3 Identify important factors in the success of logical database design. See Section 7.1.3 7.4 Discuss the important role played by users in the process of database design. The users' involvement throughout the database design phase is critical to providing the 'correct' system. In particular, the user should clarify any ambiguities in the specification that describes the required system and also review continually the development of the database design. The process of developing the database design is repeated until the user is prepared to 'sign-off' the design as being a 'true' representation of the part of the enterprise that is being modeling. See Sections 7.1.3 and 7.2. 7.5 Describe the main objective of conceptual database design. See Section 7.3 Step 1. 7.6 Describe what a user view represents and the approaches that may be used to identify user views. User views can be identified using various methods. First, by examining the data flow diagrams that should have been produced previously to identify functional areas and possibly individual functions. Alternatively, by interviewing users, examining procedures, reports, forms, and/or observing the enterprise in operation. See also Section 7.3 7.7 Identify the main tasks associated with conceptual database design. See Section 7.3 7.8 Discuss the purpose of specialization/generalization of entity types, and discuss why this is an optional step in conceptual database design. See Section 7.3 Step 1.6. 7.9 Identify and describe the purpose of the documentation generated during conceptual database design. See Section 7.3 and in particular note the documentation generated at the end of each step. 22 Database Systems: Instructor's Guide - Part III Chapter 8 Methodology - Logical Database Design for Relational Model Review Questions 8.1 Identify the three main phases of database design and discuss the purpose of logical database design. The three main phases of database design are conceptual, logical, and physical. For purpose of logical database design see Section 8.1. 8.2 Describe the steps involved in refining a conceptual data model into a logical data model. See Section 8.1 Step 2. 8.3 Describe the rules for deriving relations that represent strong entity types, weak entity types, one-to-one binary relationship types, one-to-many relationship types, multi-valued attributes, and superclass/subclass relationships. See Section 8.1 Step 2.2. 8.4 Discuss how the technique of normalization can be used to validate the logical data model and the relations derived from the model. The logical data model can be validated using the technique of nomalization and against the transactions that the model is required to support. Normalization is used to improve the model so that it satisfies various constraints that avoid unnecessary duplication of data. Normalization ensures that the resultant model is a closer model of the enterprise that it serves, it is consistent, and has minimal redundancy and maximum stability. See also Section 8.1 Step 2.3. 8.5 Discuss two approaches that can be used to validate that the logical data model is capable of supporting the transactions required by the user’s view. Two possible approaches to ensure that the logical data model supports the required transactions include: checking that all the information (entities, relationships and their attributes) required by each transaction is provided by the model in documenting a description of each transaction’s requirements, and diagrammatically representing the pathway taken by each transaction directly on the ER diagram. See also Section 8.1 Step 2.4. 8.6 Describe the purpose of integrity constraints and identify the five main types of constraints. The main types of constraints include: There are five types of integrity constraints: required data, attribute domain constraints, entity integrity, referential integrity, and enterprise constraints. See also Section 8.1 Step 2.6. 8.7 Describe the alternative strategies that can be applied if there exists a child occurrence referencing a parent occurrence that we wish to delete. There are several strategies to consider when there exists a child occurrence referencing the parent occurrence that we are attempting to delete: NO ACTION, CASCADE, SET NULL, SET DEFAULT, and NO CHECK. See also Section 8.1 Step 2.6. 8.8 Identify the tasks typically associated with merging local logical data models into a global logical model. 23 Database Systems: Instructor's Guide - Part III See Section 8.1 Step 3 and in particular Step 3.1. 24 Database Systems: Instructor's Guide - Part III Chapter 9 Methodology - Physical Database Design for Relational DBMSs Review Questions 9.1 Explain the difference between logical and physical database design. Why might these tasks be carried out by different people? Logical database design is concerned with building the data model for the organization that is completely independent of any DBMS. Physical database design, however, is concerned with actually defining the data model using the DDL of a particular DBMS. Consequently, logical database design focuses on what the data model represents, whereas physical database design focuses on how the data model is to be implemented. Different skills are required to undertake these design phases, which are often found in different people. See Section 9.1. 9.2 Describe the inputs and outputs of physical database design. The inputs are the global logical data model and the data dictionary. The outputs are the base relations, integrity rules, file organization specified, secondary indexes determined, user views and access rules. See Section 9.3. 9.3 Describe the purpose of the main steps in the physical design methodology presented in this chapter. Step 4 Step 5 Produces a relational database schema from the global logical data model. This includes integrity rules. Determines the file organizations for the base relations. This takes account of the nature of the transactions to be carried out, which also determine where secondary indexes will be of use. As a result of analyzing the transactions, the design may be altered by incorporating controlled redundancy into it. Designs the security measures for the database implementation. This includes designing the user views and the access rules on the relations and views. Monitors the database application systems and improves performance by making amendments to the design as appropriate. Step 6 Step 7 See also Section 9.3. 9.4 "One of the main objectives of physical database design is to store data in an efficient way." How might we measure efficiency in this context? This can be measured by the number of transactions that can be processed by the system in a given time frame, or by the length of time it takes to complete one transaction, or by the amount of disk storage taken up by the database files. See also Section 9.3 Step 5. 9.5 Under what circumstances would we want to denormalize a logical data model? Use examples to illustrate your answer. Generally, if overall performance needs to be improved, controlled redundancy can be introduced. See also Section 9.3 Step 5.4. Examples: If queries on staff always required the branch address, this attribute could be posted into staff. The effect on updating would be minimal if branch data was relatively static, and it removes a join from the query. 25 Database Systems: Instructor's Guide - Part III If the number of staff for each branch was often required with branch details, a derived attribute could be placed in branch. This would remove the need to access and repeatedly count the relevant records in staff. When staff joined or left a branch, this attribute would required updating. 26 Database Systems: Instructor's Guide - Part III Chapter 10 Conceptual Database Design Methodology - Worked Example Exercises The Wellmeadows Hospital case study 10.1 Identify user views for the Medical Director and Charge Nurse in the Wellmeadows Hospital case study, described in Appendix A. See Appendix A 10.2 List the users' requirements specification for each of these views. The student should examine the case study in detail and identify and compile a user specification for the Medical Director’s and the Charge Nurse’s views. As an additional task the student may also compile a specification for the Personnel Officer’s view. Of course to complete this exercise will require that the student makes some ascertions about the precise requirements of each user view. Any assumption should be documented along with each view. Examples of specification for the Medical Director’s and Charge Nurse’s views are shown below. Medical Director The Director is responsible for the overall management of the hospital and must maintain control over the use of resources (including staff, beds, and supplies) in the provision of cost-effective treatment for all patients. 1. The hospital is composed of many wards. Each ward is managed by a Charge Nurse. The information to be held on each ward includes the ward name, number (e.g. W1), phone number, location (e.g. Block E), number of beds and the name of the Charge Nurse. Each ward is allocated staff including (e.g. Charge nurse, senior and junior nurses, doctors, consultants, auxiliaries). The hospital maintains a central stock of surgical (e.g. syringe, bandages) and nonsurgical (e.g. plastic bags, aprons). The details of surgical and non-surgical supplies includes item number and name, item description, quantity in stock, re-order level, cost per unit. The supplies used by each ward is monitored. The hospital also maintains a stock of pharmaceutical supplies (e.g. antibiotics, pain killers). The details of pharmaceutical supplies includes drug number and name, description, dosage, quantity in stock, re-order level, cost per unit. The pharmaceutical supplies used by each ward is monitored. 4. The details of the suppliers of the surgical, non-surgical and pharmaceutical items are stored. The information stored includes the supplier name and number, address, phone and fax number. Patients are normally referred to the hospital for treatment by their local doctor. The details of local doctors are stored including the their name, clinic number, address and phone number. The details of patients referred to the hospital includes the patient number, name (first and last name), address, phone number, date of birth, martial status, next-of-kin details (name, relationship, address and phone number). 2. 3. 5. 6. 27 Database Systems: Instructor's Guide - Part III 7. When a patient is referred by their doctor to attend the hospital, the patient is given an appointment and is examined by a consultant. The details of the appointment are stored including the consultant’s name and number, appointment number, date, time and examination room (e.g. Room E112). As a result of the examination, the patient is either recommended to attend the outpatient clinic or placed on a waiting list until a bed can be found in a particular ward. 8. The details of outpatients are stored. The information stored includes the patient details as stated earlier (see 6) and the date and time of the appointment at the outpatient clinic. The details of patients currently placed in a ward and those on the waiting list for a place on a ward are stored. The information stored includes the patient details as stated earlier (see 6) and the date placed on waiting list, ward required, expected duration of stay, date placed in the ward and date left the ward. 9. Charge Nurse The Charge Nurse has overall responsibility for the management of a single ward. The Charge Nurse is allocated a budget to run the ward and must ensure that all resources (staff, beds and supplies) are used effectively in the care of patients. The Charge Nurse and other senior medical staff are responsible for the allocation of beds to patients on the waiting list. 1. The information to be held on each ward includes the details of staff allocated to each ward including the staff number, name, address, phone number, position, number of hours worked per week and shift (e.g. early, late). The information stored on each patient on the waiting list includes the patient number, name (first and last name), address, phone number, date of birth, martial status, next-ofkin details, date placed on waiting list, required ward, date placed in ward, expected duration of stay, date left ward. When a patient enters the ward they are allocated a bed with a unique bed number. Each patient is prescribed medication and the details of this medication includes the patient number, drug number and name, units per day, start and finish date. The medication (pharmaceutical supplies) given to each patient is monitored. Staff are allocated to work in wards, as required. The Charge Nurse of each ward is responsible for creating a staff rota which ensures that the correct complement of staff are on duty for each shift (early, late, night). Nursing staff may be specifically allocated to patients who required specialist care. When required the Charge Nurse may obtain surgical, non-surgical and pharmaceutical supplies from the central stock of supplies held by the hospital. The information to be stored includes the requisition number, staff name and number, ward number, item number (or drug number), quantity required, date ordered and date received. 2. 3. 4. 5. 10.3 Create local conceptual data models for each of the user views. State any assumptions necessary to support your design. The student should create local conceptual data models using the requirements specification for the Medical Director and the Charge Nurse. This should include an ER model representing each user view and the supporting documentation that describes the models. Throughout the process of design, the student should clearly state any assumptions necessary to support his or her design. 28 Database Systems: Instructor's Guide - Part III Chapter 11 Logical Database Design Methodology - Worked Example Exercises The Wellmeadows Hospital case study 11.1 Create and validate the local logical data models for each of the user views of the Wellmeadows Hospital case study identified in Exercise 10.1. The student should refine the local conceptual models to create local logical data models based on the Medical Director and Charge Nurse views. The logical models representing each user view should also be validated. The student should produce an ER model for each user view and the supporting documentation that describes each model. Throughout the process of design, the student should clearly state any assumptions necessary to support his or her design. 11.2 Merge the local data models to create a global logical data model of the Wellmeadows Hospital case study. State any assumptions necessary to support your design. Once the local data models have been validated, the student should demonstrate the view integration approach to create a global logical data model. The student should produce an ER model of the global data model, representing both user views and the supporting documentation that describes the model. Throughout the process of design, the student should clearly state any assumptions necessary to support his or her design. 11.3 Create or update the supporting documentation for the global logical data model of the Wellmeadows Hospital case study. An example of an ER model and the relational schema of the global data model of the Wellmeadows Hospital case study is shown below. Note that this answer also includes the Personnel Officer’s view. 29 Database Systems: Instructor's Guide - Part III Doctor 1 Refers M Has 1 Clinic M Item 1 Drug Form s M Medication M Takes 1 Patient 1 Attends Is 1 RelatedTo M NOK N d GoesTo 1 M Supply M Provides Qualificatn WorkE xp OPClinic M Appointmt M InPatient 1 For M Has M Has M WorksIn M M M M Requires In Examines AssignedTo 1 Supplier M M Requisition 1 Makes 1 1 1 1 Staff 1 M WorksIn 1 N 1 Ward M Has 1 Bed M 1 Receives Ward (WardNo, WName, Location, TotalBeds, TelExtn, CNStaffNo) Primary Key WardNo Alternate Key TelExtn Foreign Key CNStaffNo is NOT NULL references Staff(StaffNo) on delete SET DEFAULT on update CASCADE Staff (StaffNo, FName, LName, Address, TelNo, DOB (Date_of_Birth), Sex, (National Insurance Number) NIN, Position, Salary, SScale, WeekHrs, ContType, TypePay) Primary Key StaffNo Alternate Key NIN Qualification (QDate, QType, Institution, StaffNo) Primary Key QType, StaffNo Foreign Key StaffNo is NOT NULL references Staff(StaffNo) on delete CASCADE on update CASCADE Work_Experience (SDate, FDate, Position, OrgName, StaffNo) Primary Key OrgName, StaffNo Foreign Key StaffNo is NOT NULL references Staff(StaffNo) on delete CASCADE on update CASCADE Staff_Rota (Shift, WeekNo, StaffNo, WardNo) Primary Key StaffNo, WeekNo Foreign Key StaffNo is NOT NULL references Staff(StaffNo) on delete CASCADE on update CASCADE) Foreign Key WardNo is NOT NULL references Ward(WardNo) on delete CASCADE on update CASCADE) Patient (PatNo, FName, LName, Address, TelNo, DOB, Sex, MStatus, DateReg, DocName, ClinicNo, NName, NRelationship, NAddress, NTelNo) Primary Key PatNo 30 Database Systems: Instructor's Guide - Part III Foreign Key DocName, ClinicNo is NOT NULL references Doctor(DocName, ClinicNo) on delete NO ACTION on update CASCADE Doctor (DocName, ClinicNo, Address, TelNo) Primary Key DocName, ClinicNo Appointment (AppNo, PatNo, ConsStaffNo, ADate, ATime, RoomNo) Primary Key AppNo Foreign Key PatNo is NOT NULL references Patient(PatNo) on delete NO ACTION on update CASCADE) Foreign Key ConStaffNo is NOT NULL references Staff(StaffNo) on delete NO ACTION on update CASCADE) OutPatient_Appointment (OutPatDate, OutPatTime, PatNo) Primary Key OutPatDate, PatNo Foreign Key PatNo is NOT NULL references Patient(PatNo) on delete CASCADE on update CASCADE) InPatient_Allocation (ListDate, WardReq, Duration, PlacedDate, ExLeaveDate, ActLeaveDate, PatNo, BedNo) Primary Key PatNo, ListDate Foreign Key PatNo is NOT NULL references Patient(PatNo) on delete CASCADE on update CASCADE) Foreign Key WardReq is NOT NULL references Ward(WardNo) on delete NO ACTION on update CASCADE Medication (PatNo, DrugNo, UnitsDay, AMethod, SDate, FDate) Primary Key PatNo, DrugNo, SDate Foreign Key PatNo is NOT NULL references Patient(PatNo) on delete CASCADE on update CASCADE) Foreign Key DrugNo is NOT NULL references Pharmaceutical(DrugNo) on delete NO ACTION on update CASCADE) Pharmaceutical (DrugNo, DName, Description, Dosage, MAdmin, QStock, RLevel, UnitCost , SupplierNo) Primary Key DrugNo Foreign Key SupplierNo is NOT NULL references Supplier(SupplierNo) on delete NO ACTION on update CASCADE Non-Surgical/Surgical (ItemNo, IName, IDescription, QStock, RLevel, UnitCost, SupplierNo) Primary Key ItemNo Foreign Key SupplierNo is NOT NULL references Supplier(SupplierNo) on delete NO ACTION on update CASCADE Requisition (ReqNo, CNStaffNo, WardNo, ItemDrugNo, QuantReq, DateOrder, DateReceive) Primary Key ReqNo Foreign Key CNStaffNo is NOT NULL references Staff(StaffNo) on delete NO ACTION on update CASCADE Foreign Key WardNo is NOT NULL references Ward(WardNo) on delete NO ACTION on update CASCADE Foreign Key ItemDrugNo is NOT NULL references Non-Surgical/Surgical(ItemNo) and Pharmaceutical(DrugNo) on delete NO ACTION on update CASCADE Supplier (SupplierNo, SName, SAddress, TelNo, FaxNo) Primary Key SupplierNo Alternative Key TelNo Alternative Key FaxNo 31 Database Systems: Instructor's Guide - Part III Chapter 12 Physical Database Design Methodology – Worked Example Exercises 12.1 Create a physical database design for the logical design of the DreamHome case study (described in Chapter 11) based on the DBMS that you have access to. The assumption made here is that any DBMS that is being used is a relational DBMS. It is important to know the facilities that are provided by the DBMS, and understand how to make use of them for physical database design. Assumptions made may made, for example, on the performance of transactions. The student should produce the required documentation for the target DBMS. 12.2 Implement the DreamHome database using the physical design created in Section 12.1. The student should implement the physical database design created in Exercise 12.1 using the target DBMS. 12.3 Investigate whether your DBMS can accommodate the two new requirements for the DreamHome case study given in Step 7 of this chapter. Again the student will have to investigate the functionality of the target DBMS to assess whether the new requirements can be made. If yes, the student should implement the new requirements and also suggest further enhancements. 12.4 Create a physical database design for the Wellmeadows Hospital case study (described in Appendix A) based on the DBMS that you have access to. The student should create a physical database design for the Wellmeadows Hospital case study based on the logical design created in Exercise 11.3. 12.5 Implement the Wellmeadows Hospital database using the physical design created in 12.4. The student should implement the physical database design created in Exercise 12.4 using the target DBMS. 32 Database Systems: Instructor's Guide - Part III Part Three Database Languages Chapter 13 SQL Review Questions 13.1 What are the two major components of SQL and what function do they serve? A data definition language (DDL) for defining the database structure. A data manipulation language (DML) for retrieving and updating data. 13.2 What are the advantages and disadvantages of SQL? Advantages • Satisfies ideals for database language • (Relatively) Easy to learn • Portability • SQL standard exists • Both interactive and embedded access • Can be used by specialist and non-specialist. Disadvantages • Impedance mismatch - mixing programming paradigms with embedded access • Lack of orthogonality - many different ways to express some queries • Language is becoming enormous (SQL-92 is 6 times larger than predecessor) • Handling of nulls in aggregate functions • Result tables are not strictly relational - can contain duplicate tuples, imposes an ordering on both columns and rows. 13.3 Explain the function of each of the clauses in the SELECT statement. What restrictions are imposed on these clauses? FROM Specifies the table or tables to be used. WHERE Filters the rows subject to some condition. GROUP BY Forms groups of rows with the same column value. HAVING Filters the groups subject to some condition. SELECTSpecifies which columns are to appear in the output. ORDER BY Specifies the order of the output. If the SELECT list includes an aggregate function and no GROUP BY clause is being used to group data together, then no item in the SELECT list can include any reference to a column unless that column is the argument to an aggregate function. When GROUP BY is used, each item in the SELECT list must be single-valued per group. Further, the SELECT clause may only contain: • • • • Column names. Aggregate functions. Constants. An expression involving combinations of the above. All column names in the SELECT list must appear in the GROUP BY clause unless the name is used only in an aggregate function. 13.4 What restrictions apply to the use of the aggregate functions within the SELECT statement? How do nulls affect the aggregate functions? An aggregate function can be used only in the SELECT list and in the HAVING clause. 33 Database Systems: Instructor's Guide - Part III Apart from COUNT(*), each function eliminates nulls first and operates only on the remaining non-null values. COUNT(*) counts all the rows of a table, regardless of whether nulls or duplicate values occur. 13.5 Explain how the GROUP BY clause works. What is the difference between the WHERE and HAVING clauses? SQL first applies the WHERE clause. Then it conceptually arranges the table based on the grouping column(s). Next, applies the HAVING clause and finally orders the result according to the ORDER BY clause. WHERE filters rows subject to some condition; HAVING filters groups subject to some condition. 13.6 What is the difference between a subquery and a join? Under what circumstances would you not be able to use a subquery? With a subquery, the columns specified in the SELECT list are restricted to one table. Thus, cannot use a subquery if the SELECT list contains columns from more than one table. Exercises The following tables form part of a database held in a relational DBMS: Hotel Room Booking Guest where (Hotel_No, Name, Address) (Room_No, Hotel_No, Type, Price) (Hotel_No, Guest_No, Date_From, Date_To, Room_No) (Guest_No, Name, Address) and Hotel contains hotel details and Hotel_No is the primary key Room contains room details for each hotel and (Hotel_No, Room_No) forms the primary key Booking contains details of the bookings and the primary key comprises (Hotel_No, Guest_No and Date_From) Guest contains guest details and Guest_No is the primary key. Simple Queries 13.7 List full details of all hotels. SELECT * FROM hotel; 13.8 List full details of all hotels in London. SELECT * FROM hotel WHERE address LIKE '%London%'; Strictly speaking, this would also find rows with an address like: '10 London Avenue, New York'. 13.9 List the names and addresses of all guests in London, alphabetically ordered by name. SELECT name, address FROM guest WHERE address LIKE '%London%' ORDER BY name; 13.10 List all double or family rooms with a price below £40.00 per night, in ascending order of price. SELECT * FROM room WHERE price < 40 AND type IN ('D', 'F') ORDER BY price; (Note, ASC is the default setting). 13.11 List the bookings for which no date_to has been specified. 34 Database Systems: Instructor's Guide - Part III SELECT * FROM booking WHERE date_to IS NULL; Aggregate Functions 13.12 How many hotels are there? SELECT COUNT(*) FROM hotel; 13.13 What is the average price of a room? SELECT AVG(price) FROM room; 13.14 What is the total revenue per night from all double rooms? SELECT SUM(price) FROM room WHERE type = 'D'; 13.15 How many different guests have made bookings for August? SELECT COUNT(DISTINCT guest_no) FROM booking WHERE (date_from <= DATE'1999-08-01' AND date_to >= DATE'1999-08-01') OR (date_from >= DATE'1999-08-01' AND date_from <= DATE'1999-08-31'); Subqueries and Joins 13.16 List the price and type of all rooms at the Grosvenor Hotel. SELECT price, type FROM room WHERE hotel_no = (SELECT hotel_no FROM hotel WHERE name = 'Grosvenor Hotel'); 13.17 List all guests currently staying at the Grosvenor Hotel. SELECT * FROM guest WHERE guest_no = (SELECT guest_no FROM booking WHERE date_from <= CURRENT_DATE AND date_to >= CURRENT_DATE AND hotel_no = (SELECT hotel_no FROM hotel WHERE name = 'Grosvenor Hotel')); 13.18 List the details of all rooms at the Grosvenor Hotel, including the name of the guest staying in the room, if the room is occupied. SELECT r.* FROM room r LEFT JOIN (SELECT g.name, h.hotel_no, b.room_no FROM Guest g, Booking b, Hotel h WHERE g.guest_no = b.guest_no AND b.hotel_no = h.hotel_no AND h.name= 'Grosvenor Hotel' AND b.date_from <= CURRENT_DATE AND b.date_to >= CURRENT_DATE) AS XXX ON r.hotel_no = XXX.hotel_no AND r.room_no = XXX.room_no; 13.19 What is the total income from bookings for the Grosvenor Hotel today? SELECT SUM(price) FROM booking b, room r, hotel h WHERE (b.date_from <= CURRENT_DATE AND b.date_to >= CURRENT_DATE) AND r.hotel_no = h.hotel_no and r.room_no = b.room_no; 35 Database Systems: Instructor's Guide - Part III 13.20 List the rooms that are currently unoccupied at the Grosvenor Hotel. SELECT * FROM room r WHERE room_no NOT IN (SELECT room_no FROM booking b, hotel h WHERE (date_from <= CURRENT_DATE AND date_to >= CURRENT_DATE) AND b.hotel_no = h.hotel_no AND name = 'Grosvenor Hotel'); 13.21 What is the lost income from unoccupied rooms at the Grosvenor Hotel? SELECT SUM(price) FROM room r WHERE room_no NOT IN (SELECT room_no FROM booking b, hotel h WHERE (date_from <= CURRENT_DATE AND date_to >= CURRENT_DATE) AND b.hotel_no = h.hotel_no AND name = 'Grosvenor Hotel'); Grouping 13.22 List the number of rooms in each hotel. SELECT hotel_no, COUNT(room_no) AS count FROM room GROUP BY hotel_no; 13.23 List the number of rooms in each hotel in London. SELECT hotel_no, COUNT(room_no) AS count FROM room r, hotel h WHERE r.hotel_no = h.hotel_no AND address LIKE '%London%' GROUP BY hotel_no; 13.24 What is the average number of bookings for each hotel in August? SELECT AVG(X) FROM ( SELECT hotel_no, COUNT(hotel_no) AS X FROM booking b WHERE (b.date_from <= DATE'1999-08-01' AND b.date_to >= DATE'1999-08-01') OR (b.date_from >= DATE'1999-08-01' AND b.date_from <= DATE'1999-08-31') GROUP BY hotel_no); Yes - this is legal in SQL-92! 13.25 What is the most commonly booked room type for each hotel in London? SELECT MAX(X) FROM ( SELECT type, COUNT(type) AS X FROM booking b, hotel h, room r WHERE r.room_no = b.room_no AND b.hotel_no = h.hotel_no AND h.address LIKE '%London%' GROUP BY type); 13.26 What is the lost income from unoccupied rooms at each hotel today? SELECT hotel_no, SUM(price) FROM room r WHERE room_no NOT IN (SELECT room_no FROM booking b, hotel h WHERE (date_from <= CURRENT_DATE AND 36 Database Systems: Instructor's Guide - Part III date_to >= CURRENT_DATE) AND b.hotel_no = h.hotel_no) GROUP BY hotel_no; Creating and Populating Tables 13.27 Using the CREATE TABLE statement, create the Hotel, Room, Booking and Guest tables. CREATE TABLE hotel( hotel_no name address room_no hotel_no type price CHAR(4) NOT NULL, VARCHAR(20) NOT NULL, VARCHAR(50) NOT NULL); VARCHAR(4) CHAR(4) CHAR(1) DECIMAL(5,2) CHAR(4) CHAR(4) DATETIME DATETIME CHAR(4) NOT NULL, NOT NULL, NOT NULL, NOT NULL); NOT NULL, NOT NULL, NOT NULL, NULL, NOT NULL); CREATE TABLE room( CREATE TABLE booking(hotel_no guest_no date_from date_to room_no CREATE TABLE guest( guest_no name address CHAR(4) NOT NULL, VARCHAR(20) NOT NULL, VARCHAR(50) NOT NULL); 13.28 Insert records into each of these tables. INSERT INTO hotel VALUES ('H111', 'Grosvenor Hotel', 'London'); INSERT INTO room VALUES ('1', 'H111', 'S', 72.00); INSERT INTO guest VALUES ('G111', 'John Smith', 'London'); INSERT INTO booking VALUES ('H111', 'G111', DATE'1999-01-01', DATE'1999-01-02', '1'); 13.29 Update the price of all rooms by 5%. UPDATE room SET price = price*1.05; 13.30 Create a separate table with the same structure as the Booking table to hold archive records. Using the INSERT statement, copy the records from the Booking table to the archive table relating to bookings before 1st January 1990. Delete all bookings before 1st January 1990 from the Booking table. CREATE TABLE booking_old( hotel_no guest_no date_from date_to room_no CHAR(4) CHAR(4) DATETIME DATETIME VARCHAR(4) NOT NULL, NOT NULL, NOT NULL, NULL, NOT NULL); INSERT INTO booking1 (SELECT * FROM booking WHERE date_to < DATE'1990-01-01'); DELETE FROM booking WHERE date_to < DATE'1990-01-01'; 37 Database Systems: Instructor's Guide - Part III General 13.31 Investigate the SQL dialect on any DBMS that you are currently using. Determine the compliance of the DBMS with the ISO standard. Investigate the functionality of any extensions the DBMS supports. Are there any functions not supported? This is a small student project, the result of which is dependent on the dialect of SQL being used. 13.32 Show that a query using the HAVING clause has an equivalent formulation without a HAVING clause. Hint: Allow the students to show that the restricted groups could have been restricted earlier with a WHERE clause. 13.33 Show that SQL is relationally complete. Hint: Allow the students to show that each of the relational algebra operations can be expressed in SQL. 38 Database Systems: Instructor's Guide - Part III Chapter 14 Advanced SQL Review Questions 14.1 Discuss the advantages and disadvantages of views. See Section 14.1.7. 14.2 Describe how the process of view resolution works. Described in Section 14.1.3. 14.3 What restrictions are necessary to ensure that a view is updatable? ISO standard specifies the views that must be updatable in a system that conforms to the standard. Definition given in SQL-92 is that a view is updatable if and only if: • • • DISTINCT is not specified; that is, duplicate rows must not be eliminated from the query results. Every element in the SELECT list of the defining query is a column name (rather than a constant, expression, or aggregate function) and no column appears more than once. The FROM clause specifies only one table; that is, the view must have a single source table for which the user has the required privileges. If the source table is itself a view, then that view must satisfy these conditions. This, therefore, excludes any views based on a join, union (UNION), intersection (INTERSECT), or difference (EXCEPT). The WHERE clause does not include any nested SELECTs that reference the table in the FROM clause. There is no GROUP BY or HAVING clause in the defining query. • • In addition, every row that is added through the view must not violate the integrity constraints of the base table (Section 14.1.5). 14.4 Discuss the functionality and importance of the Integrity Enhancement Feature (IEF). Required data: NOT NULL of CREATE/ALTER TABLE. Domain constraint:CHECK clause of CREATE/ALTER TABLE and CREATE DOMAIN. Entity integrity: PRIMARY KEY (and UNIQUE) clause of CREATE/ALTER TABLE. FOREIGN KEY clause of CREATE/ALTER TABLE. Enterprise constraints: CHECK and UNIQUE clauses of CREATE/ALTER TABLE and (CREATE) ASSERTION. Referential integrity: 14.5 Discuss how the Access Control mechanism of SQL works. Each user has an authorization identifier (allocated by DBA). Each object has an owner. Initially, only owner has access to an object but the owner can pass privileges to carry out certain actions on to other users via the GRANT statement and take away given privileges using REVOKE. 14.6 Discuss the difference between interactive SQL, static embedded SQL, and dynamic embedded SQL. Interactive SQL: SQL statements usually input interactively from a terminal. Static and dynamic embedded SQL refers to the embedding of SQL statements in a high-level programming language such as C. Embedded SQL statements are changed by a preprocessor provided with the DBMS vendor into functions calls. The basic difference between static and dynamic embedded SQL is that static SQL does not allow host variables to be used in place of table names or column names. 39 Database Systems: Instructor's Guide - Part III Exercises Answer the following questions using the relational schema from the Exercises of Chapter 13. 14.7 Create a view containing the hotel name and the names of the guests staying at the hotel. CREATE VIEW hotel_data(hotel_name, guest_name) AS SELECT h.name, g.name FROM hotel h, guest g, booking b WHERE h.hotel_no = b.hotel_no AND g.guest_no = b.guest_no AND b.date_from <= CURRENT_DATE AND b.date_to >= CURRENT_DATE; 14.8 Create a view containing the account for each guest at the Grosvenor Hotel. CREATE VIEW booking_out_today AS SELECT g.guest_no, g.name, g.address, r.price*(b.date_to - b.date_from) FROM guest g, booking b, hotel h, room r WHERE g.guest_no = b.guest_no AND r.room_no = b.room_no AND b.hotel_no = h.hotel_no AND h.name = 'Grosvenor Hotel' AND b.date_to = CURRENT_DATE; 14.9 Give the users Manager and Deputy full access to these views, with the privilege to pass the access on to other users. GRANT ALL PRIVILEGES ON hotel_data TO manager, deputy WITH GRANT OPTION; GRANT ALL PRIVILEGES ON booking_out_today TO manager, deputy WITH GRANT OPTION; 14.10 Give the user Accounts SELECT access to these views. Now revoke the access from this user. GRANT SELECT ON hotel_data TO accounts; GRANT SELECT ON booking_out_today TO accounts; REVOKE SELECT ON hotel_data FROM accounts; REVOKE SELECT ON booking_out_today FROM accounts; 14.11 Create the Hotel table using the Integrity Enhancement Features of SQL. CREATE DOMAIN HOTEL_NUMBER AS CHAR(4); CREATE TABLE hotel( hotel_no HOTEL_NUMBER name VARCHAR(20) address VARCHAR(50) PRIMARY KEY (hotel_no)); 14.12 NOT NULL, NOT NULL, NOT NULL, Now create the Room, Booking, and Guest tables using the Integrity Enhancement Features of SQL with the following constraints: (a) (b) (c) (d) (e) (f) Type must be one of Single, Double, or Family. Price must be between £10 and £100. Rid must be between 1 and 100. Date_From and Date_To must be greater than today’s date. The same room cannot be double booked. The same guest cannot have overlapping bookings. 40 Database Systems: Instructor's Guide - Part III CREATE DOMAIN ROOM_TYPE AS CHAR(1) CHECK(VALUE IN ('S', 'F', 'D')); CREATE DOMAIN HOTEL_NUMBERS AS HOTEL_NUMBER CHECK(VALUE IN (SELECT hotel_no FROM hotel)); CREATE DOMAIN ROOM_PRICE AS DECIMAL(5,2) CHECK(VALUE BETWEEN 10 AND 100); CREATE DOMAIN ROOM_NUMBER AS VARCHAR(4) CHECK(VALUE BETWEEN '1' AND '100'); CREATE TABLE room( room_no ROOM_NUMBER NOT NULL, hotel_no HOTEL_NUMBERS NOT NULL, type ROOM_TYPE NOT NULL DEFAULT 'S' price ROOM_PRICE NOT NULL, PRIMARY KEY (room_no, hotel_no), FOREIGN KEY (hotel_no) REFERENCES hotel ON DELETE CASCADE ON UPDATE CASCADE); CREATE DOMAIN GUEST_NUMBER AS CHAR(4); CREATE TABLE guest( guest_no name address GUEST_NUMBER VARCHAR(20) VARCHAR(50) NOT NULL, NOT NULL, NOT NULL); CREATE DOMAIN GUEST_NUMBERS AS GUEST_NUMBER CHECK(VALUE IN (SELECT guest_no FROM guest)); CREATE DOMAIN BOOKING_DATE AS DATETIME CHECK(VALUE > CURRENT_DATE); CREATE TABLE booking( hotel_no HOTEL_NUMBERS NOT NULL, guest_no GUEST_NUMBERS NOT NULL, date_from BOOKING_DATE NOT NULL, date_to BOOKING_DATE NULL, room_no ROOM_NUMBER NOT NULL, PRIMARY KEY (hotel_no, guest_no, date_from), FOREIGN KEY (hotel_no) REFERENCES hotel ON DELETE CASCADE ON UPDATE CASCADE, FOREIGN KEY (guest_no) REFERENCES guest ON DELETE NO ACTION ON UPDATE CASCADE, FOREIGN KEY (hotel_no, room_no) REFERENCES room ON DELETE NO ACTION ON UPDATE CASCADE, CONSTRAINT room_booked CHECK (NOT EXISTS (SELECT * FROM booking b WHERE b.date_to > booking.date_from AND b.date_from < booking.date_to AND AND b.room_no = booking.room_no AND b.hotel_no = booking.hotel_no)), CONSTRAINT guest_booked CHECK (NOT EXISTS (SELECT * FROM booking b WHERE b.date_to > booking.date_from AND b.date_from < booking.date_to AND AND b.guest_no = booking.guest_no))); 41 Database Systems: Instructor's Guide - Part III 14.13 Investigate the embedded SQL functionality of any DBMS that you use. Determine the compliance of the DBMS with the ISO standard. Investigate the functionality of any extensions the DBMS supports. Are there any functions not supported? This is a small student project, the result of which is dependent on the system analyzed. 14.14 Write a small program that prompts the user for guest details and inserts the record into the guest table. #include <stdio.h> #include <stdlib.h> EXEC SQL INCLUDE sqlca; main() { EXEC SQL BEGIN DECLARE SECTION; char guest_no[5]; char name[21]; char address[51]; EXEC SQL END DECLARE SECTION; /* Connect to database */ EXEC SQL CONNECT 'hoteldb'; if (sqlca.sqlcode < 0) exit (-1); /* Prompt for data */ printf("Enter guest number: "); scanf("%s", guest_no); printf("Enter guest name: "); scanf("%s", name); printf("Enter address: "); scanf("%s", address); EXEC SQL INSERT INTO guest VALUES :guest_no, :name, :address; /* input guest number */ /* input guest name */ /* input address */ /* Check success */ if (sqlca.sqlcode >= 0) printf("Insert successful\n"); else printf("Insert unsuccessful\n"); /* Finally, disconnect from the database */ EXEC SQL DISCONNECT; } 42 Database Systems: Instructor's Guide - Part III 14.15 Write a small program that prompts the user for booking details, checks that the specified hotel, guest and room exists and inserts the record into the booking table. #include <stdio.h> #include <stdlib.h> EXEC SQL INCLUDE sqlca; main() { EXEC SQL BEGIN DECLARE SECTION; char hotel_no[5]; char guest_no[5]; char date_from[26]; char date_to[26]; char room_no[5]; char hname[21]; char gname[21]; char type[2]; EXEC SQL END DECLARE SECTION; /* Connect to database */ EXEC SQL CONNECT 'hoteldb'; if (sqlca.sqlcode < 0) exit (-1); /* Prompt for data */ printf("Enter hotel number: "); scanf("%s", hotel_no); EXEC SQL SELECT name INTO :hname FROM hotel WHERE hotel_no = :hotel_no; if (sqlca.sqlcode < 0) { printf("Hotel %s does not exist ... exiting\n", hotel_no); exit(-1); } printf("Enter guest number: "); scanf("%s", guest_no); EXEC SQL SELECT name INTO :gname FROM guest WHERE guest_no = :guest_no; if (sqlca.sqlcode < 0) { printf("Guest %s does not exist ... exiting\n", guest_no); exit(-1); } printf("Enter room number: "); scanf("%s", room_no); EXEC SQL SELECT type INTO :type FROM room WHERE room_no = :room_no; if (sqlca.sqlcode < 0) { printf("Room %s does not exist ... exiting\n", room_no); exit(-1); } printf("Enter from date: "); scanf("%s", date_from); printf("Enter to date: "); scanf("%s", date_to); EXEC SQL INSERT INTO booking VALUES :hotel_no, :guest_no, :date_from, :date_to, :room_no; /* Check success */ if (sqlca.sqlcode >= 0) printf("Insert successful\n"); else printf("Insert unsuccessful\n"); /* Finally, disconnect from the database */ EXEC SQL DISCONNECT; } /* input hotel number */ /* input guest number */ /* input from date */ /* input to date */ /* input room number */ /* return hotel name */ /* return guest name */ /* return room type */ 14.16 Write a program that increases the price of every room by 5%. #include <stdio.h> #include <stdlib.h> EXEC SQL INCLUDE sqlca; 43 Database Systems: Instructor's Guide - Part III main() { /* Connect to database */ EXEC SQL CONNECT 'hoteldb'; if (sqlca.sqlcode < 0) exit(-1); /* Display message for user and update the table */ printf("Updating ROOM table\n"); EXEC SQL UPDATE room SET price = price*0.05; if (sqlca.sqlcode >= 0) printf("Update successful\n"); else printf("Update unsuccessful\n"); /* Commit the transaction */ EXEC SQL COMMIT; /* Finally, disconnect from the database */ EXEC SQL DISCONNECT; } /* Check success */ 14.17 14.18 See overleaf. Write a program that allows the user to insert data into any user-specified table. Several ways to write this program. For example: a) Could use static SQL and after user enters table name, have a switch that branches to the appropriate piece of code to prompt for the relevant data and perform an INSERT statement. This solution is not very elegant. Use the program given in Example 14.17 of the book and type in the relevant INSERT statement. b) 44 Database Systems: Instructor's Guide - Part III 14.17 Write a program that calculates the account for every guest checking out of the Grosvenor Hotel today. #include <stdio.h> #include <stdlib.h> EXEC SQL INCLUDE sqlca; main() { EXEC SQL BEGIN DECLARE SECTION; char guest_no[5]; char gname[21]; char address[51]; double balance; EXEC SQL END DECLARE SECTION; /* Connect to database */ EXEC SQL CONNECT 'hoteldb'; if (sqlca.sqlcode < 0) exit (-1); /* Establish SQL error handling */ EXEC SQL WHENEVER SQLERROR GOTO error; EXEC SQL WHENEVER NOT FOUND GOTO done; /* Declare cursor for selection */ EXEC SQL DECLARE booking_out_cursor CURSOR FOR SELECT g.guest_no, g.name, g.address, r.price*(b.date_to - b.date_from) FROM guest g, booking b, hotel h, room r WHERE g.guest_no = b.guest_no AND r.room_no = b.room_no AND b.hotel_no = h.hotel_no AND h.name = 'Grosvenor Hotel' AND b.date_to = CURRENT_DATE ORDER by g,name; /* Open the cursor to start of selection */ EXEC SQL OPEN booking_out_cursor; printf("Guest Number\t Guest Name\t Guest Address\t Balance\n\n"); /* Loop to fetch each row of the result table */ for ( ; ; ) { /* Fetch next row of the result table */ EXEC SQL FETCH booking_out_cursor INTO :guest_no, :gname, :address, :balance; /* Display data */ printf("%s\t%s\t%s\t%f\n", guest_no, gname, address, balance); } /* Error condition - print out error */ error: printf("SQL error %d\n"); done: /* Close the cursor before completing */ EXEC SQL WHENEVER SQLERROR continue; EXEC SQL CLOSE booking_out_cursor; EXEC SQL DISCONNECT; } /* guest number */ /* guest name */ /* guest address */ /* guest address */ 45 Database Systems: Instructor's Guide - Part III Chapter 15 QBE Exercises The student should attempt the following exercises using the QBE facility or similar of the available DBMS. 15.1 Create the sample tables of the DreamHome case study, shown in Figure 3.3 and carry out the exercises demonstrated in this chapter, using (where possible) the QBE facility of your DBMS. Create the following additional select QBE queries for the sample tables of the DreamHome case study, using (where possible) the QBE facility of your DBMS. (a) Retrieve the branch number, address, and telephone number for all branch offices. (b) Retrieve the staff number, position, and salary for all members of staff working at Branch Office B3. (c) Retrieve the details of all flats in Glasgow. (d) Retrieve the details of all female members of staff who are older than 25 years old. (e) Retrieve the full name and telephone of all customers who have viewed flats in Glasgow. (f) Retrieve the total number of properties, according to property type. (g) Retrieve the total number of staff working at each branch office, ordered by branch number. Create the following additional advanced QBE queries for the sample tables of the DreamHome case study, using (where possible) the QBE facility of your DBMS. (a) Create a parameter query that prompts for a property number and then displays the details of that property. (b) Create a parameter query that prompts for the first and last names of a member of staff and then displays the details of the property that the member of staff is responsible for. (c) Add several more records into the Property_for_Rent tables to reflect the fact that property owners ‘Carol Farrel’ and ‘Tony Shaw’ now own many properties in several cities. Create a select query to displays for each owner, the number of properties he or she owns in each city. Now, convert the select query into a crosstab query and assess whether the display is more or less useful when comparing the number of properties owned by each owner in each city. (d) Introduce an error into your Staff table by entering an additional record for the member of staff called ‘David Ford’, assigned a new staff number. Use the Find Duplicates query to identify this error. (e) Use the Find Unmatched query to identify those members of staff who are not assigned to manage property. (f) Create an autolookup query that fills in the details of an owner, when a new property record is entered into the Property_for_Rent table and the owner of the property already exists in the database. 15.2 15.3 15.4 Use action queries to carry out the following tasks on the sample tables of the DreamHome cases study, using (where possible) the QBE facility of your DBMS. (a) Create a cut-down version of the Property_for_Rent table called PROPERTY_GLASGOW, which has the Pno, Street, Pcode and Type fields of the original table and contains only the details of properties in Glasgow. (b) Remove all records of property viewings that do not have an entry in the Comment field. (c) Update the salary of all members of staff, except Managers, by 12.5%. (d) Create a table called NewRenter, which contains the details of potential renters of property. Append this information into the original Renter table. 15.5 Using the sample tables of the DreamHome case study, create equivalent QBE queries for the SQL examples given in Chapter 15. 46 Database Systems: Instructor's Guide - Part III Part Four Selected Database Issues Chapter 16 Security Review Questions 16.1 Explain the purpose and scope of database security. The purpose is clearly concerned with the protection of the data. However, the scope is wider than that of the DBMS alone, and takes into account the database environment. Consequently, it also considers the hardware, software, and users. See also Section 16.1. 16.2 List six different types of threat that could affect a database system, and for each, describe the controls that you would use to counteract each of them. For threats see Section 16.1.1 and for computer-based countermeasures see Section 16.2 and for non-computer based controls see Section 16.3. 16.3 Explain the following: (a) (b) (c) (d) (e) (f) (g) (h) 16.4 authorization backup encryption contingency plan personnel controls Escrow agreement privacy data protection See Section 16.2.1 See Section 16.2.3 and 16.2.6 See Section 16.2.5 See Section 16.3.1 See Section 16.3.2 See Section 16.3.4 See Section 16.8 See Section 16.8 Outline the stages of risk analysis and provide a brief explanation of each stage. See Section 16.7. Exercises 16.5 Identify an important computer-based application in your computing environment and determine: (a) (b) (c) the type of data the application uses and produces. the integrity checks that are required. how the system performs the integrity checks. The select application may be used by a single user, group of users, department, or the entire organization. Investigating the application may require any or all of the following to be undertaken: • • • • 16.6 users to be interviewed forms to be obtained or perused analysts, designers and developers to be interviewed use of the system. Select a DBMS used within your computing environment and evaluate how well it supports all the integrity controls described in the chapter. The DBMS may be either mainframe or PC-based. A good method of evaluating a system is to investigate how the integrity constraints for an application are implemented. You may find that basic constraints can be easily handled, but more complex ones, such as conditional business rules, require much more effort. 16.7 Carry out an investigation of your computing environment and: 47 Database Systems: Instructor's Guide - Part III (a) (b) (c) List all potential threats and breaches of security you can identify. List all examples you find of countermeasures against potential threats. Determine whether or not any risk assessment is undertaken and if not, how security breaches are dealt with. It is better to focus on a particular area of concern here, rather than attempt to cover an entire department’s or organization’s operation. The student will need to know how procedures are carried out and how the application functions. If all exercises are being undertaken, it may be useful to build on the work carried out for 14.6. 16.8 Determine the nature and extent of the personal data used by your computing environment and: (a) (b) (c) Identify who has responsibility for setting guidelines for its collection and usage. Determine which staff handle personal data and whether they receive any special training regarding the handling of this type of data. Is the training regularly updated? Give details of the arrangements for access to personal data. This is an investigation into the policies, procedures and practices carried out within the student’s environment. The student should not require access to any data, only the type of data collected and stored. Again, it may be easier to focus on a particular application area, such as a department. 16.9 Discover what legislation exists that applies to data of a personal nature. Does it apply to all data, or just that held on computer? A useful starting point is to determine whether there is someone who is responsible for managing the legislation, such as a Data Protection Registrar, and obtain information from him/her. The focus here could simply be national or federal legislation, although the student may wish to check out state or provincial legislation where it exists. 16.10 Consider the DreamHome case study introduced in Section 1.7. List the potential threats that should be considered in this environment and propose countermeasures to overcome them. This should be tackled in a similar manner to Exercise 16.7 in determining the potential threats and any countermeasures. 16.11 Consider the Wellmeadows Hospital case study introduced in Appendix A. List the potential threats that should be considered in this environment and propose countermeasures to overcome them. This should be tackled in a similar manner to Exercise 16.7 in determining the potential threats and any countermeasures. 48 Database Systems: Instructor's Guide - Part III Chapter 17 Transaction Management Review Questions 17.1 Explain what is meant by a transaction. Why are transactions important units of operation in a DBMS? Transaction: An action or series of actions, carried out by a single user or application program, which accesses or changes the contents of the database (Section 17.1). A logical unit of work that transforms the database from one consistent state to another. Also the unit of concurrency and recovery control. 17.2 The consistency and reliability aspects of transactions are due to the "ACIDity" properties of transactions. Discuss each of these properties and how they relate to the concurrency control and recovery mechanisms. Give examples to illustrate your answer. ACID properties discussed in Section 17.1.1. 17.3 Discuss, with examples, the types of problems that can occur in a multiuser environment when concurrent access to the database is allowed. Lost update problem, the uncommitted dependency problem and the inconsistent analysis problem (Examples 17.1 - 17.3). 17.4 Give full details of a mechanism for concurrency control that can be used to ensure the types of problems discussed in 17.3 cannot occur. Show how the mechanism prevents the problems illustrated from occurring. Discuss how the concurrency control mechanism interacts with the transaction mechanism. Should discuss 2PL, timestamping, or an optimistic technique. Solutions to above problems for 2PL given in Examples 17.6 - 17.8. Transaction is the unit of concurrency control. 17.5 Discuss the types of problems that can occur with locking-based mechanisms for concurrency control and the actions that can be taken by a DBMS to prevent them. Deadlock/livelock (see end of Section 17.2.3 and Section 17.2.4.) 17.6 Explain the concepts of serial, non-serial, and serializable schedules. State the rules for equivalence of schedules. See Section 17.2.2. 17.7 Discuss the difference between conflict serializability and view serializability. See Section 17.2.2. 17.8 Discuss the types of failure that may occur in a database environment. Explain why it would be unreasonable for a multiuser DBMS not to provide a recovery mechanism. Failures given in Section 17.3.1. Answer to second part is based on the amount of work that potentially could be lost following a failure. 17.9 Discuss how the log file (or journal) is a fundamental feature in any recovery mechanism. Explain what is meant by forward and backward recovery and describe how the log file is used in forward and backward recovery. What is the significance of the write-ahead log protocol? How do checkpoints affect the recovery protocol? 49 Database Systems: Instructor's Guide - Part III Log file contains before and after-images of updates to the database. Before images can be used to undo changes to the database; after-images can be used to redo changes. Log file also contains a checkpoint record, which can speed up the time for recovery following a failure. 17.10 Discuss the following advanced transaction models: (a) nested transactions, (b) sagas, (c) multi-level transactions, (d) dynamically restructuring transactions. See Section 17.4. Exercises 17.11 Analyze the DBMSs that you are currently using. What concurrency protocol does the DBMS use? What type of recovery mechanism is used? What support is provided for the advanced transaction models discussed in Section 17.4? This is a small student project, the result of which is dependent on the system analyzed. 17.12 (a) Explain what is meant by the constrained write rule, and explain how to test whether a schedule is serializable under the constrained write rule. Using the above method, determine whether the following schedule is serializable: S= [R1(Z), R2(Y), W2(Y), R3(Y), R1(X), W1(X), W1(Z), W3(Y), R2(X), R1(Y), W1(Y), W2(X), R3(W), W3(W)] where Ri(Z)/Wi(Z) indicates a read/write by transaction i on data item Z. Constrained write rule: transaction updates a data item based on its old value, which is first read by the transaction. A precedence graph can be produced to test for serializability. T1 T2 T3 Cycle in precedence graph, which implies that schedule is not serializable. (b) Would it be sensible to produce a concurrency control algorithm based on serializability? Give justification for your answer. How is serializability used in standard concurrency control algorithms? No - interleaving of operations from concurrent transactions is typically determined by operating system scheduler. Hence, it is practically impossible to determine how the operations will be interleaved beforehand to ensure serializability. If transactions are executed and then you test for serializability, you would have to cancel the effect of a schedule if it turns out not to be serializable. This would be impractical! 17.13 Produce a wait-for-graph for the following transaction scenario and determine whether deadlock exists. 50 Database Systems: Instructor's Guide - Part III Transaction T1 T2 T3 T4 T5 T6 T7 Data items locked by transaction X2 X3, X10 X8 X7 X1, X5 X4, X9 X6 Data items transaction is waiting for X1, X3 X7, X8 X4, X5 X1 X3 X6 X5 T2 T3 T6 T1 T4 T5 T7 Cycles in graph implies that deadlock exists. 17.14 Write an algorithm for shared and exclusive locking. How does granularity affect this algorithm? read_lock(X): B: if LOCK (X) = "unlocked" then begin LOCK (X) = "read-locked"; no_of_reads(X) = 1 end else if LOCK(X) = "read-locked" then no of_reads(X) = no_of _reads(X) + 1 else begin wait (until LOCK (X) = "unlocked" and the lock manager wakes up the transaction); goto B end write_lock (X): B: if LOCK (X) = "unlocked" then LOCK (X) = "write-locked" else begin wait (until LOCK(X) = "unlocked" and the lock manager wakes up the transaction); goto B end; 51 Database Systems: Instructor's Guide - Part III unlock_item (X): if LOCK (X) = "write-locked" then begin LOCK (X) = "unlocked"; wakeup one of the waiting transactions, if any end else if LOCK(X) = "read-locked" then begin no of_reads(X) = no_of_reads(X) - 1; if no of reads(X) = 0 then begin LOCK (X) = "unlocked"; wakeup one of the waiting transactions, if any end; end; Algorithm 1 Locking and unlocking shared-exclusive) locks operations for two-mode (read-write or 17.15 Write an algorithm that checks whether the concurrently executing transactions are in deadlock. Boolean function deadlock_detection Input: A table called Wait_for_Table containing Transaction_id; Data_Item_Locked; Data_Item_Waiting_For Output: Boolean flag indicating whether system is deadlocked. begin Deadlock = FALSE; Transaction_stack = NULL; for next transaction in Wait_for_Table while not Deadlock begin push next transaction_id into Transaction_stack; for next Data_Item_Waiting_For of transaction on top of stack and not Deadlock and not Transaction_stack = NULL begin D_next = next Data_Item_Waiting_For; find Tran_id of transaction which has locked D_next; if Tran_id is in stack then Deadlock = TRUE; else push Tran_id to Transaction_stack; end pop Transaction_stack; end return deadlock; end 17.16 Explain why stable storage cannot really be implemented. How would you simulate stable storage? Information residing on stable storage is never lost. Theoretically, this cannot be guaranteed. To implement an approximation of stable storage, we need to replicate information in several nonvolatile storage media with independent failure modes and update the information in a controlled manner. Although a large number of copies reduces the probability of a failure, it is usually reasonable to simulate stable storage with only two copies. Block transfer can result in: successful completion, partial failure (destination block has incorrect information) and total failure (destination blocks not written to). 52 Database Systems: Instructor's Guide - Part III If a data transfer occurs, the system must detect it and recover the block to a consistent state. To do so, the system maintains two physical blocks for each logical database block, written as follows: 1. 2. 3. 17.17 Write information to first physical block. When first is successfully complete, write same information to second physical block. Output is complete, only after the second write successfully completes. Would it be realistic for a DBMS to dynamically maintain a wait-for-graph rather than create it each time the deadlock detection algorithm runs? Explain your answer. Yes, could do this by maintaining the WFG in memory and only update directed edges that change. 53 Database Systems: Instructor's Guide - Part III Chapter 18 Query Processing Review Questions 18.1 What are the objectives of query processing? The aims of query processing are to transform a query written in a high-level language, typically SQL, into a correct and efficient execution strategy expressed in a low-level language (implementing relational algebra), and to execute the strategy to retrieve the required data (see Section 18.1). 18.2 How does query processing in relational systems differ from the processing of low-level query languages for network and hierarchical systems? In first generation network and hierarchical database systems, the low-level procedural query language is generally embedded in a high-level programming language such as COBOL, and it is the programmer’s responsibility to select the most appropriate execution strategy. In contrast, with declarative languages such as SQL, the user specifies what data is required rather than how it is to be retrieved. This relieves the user of the responsibility of determining, or even knowing, what constitutes a good execution strategy and makes the language more universally usable (see start of chapter). 18.4 What are the typical stages of query decomposition? The typical stages of query decomposition are analysis, normalization, semantic analysis, simplification, and query restructuring (see Section 18.2). 18.5 What is the difference between conjunctive and disjunctive normal form? Conjunctive normal form: A sequence of conjuncts that are connected with the ∧ (AND) operator. Each conjunct contains one or more terms connected by the ∨ (OR) operator. A conjunctive selection contains only those tuples that satisfy all conjuncts. Disjunctive normal form: A sequence of disjuncts that are connected with the ∨ (OR) operator. Each disjunct contains one or more terms connected by the ∧ (AND) operator. A disjunctive selection contains those tuples formed by the union of all tuples that satisfy the disjuncts. See Section 18.2 under normalization. 18.6 How would you check the semantic correctness of a query? See Section 18.3 under semantic analysis. 18.7 State the transformation rules that apply to: (a) Selection operations. See Section 18.3.1: Rules 1, 2, 4, 6, 9. (b) Projection operations. See Section 18.3.1: Rules 3, 4, 7, 10. (c) Theta-join operations. See Section 18.3.1: Rules 5, 6, 7, 11. State the heuristics that we should apply to improve the processing of a query. See Section 18.3.2. 18.9 What type of statistics should a DBMS hold to be able to derive estimates of relational algebra operations? See Section 18.4.1. 18.10 Under what circumstances would the system have to resort to a linear search when implementing a selection operation? 18.8 54 Database Systems: Instructor's Guide - Part III (1) Unordered file, with no indexes. (2) Composite predicate where one of the terms contains an OR condition and the term requires a linear search. (3) Composite predicate where no attribute can be used for efficient retrieval. 18.11 What are the main strategies for implementing the join operation? Main strategies for implementing join are: block nested loop join, indexed nested loop join, sort-merge join, hash join. See Section 18.4.3. 18.12 What is the difference between materialization and pipelining? Materialization - output of one operation is stored in a temporary relation for processing by next operation. Pipelining - pipeline results of one operation to another operation without creating a temporary relation to hold the intermediate result. See Section 18.5. 18.13 Discuss the difference between linear and non-linear relational algebra trees. Give examples to illustrate your answer. Linear tree - relation on one side of each operator is always a base relation. See Section 18.5 under Linear Trees. 18.14 What are the advantages and disadvantages of left-deep trees? The inner relation of a join needs to be examined for each tuple of the outer relation, so inner relations must always be materialized. With a left-deep tree, the inner relation is always a base relation, and so already materialized. Exercises 18.15 Calculate the cost of the three strategies cited in Example 18.1 if the Staff relation has 10,000 tuples, Branch has 500 tuples, there are 500 Managers (one for each Branch), and there are 10 London branches. Costs: (1) (2) (3) 10,010,500 30,500 11,520. 18.16 Using the Hotel schema given in the Exercises of Chapter 13, determine whether the following queries are semantically correct: (a) SELECT r.type, r.price FROM room r, hotel h WHERE r.hotel_number = h.hotel_number AND h.hotel_name = ‘Grosvenor Hotel’ AND r.type > 100; Not semantically correct: hotel_number and hotel_name not in schema; type is character string and so cannot be compared with an integer value (100). (b) SELECT g.guest_no, g.name FROM hotel h, booking b, guest g WHERE h.hotel_no = b.hotel_no AND h.hotel_name = ‘Grosvenor Hotel’; Not semantically correct: hotel_name not in schema; Guest table not connected to remainder of query. 55 Database Systems: Instructor's Guide - Part III (c) SELECT r.room_no, h.hotel_no FROM hotel h, booking b, room r WHERE h.hotel_no = b.hotel_no AND h.hotel_no = ‘H21’ AND b.room_no = r.room_no AND type = ‘S’ AND b.hotel_no = ‘H22’; Not semantically correct: hotel_no cannot be both H21 in Hotel and H22 in Booking. 18.17 Again, using the Hotel schema given in the Exercises of Chapter 13, draw a relational algebra tree for each of the following queries and use the heuristic rules given in Section 18.3.2 to transform the queries into a more efficient form: (a) SELECT r.rno, r.type, r.price FROM room r, booking b, hotel h WHERE r.room_no = b.room_no AND b.hotel_no = h.hotel_no AND h.hotel_name = ‘Grosvenor Hotel’ AND r.price > 100; SELECT g.guest_no, g.name FROM room r, hotel h, booking b, guest g WHERE h.hotel_no = b.hotel_no AND g.guest_no = b.guest_no AND h.hotel_no = r.hotel_no AND h.hotel_name = ‘Grosvenor Hotel’ AND date_from >= ‘1-Jan-98’ AND date_to <= ‘31-Dec-98’; (b) Discuss each step and state any transformation rules used in the process. See Figures 18.1 and 18.2 overleaf. 18.18 Using the Hotel schema, assume the following: There is a hash index with no overflow on the primary key attributes, Room_No/Hotel_No in Room. There is a clustering index on the foreign key attribute Hotel_No in Room. There is a B+-tree index on the Price attribute in Room. A secondary index on the attribute Type in Room. ntuples(Room) ntuples(Hotel) ntuples(Booking) ndistincthotel_no(Room) ndistincttype(Room) ndistinctprice(Room) minprice(Room) nlevelshotel_no(I) nlevelstype(I) nlevelsprice(I) (a) = = = = = = = = = = 10000 50 100000 50 10 500 200 2 2 2 bfactor(Room) = bfactor(Hotel) = bfactor(Booking) = 200 40 60 maxprice(Room) nlfblocksprice(I) = 50 = 50 Calculate the cardinality and minimum cost for each of the following selection operations: S1: σroom_no=1 ∧ hotel_no=1(Room) Use Hash: cost = 1; Linear search cost is: [10000/200]/2 = 25 S2: σtype='D'(Room) Use equality condition on clustering index: SCtype(Room) = [10000/10] = 1000 Cost: 2+ [1000/200] = 5 Linear search cost is: 50 S3: σhotel_no=2(Room) Use equality on clustering index: SChotel_no(Room) = [10000/50] = 200 Cost: 2+ [200/200] = 3 56 Database Systems: Instructor's Guide - Part III Linear search cost is: 50 S4: σprice>100(Room) Use inequality on a secondary B+-tree: Cost: 2 + [50/2 + 10000/2] = 5027 Linear search cost is: 50 S5: σtype='S' ∧ hotel_no=3(Room) If use secondary index searches on each of the two components, costs are 5 and 3, respectively (from above). Optimizer would then choose search via clustering index on hotel_no, and check the remaining predicate in memory. Linear search cost is: 50 S6: σtype='S' ∨ price < 100(Room) Use the secondary indexes on Type and Price, then take the union of the two sets. Linear search cost is: 50 (b) Calculate the cardinality and minimum cost for each of the following join operations: J1: Hotel hotel_no Room Assume nbuffer = 100 Block Nested Loop Indexed Nested Loop Sort-Merge Hash J2: Hotel hotel_no 102, buffer has only 1 block for Hotel and Room. 52, all of Hotel fits into buffer 152, using primary key index 302 unsorted 52 sorted 156 if hash index fits in memory 33336, buffer has only 1 block for Hotel and 16669, all of Hotel fits into buffer 152, using primary key index 250007 unsorted 16669 sorted 50007 if hash index fits in memory 833400, buffer has only 1 block for Room and 16717, all of Room fits into buffer 30050, using clustering index 250305 unsorted 16717 sorted 50151 if hash index fits in memory 150, buffer has only 1 block for Room and Hotel. 52, all of Room fits into buffer 30050, using clustering index 302 unsorted 52 sorted 156 if hash index fits in memory 50001, buffer has only 1 block for Hotel and 17008, if (nbuffer-2) blocks for Booking 16669, if all of Booking fits into buffer 1666716667, using linear search 250007 unsorted 16669 sorted 50007 if hash index fits in memory Booking Block Nested Loop Booking. Indexed Nested Loop Sort-Merge Hash J3: Room room_no Booking Block Nested Loop Booking. Indexed Nested Loop Sort-Merge Hash J4: Room hotel_no Hotel Block Nested Loop Indexed Nested Loop Sort-Merge Hash J5: Booking hotel_no Hotel Block Nested Loop Booking. Indexed Nested Loop Sort-Merge Hash J6: Booking room_no Room 57 Database Systems: Instructor's Guide - Part III Block Nested Loop Room. 850017, buffer has only 1 block for Booking and 25171, if (nbuffer-2) blocks for Booking 16717, all of Booking fits into buffer 1666716667, using clustering index 250305 unsorted 16717 sorted 50151 if hash index fits in memory Indexed Nested Loop Sort-Merge Hash (c) Calculate the cardinality and minimum cost for each of the following projection operations: P1: Πhotel_no(Hotel) Duplicate elimination using sorting: Duplicate elimination using hashing: Πhotel_no(Room) 350 51 (cardinality of result estimated as SChotel_no(Room) = 200, occupying 1 block) 4 4 (cardinality of result stays same because Hotel_No is key attribute) P2: Duplicate elimination using sorting: Duplicate elimination using hashing: P3: Πprice(Room) 350 51 (cardinality of result estimated as SCprice(Room) = 20, occupying 1 block) Duplicate elimination using sorting: Duplicate elimination using hashing: P4: Πtype(Room) 350 55 (cardinality of result estimated as SChotel_no(Room) = 1000, occupying 5 blocks) Duplicate elimination using sorting: Duplicate elimination using hashing: P5: Πhotel_no, price(Room) 350 51 (cannot be any more than cost of P2, P3) Duplicate elimination using sorting: Duplicate elimination using hashing: 18.19 Modify the block nested loop join and the indexed nested loop join algorithms presented in Section 18.4.3 to read (nbuffer - 2) blocks of the outer relation R at a time, rather than one block at a time. Hint: Add an extra outer loop to each algorithm to read (nbuffer-2) blocks of R in each time rather than process each block of R individually. 58 Database Systems: Instructor's Guide - Part III Π r.room_no, r.type, r.price Π r.room_no, r.type, r.price σr.room_no=b.room_no ∧ b.hotel_no=h.hotel_no ∧ σb.hotel_no=h.hotel_no h.hotel_name=’Grosvenor Hotel’ ∧ r.price > × σr.room_no=b.room × σh.hotel_name='Grosvenor Hotel' × H × R B σr.price>100 B H R Π r.room_no, r.type, r.price Π r.room_no, r.type, r.price b.hotel_no=h.hotel_no b.hotel_no=h.hotel_no σ h.hotel_name='Grosvenor Hotel' r.room_no=b.room_no r.room_no=b.room Π h.hotel_no σr.price>100 σr.price>100 B H Π b.room_no,b.hotel_no σh.hotel_name='Grosvenor Hotel' R R B H 59 Database Systems: Instructor's Guide - Part III Π g.guest_no,g.name Π g.guest_no,g.name σ b.hotel_no=h.hotel_no ∧ r.room_no=b.room_no ∧ h.hotel_no=r.hotel_no ∧ σg.guest_no=b.guest_no h.hotel_name=’Grosvenor Hotel’ ∧ date_from>= '1-Jan-98' ∧ date_to <='31-Dec-98' × × σh.hotel_no=b.hotel_no G × G × × B σh.hotel_no=r.hotel_no σ date_from>= '1-Jan-98' ∧ date_to <='31-Dec-98' R H × σh.hotel_name='Grosvenor Hotel' B R H Π g.guest_no,g.name g.guest_no=b.gues_no B h.hotel_no=b.hotel_no h.hotel_no=r.hotel_no σ date_from>= '1-Jan-98' ∧ date_to <='31-Dec-98' σ h.hotel_name='Grosvenor Hotel' B R H 60 Database Systems: Instructor's Guide - Part III Π g.guest_no,g.name g.guest_no=b.guest_no hotel_no=r.hotel_no Π g.guest_no,g.name h.hotel_no=b.hotel_no Π r.hotel_no G Π h.hotel_no Π b.hotel_no R σh.hotel_name='Grosvenor Hotel' σ date_from>= '1-Jan-98' ∧ date_to <='31-Dec-98' H B 61 Database Systems: Instructor's Guide - Part III Part Five Current Trends Chapter 19 Distributed DBMSs - Concepts and Design Review Questions 19.1 Explain what is meant by a DDBMS and discuss the motivation in providing such a system. See Section 19.1.1; motivation given at start of Section 19.1. 19.2 Compare and contrast a DDBMS with distributed processing. Under what circumstances would you choose a DDBMS over distributed processing? Distributed processing defined at end of Section 19.1.1. Would choose a DDBMS, for example, if each site needed control over its own data, sites had their own existing DBMSs, communication costs would be significantly reduced, and so on. 19.3 Compare and contrast a DDBMS with a parallel DBMS. Under what circumstances would you choose a DDBMS over a parallel DBMS? Parallel DBMS defined at end of Section 19.1.1. 19.4 Discuss the advantages and disadvantages of a DDBMS. See Section 19.1.2. 19.5 One problem area with DDBMSs is that of distributed database design. Discuss the issues that have to be addressed with distributed database design. Discuss how these issues apply to the global system catalog. The question that is being addressed is how the database and the applications that run against it should be placed across the sites. Two basic alternatives: partitioned or replicated. In partitioned scheme database is divided into a number of disjoint partitions each of which is placed at a different site. Replicated designs can be fully or partially replicated. Two fundamental design issues are fragmentation and distribution. Mostly involves mathematical programming to minimize combined cost of storing the database, processing transactions against it, and communication. Problem is NP-hard; therefore proposed solutions are based on heuristics. The global system catalog (GSC) is only relevant if we talk about a distributed DBMS or multiDBMS that uses a global conceptual schema. Problems are similar to above. Briefly, a GSC may be either global to entire database or local; it may be maintained centrally at one site, or in a distributed fashion over a number of sites; finally, replication - there may be a single copy of the directory or multiple copies. These three dimensions are orthogonal to one another. 19.6 What are the strategic objectives for the definition and allocation of fragments? See start of Section 19.4. 19.7 Define and contrast alternative schemes for fragmenting a global relation. State how you would check for correctness to ensure that the database does not undergo semantic change during fragmentation. Alternative schemes are: primary horizontal, vertical, mixed, and derived horizontal fragmentation (see Section 19.4). Correctness rules are: completeness, reconstruction, and disjointness (see Section 19.4 again). 19.8 What layers of transparency should be provided with a DDBMS? Give justification for your answer. See Section 19.5. 62 Database Systems: Instructor's Guide - Part III 19.9 A DDBMS must ensure that no two sites create a database object with the same name. One solution to this problem is to create a central name server. What are the disadvantages with this approach? Propose an alternative approach that overcomes these disadvantages. See Section 19.5.1 - Naming Transparency. Problems with the central name server, which has the responsibility for ensuring uniqueness of all names in the system, are: • • • loss of some local autonomy performance problems, if the central site becomes a bottleneck low availability, if the central site fails, the remaining sites cannot create any new database objects. An alternative solution, is to prefix an object with the identifier of the site that created it. For example, a relation BRANCH created at site S1 might be named S1.BRANCH. Similarly, we would need to be able to identify each fragment and each of its copies. Thus, copy 2 of fragment 3 of the branch relation created at site S1 might be referred to as S1.BRANCH.F3.C2. However, this results in loss of distribution transparency. An approach which resolves the problems with both these solutions uses aliases for each database object. Thus, S1.BRANCH.F3.C2 might be known as local_branch by the user at site S1. The DDBMS has the task of mapping aliases to the appropriate database object. Exercises A multinational engineering company has decided to distribute its project management information at the regional level in mainland Britain. The current centralised relational schema is as follows:Employee Department Projects Works_On Business Region where Employee Department (NIN, First_Name, Last_Name, Address, Birth_Date, Sex, Salary, Tax_Code, Dept_No) (Dept_No, Dept_Name, Manager_NIN, Business_Area_No, Region_No) (Proj_No, Proj_Name, Contract_Price, Project_Manager_NIN, Dept_No) (NIN, Proj_No, Hours_Worked) (Business_Area_no, Business_Area_Name) (Region_No, Region_Name) contains employee details and the national insurance number NIN is the key. contains department details and Dept_No is the key. Manager_NIN identifies the employee who is the manager of the department. There is only one manager for each department. contains details of the projects in the company and the key is Proj_No. The project manager is identified by the Project_Manager_NIN, and the department responsible for the project by Dept_No. contains details of the hours worked by employees on each project and (NIN, Proj_No) forms the key. contains names of the business areas and the key is Business_Area_No. contains names of the regions and the key is Region_No. Projects Works_on Business Region and Departments are grouped regionally as follows: Region 1: Scotland; Region 2: Wales; Region 3: England Information is required by business area which covers: Software Engineering, Mechanical Engineering and Electrical Engineering. There is no Software Engineering in Wales and all Electrical Engineering departments are in England. Projects are staffed by local department offices. As well as distributing the data regionally, there is an additional requirement to access the employee data either by personal information (by Personnel) or by work related information (by Payroll). 63 Database Systems: Instructor's Guide - Part III 19.10 Draw an Entity-Relationship (ER) diagram to represent this system. For simplicity use crow’s foot notation: Employee Works_On Region Department Business Project 19.11 Produce a distributed database design for the above system and include: (a) (b) (c) a suitable fragmentation schema for the system; in the case of primary horizontal fragmentation, a minimal set of predicates; the reconstruction of global relations from fragments. State any assumptions necessary to support your design. Possible solution as follows: Don't fragment Business/Region - replicate relations at all sites - only contain a small number of records. Department Use primary horizontal fragmentation for Department with minterm predicates : D1 Region = 'Scotland' and Business_area = 'SE' D2 Region = 'Scotland' and Business_area = 'ME' D3 Region = 'Wales' and Business_area = 'ME' D4 Region = 'England' and Business_area = 'SE' D5 Region = 'England' and Business_area = 'ME' D6 Region = 'England' and Business_area = 'EL' Reconstruction: D1 ∪ D2 ∪ D3 ∪ D4 ∪ D5 ∪ D6 Employee Use vertical fragmentation for Employee: E1: E2: Πnin,first_name,last_name,address,birth_date,sex,dept_noEmployee Πnin,salary,tax_codeEmployee Then used derived fragmentation on fragment E1: Eii: E1 dept_no Di 1≤i≤6 nin Reconstruction: (E11 ∪ E12 ∪ E13 ∪ E14 ∪ E15 ∪ E16 ) Projects Use derived fragmentation for Projects: Pi: Projects dept_no E2 Di 1≤i≤6 64 Database Systems: Instructor's Guide - Part III Reconstruction: (P1 ∪ P2 ∪ P3 ∪ P4 ∪ P5 ∪ P6 ) Works_on Use derived fragmentation for Works_on: Wi: 19.12 Works_on ninE1i 1≤i≤6 Repeat Exercise 19.11 for the DreamHome case study presented in Section 1.7. Possible solution as follows: Don't fragment Branch - replicate relations at all sites - only contain a small number of records. Property_for_Rent Use primary horizontal fragmentation for Property_for_Rent with minterm predicates (for example): P1j asking_price ≤ 39999 AND bno = j P2j 40000 ≤ asking_price ≤ 69999 AND bno = j P3j asking_price ≥ 70000 AND bno = j 1 ≤ j ≤ maximum number of branches Reconstruction: ∪ (P1i ∪ P2i ∪ P3i) i=1 Staff Assume salaries paid by head office (branch 1 say), so use vertical fragmentation first: S1: S2: Πsno, fname, lname, address, tel_no, bnoStaff Πsno, position, sex, dob, salary, ninStaff Then use horizontal fragmentation on fragment S1: S1i: σbno= i S1 1≤i≤j nin Reconstruction: (S11 ∪ S12 ∪ S13 … ∪ S1j ) Renter Use horizontal fragmentation: Ri: σbno= i Renter 1≤i≤j S2 Reconstruction: (R1 ∪ R2 ∪ R3 … ∪ Rj ) Viewing, Owner Use derived fragmentation for Viewing and Owner: Vik: Viewing SJ Pik 1 ≤ i ≤ 3, 1 ≤ k ≤ j Oik: Owner SJ Pik 1 ≤ i ≤ 3, 1 ≤ k ≤ j Reconstruction: 19.13 as for Property_for_Rent In Section 19.5.1 when discussing naming transparency, we proposed the use of aliases to uniquely identify each replica of each fragment. Provide an outline design for the implementation of this approach to naming transparency. FUNCTION map(name) { IF name appears in the replica table 65 Database Systems: Instructor's Guide - Part III THEN result = name of replica of name; IF name appears in the fragment table THEN { result = expression to construct fragment; FOR each iname IN result { replace iname in result with map(iname); } } RETURN result; } IF name appears in the alias table THEN expression = map(name); ELSE expression = name; 66 Database Systems: Instructor's Guide - Part III Chapter 20 Distributed DBMSs - Advanced Concepts Review Questions 20.1 In a distributed environment, locking-based algorithms can be classified as centralized, primary copy, or distributed. Compare and contrast these algorithms. See Section 20.2.3. 20.2 One of the most well-known methods for distributed deadlock detection was developed by Obermarck. Explain how Obermarck’s method works and how deadlock is detected and resolved. See Section 20.3 under Distributed Deadlock Detection. 20.3 Outline two alternative two-phase commit topologies to the centralized topology. Alternative topologies: linear and distributed 2PC. See end of Section 20.4.3. 20.4 Explain the term non-blocking protocol and explain why two-phase commit protocol is not a nonblocking protocol. A non-blocking protocol should cater for both site and communication failures to ensure that the failure of one site will not affect processing at another site. In other words, operational sites should not be left blocked. In the event that a participant has voted COMMIT but has not received global decision and is unable to communicate with any other site that knows the decision, that site is blocked. Although 2PC has a cooperative termination protocol that reduces the likelihood of blocking, blocking is still possible and the blocked process will just have to keep on trying to unblock as failures are repaired. 20.5 Discuss how the three-phase commit protocol is a non-blocking protocol in the absence of complete site failure. See Section 20.4.4. The basic idea of 3PC is to remove the uncertainty period for participants who have voted commit and are waiting for the global abort or global commit from the coordinator. 3PC introduces a third phase, called pre-commit, between voting and global decision. 20.6 Compare and contrast the different ownership models for replication. Give examples to illustrate your answer. The main types of ownership are master/slave, workflow, and update-anywhere, sometimes referred to as peer-to-peer or symmetric replication. See Section 20.6.1. 20.7 Compare and contrast the database mechanisms for replication. Discuss table snapshots versus database triggers - see Section 20.6.1 under both these headings. Exercises 20.8 Give full details of the centralized two-phase commit protocol in a distributed environment. Outline the algorithms for both coordinator and participants. 2PC coordinator algorithm STEP C1 VOTE INSTRUCTION write 'begin global commit' message to log send 'vote' message to all participants do until votes received from all participants Algorithm (a) (a) begin 67 Database Systems: Instructor's Guide - Part III wait on timeout go to STEP C2b end-do STEP C2a GLOBAL COMMIT if all votes are 'commit' then begin write 'global commit' record to log send 'global commit' to all participants end STEP C2b GLOBAL ABORT at least one participant has voted abort or coordinator has timed out else begin write 'global abort' record to log send 'global abort' to all participants end end-if STEP C3 TERMINATION do until acknowledgement received from all participants wait end-do write 'end global transaction record' to log finish end Algorithm (b) (b) begin STEP P0 WAIT FOR VOTE INSTRUCTION do until 'vote' instruction received from coordinator wait end-do 2PC participants algorithm STEP P1 VOTE if vote = 'commit' then send 'commit' to coordinator else send 'abort' and go to STEP P2b do until global vote received from coordinator wait end-do STEP P2a COMMIT if global vote = 'commit' then perform local commit processing STEP P2b ABORT at least one participant has voted abort else perform local abort processing end-if STEP P3 TERMINATION send acknowledgement to coordinator finish end 68 Database Systems: Instructor's Guide - Part III Algorithm begin Cooperative termination protocol for 2PC do while P0 is blocked STEP 1 HELP REQUESTED FROM Pi P0 sends a message to Pi asking for help to un-block if Pi knows the decision (Pi received global commit/abort or Pi unilaterally aborted) then begin Pi conveys decision to P0 P0 unblocks and finishes end end-if STEP 2 HAS Pi VOTED? if Pi has not voted then begin Pi unilaterally aborts P0 told to abort P0 unblocks and finishes end end-if STEP 3 Pi CANNOT HELP; TRY Pi+1 next Pi end-do end Algorithm begin do while Pr is blocked STEP 1 ASCERTAIN STATUS OF Pr IMMEDIATELY PRIOR TO FAILURE if Pr voted 'commit' then go to STEP 2 else begin Pr voted 'abort' prior to failure or had not voted Pr aborts unilaterally Pr recovers independently and finishes end end-if STEP 2 IS GLOBAL DECISION KNOWN? if Pr knows global decision then begin Pr takes action in accordance with global decision Pr recovers independently and finishes end end-if STEP 3 Pr CANNOT RECOVER INDEPENDENTLY AND ASKS FOR HELP Pr asks for help from participant Pr+1 using the cooperative termination protocol end-do end 2PC participant restart following failure 69 Database Systems: Instructor's Guide - Part III 20.9 Give full details of the three-phase commit protocol in a distributed environment. Outline the algorithms for both coordinator and participants. 3PC coordinator algorithm STEP C1 VOTE INSTRUCTION write 'begin global commit' message to log send 'vote' message to all participants do until votes received from all participants wait on timeout go to STEP C2b end-do STEP C2a PRE-COMMIT if all votes are 'commit' then begin write 'pre-commit' message to log send 'pre-commit' message to all participants end STEP C2b GLOBAL ABORT at least one participant has voted abort or coordinator has timed out else begin write 'global abort' record to log send 'global abort' to all participants go to STEP 4 end end-if STEP C3 GLOBAL COMMIT do until all (pre-commit) acknowledgements received wait end-do write 'global commit' record to log send 'global commit' to all participants end STEP C4 TERMINATION do until acknowledgement received from all participants wait end-do write 'end global transaction record' to log finish end Algorithm (a) (a) begin Algorithm (b) (b) begin 3PC participants algorithm STEP P0 WAIT FOR VOTE INSTRUCTION do until 'vote' instruction received from coordinator wait end-do STEP P1 VOTE if participant is prepared to commit then send 'commit' message to coordinator else send 'abort' message to coordinator and go to STEP P2b do until global vote received from coordinator 70 Database Systems: Instructor's Guide - Part III wait end-do STEP P2a PRE-COMMIT if global instruction = 'pre-commit' then go to STEP P3 (and wait for global commit) end-if STEP P2b ABORT at least one participant has voted abort perform local abort processing go to STEP P4 STEP P3 COMMIT do until 'global commit' received from coordintaor wait end-do perform local commit processing STEP P4 TERMINATION send acknowledgement to coordinator finish end 20.10 Analyze the DBMSs that you are currently using and determine the support each provides for the X/Open DTP model and for data replication. This is a small student project, the result of which is dependent on the system analyzed. 20.11 You have been asked by the Managing Director of DreamHome to investigate the data distribution requirements of the organization and to prepare a report on the potential use of a distributed DBMS. The report should compare the technology of the centralized DBMS with that of the distributed DBMS, and should address the advantages and disadvantages of implementing a DDBMS within the organization, and any perceived problem areas. The report should also address the possibility of using a replication server to address the distribution requirements. Finally, the report should contain a fully justified set of recommendations proposing an appropriate solution. A well-presented report is expected. Justification must be given for any recommendations made. 20.12 Consider six transactions T1, T2, T3, T4, and T5 with: T1 initiated at site S1 and spawning an agent at site S2, T2 initiated at site S3 and spawning an agent at site S1, T3 initiated at site S1 and spawning an agent at site S3, T4 initiated at site S2 and spawning an agent at site S3, T5 initiated at site S3. The locking information for these transactions is shown in Table 1. 71 Database Systems: Instructor's Guide - Part III Transaction T1 T1 T2 T2 T3 T3 T4 T4 T5 Data items locked by transaction x1 x6 x4 x5 x2 x7 x8 x3 Data items transaction is waiting for x8 x2 x1 x7 x3 x5 x7 Table 1 Site involved in operations S1 S2 S1 S3 S1 S3 S2 S3 S3 (a) Produce the local wait-for-graphs (WFGs) for each of the sites. What can you conclude from the local WFGs? T1 T2 T1 T4 T2 T3 T3 Site 1 Site 2 T4 T5 Site 3 Conclusion: There is no local deadlock at any site. (b)Using the above transactions, demonstrate how Obermarck's method for distributed deadlock detection works. What can you conclude from the global WFG? Text T1 T2 T1 T4 T2 T3 Text Site 1 T3 Text T4 T5 Site 2 Site 3 Cycle at site 1, so move WFG from Site 1 to site 3. The resulting WFG shows a cycle: Text T1 T2 T3 T4 T5 Sites 1 and which implies system is in global deadlock and one 3 of the transactions must be selected to be aborted and restarted. 72 Database Systems: Instructor's Guide - Part III Chapter 21 Introduction to Object DBMSs Review Questions 21.1 Discuss the general characteristics of advanced database applications. See Section 21.1. 21.2 Discuss why the weaknesses of relational DBMSs may make them unsuitable for advanced database applications. See Section 21.2. 21.3 Discuss each of the following concepts in the context of an object data model: (a) (b) (b) (c) (d) (e) (f) abstraction, encapsulation, and information hiding; objects and attributes; object identity; methods and messages; classes, subclasses, superclasses, and inheritance; overloading; polymorphism and dynamic binding. See Section 21.3.1 See Section 21.3.2 See Section 21.3.3 See Section 21.3.4 See Section 21.3.5/6 See Section 21.3.7 See Section 21.3.8 Give examples using the DreamHome sample data shown in Figure 3.3. - Expect examples similar to the ones in the above referenced sections. Exercises 21.4 Investigate one of the advanced database applications discussed in Section 21.1, or a similar one that handles complex, interrelated data. In particular, examine its functionality, and the data types and operations it uses. Map the data types and operations to the object-oriented concepts discussed in Section 21.3. This is a small student project, the result of which is dependent on the application investigated. However, expect the student to cover not just standard concepts such as objects, attributes, classes, superclasses, but also concepts such as overloading, complex objects, dynamic binding. 21.5 Analyze the relational DBMSs that you are currently using. Discuss the object-oriented features provided by the system. What additional functionality do these features provide? This is a small student project, the result of which is dependent on the system analyzed. 21.6 For the DreamHome case study introduced in Section 1.7, suggest attributes and methods that would be appropriate for Branch, Staff, and Property_for_Rent classes. The attributes should be similar to those documented in the textbook - see, for example, the appendices at the end of Chapter 8. The student would be expected to come up with standard get/set methods, such as Get_Staff_Salary, Put_Staff_Salary, and then methods for registering new properties for rent, new owners, new renters, appointments for viewings, and so on. 73 Database Systems: Instructor's Guide - Part III Chapter 22 Object-Oriented DBMSs Review Questions 22.1 Compare and contrast the different definitions of object-oriented data models. See Section 22.1. 22.2 What is a persistent programming language and how does it differ from an OODBMS? A language that provides its users with the ability to (transparently) preserve data across successive executions of a program, and even allows such data to be used by different programs (See Section 22.1.1). Main difference between PPL and OODBMS is that the OODBMS tends to provide more DBMS-related services. 22.3 Discuss the difference between the two-level storage model used by conventional DBMSs and the single-level storage model used by OODBMSs. See Section 22.2. 22.4 How does this single-level storage model affect data access? See Section 22.2.1. 22.5 Discuss the main strategies that can be used to create persistent objects. Checkpointing, serialization, and explicit paging. Explicit paging includes reachability-based and allocation-based persistence (see Section 22.3.1). 22.6 What is pointer swizzling? Discuss the different approaches to pointer swizzling. Action of converting OIDs to main memory pointers (See Section 22.3.3). 22.7 Discuss the types of transaction protocols that can be useful in design applications. See Sections 22.4.1 and 19.4. 22.8 Discuss why version management may be a useful facility for some applications. There are many applications that need access to the previous state of an object. For example, the development of a particular design is often an experimental and incremental process, the scope of which changes with time. It is therefore necessary in databases that store designs to keep track of the evolution of design objects and the changes made to a design by various transactions. 22.9 Discuss why schema control may be a useful facility for some applications. Engineering design is an incremental process and evolves with time. To support this process, applications require considerable flexibility in dynamically defining and modifying the database schema. For example, it should be possible to modify class definitions, the inheritance structure, and specifications of attributes and methods without requiring system shutdown. 22.10 Compare and contrast the different architectures for an OODBMS. See Section 22.4.4. Exercises 22.11 For the relational schema in the exercises at the end of Chapter 3, suggest a number of methods that would be applicable to the system. Produce an object-oriented schema for the system. 74 Database Systems: Instructor's Guide - Part III Some methods might be: create_new_hotel create_new_room change_room_type create_new_guest change_guest_address create_new_booking change_dates destroy_hotel destroy_room destroy_guest destroy_booking change_hotel_name change_room_price change_guest_name change_room_number interface Hotel { (extent hotels key Hotel_No) attribute string Hotel_No; attribute string Name; attribute string Address; attribute Set<struct<string room_no, char type, float price>> rooms; relationship List<Booking> has_booking inverse Booking::booking_for; create_new_hotel(); destroy_hotel() raises(no_such_hotel); create_new_room(); destroy_room(in room_no:string) raises(no_such_room); change_room_price(in room_no:string, in float) raises(no_such_room); change_room_type(in room_no:string, in char) raises(no_such_room); } interface Guest { (extent guests key Guest_No) attribute string Guest_No; attribute string Name; attribute string Address; relationship List<Booking> booked_for inverse Booking::booking_by; create_new_guest(); destroy_guest() raises(no_such_guest); } interface Booking { (extent bookings key (Hotel_No, Guest_No, Date_From)) attribute string Hotel_No; attribute string Guest_No; attribute date Date_From; attribute date Date_To; attribute string Room_No; relationship <Hotel> booking_for inverse Hotel::has_booking; relationship Guest booked_by inverse Guest::booking_for; create_new_booking(in Hotel, in Guest) raises(no_such_hotel, no_such_guest, hotel_full) destroy_booking() raises(no_such_booking); change_room_number(); change_dates(); } 75 Database Systems: Instructor's Guide - Part III 22.12 Using the schema produced above, show how the following queries would be written in OQL: (a) List all hotels. hotel or to sort them: sort h IN hotel by h.name (b) List all single rooms with a price below £20.00 per night. SELECT h.rooms FROM h IN hotel WHERE h.rooms.price < 20 (c) List the names and addresses of all guests. SELECT STRUCT(name:g.name, address:g.address) FROM g IN guest (d) List the price and type of all rooms at the Grosvenor Hotel. type prices {attribute price : float; type: char;} prices (SELECT STRUCT(price:h.rooms.price, type:h.rooms.type) FROM h IN hotel WHERE h.name = 'Grosvenor Hotel' (e) List all guests currently staying at the Grosvenor Hotel. SELECT g FROM g IN guests b IN g.booked_for h IN b.booking_for WHERE b.date_from <= '01-01-99' AND b.date_to >= '01-01-99' AND h.name = 'Grosvenor Hotel' (substitute '01-01-99' for today's date). (f) List the details of all rooms at the Grosvenor Hotel, including the name of the guest staying in the room, if the room is occupied. define occupied as SELECT STRUCT(h.rooms, SELECT g.name FROM g IN guest b IN g.booked_for h IN b.booking_for WHERE b.date_from <= '01-01-99' AND b.date_to >= '01-01-99' AND h.name = 'Grosvenor Hotel') FROM h IN hotel UNION SELECT STRUCT(h.rooms, nil) FROM h IN hotel, y IN occupied WHERE h.rooms.room_no != y.room_no 76 Database Systems: Instructor's Guide - Part III (g) List the guest details (guest_no, name and address) of all guests staying at the Grosvenor Hotel. SELECT STRUCT(guest_no:g.guest_no, name:g.name, address:g.address) FROM g IN g IN guest b IN g.booked_for h IN b.booking_for WHERE b.date_from <= '01-01-99' AND b.date_to >= '01-01-99' AND h.name = 'Grosvenor Hotel' (substitute '01-01-99' for today’s date). 22.13 Produce an object-oriented database design for the DreamHome case study presented in Section 1.7. State any assumptions necessary to support your design. Partial solution is as follows: interface Branch { (extent branches key Branch_No) attribute string Branch_No; attribute <struct<string Street, string Area, string City, string Postcode>> Address; attribute string Tel_No; attribute string Fax_No; attribute date Manager_Start_Date; attribute float Bonus_Payment; attribute string Car_Allowance; relationship Set<Renter> registers inverse Renter::is_registered_with; relationship Staff has inverse Staff::works_in; relationship Manager is_managed_by inverse Manager::manages; relationship Set<Property_for_Rent> offers inverse Property_for_Rent is_offered_by; destroy_branch(in Branch) raises(no_such_branch); } interface Staff { (extent staff key Staff_No, NIN) attribute string Staff_No; attribute <struct<string FName, string LName>>; attribute string Address; attribute string Tel_No; attribute char Sex; attribute date DOB; attribute string Position; attribute float Salary; attribute date Date_Joined; attribute string NIN; attribute <struct<string NName, string Relationship, string Address, string Tel_No>> Next_of_Kin; relationship Branch works_in inverse Branch::has; relationship Set<Property_for_Rent> oversees inverse Property_for_Rent is_overseen_by; relationship Set<Inspection> carryout inverse Inspection::is_carried_out_by; destroy_staff(in Staff) raises(no_such_person); increase_salary(in Staff, in float) raises(no_such_person); oversee_property(in Staff, in Property_for_Rent) raises(no_such_person, no_such_property); move_branch(in Staff, in from::Branch, in to::Branch) raises(no_such_person, no_such_branch); 77 Database Systems: Instructor's Guide - Part III } interface Manager::Staff { (extent managers) relationship Branch manages inverse Branch::is_managed_by; } interface Secretary::Staff { (extent secretaries) attribute integer Typing_Speed; relationship Set<Staff> supports inverse Staff::is_supported_by; } interface Supervisor::Staff { (extent supervisors) relationship Set<Staff> supervises inverse Staff::is_supervised_by; } interface Property_for_Rent { (extent rentals key Property_No) attribute string Property_No; attribute <struct<string Street, string Area, string City, string Postcode>> Address; attribute string Type; attribute integer Rooms; attribute float Rent; relationship Staff is_overseen_by inverse Staff::oversees; relationship Branch is_offered_by inverse Branch::offers; relationship Set<Inspection> undergoes inverse Inspection::inspection_for; relationship Owner is_owned_by inverse Owner::owns; relationship Set<Viewing> takes inverse Viewing::is_taken_for; relationship Set<Lease_Agreement> associated_with inverse Lease_Agreement:: is_associated_with; destroy_pfp(in Property_for_Rent) raises(no_such_property); rent_out(in Property_for_Rent) raises(no_such_property); } interface Renter { (extent renters key Renter_No) attribute string Renter_No; attribute <struct<string FName, string LName>>; attribute string Address; attribute string Tel_No; attribute string Pref_Type; attribute float Max_Rent; relationship Branch is_registered_with inverse Branch::registers; relationship Set<Viewing> attends inverse Viewing::is_attended_by; relationship Set<Lease_Agreement> holds inverse Lease_Agreement::is_held_by; rent_property(in Renter, in Property_for_Rent) raises(no_such_renter, no_such_property); register(in Renter, in Branch) raises(no_such_renter, no_such_branch); arrange_viewing(in Renter, in Property_for_Rent) raises(no_such_renter, no_such_property); destroy_renter(in Renter) raises(no_such_renter); } 78 Database Systems: Instructor's Guide - Part III 22.14 Produce an object-oriented database design for the Wellmeadows Hospital student project presented in Appendix A. State any assumptions necessary to support your design. The following is a sample schema, not all methods are shown: interface Ward { (extent wards key WardNo) attribute string WardNo; attribute string WName; attribute string Location; attribute integer TotalBeds; attribute string TelExtn; relationship List<Waiting_List> has_waiting_list inverse Waiting_List::is_waiting_for {order_by Waiting_List::ListDate, WardReq}; relationship List<Staff_Rota> has_staff_rota inverse Staff_Rota::on_shift_for {order_by Staff_Rota::WardNo, WeekNo}; relationship Set<Requisition> needs_req inverse Requisition::req_for; relationship Nurse is_managed_by inverse Nurse::manages; create_new_ward(); destroy_ward(in Ward) raises(no_such_ward); assign_charge_nurse(in Nurse) raises(no_such_nurse); } interface Staff { (extent staff key StaffNo, NIN) attribute string StaffNo; attribute <struct<string FName, string LName>>; attribute string Address; attribute string TelNo; attribute date DOB; attribute char Sex; attribute string NIN; attribute string Position; attribute float Salary; attribute integer SScale; attribute integer WeekHrs; attribute char ContType; attribute char TypePay; attribute Set<struct< date QDate, string QType, string Institution>> Qualification; attribute Set<struct< date SDate, date FDate, string Position, string OrgName>> Work_Experience; relationship List<Staff_Rota> has_rota inverse Staff_Rota::shift_for {order_by Staff_Rota::StaffNo, WeekNo}; destroy_staff(in Staff) raises(no_such_person); increase_salary(in Staff, in float); } interface Nurse::Staff { (extent nurses) relationship Ward manages inverse Ward::is_managed_by; relationship Set<Requisition> makes_req inverse Requisition::made_by; 79 Database Systems: Instructor's Guide - Part III create_new_nurse(); make_requisition(in Nurse, in Pharmaceutical) raises(no_such_drug); } interface Consultant::Staff { (extent consultants) relationship List<Appointment> sees inverse Appointment::is_seen_by {order_by Appointment::ConsStaffNo, Adate, ATime}; create_new_consultant(); cancel_appointment(in Consultant, in Appointment); } interface Staff_Rota { (extent rotas key (StaffNo, WeekNo)) attribute integer Shift; attribute integer WeekNo; attribute string StaffNo; attribute string WardNo; relationship Ward on_shift_for inverse Ward::has_staff_rota; relationship Staff shift_for inverse Staff::has_rota; create_new_rota(); } interface Patient { extent patients; key PatNo; attribute string PatNo; attribute <struct<string FName, string LName>>; attribute string Address; attribute string TelNo; attribute date DOB; attribute char Sex; attribute char MStatus; attribute date DateReg; attribute <struct< string NName, string NRelationship, string NAddress, string NTelNo>> Next-OfKin; attribute Set<struct<date OutPatDate, time OutPatTime>> outpatient_appointment; relationship Doctor has_gp inverse Doctor::gp_for; relationship List<Appointment> has_appt inverse Appointment::is_appt_for; relationship List<Medication> takes inverse Medication::is_taken_by; relationship List<Waiting_List> waits_for inverse Waiting_List::is_on; create_new_patient(); change_doctor(in from:Doctor, in to:Doctor) raises(unknown_doctor); cancel_appointment(in Appointment) raises(no_appointment); } interface Doctor { (extent gps key (DocName, ClinicNo)) attribute string DocName; attribute string ClinicNo; 80 Database Systems: Instructor's Guide - Part III attribute string Address attribute string TelNo; relationship List<Patient> gp_for inverse Patient::has_gp; create_new_doctor(); } interface Appointment { (extent appointments key AppNo) attribute string AppNo; attribute date ADate; attribute time ATime; attribute string RoomNo; relationship Patient is_appt_for inverse Patient::has_appt; relationship Consultant is_seen_by inverse Consultant::sees; } interface Waiting_List { (extent waiting key (PatNo, ListDate)) attribute date ListDate; attribute interval Duration; attribute date PlacedDate; attribute date ExLeaveDate; attribute date ActLeaveDate; attribute string BedNo; relationship Ward is_waiting_for inverse Ward::has_waiting_list; relationship Patient is_on inverse Patient::waits_for; add_to_list(in Patient, in Ward) raises(no_such_patient, no_such_ward); remove_from_list(in Patient, in Ward) raises(no_such_patient, no_such_ward); } interface Medication { (extent medications key (PatNo, DrugNo, SDate)) attribute string UnitsDay; attribute string AMethod; attribute date SDate; attribute date FDate; relationship Patient is_taken_by inverse Patient::takes; relationship Pharmaceutical contains inverse Pharmaceutical::contained_in; add_medication(in Patient, in Pharmaceutical) raises(no_such_patient, no_such_drug); change_medication(in Patient, in Pharmaceutical) raises(no_such_patient, no_such_drug);; } interface Pharmaceutical { (extent drugs key DrugNo) attribute string DrugNo; attribute string DName; attribute string Description; attribute string Dosage; 81 Database Systems: Instructor's Guide - Part III attribute string MAdmin; attribute integer QStock; relationship List<Medication> contained_in inverse Medication::contains; relationship List<Requisition> req_in inverse Requisition::reqs_for; relationship Supplier is_supplied_by inverse Supplier::supplies; create_drug(); remove_drug(in Pharmaceutical) raises(no_such_drug); } interface Non-Surgical/Surgical { (extent supplies key ItemNo) attribute string ItemNo; attribute string IName; attribute string IDescription; attribute integer QStock; attribute string RLevel; attribute float UnitCost; relationship List<Requisition> req_in inverse Requisition::reqs_for; create_item(); remove_Item(in Non-Surgical/Surgical) raises(no_such_item); } interface Requisition { (extent requisitions key ReqNo) attribute string ReqNo; attribute integer QuantReq; attribute date DateOrder; attribute date DateReceive; relationship Ward req_for inverse Ward::needs_req; relationship Nurse made_by inverse Nurse::makes_req; relationship Pharmaceutical reqs_for inverse Pharmaceutical::req_in; relationship Non-Surgical/Surgical reqs_for inverse Non-Surgical/Surgical::req_in; deliver_req(in Requisition) raises(no_such_req); } interface Supplier { (extent suppliers key SupplierNo) attribute string SupplierNo; attribute string SName; attribute string SAddress; attribute string TelNo; attribute string FaxNo; relationship Set<Pharmaceutical> supplies inverse Pharmaceutical::is_supplied_by; } 22.15 You have been asked by the Managing Director of DreamHome to investigate and prepare a report on the applicability of an Object-Oriented DBMS for the organization. The report should compare the technology of the relational DBMS with that of the Object-Oriented DBMS, and should address the advantages and disadvantages of implementing an OODBMS within the organization, and any perceived problem areas. Finally, the report should contain a fully justified set of conclusions on the applicability of the OODBMS for DreamHome. 82 Database Systems: Instructor's Guide - Part III A well-presented report is expected. Justification must be given for any recommendations made. 22.16 Using the rules for schema consistency given in Section 22.4.3, consider each of the following modifications and state what the effect of the change should be to the schema: (a) (b) (c) (d) Adding an attribute to a class Deleting and attribute from a class Making a class S a superclass of a class C Removing a class S from the list of superclasses of a class C. See Section 22.4.3. 83 Database Systems: Instructor's Guide - Part III Chapter 23 Object-Relational DBMSs Review Questions 23.1 What typical functionality would be provided by an ORDBMS? Many different answers here – see, for example, Section 23.2.1. Expect standard DBMS functionality, plus object management capabilities (types, inheritance, etc), plus ability to extend query optimizer, and define new index types (see Section 23.5). 23.2 What are the advantages and disadvantages of extending the relational data model? See Section 23.1 under Advantages and Disadvantages. 23.3 What are the main features of the forthcoming SQL standard? See start of Section 23.4. 23.4 Discuss the extensions required to query processing and query optimization to fully support the ORDBMS? See Section 23.5. 23.5 What are the security problems associated with the introduction of user-defined methods and suggest some solutions to these problems? See paragraph before start of Section 23.5.1. Exercises 23.6 Analyze the relational DBMSs that you are currently using. Discuss the object-oriented facilities provided by the system. What additional functionality do these facilities provide? This is a small student project, the result of which is dependent on the system analyzed. 23.7 Consider the relational schema for the Hotel case study given in the Exercises of Chapter 11. Redesign this schema to take advantage of the new features of SQL3/SQL4. Add user-defined functions that you consider appropriate. One possible solution as follows: CREATE DOMAIN HOTEL_NUMBER AS CHAR(4); CREATE DOMAIN ROOM_TYPE AS CHAR(1) CHECK(VALUE IN ('S', 'F', 'D')); CREATE DOMAIN ROOM_PRICE AS DECIMAL(5,2) CHECK(VALUE BETWEEN 10 AND 100); CREATE ROOM_NUMBER AS VARCHAR(4) CHECK(VALUE BETWEEN '1' AND '100'); CREATE DOMAIN GUEST_NUMBER AS CHAR(4); CREATE DOMAIN BOOKING_DATE AS DATETIME CHECK(VALUE > CURRENT_DATE); CREATE TYPE hotel_type( hotel_no HOTEL_NUMBER name VARCHAR(20) address VARCHAR(50) NOT NULL, NOT NULL, NOT NULL); CREATE TABLE hotel OF hotel_type( oid REF(hotel_type) VALUES ARE SYSTEM GENERATED, PRIMARY KEY (hotel_no)); 84 Database Systems: Instructor's Guide - Part III CREATE TYPE rooms_type( room_no ROOM_NUMBER hotel_no REF(hotel_type) type ROOM_TYPE price ROOM_PRICE NOT NULL, NOT NULL, NOT NULL DEFAULT 'S' NOT NULL); CREATE TABLE room OF rooms_type( PRIMARY KEY (room_no, hotel_no), FOREIGN KEY (hotel_no) REFERENCES hotel ON DELETE CASCADE ON UPDATE CASCADE); CREATE TYPE guest_type( guest_no GUEST_NUMBER name VARCHAR(20) address VARCHAR(50) NOT NULL, NOT NULL, NOT NULL); CREATE TABLE guest OF guest_type( oid REF(guest_type) VALUES ARE SYSTEM GENERATED, PRIMARY KEY (guest_no)); CREATE TYPE booking_type( hotel_no REF(hotel_type) guest_no REF(guest_type) date_from BOOKING_DATE date_to BOOKING_DATE room_no ROOM_NUMBER NOT NULL, NOT NULL, NOT NULL, NULL, NOT NULL); CREATE TABLE booking OF booking_type( PRIMARY KEY (hotel_no, guest_no, date_from), FOREIGN KEY (hotel_no) REFERENCES hotel ON DELETE CASCADE ON UPDATE CASCADE, FOREIGN KEY (guest_no) REFERENCES guest ON DELETE NO ACTION ON UPDATE CASCADE, FOREIGN KEY (hotel_no, room_no) REFERENCES room ON DELETE NO ACTION ON UPDATE CASCADE, CONSTRAINT room_booked CHECK (NOT EXISTS (SELECT * FROM booking b WHERE b.date_to > booking.date_from AND b.date_from < booking.date_to AND AND b.room_no = booking.room_no AND b.hotel_no = booking.hotel_no)), CONSTRAINT guest_booked CHECK (NOT EXISTS (SELECT * FROM booking b WHERE b.date_to > booking.date_from AND b.date_from < booking.date_to AND AND b.guest_no = booking.guest_no))); 23.8 Create SQL3/SQL4 statements for the queries given in Exercise 13.7 - 13.26. Depends on the solution to the previous question. For example, using the above schema, the solution to 13.16 would be: SELECT price, type FROM room r WHERE r–>hotel.name = 'Grosvenor Hotel'; 85 Database Systems: Instructor's Guide - Part III 23.9 Create an insert trigger that sets up a mailshot table recording the names and addresses of all guests who have stayed at the hotel during the days before and after New Year for the past two years. The aim of this question is to show the difficulty of creating a trigger that does not necessarily need to be tied to a modification to a table. CREATE TRIGGER insert_mailshot_table AFTER INSERT ON booking BEGIN INSERT INTO mailshot (SELECT g.guest_no, g.name, g.address FROM guest g, booking b1, booking b2 WHERE g.guest_no = b1.guest_no AND b1.guest_no = b2.guest_no AND ((b1.date_from <= DATE'1998-12-31' AND b1.date_to>=DATE'1998-12-31') OR (b1.date_from >= DATE'1998-12-31' AND b1.date_from <= DATE'1999-01-02') ) AND ((b1.date_from <= DATE'1997-12-31' AND b1.date_to>=DATE'1997-12-31') OR (b1.date_from >= DATE'1997-12-31' AND b1.date_from <= DATE'1998-01-02') ) AND NOT EXISTS (SELECT * FROM mailshot m WHERE m.guest_no = g.guest_no); END; 23.10 Repeat Exercise 23.7 for the multinational engineering case study in the Exercises of Chapter 19. Look for solution that uses the features of SQL3/4 such as types with inheritance, SETs, and object references. 23.11 Create an object-relational schema for the DreamHome case study presented in Section 1.7. Add user-defined functions that you consider appropriate. Look for solution that uses the features of SQL3/4 such as types with inheritance, SETs, and object references. 23.12 Create an object-relational schema for the Wellmeadows case study presented in Appendix A. Add user-defined functions that you consider appropriate. Partial solution (see Exercise for 11.3): CREATE TYPE ward_type( WardNo VARCHAR(4) NOT NULL, WName VARCHAR(20) NOT NULL, Location VARCHAR(20) NOT NULL, TotalBeds INTEGER, TelExtn VARCHAR(4) NOT NULL, Consultant REF(consultant_type), FUNCTION assign_charge_nurse(W ward_type RESULT, N nurse_type) RETURNS ward_type ... RETURN; END, ); CREATE TYPE qualification_type( QType VARCHAR(5) NOT NULL, QDate DATE, Institution VARCHAR(30), 86 Database Systems: Instructor's Guide - Part III ); CREATE TYPE work_experience_type( OrgName VARCHAR NOT NULL, SDate DATE, FDate DATE, Position VARCHAR(10) ); CREATE TYPE name_type ( FName VARCHAR(15) NOT NULL, LName VARCHAR(15) NOT NULL ); CREATE TYPE staff_type( PRIVATE date_of_birth DATE_CHECK(date_of_birth < CURRENT_DATE), PUBLIC StaffNo VARCHAR(5) NOT NULL, Name name_type, Address VARCHAR(50) NOT NULL, TelNo VARCHAR(13) NOT NULL, Sex CHAR NOT NULL, NIN VARCHAR(9) NOT NULL, Position VARCHAR(10) NOT NULL, Salary DECIMAL(6,2) NOT NULL, SScale INTEGER NOT NULL, WeekHrs INTEGER NOT NULL, ContType CHAR NOT NULL, TypePay CHAR NOT NULL, qualification SET(qualification_type), work_exp SET(work_experience_type) FUNCTION get_age (P person_type) RETURNS INTEGER RETURN/* code to calculate age from date_of_birth */ END, FUNCTION set_age (P person_type RESULT, DOB: DATE) RETURNS person_type RETURN/* set date_of_birth */ END ) NOT FINAL; CREATE TYPE nurse_type UNDER staff_type ( FUNCTION make_requisition(...); BEGIN ... END ); CREATE TYPE consultant_type UNDER staff _type ( FUNCTION cancel_appointment(...); BEGIN ... END ); CREATE TABLE ward OF ward_type( oid REF(ward_type) VALUES ARE SYSTEM GENERATED, PRIMARY KEY(wardno)); CREATE TABLE nurse OF nurse_type( 87 Database Systems: Instructor's Guide - Part III oid REF(nurse_type) VALUES ARE SYSTEM GENERATED, PRIMARY KEY(staffno)); CREATE TABLE consultant OF consultant_type( oid REF(consultant_type) VALUES ARE SYSTEM GENERATED, PRIMARY KEY(staffno)); 23.13 You have been asked by the Managing Director of DreamHome to investigate and prepare a report on the applicability of an Object-Relational DBMS for the organization. The report should compare the technology of the relational DBMS with that of the Object-Relational DBMS, and should address the advantages and disadvantages of implementing an ORDBMS within the organization, and any perceived problem areas. The report should also consider the applicability of an Object-Oriented DBMS, and a comparison of the two types of systems for DreamHome should be included. Finally, the report should contain a fully justified set of conclusions on the applicability of the ORDBMS for DreamHome. A well-presented report is expected. Justification must be given for any recommendations made. 88 Database Systems: Instructor's Guide - Part III Part Six Future Trends Chapter 24 Web Technology and DBMSs Review Questions 24.1 Discuss each of the following terms: (a) Internet, intranet, and extranet. (b) World-Wide Web. (c) HyperText Transfer Protocol (HTTP). (d) HyperText Markup Language (HTML). (e) Uniform Resource Locators (URLs). See Section 24.1 See Section 24.1.1 See Section 24.1.2 See Section 24.1.3 See Section 24.1.4 24.2 Compare and contrast the two-tier client-server architecture for traditional DBMSs with the three tier client-server architecture. Why is the latter architecture more appropriate for the Web? See Section 24.2.2 (additional information on client-server architecture can be found in Section 2.6.3). 24.3 Discuss the advantages and disadvantages of the Web as a database platform. See Section 24.2.3. 24.4 Compare and contrast the Common Gateway Interface and server extensions, as approaches for integrating databases onto the Web. See Section 24.6.1. 24.5 Discuss, with examples, the security problems that can arise in a Web environment. What mechanisms are available to prevent these problems from occurring? See Section 24.11. Exercises 24.6 Examine the Web functionality provided by any DBMS that you currently use. Compare the functionality of your system with the approaches discussed in Section 24.3 to 24.10. This is a small student project, the result of which is dependent on the system analyzed. 24.7 Examine the security features provided by the Web interface to your DBMS. Compare these features with the features discussed in Section 24.11. This is a small student project, the result of which is dependent on the system analyzed. 24.8 Using an approach to Web-DBMS integration, create a series of forms that display the base tables of the DreamHome case study. This depends on the type of system/layout preferred. Students should not use any automatic facilities, such as those provided my Access, but should design the layout themselves. 24.9 Extend the implementation of Exercise 24.8 to allow the base tables to be updated from the Web browser. Again, depends on the technology used. 24.10 Repeat Exercises 24.8 and 24.9 for the Wellmeadows case study. As above. 89 Database Systems: Instructor's Guide - Part III 24.11 Using any Web browser, look at some of the following Web sites and discover the wealth of information held there: (a) W3C: http://www.w3c.org (b) Microsoft: http://www.microsoft.com (c) Oracle: http://www.oracle.com (d) Informix: http://www.informix.com (e) IBM: http://www.ibm.com (f) Sybase: http://www.sybase.com (g) Javasoft http://www.javasoft.com (h) Gemstone http://www.gemstone.com (i) Objectivity http://www.objectivity.com (j) ObjectStore http://www.idc.com (k)O2 Technology http://www.o2tech.com (l)Poet http://www.poet.com You have been asked by the Managing Director of DreamHome to investigate and prepare a report on the feasibility of making the DreamHome database accessible from the Internet. The report should examine the technical issues, the technical solutions, address the advantages and disadvantages of this proposal, and any perceived problem areas. The report should contain a fully justified set of conclusions on the feasibility of this proposal for DreamHome. A well-presented report is expected. Justification must be given for any recommendations made. 24.12 90 Database Systems: Instructor's Guide - Part III Chapter 25 Data Warehousing Review Questions 25.1 Describe what is meant by the following terms, when describing the characteristics of the data in a data warehouse: (a) Subject-oriented (b) Integrated (c) Time-variant (d) Non-volatile See Section 25.1.2. 25.2 Discuss how Online Transaction Processing (OLTP) systems differ from data warehousing systems. See Section 25.1.4. 25.3 Discuss the main benefits and problems associated with data warehousing. For the main benefits of data warehousing see Section 25.1.3 and for the main problems associated with data warehousing see Section 25.1.5. 25.4 Present a diagrammatic representation of the typical architecture and main components of a data warehouse. For a diagram of the typical architecture of a data warehouse see Figure 25.1. 25.5 Describe the characteristics and main functions of the following components of a data warehouse. (a) load manager See Section 25.2.2 (b) warehouse manager See Section 25.2.3 (c) query manager See Section 25.2.4 (d) metadata See Section 25.2.8 (e) end-user access tools. See Section 25.2.9 Discuss the activities associated with each of the five primary information flows or processes within the data warehouse: (a) Inflow See Section 25.3.1 (b) Upflow See Section 25.3.2 (c) Downflow See Section 25.3.3 (d) Outflow See Section 25.3.4 (e) Metaflow. See Section 25.3.5 What are the three main approaches taken by vendors to provide data extraction, cleansing, and transformation tools? The three approaches are code generators, database data replication tools and dynamic transformation engines. See Section 25.4.1 for a description of each approach. 25.6 25.7 25.8 Describe the specialized requirements of a relational database management system (RDBMS) suitable for use in a data warehouse environment. See Section 25.4.2 25.9 Discuss how parallel technologies can support the requirements of the data warehouse. See last topic discussed in Section 25.4.2 under the heading Parallel database technologies. 25.10 Discuss the importance of managing metadata and how this relates to the integration of the data warehouse. 91 Database Systems: Instructor's Guide - Part III See Section 25.4.3. 25.11 Discuss the main tasks associated with the administration and management of a data warehouse. See Section 25.4.4. Discuss how data marts differ from data warehouses and discuss the main reasons for implementing a data mart. For a discussion on how data marts differ form data warehouses see introductory paragraphs of Section 25.5 and for reasons for implementing a data mart see Section 25.5.1. 25.12 Identify the main issues associated with the development and management of data marts. See Section 25.5.2. Describe an approach for the development of the database component of a data warehouse that is capable of supporting decision-making using star, snowflake, and starflake schemas. See Section 25.6. 25.12 25.13 Exercises 25.15 Design a database suitable for decision-support for the DreamHome case study (see Section 1.7). The design should be based on the query requirements for the Director of the organization. The student should follow the approach described in Section 25.6. The student must first identify and document the types of decision-support queries that the Director of the DreamHome may wish to ask. The student should then produce a design to meet these requirements. 25.16 Design a database suitable for decision-support for the Wellmeadows case study (see Appendix A). The design should be based on the query requirements for the Director of the hospital. The student should follow the approach described in Section 25.6. The student must first identify and document the types of decision-support queries that the Director of the Wellmeadows Hospital may wish to ask. The student should then produce a design to meet these requirements. 25.17 Design a database suitable for decision-support for your organization. The design should be based on the query requirements of the staff your organization. The student should follow the approach described in Section 25.6. The student must first identify and document the types of decision-support queries that the senior staff of your organization may wish to ask. The student should then produce a design to meet these requirements. 25.18 You are asked by the Managing Director of DreamHome to investigate and report on the applicability of data warehousing for the organization. The report should compare data warehouse technology with relational DBMSs and should identify the advantages and disadvantages, and any problem areas associated with implementing a data warehouse. The report should reach a fully justified set of conclusions on the applicability of a data warehouse for DreamHome. A well-presented report is expected. Justification must be given for any recommendations made. 92 Database Systems: Instructor's Guide - Part III Chapter 26 OLAP and Data Mining Review Questions 26.1 Discuss the major characteristics of Multi-dimensional OLAP (MOLAP) and how a datacube supports multi-dimensional queries. See Section 26.1. 26.2 Describe the architecture, characteristics, and issues associated with each of the following categories of OLAP tools: (a) MOLAP, (b) ROLAP, (c) MQE. See Section 26.1.3. 26.3 Discuss some of the possible extensions to SQL that support data analysis and decisionsupport applications. See Section 26.1.4. 26.4 Discuss how data mining can realize the value of a data warehouse. See introductory paragraphs of Section 26.2 and see Section 26.2.1. See also Section 26.2.8. 26.5 Describe the characteristics, associated techniques, and typical applications for each of the following main data mining operations: (a) (b) (c) (d) Predictive modeling, Database segmentation, Link analysis, Deviation detection. See Section 26.2.3 See Section 26.2.4 See Section 26.2.5 See Section 26.2.6 See Section 26.2.2 for introduction to techniques. Exercises 26.6 You are asked by the Managing Director of DreamHome to investigate and report on the applicability of data mining for the organization. The report should describe the technology and provide a comparison with traditional querying and reporting tools of relational DBMSs. The report should also identify the advantages and disadvantages, and any problem areas associated with implementing data mining. The report should reach a fully justified set of conclusions on the applicability of data mining for DreamHome A well-presented report is expected. Justification must be given for any recommendations made. 93