AIT 524: Database Management Essentials Test 1 Study Guide Test 1 consists of three parts that are equally weighted: • Part 1 will assess how well students can apply concepts learned in class to solve real-world problems and design a database. • Part 2 includes 50 multiple-choice/true-false questions and covers the topics listed below. Questions are based on the material covered in the textbook, presented in the Power-Point presentations and open-source materials before Test 1. • Part 3 will assess students' knowledge of relational algebra. Test 1 is closed book, closed notes. Students will be taking all parts on Blackboard. The best way to prepare for Test 1 is to review practice problems and HW assignments, and complete practice quizzes. In addition, it is recommended to go over examples and exercises available in the textbooks required for this course. Test 1 topics include but are not limited to: Database Systems: Ø ad hoc query—A “spur-of-the-moment” question. Ø centralized database—A database located at a single site. Ø data—Raw facts, that is, facts that have not yet been processed to reveal their meaning to the end user. Ø data anomaly—A data abnormality that exists when inconsistent changes to a database have been made. For example, an employee moves, but the address change is corrected in only one file and not across all files in the database. Ø data dependence—A data condition in which the data representation and manipulation are dependent on the physical data storage characteristics. Ø data dictionary—A DBMS component that stores metadata—data about data. Thus, the data dictionary contains the data definition as well as its characteristics and relationships. A data dictionary may also include data that are external to the DBMS. Also known as an information resource dictionary. See also active data dictionary, metadata, and passive data dictionary. Ø data inconsistency—A condition in which different versions of the same data yield different (inconsistent) results. Ø data independence—A condition that exists when data access is unaffected by changes in the physical data storage characteristics. Ø data integrity—In a relational database, refers to a condition in which the data in the database is in compliance with all entity and referential integrity constraints. Ø data management—A process that focuses on data collection, storage, and retrieval. Common data management functions include addition, deletion, modification, and listing. Ø data redundancy—A condition that exists when a data environment contains redundant (unnecessarily duplicated) data. Ø data warehouse—Bill Inmon, the acknowledged “father of the data warehouse,” defines the term as “an integrated, subject-oriented, time-variant, nonvolatile collection of data that provides support for decision making.” Ø database—A shared, integrated computer structure that houses a collection of related data. A database contains two types of data: end-user data (raw facts) and metadata. The metadata consist of data about data, that is, the data characteristics and relationships. Ø database design—The process that yields the description of the database structure. The database design process determines the database components. Database design is the second phase of the database life cycle. Ø database management system (DBMS) —Refers to the collection of programs that manages the database structure and controls access to the data stored in the database. Ø database system—An organization of components that defines and regulates the collection, storage, management, and use of data in a database environment. Ø desktop database—A single-user database that runs on a personal computer. Ø distributed database—A logically related database that is stored over two or more physically independent sites. Ø enterprise database—The overall company data representation, which provides support for present and expected future needs. Ø field—A character or group of characters (alphabetic or numeric) that defines a characteristic of a person, place, or thing. For example, a person’s Social Security number, address, phone number, and bank balance all constitute fields. Ø file—A named collection of related records. Ø information—The result of processing raw data to reveal its meaning. Information consists of transformed data and facilitates decision making. Ø islands of information—A term used in the old-style file system environment to refer to independent, often duplicated, and inconsistent data pools created and managed by different organizational departments. Ø knowledge—The body of information and facts about a specific subject. Knowledge implies familiarity, awareness, and understanding of information as it applies to an environment. A key characteristic of knowledge is that “new” knowledge can be derived from “old” knowledge. Ø logical data format—The way in which a human being views data. Ø metadata—Data about data, that is, data concerning data characteristics and relationships. See also data dictionary. Ø multiuser database—A database that supports multiple concurrent users. Ø operational database—A database that is designed primarily to support a company’s day-to-day operations. Also known as a transactional database or production database. Ø performance tuning—Activities that make a database perform more efficiently in terms of storage and access speed. Ø physical data format—The way in which a computer “sees” (stores) data. Ø production database—The main database designed to keep track of the day-to- day operations of a company. See also transactional database. Ø query—A question or task asked by an end user of a database in the form of SQL code. A specific request for data manipulation issued by the end user or the application to the DBMS. Ø query language—A nonprocedural language that is used by a DBMS to manipulate its data. An example of a query language is SQL. Ø query result set—The collection of data rows that are returned by a query. Ø record—A collection of related (logically connected) fields. Ø single-user database—A database that supports only one user at a time. Ø structural dependence—A data characteristic that exists when a change in the database schema affects data access, thus requiring changes in all access programs. Ø structural independence—A data characteristic that exists when changes in the database schema do not affect data access. Ø Structured Query Language—A powerful and flexible relational database language composed of commands that enable users to create database and table structures, perform various types of data manipulation and data administration, and query the database to extract useful information. Ø transactional database—A database designed to keep track of the day-to-day transactions of an organization. See also production database. Ø workgroup database—A multiuser database that supports a relatively small number of users (usually fewer than 50) or that is used for a specific department in an organization. Data Models Ø attribute—A characteristic of an entity or object. An attribute has a name and a data type. Ø business rule—Narrative descriptions of a policy, procedure, or principle within an organization. Examples: A pilot cannot be on duty for more than 10 hours during a 24-hour period. A professor may teach up to four classes during any one semester. Ø class—A collection of like objects with shared structure (attributes) and behavior (methods). A class encapsulates an object’s data representation and a method’s implementation. Classes are organized in a class hierarchy. Ø class hierarchy—The organization of classes in a hierarchical tree where each “parent” class is a superclass and each “child” class is a subclass. See also inheritance. Ø conceptual model—The output of the conceptual design process. The conceptual model provides a global view of an entire database. Describes the main data objects, avoiding details. Ø conceptual schema—A representation of the conceptual model, usually expressed graphically. See also conceptual model. Ø connectivity—Describes the classification of the relationship between entities. Classifications include 1:1, 1:M, and M:N. Ø constraint—A restriction placed on data. Constraints are normally expressed in the form of rules. Example: “A student’s GPA must be between 0.00 and 4.00.” Constraints are important because they help to ensure data integrity. Ø Crow’s Foot notation—A representation of the entity relationship diagram using a three-pronged symbol to represent the “many” sides of the relationship. Ø data definition language (DDL)—The language that allows a database administrator to define the database structure, schema, and subschema. Ø data management language (DML)—The language (set of commands) that allows an end user to manipulate the data in the database (SELECT, INSERT, UPDATE, DELETE, COMMIT, and ROLLBACK). Ø data model—A representation, usually graphic, of a complex “real-world” data structure. Data models are used in the database design phase of the database life cycle. Ø entity—Something about which someone wants to store data; typically a person, a place, a thing, a concept, or an event. See also attribute. Ø entity instance—A term used in ER modeling to refer to a specific table row. Also known as an entity occurrence. Ø entity occurrence—See entity instance. Ø entity relationship diagram (ERD)—A diagram that depicts an entity relationship model’s entities, attributes, and relations. Ø entity relationship (ER) model (ERM)—A data model developed by P. Chen in 1975. It describes relationships (1:1, 1:M, and M:N) among entities at the conceptual level with the help of ER diagrams. Ø entity set—In a relational model, refers to a grouping of related entities. Ø extended relational data model (ERDM)—Sometimes referred to as the enhanced entity relationship model; the result of adding more semantic constructs (entity supertypes, entity subtypes, and entity clustering) to the original entity relationship (ER) model. Ø external model—The application programmer’s view of the data environment. Given its business-unit focus, an external model works with a data subset of the global database schema. Ø external schema—The specific representation of an external view, that is, the end user’s view of the data environment. Ø hardware independence—Means that a model does not depend on the hardware used in the implementation of the model. Therefore, changes in the hardware will have no effect on the database design at the conceptual level. Ø hierarchical model—No longer a major player in the current database market; important to know, however, because the basic concepts and characteristics form the basis for subsequent database development. This model is based on an “upside-down” tree structure in which each record is called a segment. The top record is the root segment. Each segment has a 1:M relationship to the segment directly below it. Ø inheritance—In the object-oriented data model, the ability of an object to inherit the data structure and methods of the classes above it in the class hierarchy. See also class hierarchy. Ø internal model—In database modeling, refers to a level of data abstraction that adapts the conceptual model to a specific DBMS model for implementation. Ø internal schema—Depicts a specific representation of an internal model, using the database constructs supported by the chosen database. (The internal model is the representation of a database as “seen” by the DBMS. In other words, the internal model requires a designer to match the conceptual model’s characteristics and constraints to those of the selected implementation model.) Ø logical design—A stage in the design phase that matches the conceptual design to the requirements of the selected DBMS and is, therefore, software-dependent. It is used to translate the conceptual design into the internal model for a selected database management system, such as DB2, SQL Server, Oracle, IMS, Informix, Access, and Ingress. Ø logical independence—A condition that exists when the internal model can be changed without affecting the conceptual model. (The internal model is hardware independent because it is unaffected by the choice of computer on which the software is installed. Therefore, a change in storage devices or even a change in operating systems will not affect the internal model.) Ø many-to-many (M:N or *..*) relationship—One of three types of relationships (associations among two or more entities) in which one occurrence of an entity is associated with many occurrences of a related entity and one occurrence of the related entity is associated with many occurrences of the first entity. Ø method—In the object-oriented data model, a named set of instructions to perform an action. Methods represent real-world actions. Methods are invoked through messages. Ø network model—A data model standard created by the CODASYL Data Base Task Group in the late 1960s. It represented data as a collection of record types and relationships as predefined sets with an owner record type and a member record type in a 1:M relationship. Ø object—An abstract representation of a real-world entity that has a unique identity, embedded properties, and the ability to interact with other objects and with itself. Ø object-oriented data model (OODM)—A data model whose basic modeling structure is an object. Ø object-oriented database management system (OODBMS)—Data management software used to manage data found within an object-oriented database model. Ø object/relational database management system (O/RDBMS)—A DBMS based on the extended relational model (ERDM). The ERDM, championed by many relational database researchers, constitutes the relational model’s response to the OODM. This model includes many of the object-oriented model’s best features within an inherently simpler relational database structural environment. Ø one-to-many (1:M or 1..*) relationship—One of three types of relationships (associations among two or more entities) that are used by data models. In a 1:M relationship, one entity instance is associated with many instances of the related entity. Ø one-to-one (1:1 or 1..1) relationship—One of three types of relationships (associations among two or more entities) that are used by data models. In a 1:1 relationship, one entity instance is associated with only one instance of the related entity. Ø physical independence—A condition that exists when the physical model can be changed without affecting the internal model. Ø physical model—A model in which the physical characteristics (location, path, and format) are described for the data. Both hardware- and software-dependent. See also physical design. Ø relation—In a relational database model, an entity set. Relations are implemented as tables. Relations (tables) are related to each other through the sharing of a common entity characteristic (value in a column). Ø relational database management system (RDBMS)—A collection of programs that manages a relational database. The RDBMS software translates a user’s logical requests (queries) into commands that physically locate and retrieve the requested data. A good RDBMS also creates and maintains a data dictionary (system catalog) to help provide data security, data integrity, concurrent access, easy access, and system administration to the data in the database through a query language (SQL) and application programs. Ø relational diagram—A graphical representation of a relational database’s entities, the attributes within those entities, and the relationships among those entities. Ø relational model—Developed by E. F. Codd (of IBM) in 1970, it represents a major breakthrough for users and designers because of its conceptual simplicity. The relational model, based on mathematical set theory, represents data as independent relations. Each relation (table) is conceptually represented as a matrix of intersecting rows and columns. The relations are related to each other through the sharing of common entity characteristics (values in columns). Ø relationship—An association between entities. Ø schema— A logical grouping of database objects (tables, indexes, views, queries, etc.) that are related to each other. Usually, a schema belongs to a single user or application. Ø semantic data model—The first of a series of data models that more closely represented the real world, modeling both data and their relationships in a single structure known as an object. The SDM, published in 1981, was developed by M. Hammer and D. McLeod. Ø software independence—A property of any model or application that does not depend on the software used to implement it. Ø subschema—In the network model, the portion of the database “seen” by the application programs that produce the desired information from the data contained within the database. Ø table—A (conceptual) matrix composed of intersecting rows (entities) and columns (attributes) that represents an entity set in the relational model. Also called a relation. The Relational Database Model Ø associative entity—See composite entity. Ø attribute domain—See domain. Ø bridge entity—See composite entity. Ø candidate key—See key. Ø composite entity—An entity designed to transform an M:N relationship into two 1:M relationships. The composite entity’s primary key comprises at least the primary keys of the entities that it connects. Also known as a bridge entity. See also linking table. Ø composite key—A multiple-attribute key. Ø data dictionary—A DBMS component that stores metadata—data about data. Thus, the data dictionary contains the data definition as well as its characteristics and relationships. A data dictionary may also include data that are external to the DBMS. Also known as an information resource dictionary. See also active data dictionary, metadata, and passive data dictionary. Ø determination—The role of a key. In the context of a database table, the statement “A determines B” indicates that knowing the value of attribute A means that the value of attribute B can be looked up (determined). Ø domain—In data modeling, refers to the construct used to organize and describe an attribute’s set of possible values. Ø entity integrity—The property of a relational table that guarantees that each entity has a unique value in a primary key and that there are no null values in the primary key. Ø flags—Special codes implemented by designers to trigger a required response, to alert end users to specified conditions, or to encode values. Flags may be used to prevent nulls by bringing attention to the absence of a value in a table. Ø foreign key (FK)—See key. Ø full functional dependence—A condition in which an attribute is functionally dependent on a composite key but not on any subset of that composite key. Ø functional dependence—Within a relation R, an attribute B is functionally dependent on an attribute A if and only if a given value of the attribute A determines exactly one value of the attribute B. The relationship “B is dependent on A” is equivalent to “A determines B” and is written as A B. Ø homonyms—Indicates the use of the same name to label different attributes; generally should be avoided. Some relational software automatically checks for homonyms and either alerts the user to their existence or automatically makes the appropriate adjustments. See also synonym. Ø index—An ordered array composed of index key values and row ID values (pointers). Indexes are generally used to speed up and facilitate data retrieval. Also known as an index key. Ø index key—See index. Ø key—An entity identifier based on the concept of functional dependence; may be classified as follows: Superkey: An attribute (or combination of attributes) that uniquely identifies each entity in a table. Candidate key: A minimal attribute that is itself a superkey. Primary key (PK): A candidate key selected as a unique entity identifier. Secondary key: A key that is used strictly for data retrieval purposes. For example, a customer is not likely to know his or her customer number (primary key), but the combination of last name, first name, middle initial, and telephone number is likely to make a match to the appropriate table row. Foreign key: An attribute (or combination of attributes) in one table whose values must match the primary key in another table or whose values must be null. Ø key attribute—The attribute(s) that form(s) a primary key. See also prime attribute. Ø null—In SQL, refers to the absence of an attribute value. Note: A null is not a blank. Ø primary key (PK)—In the relational model, an identifier composed of one or more attributes that uniquely identifies a row. See also key. Ø referential integrity—A condition by which a dependent table’s foreign key must have either a null entry or a matching entry in the related table. Even though an attribute may not have a corresponding attribute, it is impossible to have an invalid entry. Ø relational schema—The description of the organization of a relational database as seen by the database administrator. Ø secondary key—A key that is used strictly for data retrieval purposes. For example, a customer is not likely to know his or her customer number (primary key), but the combination of last name, first name, middle initial, and telephone number is likely to make a match to the appropriate table row. See also key. Ø superkey—See key. Ø synonym—The use of different names to identify the same object, such as an entity, an attribute, or a relationship; should generally be avoided. See also homonym. Ø system catalog—A detailed system data dictionary that describes all objects in a database. Ø unique index—An index in which the index key can have only one pointer value (row) associated with it. Entity Relationship (ER) Modeling Ø binary relationship—An ER term used to describe an association (relationship) between two entities. Example: PROFESSOR teaches COURSE. Ø cardinality—Assigns a specific value to connectivity. Expresses the range (minimum to maximum) of allowed entity occurrences associated with a single occurrence of the related entity. Ø composite attribute—An attribute that can be further subdivided to yield additional attributes. For example, a phone number (615-898-2368) may be divided into an area code (615), an exchange number (898), and a four-digit code (2368). Compare to simple attribute. Ø composite identifier—In ER modeling, a key composed of more than one attribute. Ø connectivity—Describes the classification of the relationship between entities. Classifications include 1:1, 1:M, and M:N. Ø derived attribute—An attribute that does not physically exist within the entity and is derived via an algorithm. Example: Age = current date – birth date. Ø existence-dependent—A property of an entity whose existence depends on one or more other entities. In an existence-dependent environment, the existence- independent table must be created and loaded first because the existence- dependent key cannot reference a table that does not yet exist. Ø existence-independent—An entity that can exist apart from one or more related entities. It must be created first when referencing an existence-dependent table to it. Ø identifiers—The ERM uses identifiers to uniquely identify each entity instance. In the relational model, such identifiers are mapped to primary keys in tables. Ø identifying relationship—A relationship that exists when the related entities are existence-dependent. Also called a strong relationship or strong identifying relationship because the dependent entity’s primary key contains the primary key of the parent entity. Ø iterative process—A process based on repetition of steps and procedures. Ø mandatory participation—A term used to describe a relationship in which one entity occurrence must have a corresponding occurrence in another entity. Example: EMPLOYEE works in DIVISION. (A person cannot be an employee if he or she is not assigned to a company’s division.) Ø multivalued attribute—An attribute that can have many values for a single entity occurrence. For example, an EMP_DEGREE attribute might store the string “BBA, MBA, PHD” to indicate three different degrees held. Ø non-identifying relationship—A relationship that occurs when the primary key of the dependent (many side) entity does not contain the primary key of the related parent entity. Also known as a weak relationship. Ø optional attribute—In ER modeling, refers to an attribute that does not require a value, therefore it can be left empty. Ø optional participation—In ER modeling, refers to a condition where one entity occurrence does not require a corresponding entity occurrence in a particular relationship. Ø participants—An ER term used to label the entities that participate in a relationship. Example: PROFESSOR teaches CLASS. (The teaches relationship is based on the participants PROFESSOR and CLASS.) Ø recursive relationship—A relationship that is found within a single entity type. For example, an EMPLOYEE is married to an EMPLOYEE or a PART is a component of another PART. Ø relationship degree—Indicates the number of entities or participants associated with a relationship. A relationship degree can be unary, binary, ternary, or higher level. Ø required attribute—In ER modeling, refers to an attribute that must have a value. In other words, it cannot be left empty. Ø simple attribute—An attribute that cannot be subdivided into meaningful components. Compare to composite attribute. Ø single-valued attribute—An attribute that can have only one value. Ø strong relationship—When two entities are existence-dependent; from a database design perspective, this exists whenever the primary key of the related entity contains the primary key of the parent entity. Ø ternary relationship—An ER term used to describe an association (relationship) between three entities. Example: A CONTRIBUTOR contributes money to a FUND from which a RECIPIENT receives money. Ø unary relationship—An ER term used to describe an association within an entity. Example: A COURSE is a prerequisite to another COURSE. Ø weak entity—An entity that displays existence dependence and inherits the primary key of its parent entity. Example: A DEPENDENT requires the existence of an EMPLOYEE. Ø weak relationship—A relationship that exists when the PK of the related entity does not contain a PK component of the parent entity. Also known as a non- identifying relationship. Normalization of Database Tables Ø atomic attribute—An attribute that cannot be further subdivided to produce meaningful components. For example, a person’s last name attribute cannot be meaningfully subdivided into other name components; therefore, the last name attribute is atomic. Ø atomicity—A property of transactions that states that all parts of a transaction must be treated as a single logical unit of work in which all operations must be completed (committed) to produce a consistent database. Ø Boyce-Codd normal form (BCNF)— A special form of third normal form (3NF) in which every determinant is a candidate key. A table that is in BCNF must be in 3NF. See also determinant. Ø denormalization—A process by which a table is changed from a higher level normal form to a lower level normal form. Usually done to increase processing speed. Potentially yields data anomalies. Ø dependency diagram—A representation of all data dependencies (primary key, partial, or transitive) within a table. Ø determinant—Any attribute in a specific row whose value directly determines other values in that row. See also Boyce-Codd normal form (BCNF). Ø first normal form (1NF)—The first stage in the normalization process. It describes a relation depicted in tabular format, with no repeating groups and with a primary key identified. All nonkey attributes in the relation are dependent on the primary key. Ø granularity—Refers to the level of detail represented by the values stored in a table’s row. Data stored at their lowest level of granularity are said to be atomic data. Ø key attribute—The attribute(s) that form(s) a primary key. See also prime attribute. Ø nonkey attribute—See nonprime attribute. Ø nonprime attribute—An attribute that is not part of a key. Ø normalization—A process that assigns attributes to entities in such a way that data redundancies are reduced or eliminated. Ø partial dependency—In normalization, a condition in which an attribute is dependent on only a portion (subset) of the primary key. Ø prime attribute—A key attribute, that is, an attribute that is part of a key or is the whole key. See also key attribute. Ø repeating group—In a relation, a characteristic describing a group of multiple entries of the same type that exist for a single key attribute occurrence. For example, a car can have multiple colors (top, interior, bottom, trim, and so on). Ø second normal form (2NF)—The second stage in the normalization process in which a relation is in 1NF and there are no partial dependencies (dependencies in only part of the primary key). Ø surrogate key—A system-assigned primary key, generally numeric and auto- incremented. Ø third normal form (3NF)—A table is in 3NF when it is in 2NF and no nonkey attribute is functionally dependent on another nonkey attribute; that is, it cannot include transitive dependencies. Ø transitive dependency—A condition in which an attribute is dependent on another attribute that is not part of the primary key.
欢迎咨询51作业君