Database Management
Foundations of Information Systems
© The McGraw-Hill Companies, Inc.., 1998
Chapter Objectives
• Specify the elements of the data hierarchy.
• Understand what can be accomplished with files and
what the limitations of a file-based environment are.
• Understand the advantages of a database
environment and the role of a database management
system (DBMS).
• Specify the three levels at which data are defined in
databases.
• Compare the three data models.
• Understand the operation of relational databases and
theirquery language SQL.
Chapter Objectives, cont.
• Understand the role of the data dictionary.
• Explain the components of information
resource management.
• Describe the developmental trends in
database management, including distributed
databases, data warehouses, and objectoriented and other rich database.
Hierarchy of Data
• Data are the Principal Resources of an
Organization
• Data Stored in Computer Systems form
a Hierarchy
– Extending from a Single Bit to a Database
– The Major Record Keeping entity of a
firm
– Each higher rung of this hierarchy is
organized from the components below it
Hierarchy of Data
•
•
•
•
•
•
Data are logically organized into:
1. Bits (Characters)
2. Fields
3. Records
4. Files
5. Databases
Hierarchy of Data
• Bit (Character)
– The smallest unit of data representation
– Value of a Bit may be 0 or 1
– 8 Bits make a Byte which can represent a
character or a special symbol in a
character code
Hierarchy of Data
• Field
– A grouping of characters
– A data field represents an attribute,
characteristic, or quality of some entity,
object, person, place, or event.
Hierarchy of Data
• Record
– A collection of attributes that describe a
real world entity
– A Record consists of fields with each field
describing an attribute of the entity
Hierarchy of Data
• File
– A group of related records
– Classified by the application for which
they are used
• Example: Employee File
– Primary Key in a File is the Field (or
Fields) whose value identifies a records
among others in a data file
Hierarchy of Data
• Database
– An integrated collection of logically
related records or files
– Consolidates records previously stored in
separate files into a common pool of data
records that provides data for may
applications
– The Data is managed by systems software
called Database Management Systems
(DBMS)
Hierarchy of Data Organization in
Computer Storage
Example
Component of Data
Organization
SUPPLIERS
Database
Logical
Components
File
PARTS
SHIPMENTS
SUPPLIERS
NO. NAME
STREET
ADDRESS
13
50.Oak
Gasket Co.
CITY ST ZIP
Tifflin
OH 44883
.
.
3251 Reliable Supp. 11 Cedar Teaneck NJ 07666
Record
Field
(attribute)
Physical
(Storage)
Components
Byte
Bit
13
Gasket Co.
50 Oak
Tifflin
Reliable Suppliers
01000001 (represents “A” in the
ASCII-8 character code)
0
OH 44883
File Environment and Its
Limitations
• File Organization
– Data files are organized so as to facilitate
access to records and to ensure efficient
storage
• A Tradeoff between these two requirements
generally exists
– If rapid access is required, then more storage is
required.
File Environment and Its
Limitations
• Access to a record for reading it is the
essential operation on data
• 1. Sequential Access
– Records are accessed in the order they are
stored
– The main access mode only in Batch
Systems where files are used and updated
at regular intervals
• 2. Direct Access
File Environment and Its
Limitations
•
•
•
•
File Organization Methods
1. Sequential Organization
2. Indexed Sequential Organization
3. Direct Organization
File Environment and Its
Limitations
• Sequential Organization
– Records are physically stored in a
specified order according to a key field in
each record
– Advantages
• Fast
• Efficient for dealing with large volumes data
that need to be processed periodically (batch
system)
– Disadvantages
File Environment and Its
Limitations
• Indexed Sequential Organization
– Records are physically stored in sequential
order on a magnetic disk or other direct
access storage device based on the key
field of each record
– Each file contains an index that references
one or more key field of each data record
to its storage location address
File Environment and Its
Limitations
• Direct Organization
– Provides the fastest direct access to
records
– Records do not have to be arranged in any
particular sequence on storage media
– Computers must keep track of the storage
location of each record using a variety of
direct organization methods to retrieve
data
– New transactions’ data does not have to be
File Environment and Its
Limitations
• Limitations of a File Oriented
Environment
– Fair amount of data redundancy
– Example of Three Systems
•
•
•
•
Supplier
Shipment
Inventory
Note: These Databases may have
inconsistencies
Database Environment
• A Database is an organized collection
of interrelated data that serves a
number of applications in an enterprise.
– Stores both the values of various entities
and their relationships
– Managed by a Database Management
System DBMS
• A system software that provides assistance in
managing databases shared by multiple users
Database Environment
• DBMS
• Helps organize data for effective access
by a variety of users with different
access needs and for efficient storage
• Makes it possible to create, access,
maintain, and control databases
• Data can be integrated and presented
on demand
Database Environment
• Advantages of a Database Management
Approach
• Avoiding uncontrolled data redundancy
and preventing inconsistency
• Program-Data independence
• Flexible access to shared data
• Benefits of centralized control of data
A Database Environment
Reports
End Users
Application
Program A
.
.
.
Application
Program X
DBMS
Database
Query
Update
Reports
Levels of Data Definition in
Databases
• User view of a DBMS is the basis for
modeling steps
• Data models define the logical
relationships among the elements to
support the basic process
• Logical framework to base the physical
design of databases and the
development of application programs
Levels of Data Definition in
Databases
• DBMS defines a database
• Schema
– Overall logical view of the relationships
between data in a database
• Subschema
– Logical view of data relationships needed
to support specific end user applications to
access the database
• Physical
Levels of Data Definition in
Databases
• DBMS provides the language, DDL
Data Definition Language to define the
databases on the three levels (Schema,
Subschema, and Physical), and the
DML Data Manipulation Language to
access records, change values of
attributes, and delete or insert records
Data Models or How to
Represent Relationships
Between Data
• Data Model is a method for organizing
databases on the logical, schema, and
subschema levels.
• The main concern is how to represent
relationships among database records
• Relationships are based on logical data
structures or models
• DBMS are designed to provide end
users with quick, easy access to
•
•
•
•
Data Models or How to Represent
Relationships
Between Data
Data Model Structures
Hierarchical
Network
Relational
Data Models or How to Represent
Relationships
Between Data: Hierarchical
• Used by early mainframe DBMS
• Relationships between records form a hierarchy or tree-like
structure.
• Records are dependent and arranged in multilevel structures of
one root record and any number of subordinate levels.
• Relationships among the records are one-to-many as each data
element is related only to one element above it.
• The data element or record at the highest level is the root
element. Any data element can be accessed by moving
progressively downward from the root and along the branches
of the tree to locate the record.
Data Models or How to Represent
Relationships
Between Data: Hierarchical
• Advantages
– Ease with which data can be stored and retrieved in
structured, routine types of transactions
– Ease with which data can be extracted for reporting
purposes
– Routine types of transaction processing are fast and
efficient.
• Disadvantages
– Hierarchical one-to-many relationships must be specified in
advance and are not flexible.
– Cannot easily handle ad hoc requests for information
– Modifying a hierarchical database structure is complex
– Redundancy
A Hierarchical Database
CLASS
COURSE
NUMBER
MIS 101
SECTION
ROOM
NUMBER
1
525
MEETING
DAYS
MEETING
HOURS
MWF
10-11 a.m.
PROFESSOR
PROFESSOR
NUMBER
221
PROFESSOR
NUMBER
STUDENT
NUMBER
Turkel
02579
02311
01171
STUDENT
NAME
Young
Volkonsky
Johnson
STUDENT
STUDENT
STUDENT
Data Models or How to Represent
Relationships
Between Data: Network
• Can represent more complex logical
relationships and is used by mainframe
DBMS packages
• Allows many-to-many relationships among
records meaning that the network model can
access a data element by following several
paths due to multiple relationships
Data Models or How to Represent
Relationships
Between Data: Network
• Advantages
– More flexible than the hierarchical model
– Ability to provide sophisticated logical relationships among
the records
• Disadvantages
– Network many-to-many relationships must be specified in
advance
– User is limited to retrieving data that can be access using
the established links between records
– Cannot easily handle ad hoc requests for information
– Requires knowledge of a programming language
A Network Database
PROFESSOR
STUDENT
PROFESSOR PROFESSOR
NUMBER
NAME
221
STUDENT
NUMBER
Turkel
01171
Links to Records of
Other Classes Taught
by This Professor
STUDENT
NAME
Johnson
Links to Records of
Other Classes Taken
by This Student
Links to Records of
Other Students
Taking This Class
COURSE
NUMBER
MIS 101
SECTION
ROOM
NUMBER
1
525
MEETING
DAYS
MWF
MEETING
HOURS
10-11 a.m.
Data Models or How to Represent
Relationships
Between Data: Relational
• Most popular Database Structure
• Used by most microcomputer DBMS
packages as well as many minicomputer and
mainframe systems
• Data elements within the database are stored
in the form of simple tables which are related
if they contain common fields
• DBMS packages based on the relational
model can link data elements from various
tables to provide information to users
Data Models or How to Represent
Relationships
Between Data: Relational
• Advantages
– Flexible in that ad hoc information request can be handled
– Easy for programmers
– End users can use Relational Structures with little effort or
training.
– Easier to maintain than the hierarchical and network
models
• Disadvantages
– Cannot process large amounts of business transactions as
quickly and efficiently as the hierarchical and network
models
Relational Databases
• A collection of tables that is relatively easy to use and
understand offering flexibility for the data and ability to modify
• All records in a relational database must have a unique primary
key
• Relational Systems support three principal operations on tables
without any predefined access paths
• Select- from a specified row that satisfies a given condition
• Project- selects the attribute values
• Join- builds a new table from two specified tables
Relational Databases
• The power of the relational model
derives from the join operation.
– Records are related through a join
operation rather than links so that a
predefined access path is not needed
– The join operation is time consuming
requiring access to many records stored
(on disk) to find the needed records
SQL- A Relational Query
Language
• International standard access language fro defining
and manipulating data in databases
• Data-Definition-and-Management Language for
DBMS and some nonrelational ones
• SQL is used an independent query language to
define objects in a database, enter, and access the
data
• Embedded SQL is for programming in procedural
languages (host) such as C, COBOL, PL/1, PL/SQL
to access from an application program
• For the end user environment, SQL is “hidden” by
more friendly user interfaces
SQL- A Relational Query
Language
• Principal Facilities of SQL
• Data Definition
• Data Manipulation
Designing a Relational
Database
• Database design progresses from the design of the
logical of the schema and the subschema to the
physical level.
• Logical Design or Data Modeling is to design the
schema and the subschema.
• A relational database consists of tables (relations)
describing the attributes of a particular class of
entities.
• Logical Design begins with identifying the entity
classes to be represented in the database and
establishing relationships or interactions between the
entities represented to access the data.
Designing a Relational
Database
• Entity-Relationship (E-R) Diagrams
are used to perform data modeling.
– Diagrams include rectangles for entities
and diamonds for relationships including
attributes
Designing a Relational
Database
• Normalization is the simplification of the
logical view of data in relational databases
meaning that all fields contain single data
elements, distinct records, and each table
describes only a single class of entities.
• The objective of normalization is to prevent
replication of data with all of its negative
consequences.
Designing a Relational
Database
• Physical Design is after the Logical Design
of the database.
• Fields are specified as to their length and the
nature of the data (numeric, characters).
• A principal objective is to minimize the
number of time consuming disk accesses that
are necessary in order to answer typical
database queries.
• Frequently, indexes are provided to ensure
fast access for queries.
The Data Dictionary
• A software module and database
containing descriptions and definitions
of the structure, data elements,
interrelationships, and other
characteristics of an organization’s
database
The Data Dictionary
• Schema, Subschemas, and Physical Schema
• Which applications and users may retrieve and/or
modify the specific data
• Cross reference information such as which programs
use what data and the users receive particular reports
• Where individual data elements originate, and who
is responsible for maintaining the data
• Standard naming conventions for database entities
• Integrity rules for the data
• Where the data are stored in geographically
distributed databases
The Data Dictionary
• Contains all the data definitions and
information necessary to identify data
ownership
• Ensures security and privacy of the
data and the information used during
the development and maintenance of
applications which rely on the database
Managing the Data Resource
of an Organization
• Database technology enables organizations to
control their data as a resource, but it does not
automatically produce organizational control of data
• Components of Information Resources Management:
Organizational and Technical Means
• Ensure that a firm systematically accumulates data in
its databases
• Maintains the data over time
• Provides the appropriate access
Managing the Data Resource
of an Organization
• Principal Components of Information
Resource Management
• Organizational Processes
– Information planning and data modeling
• Enabling Technologies
– DBMS and a Data Dictionary
• Organizational Functions
– Data and database administration
Managing the Data Resource
of an Organization
• Database Administration Functional
Units to Manage the Data
• Data Administrator (DA)
• Database Administrator (DBA)
Managing the Data Resource of an
Organization:
Data Administrator
• Person who has the central responsibility for an organization’s
data
• Establish policies and specific procedures for collecting,
validating, sharing, and inventorying data to be stored and
databases and for making information accessible to members of
the organization and possible persons outside of it
• Data administration is a policy making function and the DA
should have access to senior corporate management.
• Key person involved in the strategic planning of the data
resource
• Often defines the principal data entities, their attributes, and the
relationship among them
Managing the Data Resource of an
Organization:
Database Administrator
• A specialist responsible for maintaining standards for the
development, maintenance and security of an organization’s
databases
• Creating the databases and carrying out the policies of the data
administrator
• In large organizations, the DBA function is performed by a
group of professionals; in a small firm, a programmer/ analyst
may perform the DBA function, while one of the managers acts
as the DA.
• Schema and subschemas of the database are most often defined
by the DBA, who has the requisite technical knowledge to
detail the physical layout of the databases with a view toward
optimizing system performance for the expected pattern of
database usage
Managing the Data Resource of
an Organization
• Joint Responsibilities of the DA and DBA
• Maintaining the Data Dictionary
• Standardizing names and other aspects of
data definition
• Providing Backup
• Provide security and privacy for the data
stored in the database
• Establish a disaster recovery plan for the
databases
Developmental Trends in
Database Management
• Distributed Databases
• Data Warehousing
• Rich Databases (includes objectoriented databases)
Developmental Trends in
Database Management
• Distributed Databases
–
–
–
–
–
Spread across several physical locations
Data are placed where they are used most often
Entire database is available to each authorized user
Local work groups (LAN)
Departments at regional offices (WAN)
• Branch offices, manufacturing plants, and other work sites
– Databases includes segments of both common operational
and user as well as data generated only at the user’s own
site
Developmental Trends in
Database Management
• Data Warehouse Databases
– Stores data from current and previous years that has been
extracted from operational and management databases
– Central source of data
– Standardized and integrated for use by managers and other
end user professionals
– Objective of a corporate data warehouse
• To continually select data from the operational databases
• Transform the data into a uniform format
• Open the warehouse to the ends through a friendly and
consistent interface
– Data mining
• Automated discovery of potentially significant relationships
among categories of data
Developmental Trends in
Database Management
• Systems supporting a data warehouse
• Extract and Prepare Data
– The first subsystem extracts the data from the operational systems
– Many are older legacy systems that get “scrubbed” by removing
errors and inconsistencies
• Store Data in the Warehouse
– The second support component is the DBMS that will manage the
warehouse data.
• Provide Access and Analysis Capabilities
– The third subsystem is made up of the query tools to access the
data
– Includes OLAP (OnLine Analytical Processing) and other DSS
tools supporting data analysis
Developmental Trends in
Database Management
• Object-Oriented and Other Rich Databases
– With expanded capabilities of information technology, the
content of databases is becoming richer
– Traditional databases include largely numerical data or
short fragments of text organized into well structured
records
– As processing and storage capabilities expand with
growing telecommunications, the knowledge is supported
with rich data
– Geographic Information Systems
– Object-Oriented Databases
– Hypertext and Hypermedia Databases
– Image Databases and Other Text Databases
A System with a Distributed Database
Site 1
Site 3
.
.
.
Database
Fragment 1
Telecommunications
Network
Database
Fragment 3
Site 2
Users have
access to the
entire database
over the network
.
.
.
Database
Fragment 2
Key Terms
Bit
Byte
Field
Record
File
Primary Key
Database
Access (to a record)
Sequential Access
Direct Access
Sequential File
Indexed-Sequential
File
Direct File
Database Management
System (DBMS)
Program-Data
Independence
Schema
Subschema
Physical Schema
Data Definition
Language (DDL)
Data Manipulation
Language (DML)
Data Model
Hierarchical Model
Network Model
Relational Model
Structured Query
Language (SQL)
Logical Design (data
modeling)
Entity-Relationship
(E-R) Diagram
Normalization
Physical Design
Data Dictionary
Data Administrator
(DA)
Database
Administrator (DBA)
Distributed Database
Object-Oriented
Database
Data Warehouse
OLAP Software
Data Mining
Geographical
Information System
(GIS)
Descargar

Database Management