Transaction Isolation In the Sedna Native XML DBMS
c Peter Pleshachkov Leonid Novak
Institute for System Programming of the Russian Academy of Sciences,
B. Kommunisticheskaya, 25, Moscow 109004, Russia
peter@ispras.ru, novak@ispras.ru
Ph.D. Advisor S. D. Kuznetsov
Abstract
XML has become the most important tech-
nique to exchange data in World Wide Web.
As consequence, an interest to native XML
databases has surfaced. Concurrency con-
trol methods for traditional databases are
not adequate for XML databases, because
they do not capture the specic of XML
data model. In this paper we propose a lock-
ing mechanism, developed in Sedna, which
allows to achive a high degree of concur-
rency and takes into account the proper-
ties of structural and manipulation parts of
XML model.
1 Introduction
The use of the extensible markup language XML [13]
for electronic data interchange leads to an enormous
growth of the number of XML documents. The num-
ber of dierent applications supporting XML grows
rapidly too. So the challenge of isolating dierent
applications in XML environment from each other
is very actual. There are essentially three possibili-
ties of storing XML documents: to use le system,
RDBMS [4] or a native XML database management
system (XDBMS for short). The rst and second
approaches poss an obvious imperfections from an
isolation point of view [10, 2], and this is the reason
why synchronizing protocols for XDBMSes are ex-
tensively being studied and developed now. This pa-
per introduces synchronizing method of XML-data,
based on locking techniques, which was developed in
Sedna [8]. Our synchronizing method takes into ac-
count the properties of structural and manipulation
parts of XML model.
Sedna is a full-featured native XML database
management system, which is being developed by
the MODIS R&D group of ISP RAS. It supports
XQuery [14] language for query facilities. In order
to support update facilities XQuery language was
extended. Sedna update language is very close to
[11].
There are several protocols [1, 7], which allow to
synchronize hierarchical data via data locking, but
Proceedings of the Spring Young Researcher's Collo-
quium On Database and Information Systems SYR-
CoDIS, St.-Petersburg, Russia, 2004
most of them exploit the concept of intention locks.
It means that for locking a subtree all ancestors of
the root of this subtree must be locked in the in-
tention mode. Thus, the operation of locking an
entire subtree of XML document is needed informa-
tion about nodes, which are distributed over XML
document, i.e. the locking operation does not poss
the locality property.
In Sedna physical representation of XML docu-
ments [5], using of these protocols leads to a very
expensive locking operation, because all ancestors
of subtree root are stored in dierent blocks and all
these blocks must be read to the main memory.
This is the main reason why we do not incorpo-
rate these protocols in Sedna. Our locking method
allows to lock subtree without locking ancestors of
the root in intention mode. To lock the whole sub-
tree only descriptor of the root node is needed. We
have achieved this result by using numbering scheme
for locking nodes and subtrees of XML document.
The rest of the paper is organized as follows. In
Section 2 we introduce the storage schema of XML
documents, which we use in Sedna. In Section 3 we
introduce some of the basics relating to the synchro-
nization methods. In Section 4 we present our lock-
ing method, which allows to acquire the arbitrary
entire subtrees of XML documents in ecient way.
In section 5 we present logical locks, which are use-
ful for prevention con
icts at the logical level such
as lost updates, dirty reads and unrepeatable reads.
In Section 7 we discuss the locks, which are useful
for preventing con
icts at the physical level, and we
name them as physical locks. Finally, in Section 8
and 9, we give a brief overview of related work about
XML synchronization and conclusion of the paper.
2 XML storage model
In this section we will describe the main concepts of
Sedna XML storage model, over which the locking
techniques have been developed.
The choice of our physical representation of XML
documents is based on two fundamental principles
of XML-data management:
The eciency of XPath evaluation.
The eciency of propagating updates.
Consider the main characteristics of physical rep-
resentation of XML documents in Sedna.
1. A descriptive schema driven storage stategy
which consists of clustering nodes of an XML
document according to their positions in the
descriptive schema of the XML document. De-
scriptive schema presents a concise and accu-
rate structure summary of the XML document.
Formally speaking, descriptive schema is a tree.
Every path of the document has exactly one
path in the descriptive schema, and, vice versa,
every path of the descriptive schema is a path
of the document.
2. Each descriptive schema node is labeled with
an XML node kind name (e.g. element, at-
tribute, text, etc.) and has a pointer to data
blocks where nodes corresponding to the de-
scriptive schema node are stored. Some schema
nodes depending on their node kinds are also
labeled with names. Data blocks belonging to
one schema node are linked via pointers into a
bidirectional list. Nodes are ordered between
blocks in document order.
3. The structural part and text values of nodes
are separated. Text values are stored in blocks
according to the well-known slotted-page struc-
ture method [12] developed specically for data
of variable length. The structural part of a node
re
ects its relationship to other nodes (i. e. par-
ent, children, sibling nodes) and is presented in
the form of node descriptor.
4. Each node descriptor contains a pointer to a
numbering scheme. A numbering scheme as-
signs a unique numeric number to each node of
an XML document according to some scheme-
specic rules. The numeric numbers encode
information about the relative position of the
node in the document. Thus, the main purpose
of a numbering scheme is to provide mechanisms
to quickly determine the structural relationship
between a pair of nodes. It provides the mecha-
nism of quickly determining the following struc-
tural relationships between a pair of nodes: (1)
ancestor-descendant relationship, (2) the doc-
ument order relationship. Numbering scheme
allows to support many XQuery-specic opera-
tions.
3 Synchronization Preliminaries
In this section we give an overview of concurrency
control method that are of interest here.
To ensure serializability of transactions we use a
well-known strict two phase locking protocol. We
use logical locks to prevent con
icts at the logical
level such as lost updates, dirty reads and unre-
peatable reads. For prevention physical con
icts we
use physical locks. A problem at the physical level
can occur if one transaction follows a pointer to a
record on some page, while the other transaction up-
dates a second record on the same page and causes
Figure 1: Compatibility Matrix for lock Modes
a data compaction routine to reassign record loca-
tions. Physical locking is handled by setting and
holding locks on one or more pages during the exe-
cution of a single physical operation. Logical locking
is handled by setting locks on such objects as nodes
and subtrees and holding them either until they are
explicitly released or to the end of the transaction.
For synchronizing read and write operations, we in-
troduce three kinds of locks: shared locks (S), ex-
clusive locks (X) and update locks (U). Read access
requires a shared lock while write access requires an
exclusive lock. An update lock supports read with
(potential) write access and prevents further shared
locks for the same object. An update lock can be
converted to an exclusive lock after the release of
the existing read locks or back to an shared lock if
no update action is needed. The compatibility ma-
trix for lock modes is depicted in Figure 1.
For two phase locking protocol the standard rules
have to be obeyed. Before performing an operation,
the corresponding lock has to be acquired. During
lock acquisition a check for con
icting locks is per-
formed, if a con
ict exists the lock requiring trans-
action is blocked, and locks are held till the end of
transaction. If a transaction is blocked, the wait
graph is updated and if it contains a cycle, the trans-
action that completes the cycle is aborted. The se-
lection of a victim is based on the relative ages of
transactions in deadlock cycle. In general, the the
youngest transaction is selected as the victim.
Maintaining locks, granting and declining lock re-
quests are managed by a lock manager component.
In Sedna logical locks are implemented by means
of numbering scheme, which was introduced in Sec-
tion 2. Numbering scheme also can be used for pre-
venting phantoms, because locking of the interval
of numbering numbers, allows to lock all nodes (in-
cluding nodes which are not presented in database),
which labeled with numeric numbers included in this
interval. For example, the interval of numeric num-
bers, which includes all nodes of the certain subtree
(including phantoms) can be calculated by the root
of this subtree.
Our main contribution in this paper is using and
adopting numbering scheme for logical locking mech-
anisms (see Section 5).
4 Numeric Schema as Basics for
Locking
In this section we introduce the details of Sedna
numbering schema, and the main idea of our locking
methos based on the certain properties of numbering
scheme.
A numbering schema of XML document provides
a one-to-one mapping of nodes of XML document
onto numeric numbers. In other words, each node of
XML document has a unique numeric number.
The numeric number of the node consists of two
components. The rst one is a prex and the second
one is a delimiter. Prex is a string of characters,
while delimiter is a character. We will refer to the
numeric number of node n as (pn; dn), where pn is
the prex of node n and dn is its delimeter, or simply
as N if prex and delimeter of numeric number are
not signicant.
Below we give a set of denitions concerning nu-
meric numbers.
Denition 1 Let (pn1 ; dn1 ) and (pn2 ; dn2 ) are the
numeric numbers of nodes n1 and n2 of some XML
document correspondingly. We regard that the nu-
meric number of n1 is less (<) than numeric num-
ber of n2 if and only if pn1 pn2 , where is the
lexlexicographical comparison of strings.
Denition 2 Let (pn1 ; dn1 ) and (pn2 ; dn2 ) are the
numeric numbers of nodes n1 and n2 of some XML
document correspondingly. We regard that inter-
val of numeric numbers [(pn1 ; dn1 ); (pn2 ; dn2 )] (or
simply [pn1 ; pn2 ]) consists of all numeric numbers
(pni ; dni ), which satisfy the condition pn1 pni
pn2 .
Denition 3 Let p1 and p2 are the strings of char-
acters. Then the string p1 p2 is the concatenation
of strings p1 and p2.
The idea of numbering scheme, which provides (1)
and (2) mechanisms (these mechanisms were intro-
duced in Section 2) is based on the following facts:
For two given strings p1 and p2 such that p1
p2 there exists string p such that p1 p p2.
For example, if p1 = \abc00 and p2 = \abcd00,
then p = \abca00 (the number of such p is in-
nite).
Assume, that node n is the root of the entire
subtree in the document. Then the string in-
terval (pn; pn dn), sets the range of numeric
numbers for all descendants of node n.
Numbering numbers are asigned to the nodes of
a document in the following way:
For two given nodes n1 and n2, n1 preceeds n2
in the document order if and only if pn1 pn2 .
For two given nodes n1 and n2, n1 is an ancestor
of n2 if and only if pn1 pn2 pn1 dn1
!
"#
!
#
$
!%
&!
'
"(
)*
+
%
,
-
"
(
'
&
Figure 2: Example of an XML document
The exact algorithm of assigning numeric num-
bers to the nodes of XML document is not very sig-
nicant here and we don't present it here.
The idea of locking entire subtree of XML docu-
ment using numeric numbers is based on the funda-
mental property of numeric numbers, which is stated
in the following proposition
Proposition 1 Let (pn; dn) is the numeric number
of node n. Then the interval [pn; pn dn] includes the
numeric numbers of all descendants of node n.
Thus, the numeric numbers can be used for lock-
ing nodes and subtrees of XML document. The lock-
ing of the node is implemented of locking its numeric
number. The locking of the subtree (assume node n
is the root of this subtree) is corresponds to the lock-
ing of interval [pn; pn dn]. The correctness of this
idea is guaranteed by proposition 1 .
Below we consider the example of using numeric
schema for locking purposes.
Figure 3 depicts a small XML document lib.xml.
The document contains the list of three books where
each book is described by title, price (optional) and
author, respectively.
Figure 3 shows the tree representation of the
structure of the XML document dened in Figure 2.
The outer element lib is the document node. Nested
elements are connected with edges in tree. To each
node of the tree (except text nodes) the numeric
number is assigned.
To lock the rst book element transaction is re-
quired to lock interval ["ab"; "abm"], while to lock
the last book elemnt transaction is needed to lock
interval ["ap"; "apm"]. To lock the whole XML doc-
ument interval ["a"; "az"] must be locked.
Figure 3: Example of assigning numbering numbers
to the XML document
The lock all book elements of the document
the following three intervals must be locked:
["ab"; "abm"], ["ah"; "ahm"] and ["ap"; "apm"] or
simply one interval ["ab"; "apm"] should be locked.
5 Logical Locks
Logical locks serve for preventing con
icts at the log-
ical level. As mentioned in Section 4 the set of nodes
is locked by means of intervals of numeric numbers.
The locking of one node with numeric number (p; d)
corresponds to the locking of interval [p; p], while the
locking of the whole subtree corresponds to the lock-
ing of interval [p; p d], where (p; d) is the numeric
number of the root of the subtree to be locked.
Assume that there are m active transactions, and
the intervals Iij (j = 1::ki) are locked by transaction
Ti with modes Mij correspondingly. The request for
interval Iiki+1 with mode Miki+1 by transaction Ti
will be granted by lock manager if and only if one of
the following statements is obeyed:
Iiki+1 does not intersect with all intervals ac-
quired by transactions Tj , j = 1::m; j 6= i.
if Iiki+1 intersects with several intervals ac-
quired by transactions Tj , j = 1::m; j 6= i, then
Miki+1 is compatible (according to Figure 1)
with each of these intervals.
Actually, intervals can be useful for preventing
phantoms [3]. We demonstrate this idea on example.
Assume that transaction T1 retrieves all books (see
lib.xml) by the following path expression: /lib/book,
while the transaction T2 inserts the price element as
the rst child of the rst book element. It is obvious
that the price element is the phantom for transac-
tion T1. But the transactions T1 and T2 will not
run concurrently because interval ["ab"; "apm"] in-
cludes the numeric numbers for price element to be
inserted. The algorithm of assigning numeric num-
bers to the new elements ensures it.
So far, we only considered the locking method
for one node or the whole subtree of XML docu-
ment. Below we present some extensions of our lock-
ing method which allow to increasing the degree of
concurrency.
Sometimes, transaction is needed to lock the part
of the subtree, for example all nodes of the subtree
with depth less than or equal to n, while the depth
of the whole subtree is greater than n. In this case
the lock request will consist of two parts. The rst
one is the interval of numeric numbers, which covers
the whole subtree, and the second one is the part of
descriptive schema to which the nodes to be locked
belong. Actually, the part of descriptive scheme con-
sists of the set of scheme nodes. For example, the
part of descriptive scheme, which covers book ele-
ment in lib document is the set of nodes: (book,
title, price, author). We will refer to the part of
descriptive scheme using S letter.
Thus, two locks (I1; S1) and (I2; S2) are compat-
ible if one of the following statements is obeyed:
I1 does not intersect with I2
I1 intersects with I2, but their modes M1 and
M2 are compatible.
The rst and second statements are not obeyed,
but the descriptive scheme parts S1 and S2 does
not intersect.
It is obvious, that the using of descriptive scheme
allows to achieve higher degree of concurrency, but
the logic of lock manager becomes more complex.
6 Logical locks escalation
In Sedna, the lock manager component maintains a
count of the locks held by the transaction. If the
number of locks held by one transaction becomes
too large then the lock manager runs the escalation
procedure: the conversion of many ne-granularity
locks into a single coarse-granularity lock.
To tune the lock escalation, we introduce one pa-
rameter: the escalation thresold. If lock manager
detects that the number of locks acquired by one
transaction exceeds the percentage thresold value
dened by the escalation threshold then the locks
held by this transaction are replaced with a suit-
able single lock, which covers all these locks. In our
locking method the escalation can be implemented
by means of replacing a set of intervals Ii with one
interval I, which covers all Ii.
It is obvious, there is a trade-o to be observed
for lock escalation. On the one hand lock escalation
leads to the decreasing of concurrency, but on the
other hand a reduction of the number of held locks
and the number of calls to the lock manager leads
to saving lock manager overhead.
7 Physiacal Locks
Physical locks serve for preventing con
icts at the
physical level. Physical locks are held during the
evaluation of one physical operation. In most rela-
tional databases the granularity of physical locks is
block.
In Sedna we also use blocks as granule for physical
locking. And the simplest way to ensure physical
consistency is to lock the block, where the needed
nodes are stored.
To understand the interrelations between logi-
cal and physical locks consider the evaluation of
the XPath expression /lib/book[title="Data On the
Web"].
We start evaluation of the query with traversing
the descriptive schema (note: in this paper we do not
consider the synchronization of descriptive schema).
The result of traverse is one schema node that con-
tain pointer to the list of blocks where the descrip-
tors of book nodes are stored. Then transaction will
lock the rst block from the list. Traversing this
block transaction will lock the blockes, where the ti-
tle node descriptors of the book nodes are stored. If
transaction nd a book, which title is equal to "Data
on the Web" then the one will lock this book node
at the logical level. When the list of book blocks will
be passed the physical locks can be released, while
the logical locks on the book nodes, which satisfy to
the predicate will be released only at the end of the
transaction.
8 Related Work
Processing of concurrent querying and update of
XML-data has received only little attention so far.
In [9], the synchronization of concurrent trans-
actions is considered in the context of DOM API.
The authors present three types of locks. Node locks
are acquired for the actual nodes, navigational locks
are acquired on virtual navigation edges to synchro-
nize operations on the navigation paths, while log-
ical locks are introduced to prevent phantom prob-
lem. Authors oer rich options to enhance trans-
action concurrency. But synchronization of non-
navigational APIs (like XQuery) is part of future
work.
In [10] the discussion is also based on the DOM
API, several isolation protocols are proposed. But
node locks are not acquired in a hierarchical context,
and lock granularity is xed for each protocol.
Grabs et. al [6] proposed a combination of well-
known granularity locking and predicate locking
which provides high concurrency, but their locking is
applicable to only restricted XML documents with
simple XPath query for transaction access.
9 Conclusion
Ecient concurrent processing of updates and
queries of XML-data in a consistent and reliable
way is an important practical problem. There are
only few extensions of commercial database systems,
which poorly support XML document processing.
In our paper we presented the physical represen-
tation of XML documents, which we use in Sedna.
Based on this representation, we have introduced
the locking mechanism, which allows acquiring the
nodes and subtrees of XML document in ecient
way. The logical and physical locks have been dis-
cussed. The idea how phantoms problem can be
solved by means of intervals of numeric numbers
is also presented. We pointed out that descriptive
scheme knowledge can improve the degree of concur-
rency.
Future work includes ecient deadlock detection
mechanism for proposed locking method. A recovery
mechanism is also one of the next steps in our plan.
References
[1] P. A. Bernstein, V. Hadzilacos, and N. Good-
man, "Concurrency Control and Recovery in
Database Systems" Addison-Wesley, 1987.
[2] S. Dekeyser, J. Hidders, J. Paredaens, "A trans-
action model for XML databases", World Wide
Web Journal, Kluwer, 2003.
[3] K. P. Eswaran, J. Gray, R. Lorie, and I. Traiger,
"The notions of consistency and predicate locks
in a database systems", Comm. of ACM, Vol.
19, No. 11, pp. 624-633, November 1976.
[4] D. Florescu and D. Kossman, "Storing and
Querying XML Data using an RDBMS", IEEE
Data Engineering Bulletin, 1999.
[5] A. Fomichev, M. Grinev, S. Kuznetsov, "De-
scriptive Schema Driven XML Storage", Sub-
mitted to ADBIS 2004.
[6] T. Grabs, K. Bohmd, and H. Schek, "XMLTM:
ecient transaction management for XML doc-
uments", Proc. of the 19th CIKM Conference,
pp 142-152, 2002.
[7] J. Gray, and A. Reuter, "Transaction Process-
ing: Concepts and Techniques", Morgan Kauf-
mann, 1993.
[8] M. Grinev, A. Fomichev, S. Kuznetsov, K.
Antipin, A. Boldakov, D. Lizorkin, L. Novak,
M. Rekouts, P. Pleshachkov, "Sedna: A Na-
tive XML DBMS", Submitted to International
Workshop on XQuery Implementation, Experi-
ence and Perspectives (XIME-P), 2004.
[9] M. P. Haustein, and Theo Harder, "taDOM:
a Tailored Synchronization Concept with Tun-
able Lock Granularity for the DOM API",
In Proc. ADBIS Conference, LNCS 2798,
Springer, 2003.
[10] S. Helmer, C.-C Kanne, and G. Moerkotte,
"Isolation in XML Bases", Technical report,
University of Mannheim, Germany, 2001.
[11] P. Lehti, "Design and Implementation of a
Data Manipulation Processor for an XML
Query Language", Technische Universitt
Darmstadt Technical Report No. KOM-D-149,
http://www.ipsi.fhg.de/ lehti/diplomarbeit.pdf,
August, 2001.
[12] A. Silberschatz, H. Korth, S. Sudarshan,
"Database System Concepts", Third Edition,
McGraw-Hill, 1997.
[13] World Wide Web Consortium, "Extensible
Markup Language (XML) 1.0 (2nd edition)",
W3C Recommendation.
[14] World Wide Web Consortium, "XQuery 1.0:
An XML Query Language", W3C Working
Draft, 13 November 2003.
1
Introduction to database management
systems
Database System Concepts 1.2 ©Silberschatz, Korth and Sudarshan
Database management systems module
Myself: researcher in INRIA Futurs, Ioana.Manolescu@inria.fr
The course: follows (part of) the book
"Database System Concepts", Fourth Edition
Abraham Silberschatz, Henry F. Korth, S. Sudarshan
(minor modifications to slides)
All slides are at: http://www.cs.yale.edu/homes/avi/db-book/
The content taught here is basic and can be found in many other
good books (e.g. Gardarin; Ramakrishnan)
2
©Silberschatz, Database System Concepts 1.3 Korth and Sudarshan
What we are going to see
Introduction to database management systems
What's inside
Why it's interesting
Relational databases: the success story in databases
Relational model
Relational algebra
Most successful relational query language: SQL
Most successsful commercial product implementing SQL: Oracle
Database System Concepts 1.4 ©Silberschatz, Korth and Sudarshan
Chapter 1: Introduction
Purpose of Database Systems
View of Data
Data Models
Data Definition Language
Data Manipulation Language
Transaction Management
Storage Management
Database Administrator
Database Users
Overall System Structure
3
©Silberschatz, Database System Concepts 1.5 Korth and Sudarshan
Database Management System (DBMS)
Collection of interrelated data
Set of programs to access the data
DBMS contains information about a particular enterprise
DBMS provides an environment that is both convenient and
efficient to use.
Database Applications:
Banking: all transactions
Airlines: reservations, schedules
Universities: registration, grades
Sales: customers, products, purchases
Manufacturing: production, inventory, orders, supply chain
Human resources: employee records, salaries, tax deductions
Databases touch all aspects of our lives
Database System Concepts 1.6 ©Silberschatz, Korth and Sudarshan
Purpose of Database System
In the early days, database applications were built on top of
file systems
Drawbacks of using file systems to store data:
Data redundancy and inconsistency
Multiple file formats, duplication of information in different files
Difficulty in accessing data
Need to write a new program to carry out each new task
Data isolation —multiple files and formats
Integrity problems
Integrity constraints (e.g. account balance > 0) become part
of program code
Hard to add new constraints or change existing ones
4
©Silberschatz, Database System Concepts 1.7 Korth and Sudarshan
Purpose of Database Systems (Cont.)
Drawbacks of using file systems (cont.)
Atomicity of updates
Failures may leave database in an inconsistent state with partial
updates carried out
E.g. transfer of funds from one account to another should either
complete or not happen at all
Concurrent access by multiple users
Concurrent accessed needed for performance
Uncontrolled concurrent accesses can lead to inconsistencies
– E.g. two people reading a balance and updating it at the same
time
Security problems
Database systems offer solutions to all the above problems
Database System Concepts 1.8 ©Silberschatz, Korth and Sudarshan
Levels of Abstraction
Physical level describes how a record (e.g., customer) is stored.
Logical level: describes data stored in database, and the
relationships among the data.
type customer = record
name : string;
street : string;
city : integer;
end;
View level: application programs hide details of data types.
Views can also hide information (e.g., salary) for security
purposes.
5
©Silberschatz, Database System Concepts 1.9 Korth and Sudarshan
View of Data
An architecture for a database system
Database System Concepts 1.10 ©Silberschatz, Korth and Sudarshan
Instances and Schemas
Similar to types and variables in programming languages
Schema – the logical structure of the database
e.g., the database consists of information about a set of customers and
accounts and the relationship between them)
Analogous to type information of a variable in a program
Physical schema: database design at the physical level
Logical schema: database design at the logical level
Instance – the actual content of the database at a particular point in time
Analogous to the value of a variable
Physical Data Independence – the ability to modify the physical schema
without changing the logical schema
Applications depend on the logical schema
In general, the interfaces between the various levels and components should
be well defined so that changes in some parts do not seriously influence others.
6
©Silberschatz, Database System Concepts 1.11 Korth and Sudarshan
Data Models
A collection of tools for describing
data
data relationships
data semantics
data constraints
Entity-Relationship model
Relational model
Other models:
object-oriented model
semi-structured data models
Older models: network model and hierarchical model
Database System Concepts 1.12 ©Silberschatz, Korth and Sudarshan
Entity-Relationship Model
Example of schema in the entity-relationship model
7
©Silberschatz, Database System Concepts 1.13 Korth and Sudarshan
Entity Relationship Model (Cont.)
E-R model of real world
Entities (objects)
E.g. customers, accounts, bank branch
Relationships between entities
E.g. Account A-101 is held by customer Johnson
Relationship set depositor associates customers with accounts
Widely used for database design
Database design in E-R model usually converted to design in the
relational model (coming up next) which is used for storage and
processing
Database System Concepts 1.14 ©Silberschatz, Korth and Sudarshan
Relational Model
Example of tabular data in the relational model
customername
Customer-id
customerstreet
customercity
accountnumber
Johnson
Smith
Johnson
Jones
Smith
192-83-7465
019-28-3746
192-83-7465
321-12-3123
019-28-3746
Alma
North
Alma
Main
North
Palo Alto
Rye
Palo Alto
Harrison
Rye
A-101
A-215
A-201
A-217
A-201
Attributes
8
©Silberschatz, Database System Concepts 1.15 Korth and Sudarshan
A Sample Relational Database
Database System Concepts 1.16 ©Silberschatz, Korth and Sudarshan
Data Definition Language (DDL)
Specification notation for defining the database schema
E.g.
create table account (
account-number char(10),
balance integer)
DDL compiler generates a set of tables stored in a data
dictionary
Data dictionary contains metadata (i.e., data about data)
database schema
Data storage and definition language
language in which the storage structure and access methods
used by the database system are specified
Usually an extension of the data definition language
9
©Silberschatz, Database System Concepts 1.17 Korth and Sudarshan
Data Manipulation Language (DML)
Language for accessing and manipulating the data organized by
the appropriate data model
DML also known as query language
Two classes of languages
Procedural – user specifies what data is required and how to get
those data
Nonprocedural – user specifies what data is required without
specifying how to get those data
SQL is the most widely used query language
Database System Concepts 1.18 ©Silberschatz, Korth and Sudarshan
SQL
SQL: widely used non-procedural language
E.g. find the name of the customer with customer-id 192-83-7465
select customer.customer-name
from customer
where customer.customer-id = ‘192-83-7465’
E.g. find the balances of all accounts held by the customer with
customer-id 192-83-7465
select account.balance
from depositor, account
where depositor.customer-id = ‘192-83-7465’ and
depositor.account-number = account.account-number
Application programs generally access databases through one of
Language extensions to allow embedded SQL
Application program interface (e.g. ODBC/JDBC) which allow SQL
queries to be sent to a database
10
©Silberschatz, Database System Concepts 1.19 Korth and Sudarshan
Database Users
Users are differentiated by the way they expect to interact with
the system
Application programmers – interact with system through DML
calls
Sophisticated users – form requests in a database query
language
Specialized users – write specialized database applications that
do not fit into the traditional data processing framework
Naïve users – invoke one of the permanent application programs
that have been written previously
E.g. people accessing database over the web, bank tellers, clerical
staff
Database System Concepts 1.20 ©Silberschatz, Korth and Sudarshan
Database Administrator
Coordinates all the activities of the database system; the
database administrator has a good understanding of the
enterprise’s information resources and needs.
Database administrator's duties include:
Schema definition
Storage structure and access method definition
Schema and physical organization modification
Granting user authority to access the database
Specifying integrity constraints
Acting as liaison with users
Monitoring performance and responding to changes in
requirements
11
©Silberschatz, Database System Concepts 1.21 Korth and Sudarshan
Transaction Management
A transaction is a collection of operations that performs a single
logical function in a database application
Transaction-management component ensures that the database
remains in a consistent (correct) state despite system failures
(e.g., power failures and operating system crashes) and
transaction failures.
Concurrency-control manager controls the interaction among the
concurrent transactions, to ensure the consistency of the
database.
Database System Concepts 1.22 ©Silberschatz, Korth and Sudarshan
Storage Management
Storage manager is a program module that provides the
interface between the low-level data stored in the database and
the application programs and queries submitted to the system.
The storage manager is responsible to the following tasks:
interaction with the file manager
efficient storing, retrieving and updating of data
12
©Silberschatz, Database System Concepts 1.23 Korth and Sudarshan
Overall System Structure
Database System Concepts 1.24 ©Silberschatz, Korth and Sudarshan
Application Architectures
Two-tier architecture: E.g. client programs using ODBC/JDBC to
communicate with a database
Three-tier architecture: E.g. web-based applications, and
applications built using “middleware”