OODB

Object Oriented Database Systems

1.1 Modern Database Applications involve complex, specialised data structures

design (CAD), engineering examples:
- building layout
- mechanics
- electronics (VLSI)
- software production
- chemical structures

cartography (GIS: Geographic Information Systems)
- maps
- land registers

image processing
knowledge engineering
industrial production (CAM, CIM)
others, e.g. office automation

1.2 Characteristics of Design Applications

revolve around artifacts
objects built out of other objects
iterative
multiple levels of abstraction
tasks are shared among designers

1.3 What Object-Oriented Design is good at:

cf. OODB System Manifesto (Atkinson et al., 1989)

O1 complex objects
O2 object identity
O3 encapsulation
O4 types or classes
O5 inheritance
O6 overriding, overloading and late binding
O7 computational completeness
O8 extensibility

1.4 What Relational Databases are good at:

cf. OODB System Manifesto (Atkinson et al., 1989)

D1 Persistence
D2 Storage Management
D3 Concurrency
D4 Recovery
D5 Ad Hoc Query Facility

1.5 Requirements of Modern Database Applications

complex data structures: modeling, maintenance, access (O1, O2, O3, O6, D1)
extensible type system (O4, O5, O8)
navigation and query (D5)
high performance (O7, D2, D3, D4)

If only relational databases were used, then the complex data structures and the type system would have to be maintained by external programs. These would have to be specially written for each new application.

If object oriented programming languages (without databases) were used, then special procedures would have to be written to store, access, navigate and query the data. Storage management, concurrency and recovery mechanisms would have to be specially written for each new application.

1.6 Relational Database Issues:

Because the relational model is so simple, relational databases ...

are fast and efficient
have a "simple" formal model and semantics
support data independence (physical & logical)

But users cannot ...

define types (only fixed number of built-in types is available)
express nested relationships: e.g. ((Street, Number) City)
represent/manipulate complex entities as a single unit
sufficiently express data that does not map well to tables
write methods (database cannot represent behavior)

Users must explicitly ...

manage various types of relationships (e.g. is-a, association, aggregation)
define keys (integrity problems)
write procedures for versioning
write procedures for long duration transactions

Because SQL is not computationally complete ...

some computations are not possible, e.g. find all rooms near the location of room B2
transitive closure is not computable (parts explosion problem)
some applications require external programming language

1.7 Semantic Data Models

Entity Relationship Model
Extended Relational Model
Semantic Data Model
Functional Data Model
Object Oriented Model

2.1 Object identity (O2)

Entities may not have identifiers:

an entity may not have a unique name (e.g., literals versus objects)
an entity may have more than 1 unique name (e.g., references)
an entity may change its name over a period of time

2.2 Examples

1> a:= 5

2> b:= 5

3> c:= a

4> a:= 6

5> b:= "Hello World"

2.3 Examples - continued

equal values: a:= 5 and b:= 5
equal variable names but different values: a:= 5 and a:= 6
different variables: b:= 5 and b:= "Hello World"
c:= a either means deep copy or shallow copy

2.4 Identity versus Value (or State)

different objects can have the same value
identity: objects are identical if they have the same identifier
equality: objects are equal if they have the same value(s)
identity neither implies equality nor equality implies identity
deep equality (all levels down must be checked) and shallow equality (pointer to first level)

The Relational Model is Value-Based:

instances (rows) are identified by primary keys
keys are user-defined and can be changed by users
results in the need for referential integrity

2.5 Object Identity

object identity is independent of value and updates
no misleading references to objects
there is a function I that maps an object into its identity

Object Identifiers ...

are (system-wide) unique
are managed by the system
never change during object-lifetime
are never reused after object deletion
do not carrying any semantics

2.6 Object Sharing

Example:

employee database - two employees live in the same suburb
suburb: equal value or identical object?

In Relational Databases:

reference by foreign key for entity instances
reference by value for attributes

Object Identity implies:

different structures can refer to the same object
avoids ambiguity and redundancy

2.7 Advantages of Object Identity:

facilitates object sharing
users do not need to worry about it, managed by the system
objects can still have additional user-controlled names, these names can be different in different applications and can be changed freely
semantics of retrieval and manipulation clear
consistency rules can be easily specified

But the system has more to do:

operations for object assignment, deep and shallow copy needed
tests for equality needed
complex objects can be graphs: need to be managed by the system
the semantics of the system is more complicated

3.1 Complex Objects (O1)

Complex objects are built from simple objects using constructors:

data abstraction: types
- a type defines a representation and a set of operations
- representation = any other type
- operation = program (method) that can access the representation

type constructors: tuple, set, array
- orthogonality of objects and constructors: constructors can be used for any object

3.2 Object Description

Attributes vs. Properties

Properties => Unidirectional
Attributes => Bidirectional
- Can be modeled as a pair of properties with inverse.

Attribute Values

simple types (literals, strings, integers) versus abstract data types
single-valued versus multi-valued (set-values)

Attribute Domain

set of values of similar type

Class Attributes

associate a value with a type/class which applies to class as a whole, e.g. minimum salary of class employee

3.3 Association and Aggregation

Association: is-associated-to relationship

associate objects from several independent classes
when an association instance is deleted, the participating objects continue to exist

Aggregation: has-attribute and is-part-of relationships

building composite objects from their component objects
e.g. aggregate attribute values of an object to form the whole object
e.g. aggregate objects that are related by a particular relationship instance into a higher level aggregate object
if an aggregate instance is deleted the component objects are also deleted

3.4 Operations for Complex Objects

retrieve object and subobjects
- subject to retrieval predicates
- restricted to attributes/components of interest

create and delete objects/subobjects (structure-building operations)
- deletion with/without components

copy objects

navigation within object structure

3.5 Sharing Revisited

object identity facilitates object sharing
objects have independent existence
but parts of objects may not have independent existence
dependent parts only exist while container exists

Sharing parts is dangerous if parts do not have independent existence!

independent (own existence) dependent (no own existence)
sharable e.g. module, class e.g. public method
not sharable e.g. private class e.g. private method

	independent (own existence)	dependent (no own existence)
sharable	e.g. module, class	e.g. public method
not sharable	e.g. private class	e.g. private method

4.1 Encapsulation (O3)

2 Levels in Object-Oriented Programming:

1) specification is visible for application programs
- interface describes allowable operations

2) implementation is encapsulated, hidden
- data part (state, values, attributes)
- procedural part (operations, methods)

-> encapsulation, logical data independence

application programs are protected from implementation details

2 Levels in Relational Databases:

1) data
2) program (ad hoc query language + programming language)

-> data independent from programming

allows ad hoc queries
but table-specific methods cannot be defined

4.2 Encapsulation: Pros and Cons

Pros:

extensibility, software engineering

Cons:

ad hoc queries and similar operations are not allowed (not all ad hoc queries raise maintainability issues thus there is no reason to prohibit these)
optimization, lack of a theory

4.3 Overriding, Overloading and Late Binding (O6)

operations written at top level and overridden by subclasses
overloading: different programs under same name depending on context
late-binding: at run-time not compile-time

-> hides complexity from application programs

4.4 Computational Completeness (O7)

programming languages are usually complete
SQL is not complete, but SQL + programming language is complete
different from "resource completeness"

4.5 Extensibility (O8)

users can define their own types, methods, etc
no distinction in usage between user-defined and system types
there may be performance difference between user-defined and system types

5.1 Types or Classes (O4)

Types (e.g. C++, Java)

summarize common features of a set of objects
type-checking at compile-time for consistency

Classes (e.g. Smalltalk)

similar to types but can be manipulated at run-time
object factory (for creating new objects)
object warehouse (extension = all instances of a class)

5.2 Natural Types versus Role Types

natural types: e.g. gender, species

object belongs to at most one class of each natural type
"classification"

role types: e.g. role of employee, customer, family relationships

object can have different roles, simultaneously or at different times

5.3 Classes/Types: Pros and Cons

Pros:

simplification, modularization, encapsulation
operations/attributes can apply to instance or to all class members simultaneously
user-definable

Cons:

role types imply that objects can be members of different classes
class library: large vocabulary of classes and methods

6.1 Class Hierarchy

Hierarchy of Classes, Subclasses and Superclasses

e.g. programmer -> computer scientist -> employee
is-a relationship between subclass/class
member-of relationship: objects are members of a class and its superclasses

Specialisation/Generalisation

facilitates incremental design
specialisation: top-down conceptual refinement
- separation based on differentiating features
generalisation: bottom-up conceptual synthesis
- grouping based on suppressing differences
normally a combination of both specialisation and generalisation processes are employed

6.2 Multiple and Flexible Hierarchies

tree hierarchy: each class has one immediate superclass
poly-hierarchy: class can have several immediate superclasses
- because of role types: class may belong to different superclasses
- different contexts may require different hierarchies
type lattice: unique smallest common superclass and unique largest common subclass exist for each set of classes (i.e., multiple paths exists but it can be calculated where they intersect)

6.3 Inheritance (O5)

Attribute Inheritance

object has specific class attributes
object inherits attributes from superclasses

Complexity of Inheritance

simple inheritance (in tree hierarchies)
multiple inheritance (in type lattices and poly-hierarchies)
- can lead to name conflicts

Degree of Inheritance

selective inheritance
default inheritance: can be overridden at lower levels

6.4 Types of Inheritance

substitution inheritance
- based on behavior not values
- an instance of A can be used in any context in which an instance of B is expected

inclusion inheritance
- based on classification structure
- an instance of A is also an instance of B

constraint inheritance
- subcase of inclusion inheritance
- instance of A has the same operations and fields as instance of B
- but: instance of A fulfills certain further constraints
- e.g. teenager -> person

specialization inheritance
- subcase of inclusion inheritance
- but: instance of A has some extra fields compared to instance of B

6.5 Pros and Cons of Inheritance

Pros:

code reutilisation
additional semantics are represented
modeling discipline

Cons:

ambiguity and name conflicts in multiple inheritance
maintenance

6.6 OO Typing System

substitutability
static type checking
mutability
subtyping by specialisation

Can't build a type system with all 4, can choose any 3 - which ever are the most important

7.1 Comparison

	Pros, Why is it useful	Cons, Why is it difficult	How is it implemented in Object-Oriented Programming	How is it implemented in Relational Databases
O1 complex objects	modularity, object sharing	operations needed	attributes, constructors	only system-defined types (e.g. date)
O2 object identity	consistency	system must maintain it	object ID	keys, referential integrity
O3 encapsulation	extensibility software engineering	optimization no ad hoc queries	implementation, specification	data separate from program
O4 types or classes	modularization, user-definable	role types, maintenance	is implemented	tables, no methods, no pointers
O5 inheritance	code reutilisation	name conflicts maintenance	attribute and method inheritance	---
O6 overriding, overloading and late binding	simplification for user	no compile time type checking	via class hierarchy	---
O7 computational completeness	Church Turing hypothesis	---	is complete	needs programming language
O8 extensibility	hide complexity	optimization	user defined types	---
D1 persistence	easier for programmer	difficult for complex structures, some data must be transient	---	is implemented
D2 storage management	easier for programmer	---	memory allocation	is implemented
D3 concurrency	multiple users	many possible application programs	threads	is implemented
D4 recovery	stability, security	many possible application programs	---	is implemented
D5 ad hoc query facility	direct data access	difficult for complex structures	---	is implemented

7.2 Summary

Relational model and OO model have conflicting advantages/disadvantages
There may never be a single widely accepted OODB model (such as Relational Algebra is for relational databases)
Different approaches (OO, Relational DB or OODB) may be necessary for different applications