Data Warehouses
1.1 Business Intelligence (BI)
- Decision Support Systems (DSS)
- Data Warehouses (DW)
- Online Transaction Processing (OLTP): traditional relational db
- Online Analytical Processing (OLAP): data analysis for BI
- Extraction, Transportation, and Transformation (ETT) solutions
1.2 What is a Data Warehouse?
From Oracle Manual:
"A data warehouse is a relational database that is designed for query and
analysis rather than transaction processing. It usually contains historical
data that is derived from transaction data, but it can include data from
other sources. It separates analysis workload from transaction workload
and enables an organisation to consolidate data from several sources."
1.3 Inmon's characteristics of data warehouses
- Subject Oriented (built for specific focus, specific questions)
- Integrated (from heterogeneous sources)
- Nonvolatile (data does not changed after warehouse is created)
- Time Variant (comparisons between data at different times)
1.4 Data warehouses are materialised views
Data warehouses typically have
- more indexes
- fewer joins
- more derived data and aggregates
than OLTP databases.
Furthermore, data warehouses need not be normalised!
1.5 Features of data warehouses
Data warehouses are
- expected to be optimised for ad-hoc queries
- updated via ETT at scheduled times (there is no direct data entry)
- expected to process large amounts of data (VLDB)
- store historical data
2.1 Data Warehousing Schemas
- Star schema (one fact table with several dimension tables).
- Snowflake schema (in third normal form)
2.2 OLAP Operations
- Drill Across or Drill-Down (navigating along dimension hierarchy)
- Roll-Up (aggregating data at next level in the same dimension)
- Slice (isolating one layer)
- Dice (forming a subset of a cube)
- Pivoting (switching columns and rows in a table or cube)