Introduction

CMU 15-445/645 (FALL 2021), DATABASE SYSTEMS

course website

online videos

This is a course about designing and implementing database system rather than how to use a database system !

If you want to learn how to develop apps with database, go to other courses.

Project

You will build your own database system over the semester, c++17 is used.

DataBase

Definition

What is a Database ?

Organized collection of inter-related data that models some aspect of the real-world.

What is a Database Management System ?

A software which allow applications to store and analyze infomation in a database.

What is a Data Model ?

A collection of concepts for describing the data in database.

  • Relational
  • Key/Value
  • Graph
  • Matrix

Relational Model

Artist(name,year,country)

name year country
Wu-Tang Clan 1992 USA
Notorious BIG 1992 USA
Ice Cube 1989 USA

Relation

A relation is an unordered set that contain the relationship of attribute that represent the entities

  • name,year,country are attributes used to represent the entity artist

  • N-ary relaion is the same with A table with N columns

  • The whole table, including the attributes, is a relation

Tuple

A set of attribute values(aka domain) in the relation

  • (Wu-Tang Clan,1992,USA) is one tuple

  • NULL is a special value which is a member of every domain

Primary key

A relation’s primary key uniquely identifies a single tuple

  • name is the primary key of relation name, year,country
  • some popular DBS automatic create primary key if user do not define one, for example AUTO_INCREATMENT in MySQL
Foreign key

A foreign key specifies that an attribute from one relation has to map to a tuple in another relation

join table

artist_id and album_id is both foregin key and primary key

DML(Data Manipulate Language)

Methods to store and retrieve information from a database.

Procedural
  • The query specify the strategy DBSM should use to find the result

  • How to retrieve the data, Relational Algebra

Declarative
  • The query specify only what data is wanted and do not kown how to find it

  • What is the data, Relational Calculus

Relational Algebra

Foundamental operations to retrieve and manipulate tuples in a relation, based on the idea of set algebra

  • Each operation takes 1 or more relations as input and produce 1 new relation
  • We can chain operations to compose complex operation

operations

Operations

Predication

$$
\sigma \longrightarrow \text{WHERE}
$$

choose a subset of the tuples from a relation that satisfies a selection predicate.

  • Predication acts as a filter to retain only tuples that fulfills its qualifying requirement
  • Can combine multiple predicates using conjunction/disjunction

predicate

1
SELECT * FROM R WHERE a_id='a2';

Projection

$$
\pi \longrightarrow \text{SELECT}
$$

generate a relation with tuples that contains only the specified attributes

  • can rearrage the order of attributes
  • can manipulate the values

projections

1
SELECT b_id-100,a_id FROM R WHERE a_id ='a2';

Union

same with set union

union

Intersection

same with set intersection

intersection

Difference

same with set difference

difference

Product

generate a relation that contain all possible combinations of tuples from the input relations

product

Join

$$
\Join
$$

generate a realtion that contains all tuples that are a combination of two tuples(one from each input relation) with a common value for one or more attributes

join