Relational Model
It is primary data model which is used for designing commercial databases. It was introduced by E.F.Codd of IBM.
This model is based on set theory relation concept. In this data and relationship among data are represented through relations (tables – arrangement of information in rows and columns). It is implementation data model which is used to implement database.
Concepts of relational data model
1. Relation, tuple and attribute
2. Keys
3. Query language
4. Relational algebra
5. Relational calculus
6. E.F.Codd rules
Relation, tuple and attribute
Relation
A relation is table in which data is arranged in rows and columns.
The properties of a relation are
a) All items in any given column of a table are of the same kind.
b) For any given row, each column must have an atomic value
c) Each row is unique
d) The ordering of rows within a relation is insignificant
e) The columns are assigned distinct names
f) The ordering of columns within a relation is also insignificant
Tuple
A row in relational data model is called tuple.
The number of tuples in a relation is called cardinality of the relation.
Attribute
A column in relational data model is called attribute.
The number of attributes in a relation is called degree of the relation.
Example – Student Relation
Roll no
|
Name
|
Address
|
DOB
|
1
|
Amit
|
Patna
|
16-12-1990
|
2
|
Sumit
|
Ranchi
|
17-08-1995
|
3
|
Premchand
|
Garhwa
|
02-02-1993
|
Cardinality=3; Degree=4
Keys
a) Super Key
b) Candidate Key
c) Primary Key
d) Foreign Key
Super Key
It is a set of one or more attributes that, taken collectively, allows us to identify uniquely a tuple in a relation.
Example – rollno, rollno+name, emailid, emailid + name, mobileno, aadharno, mobileno+aadharno an so on
Candidate key
The super keys for which no proper subset is a super key are called candidate keys. It is a super key with minimum possible number of attributes.
Example – rollno, mobileno, aadharno, emailed
Primary key
A candidate key chosen by the database designer for the unique identification of tuples within a relation is called primary key.
Example rollno in student, custcode in customer, deptno in department, empno in employee, accno in account table
We can have only one primary key in a table. Primary key is unique as well as mandatory. A primary key composed of two or more columns is called composite key.
The candidate keys other than primary key are called alternate key.
Foreign Key
It is used to establish parent child or master detail relationship between two relations.
It is a set of one or more attributes which references primary key of some other relation for its value.
Example – rollno is foreign key in marksheet table, accno is foreign key in transaction table
NOTE - Record must be entered in the master (parent) table before detail (child) table and record must be deleted in detail table before master table.
Query Language
It is a language which is used to extract information from the database.
Types of query language
1. Procedural Query language – In this a user instructs the system to perform a sequence of operations on the database to extract data. The user specifies “how data is to be extracted from the database?” Example – relational algebra
2. Non Procedural Query language – In this the user describes the required information to be extracted without giving a specific procedure for extracting the information. The user specifies “what data is to be extracted from the database?” Example – relational calculus
Relational Algebra
It consists of operations that can be performed on relation or table.
Relational Algebra Operations
1. Project (Π)
2. Select (σ)
3. Rename (ρ)
4. Union (U)
5. Intersection (∩)
6. Set Difference (-)
7. Cartesian Product (X)
8. Join (|><|)
9. Division (÷)
Project (Π)
It is unary operation. It is used to select attributes from a relation.
Example
Πrollno,name(Student)
Select (σ)
It is also unary operation. It selects tuples from a relation that satisfy a given condition.
Example
1. σ rollno=1(student)
2. σaddress=”ranchi”(student)
3. σdob>=”01-01-1990”(student)
4. σdob>=”01-01-1990” Λ dob<=”31-12-1990”(student)
5. Πrollno,name (σ~(address=”ranchi”)(student))
Note – symbols V stands for OR, ~ stands for NOT , Λ stands for AND operation
Rename (ρ)
It is a unary operation which is used to rename a relation or attributes of a relation.
Example
1. ρ stud(Student)
2. ρ stud(roll,sname,saddress,sdob)(Student)
Union (U)
It is binary operation. A U B is valid if the following two conditions hold:
a) No. of attributes in relation A= No. of attributes in relation B
b) Domain of ith attribute in relation A is same as domain of ith in relation B, for all i.
AUB returns rows which are present either in relation A or relation B or both. The duplicate values are presented only once.
Example
1. Πrollno,name (student) U Πcustcode,custname (customer)
2. Πrollno,name,address (σ(address=”ranchi”)(student)) U Πrollno,name,address (σ(address=”patna”)(student))
Intersection (∩)
It is binary operation. A∩B is valid if the following two conditions hold:
c) No. of attributes in relation A= No. of attributes in relation B
d) Domain of ith attribute in relation A is same as domain of ith in relation B, for all i.
A∩B returns rows which are present in both the relation A and B. The duplicate values are presented only once.
Example
Πrollno,name (student) ∩ Πcustcode,custname (customer)
Πrollno,name,address (σ(address=”ranchi”)(student)) ∩ Πrollno,name,address (σ(dob>=”01-01-1990”)(student))
Set Difference (-)
It is binary operation. A-B is valid if the following two conditions hold:
a) No. of attributes in relation A= No. of attributes in relation B
b) Domain of ith attribute in relation A is same as domain of ith in relation B, for all i.
A-B returns rows which are present in the relation A, but not in B. The duplicate values are presented only once.
Example
Πrollno,name (student) - Πcustcode,custname (customer)
Πrollno,name,address (σ(address=”ranchi”)(student)) - Πrollno,name,address (σ(σ(dob>=”01-01-1990”)(student))
Cartesian Product (X)
It is a binary operation. It combines each and every tuple of relation A to each and every tuple of relation B.
# attributes (A X B)= # attributes(A) + # attributes(B)
# tuples (A X B)= # tuples (A) X # tuples(B)
Where symbol # denotes number of
Example
Employee
Empno
|
Name
|
Salary
|
Deptno
|
101
|
Mark
|
20000
|
D1
|
102
|
Smith
|
30000
|
D1
|
103
|
John
|
10000
|
D2
|
Department
Deptno
|
Dname
|
D1
|
HRD
|
D2
|
IT
|
Employee X Department
Empno
|
Name
|
Salary
|
Deptno
|
Deptno
|
DName
|
101
|
Mark
|
20000
|
D1
|
D1
|
HRD
|
101
|
Mark
|
20000
|
D1
|
D2
|
IT
|
102
|
Smith
|
30000
|
D1
|
D1
|
HRD
|
102
|
Smith
|
30000
|
D1
|
D2
|
IT
|
103
|
John
|
10000
|
D2
|
D1
|
HRD
|
103
|
John
|
10000
|
D2
|
D2
|
IT
|
Join ( |><| )
It is special form of Cartesian or cross product of two relations. It is combination of Cartesian product and selection operation. It performs Cartesian product first and then selection operation.
Types of join
1. Θ Join (theta join)
2. Equi join
3. Natural join
4. Outer Join
Θ Join (theta join)
R |><| Θ S = σ Θ(R X S), Θ is condition
Example
Employee |><| salary>10000 Department
Empno
|
Name
|
Salary
|
Deptno
|
Deptno
|
DName
|
101
|
Mark
|
20000
|
D1
|
D1
|
HRD
|
101
|
Mark
|
20000
|
D1
|
D2
|
IT
|
102
|
Smith
|
30000
|
D1
|
D1
|
HRD
|
102
|
Smith
|
30000
|
D1
|
D2
|
IT
|
Equi Join
In this rows are joined on the basis of values of a common attribute between the two relations. It enforces equality of common attribute in both the relation.
Employee |><| employee.deptno=department.deptno Department
Empno
|
Name
|
Salary
|
Deptno
|
Deptno
|
DName
|
101
|
Mark
|
20000
|
D1
|
D1
|
HRD
|
102
|
Smith
|
30000
|
D1
|
D1
|
HRD
|
103
|
John
|
10000
|
D2
|
D2
|
IT
|
Natural Join
It is most common and general form of join. It is same as equi join, but in this common attribute appears only once in the output. It performs a Cartesian product of its two relations then performs a selection forcing equality on those attributes that appear in both relations and finally removes duplicate attributes.
Employee |><| Department
Empno
|
Name
|
Salary
|
Deptno
|
DName
|
101
|
Mark
|
20000
|
D1
|
HRD
|
102
|
Smith
|
30000
|
D1
|
HRD
|
103
|
John
|
10000
|
D2
|
IT
|
Outer Join
It is an extension of the natural join operation. It deals with missing information. There are 3 forms of outer join
1. Left outer join ]><|
2. Right outer join |><[
3. Full outer join ]><[
Left outer join ]><|
It takes all the tuples in the left relation even if they don’t match with any tuple in the right relation. The tuples that don’t match with any tuple in the right relation is filled with “null” values for all other attributes from the right relation.
Example
Employee
Empno
|
Name
|
Salary
|
Deptno
|
101
|
Mark
|
20000
|
D1
|
102
|
Smith
|
30000
|
D1
|
103
|
john
|
10000
|
D2
|
104
|
David
|
15000
|
D4
|
Department
Deptno
|
Dname
|
D1
|
HRD
|
D2
|
IT
|
D3
|
MKTG
|
Employee ]><| Department
Empno
|
Name
|
Salary
|
Deptno
|
DName
|
101
|
Mark
|
20000
|
D1
|
HRD
|
102
|
Smith
|
30000
|
D1
|
HRD
|
103
|
John
|
10000
|
D2
|
IT
|
104
|
David
|
15000
|
D4
|
Null
|
Right outer join |><[
It takes all the tuples in the right relation even if they don’t match with any tuple in the left relation. The tuples that don’t match with any tuple in the left relation is filled with “null” values for all other attributes from the left relation.
Employee |><[ Department
Empno
|
Name
|
Salary
|
Deptno
|
DName
|
101
|
Mark
|
20000
|
D1
|
HRD
|
102
|
Smith
|
30000
|
D1
|
HRD
|
103
|
John
|
10000
|
D2
|
IT
|
Null
|
Null
|
Null
|
D3
|
MKTG
|
Full outer join ]><[
It takes all the tuples from both the relations even if they don’t match with any tuple in the other relation.
It produces union of left and right outer join.
Employee ]><[ Department
Empno
|
Name
|
Salary
|
Deptno
|
DName
|
101
|
Mark
|
20000
|
D1
|
HRD
|
102
|
Smith
|
30000
|
D1
|
HRD
|
103
|
John
|
10000
|
D2
|
IT
|
Null
|
Null
|
Null
|
D3
|
MKTG
|
104
|
David
|
15000
|
D4
|
Null
|
Division (÷)
It is binary operation. It is used in cases where the query involves “for all” clause or phrase. It is used to denote the condition where a given relation “R” is to be split based on its association with every tuples in the other relation “S”.
Let the set of attributes in R be X and set of attribute in S be Y.
Then attributes in R÷S , Z=X-Y
and tuples in R÷S, t Є R÷S if for all tuples t in R÷S association (t,s) subset of R
Example
EmpProject
EmpName
|
Project name
|
Harsh
|
ERP
|
Gaurav
|
ERP
|
Ashish
|
SCM
|
Harsh
|
SCM
|
Amit
|
CRM
|
Gaurav
|
SCM
|
Manish
|
ERP
|
Harsh
|
CRM
|
Gaurav
|
CRM
|
Amit
|
SCM
|
Manish
|
SCM
|
Manish
|
CRM
|
Ankit
|
ERP
|
Mohit
|
CRM
|
HarshProject
Project name
|
ERP
|
CRM
|
SCM
|
EmpProject ÷ HarshProject
EmpName
|
Harsh
|
Gaurav
|
Manish
|
It returns name of employee who have worked in all the projects in which Harsh has worked.
Relational calculus
It is a non procedural query language. It is used to extract data from relation. In this we don’t have to specify the procedure. We just have to specify data to be extracted.
It is of two types
1. Tuple relation calculus
2. Domain relation calculus
Tuple relation calculus
A query in tuple relational calculus is expressed as
{ t|P(t)}, where t denotes tuple variable and p denotes predicate or condition.
Example
Assume Employee table (empno,name,address,salary)
1. { t|t Є Employee} – It returns all the rows of relation employee
2. { t|t Є Employee Λ t[address]=”ranchi”} – It returns the details of employee whose address is ranchi
3. { t| there exists s Є Employee (t[empno]= t[empno] Λ t[address]=s[address]} – It returns the empno and address of all the employees
4. { t| there exists s Є Employee (t[empno]= t[empno] Λ t[address]=s[address] Λ t[salary]>20000} – It returns the empno and address of all the employees whose salary > 20000
Domain relation calculus
A query in tuple relational calculus is expressed as
{ <x1,x2,x3….xn>|P(x1,x2,x3….xn)}, where x1,x2,x3….xn denote domain variables and p denotes predicate or condition.
Example
Assume Employee table (empno,name,address,salary)
1. { < x1,x2,x3,x4 >| <x1,x2,x3,x4> Є Employee} – It returns all the rows of relation employee
2. { < x1,x2,x3,x4 >| <x1,x2,x3,x4> Є Employee Λ x3=”ranchi”} – It returns the details of employee whose address is ranchi
3. { < x1,x3>| there exists < x2,x4> (<x1,x2,x3,x4>Є Employee)} – It returns the empno and address of all the employees
4. { < x1,x3>| there exists < x2,x4> (<x1,x2,x3,x4>Є Employee Λ x4>20000 )} – It returns the empno and address of all the employees whose salary > 20000
E.F CODD Rules
E.F.Codd rules were given by E.F.Codd, a pioneer of relational data model. There are a set of 13 rules (0 to 12) which describe what is required from a database management system in order for it to be considered RDBMS.
1. Rule No 0 – Foundation Rule – DBMS must manage database entirely through its relational capabilities only.
2. Rule No 1- Information Rule – All information in a Relational database is represented explicitly at the logical level in exactly one way by values in tables.
3. Rule No 2 – Guaranteed Access Rule – Each and every data in a relational database is guaranteed to be logically accessible by using a combination of table name, primary key value and column name.
4. Rule No 3 – Systematic Treatment of null – Null values (not blank or 0) are supported for representing missing information or inapplicable information in a systematic way, independent of data type.
5. Rule No 4 – Dynamic Online Catalog – Database description is represented at the logical level in the same way as ordinary data.
6. Rule No 5 – Data Language – A relational system may support several language for data definition, data manipulation, data control, data query, transaction control, integrity constraint definition, session control and so on.
7. Rule No 6 – View Updating Rule – All view that are theoretically updatable are also updatable by the system.
8. Rule No 7 – High Level insert, update and delete – The capability of handling a relation as single operand applies to not only data retrieval but also to insert, update and delete operation.
9. Rule No 8 – Physical Data Independence – Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representation or access methods.
10. Rule No 9 – Logical Data Independence - Application programs and terminal activities remain logically unimpaired whenever any changes are made in logical structure of the database.
11. Rule No 10 – Integrity Independence – Integrity constraints specific to a particular relation must be definable in the relational data sublanguage and storable in the catalog, not in the application program.
12. Rule No 11 – Distribution Independence – The end user must not be able to see that data is distributed over various locations. Users should always get the impression that the data is located at one single site.
13. Rule No 12 – Non Subversion Rule – If a relational system has a low level language, that low level language can’t be used to subvert or bypass the integrity rules and constraints expressed in higher level relational language.
Conversion of ER Data Model to Relation Data Model
In relation data model everything has to be represented with relations only. We use foreign key to implement all kinds of relationship between relations