Relational Algebra is a theoretical language used to manipulate and query data in a relational database. It forms the foundation of SQL (Structured Query Language) and provides a formal way to describe how data should be retrieved or modified. Unlike SQL, which is declarative (you say what you want), relational algebra is procedural—it tells how to get the result.
Why Relational Algebra?
Relational Algebra is used because:
- It provides a formal framework for database operations.
- It helps in query optimization.
- It’s used as a base to design SQL queries.
- It ensures data is retrieved logically and efficiently.
Basic Concepts
A relation in relational algebra is simply a table. Each relation consists of tuples (rows) and attributes (columns). The operations in relational algebra are performed on one or more relations to return another relation as a result.
Types of Relational Algebra Operations
Relational algebra operations are divided into two types:
- Basic Operations (also called Fundamental)
- Derived Operations
1. Basic Relational Algebra Operations
These operations are essential for any relational query:
a. Selection (σ)
Selects rows from a table that satisfy a given condition.
Syntax:σ<condition>(Relation)
Example:σage > 18(Students)
→ selects all students older than 18.
b. Projection (π)
Chooses specific columns from a table.
Syntax:π<column1, column2>(Relation)
Example:πname, age(Students)
→ gets only name and age columns.
c. Union (∪)
Combines tuples from two relations (no duplicates).
Both relations must be union-compatible (same number and type of columns).
Syntax:Relation1 ∪ Relation2
Example:πname(Science) ∪ πname(Arts)
→ list of all unique student names in Science or Arts.
d. Set Difference (-)
Returns tuples from the first relation that are not in the second.
Syntax:Relation1 - Relation2
Example:πname(Science) - πname(Arts)
→ names only in Science, not in Arts.
e. Cartesian Product (×)
Combines every tuple of the first relation with every tuple of the second.
Used to join tables when there is no condition.
Syntax:Relation1 × Relation2
Example:Students × Courses
→ every student matched with every course.
f. Rename (ρ)
Used to rename a relation or its attributes.
Syntax:ρ(new_name, Relation)
Example:ρ(S, Students)
→ renames Students table as S.
2. Derived Relational Algebra Operations
These are built using the basic operations.
a. Join (⨝)
Combines related tuples from two relations using a condition.
Types of Join:
- Theta Join (θ): uses any condition (like
=
,<
,>
) - Equi Join: condition is equality (
=
) - Natural Join: automatically joins on common attributes
Example:Students ⨝ Students.course_id = Courses.id
→ joins both tables where course_id matches.
b. Intersection (∩)
Returns common tuples between two relations.
Syntax:Relation1 ∩ Relation2
Example:πname(Science) ∩ πname(Arts)
→ names common to both Science and Arts.
c. Division (÷)
Used when you want to find tuples in one relation that are associated with all tuples in another.
Example:
If you want students who are enrolled in all courses, use division.
Example for Better Understanding
Let’s consider two tables:
Students
RollNo | Name | Age |
---|---|---|
101 | Raj | 18 |
102 | Kirti | 20 |
103 | Aman | 17 |
Courses
RollNo | Course |
---|---|
101 | Math |
102 | English |
103 | Math |
Now:
σAge > 18(Students)
→ Raj and Kirti.πName(Students)
→ Raj, Kirti, Aman.Students ⨝ Students.RollNo = Courses.RollNo
→ shows which student takes which course.
Features of Relational Algebra
- Mathematically sound: based on set theory and logic.
- Flexible: Can represent complex queries.
- Non-duplicative: Relations are sets, so no duplicate rows.
- Foundation for SQL: Concepts directly map to SQL keywords.
Differences Between SQL and Relational Algebra
Feature | Relational Algebra | SQL |
---|---|---|
Type | Procedural | Declarative |
Syntax | Mathematical symbols | English-like commands |
Execution | Step-by-step operations | Just states the outcome |
Duplicates | Automatically removed | May allow duplicates |
Conclusion
Relational Algebra is a core concept in DBMS that helps understand how database queries are processed and optimized. Even though modern users write queries in SQL, database engines use relational algebra internally to plan and execute those queries efficiently. Understanding it helps developers write better queries and build faster application.