The Entity-Relationship (ER) Model is a high-level conceptual data model used to describe data requirements during database design. It captures the structure of data in terms of entities, their attributes, and relationships among entities — independent of any particular DBMS.
Key idea: Design at the conceptual level first (what data exists, how it relates), then map to the logical level (relational schema).
An entity is a real-world “thing” that can be distinctly identified (e.g., a specific student, a specific course).
An entity set is a collection of similar entities sharing the same structure (e.g., all Students, all Courses).
| Strong Entity | Weak Entity | |
|---|---|---|
| Has its own key? | Yes | No — identified by owner’s key + partial key |
| Existence dependent? | Independent | Depends on owner entity |
| Notation | Rectangle | Double rectangle |
| Relationship to owner | — | Identifying relationship (double diamond) |
Example: ORDER_ITEM is a weak entity — it cannot be uniquely identified without knowing which ORDER it belongs to. The line_number alone is not unique across all orders, but (order_id, line_number) is.
The attribute(s) that, combined with the owner’s key, uniquely identify a weak entity instance. Shown with a dashed underline in ER diagrams.
| Type | Description | Example |
|---|---|---|
| Simple | Atomic, cannot be subdivided | age, gender |
| Composite | Can be divided into sub-parts | name → {first_name, last_name} |
| Single-valued | One value per entity | SSN |
| Multi-valued | Multiple values per entity | phone_numbers, degrees_held |
| Derived | Computed from other attributes | age derived from birth_date |
| Null | Attribute doesn’t apply or is unknown | optional phone number |
address → {street_address, city, state, zip_code}
street_address → {house_number, street_name}
In a relational mapping, composite attributes are flattened — only the leaf-level simple attributes are stored.
Stored in a separate relation with a foreign key to the owning entity. Example: EMPLOYEE_PHONE(emp_id, phone_number).
Typically not stored — computed on the fly. If stored for performance, a trigger must maintain consistency.
A relationship is an association among two or more entities. A relationship set is a set of relationships of the same type.
Relationships can have their own attributes. Example: WORKS_FOR(employee, department) might have start_date as a relationship attribute.
In M:N relationships, these attributes become columns in the junction table. In 1:N relationships, they can be placed in the N-side entity table.
Cardinality defines how many entity instances on one side can relate to how many on the other.
| Type | Meaning | Example |
|---|---|---|
| 1:1 | One entity associates with at most one other | Person — Passport |
| 1:N | One entity associates with many | Department — Employee |
| M:N | Many entities on both sides | Student — Course |
| Type | Meaning | Notation |
|---|---|---|
| Total (mandatory) | Every entity MUST participate | Double line |
| Partial (optional) | Entity may or may not participate | Single line |
Example: Every EMPLOYEE must work for a DEPARTMENT (total participation of EMPLOYEE). But a DEPARTMENT may exist without employees initially (partial participation of DEPARTMENT).
More precise than crow’s foot. (min, max) on the relationship line next to the entity:
(1,1): must participate in exactly 1 relationship instance(0,1): optional, at most 1(0,N): optional, unlimited(1,N): must participate in at least 1Multiple entity sets are combined into a higher-level superclass. Example: CAR and TRUCK are generalized into VEHICLE.
A higher-level entity is divided into sub-entities based on distinguishing characteristics.
Subclasses inherit all attributes and relationships of the superclass.
Disjointness:
d in ER notation)o)Completeness:
Used when a relationship participates in another relationship. A relationship set is treated as a higher-level entity set.
Example: EMPLOYEE works on PROJECT (the WORKS_ON relationship). A MANAGER manages this employee-project combination. Instead of a ternary relationship, represent WORKS_ON as an aggregation and have MANAGES relate MANAGER to the aggregated WORKS_ON.
Example:
ORDER(order_id, order_date, customer_id)
ORDER_ITEM(order_id, line_number, quantity, product_id)
PK: (order_id, line_number)
FK: order_id → ORDER
Three options — choose based on participation:
| Participation | Preferred Strategy |
|---|---|
| Total on both sides | Merge into a single table |
| Total on one side (say E1) | Add FK + relationship attributes into E1’s table (E1 has total participation, so FK will never be NULL) |
| Partial on both sides | FK in either table; choose the side more likely to participate |
Why prefer total-participation side? Placing the FK on the partial side creates NULLs for non-participating entities — wasted space and complicates queries.
Example:
DEPARTMENT(dept_id, dept_name, location)
EMPLOYEE(emp_id, name, salary, dept_id) ← FK dept_id → DEPARTMENT
Example:
STUDENT(student_id, name)
COURSE(course_id, title)
ENROLLMENT(student_id, course_id, grade, semester)
PK: (student_id, course_id)
FK: student_id → STUDENT, course_id → COURSE
Example:
EMPLOYEE(emp_id, name)
EMPLOYEE_PHONE(emp_id, phone_number)
PK: (emp_id, phone_number)
FK: emp_id → EMPLOYEE
Three approaches:
Option A — Single table with type flag (for total, overlapping):
type discriminator columnOption B — Table per subclass (for disjoint specialization):
Option C — Superclass table + subclass tables (most common):
Given an ER diagram, write the relational schema with all PKs, FKs, and constraints. Common traps: