✅ 🇬🇧 Entité-Association Model - MCD (student)
Catégorie | Cours |
---|---|
Ordre d'apprentissage | 13 |
Statut | préparé |
Cours de Julie Chaumard
Data Dictionary
The Marathon Information System Project
The Marathon des Sables organizing committee aims to develop an information system to manage the events for which it is responsible. The project is now underway. The first step consists of identifying the content of the database.
Objective of This Step
The objective of this step is to establish a comprehensive list of the metadata in our database and the business rules that should be applied to them.
This list is formalized as a data dictionary, which serves several purposes:
• It defines the metadata in the context of the project.
• It provides a comprehensive inventory of all metadata.
• It must be understandable and validated by all stakeholders.
Metadata Definition Process
To define the metadata, several input elements are required. These inputs can take different forms:
• Documents: specifications, existing editions, etc.
• Existing IT applications.
• Discussions with users.
The project team is responsible for organizing workshops to exchange information and identify the inputs for the data dictionary.
For the Marathon project, the organizing team presents documents during these workshops, such as the starting list, bib numbers, route details, and results. Similarly, the document used for distributing bibs to runners is collaboratively developed.
At the end of this step, a list of metadata and their associated business rules should be established.
Data Dictionary
Label | Type | Size | Description |
---|---|---|---|
Defines a metadata item within the information system. This information must be single-valued and non-decomposable. Single-valued means that this information can take only one value within the project context. | |||
Runner’s last name | String | 50 | The runner’s last name must match the one on their identity card. |
Runner’s date of birth | Date | DD/MM/YYYY | The runner’s date of birth must match the one on their identity card. |
Runner’s age | Integer | 3 | The runner’s age is calculated as the difference between the current date and their date of birth. |
Decomposing Information
The decomposition of information depends heavily on the context in which it is used.
Example: The metadata “Address”
• A postal address is often stored as a single string for informational purposes.
• However, in the context of postal mail delivery, the address needs to be decomposed into multiple metadata fields: floor, number, street, postal code, city, following the standardized requirements set by postal services.
Business Rules
Business rules are usually implemented in the application’s code. However, one of the project steps is to validate the metadata model against these rules.
Number | Description |
1 | Within an event, a runner can participate in only one race. |
2 | Children under 10 years old cannot participate in races. |
3 | A runner’s age category is determined based on their age on race day. |
4 | Each runner joins a club for one year. |
5 | A runner can only start a race if their registration is complete. |
6 | For each race, runners are ranked by gender and age category, from fastest to slowest. |

Bib Number Description
The workshop begins with a description of the bib number, which includes the following information:
• The event name.
• The race date.
• The race type.
• The bib number.
• The runner’s first name.
• The starting corral.
The bib printer requires a file containing all this information to generate the bibs. Some details may be pre-printed on the bib.
Data Dictionary
Name | Type | Size | Description |
---|---|---|---|
Event | String | 254 | Event name, e.g., Marathon du Médoc. |
Race date | Date | DD/MM/YYYY | |
Race type | String | 20 | Name of the race type. The race type corresponds to the race distance: Marathon, Half-Marathon, I-Run Kid’s. |
Bib number | Numeric | 5 | Bib number > 0. |
Runner’s first name | String | 50 | The runner’s first name must match the one on their identity card. |
Starting corral | String | 1 | Indicates the starting corral assigned to the runner based on their predicted race time. |
During registration, runners must provide their estimated race time. Starting corrals are organized based on these times to ensure a smooth start.
Addition of “Estimated Time” Metadata
Name | Type | Size | Description |
---|---|---|---|
Estimated time provided by the runner | Numeric | 4 | Estimated race time given at registration. Measured in minutes. The maximum time is 360 min for the marathon and 180 min for the half-marathon. |
The organizing team explains that the bib number is crucial because it links all runner-related information.
Key questions to consider:
• Does a runner keep the same bib number across multiple editions?
• Is there a maximum number of bib numbers in an event?
• Is the bib number assigned per race or per runner category?
• How are bib numbers assigned?
• If a runner withdraws, is their bib number reassigned?
After discussions, the following decisions are made:
• Registering for a race assigns a unique bib number for the entire event.
• Bib numbers are assigned in order of registration.
Name | Type | Size | Description |
---|---|---|---|
Bib number | Numeric | 5 | Bib number > 0. Unique within an event. |
Race Pass Analysis
The Race Pass allows runners to collect their bib number. It contains information about the runner, their race, and their registration status.
Name | Type | Size | Description |
Runner’s name | String | 50 | Matches their identity card. |
Race date | Date | DD/MM/YYYY | |
Race start time | Time | HH:MM | |
Runner’s gender | String | 1 | M (Male), F (Female). |
Bib number | Numeric | 5 | Displayed as a QR code on the Race Pass. |
Registration status | String | 26 | Can be: registered, pending registration, complete. |
Registration date | Date | DD/MM/YYYY | |
T-Shirt size | String | 3 | S, M, L, XL, XXL. |
Registrations close on a set date for runners who have completed their registration and payment. Some documents may still be pending (e.g., medical certificate, ID copy).
New Business Rule
Number | Description |
5 | A runner can only start a race if their registration is complete. |
Route Analysis
During this first workshop, the organizers provided the marathon route. They would like to add an image of the route to the Race Pass. This metadata is added to the data dictionary:
Name | Type | Size | Description |
Race route | Image | - | Image of the race route. |
Event | String | 254 | Name of the event, e.g., Marathon du Médoc. |
Runner’s last name | String | 50 | The runner’s last name must match their identity card. |
Runner’s first name | String | 50 | The runner’s first name must match their identity card. |
Race date | Date | DD/MM/YYYY | Date of the race. |
Race start time | Time | HH:MM | Time of the race start. |
Runner’s gender | String | 1 | M (Male), F (Female). |
Bib number | Numeric | 5 | Bib number > 0. Must be displayed as a QR (Quick Response) code on the Race Pass. |
Registration status | String | 26 | Can take the following values: registered, pending registration, complete. |
Registration date | Date | DD/MM/YYYY | Date of race registration. |
T-Shirt size | String | 3 | The runner’s shirt size. Possible values: S, M, L, XL, XXL. |
Estimated time provided by the runner | Numeric | 4 | Estimated race time provided by the runner at registration, in minutes. The maximum time allowed is 360 minutes for the marathon and 180 minutes for the half-marathon. |
Actual time recorded by the runner | Numeric | 4 | Actual time recorded by the runner during the race, in minutes. |
Starting corral | String | 1 | Indicates the starting corral where the runner must position themselves, assigned based on their estimated time at registration. |
Runner’s age | Integer | 3 | The runner’s age is calculated as the difference between the current date and their birthdate. |
Age category label | String | - | Possible values: Poussins (10-11 years), Benjamins (12-13 years), Minimes (14-15 years), Cadets (16-17 years), Juniors (18-19 years), Espoirs (20-22 years), Seniors (23-39 years), Masters (40+ years). |
Minimum age for a category | Numeric | 2 | Minimum age for an age category. |
Maximum age for a category | Numeric | 2 | Maximum age for an age category. |
Championship name | String | 50 | A championship is a collection of events. A ranking is established per championship. |
Championship start date | Date | DD/MM/YYYY | Start date of a championship. |
Championship end date | Date | DD/MM/YYYY | End date of a championship. |
Club name | String | 50 | Name of an athletics club. |
Club city | String | 50 | City where the club is registered. |
Runner’s club membership year | Numeric | 4 | The year a runner is registered with a club. |
License number | Numeric | 6 | Federation license number, applicable to a club membership for a given year. |
Race type | String | 20 | Name of the race type. The race type is linked to the race distance: Marathon, Half-Marathon, I-Run Kid’s. |
Race type distance | Numeric | Decimal, 3 decimal places | Distance in kilometers for a specific race type. |
Business Rules
Number | Description |
1 | In an event, a runner can participate in only one race. |
2 | Children under 10 years old cannot participate in races. |
3 | A runner’s age category is calculated based on their age on race day. |
4 | Each runner must join a club for one year. |
5 | A runner can only start a race if their registration is complete. |
6 | Each race has a ranking based on gender and age category, with runners classified from fastest to slowest. |
ERD (Entity-Relationship Diagram)
Draw ERD with Draw.io
Database Construction
First Step
The first step of the project for the “Marathon” organizing committee involved establishing the data dictionary and business rules. The next step is to build the database.
Objectives
The modeling of a database serves several objectives:
- Represent the real-world subject matter.
- Centralize and organize metadata.
- Eliminate metadata redundancy.
- Follow a normalization process to ensure compatibility with IT tools.
Approach
To build the database, it is necessary to follow an approach that defines several models:
- Conceptual Data Model (CDM): An abstraction of reality, independent of implementation choices.
- Logical Data Model (LDM): An adaptation of the conceptual model based on implementation choices.
- Physical Data Model (PDM): The actual implementation of the logical model within a Database Management System (DBMS).
The Conceptual Data Model (CDM)
This first model represents the envisioned solution without considering technical constraints or implementation choices.
It must be understandable to be shared with non-technical stakeholders.
The entity-relationship model formalizes real-world objects and their relationships.
“The entity-relationship model (ERD) is a data model or diagram used for high-level descriptions of conceptual data models. It provides a graphical representation of such models using entities and relationships.”
Entity
An entity is an object uniquely identified and characterized by attributes.
Example 1
The first entity RUNNER is defined with the following attributes:
- CO_LASTNAME
- CO_FIRSTNAME
- CO_BIRTHDATE
- CO_GENDER
A best practice is to use a prefix for each entity. This rule becomes important when working with multiple attributes from different entities simultaneously.
The Unified Modeling Language (UML) proposes another notation. An entity is represented as a rectangle divided into two sections:
- The first section contains the name of the entity.
- The second section contains the attributes.

At this level of formalization, the term entity is sometimes used interchangeably for both metadata and its corresponding data.
However, it is more appropriate to use the term instance to refer to an entity’s actual data.
Identifier
Once the attributes have been defined in the data dictionary, it is necessary to define an identifier.
Each entity must have an identifier that meets the following properties:
- Unique: Each value must identify one and only one instance of the entity.
- Stable: The identifier must not change based on another piece of information (e.g., time).
- Single-valued: It must contain only one piece of information, not a list of values.
- Minimal: It must not be decomposable into multiple pieces of information.
An identifier can be of two types:
- Natural: When it is meaningful and defined by users.
- Artificial: When it is necessary to add one.
For runners, there is no natural identifier, so we must add one:
CO_ID
(The notation standard requires underlining the identifier.)

Relationships (Associations)
Once entities have been defined with their identifiers and attributes, it is essential to define the relationships between them.
These relationships are called associations and are represented by a line connecting entities.
- The name of the association is written in an oval on the line.
The ADHESION relationship expresses:
- A runner’s membership in a club for a given year.
- The fact that club members are also runners.

Defining Cardinalities
Once an association is established, cardinalities define the number of occurrences each entity can have in the relationship.
To determine these, we must answer minimum and maximum occurrence questions for each entity involved in the association:
- How many clubs is a runner linked to at a minimum for one year? →
0
- How many clubs is a runner linked to at a maximum for one year? →
N
(multiple)
- How many runners is a club linked to at a minimum for one year? →
1
- How many runners is a club linked to at a maximum for one year? →
N
(multiple)
- How many runners is a year linked to at a minimum for a club? → 0
- How many runners is a year linked to at a maximum for a club? →
N
(multiple)

Attributes of a Relationship
An association can have attributes.
In this project, the license number and membership fee amount are attributes related to the three entities involved.
These attributes are therefore added to the ADHESION association.

Similarly, we define the association between the RACE and RACE TYPE entities.
Cardinalities of the RACE - RACE TYPE Association

The association being established, we must now define its cardinalities:
- How many race types is a race linked to at a minimum? →
1
- How many race types is a race linked to at a maximum? →
1
- How many races is a race type linked to at a minimum? →
0
- How many races is a race type linked to at a maximum? →
N
(multiple)

The Marathon Database: Conceptual Data Model (CDM)
Once the entities, relationships, identifiers, and cardinalities are defined,
the Conceptual Data Model (CDM) for the Marathon database is as follows:

Functional Integrity Constraint (CIF)
CIF in this Conceptual Data Model (CDM)
✔ Between EVENT and TEST (COMPOSES)
Functional Integrity Constraint (CIF)
For the cardinality pair (1,1) and (1,N), the association is called a Functional Integrity Constraint (CIF).
This concept is used during the implementation of the database.
The Functional Integrity Constraint (CIF) is a rule that ensures a relationship (association) adheres to certain cardinality constraints when implemented in a database.
Explanation with Cardinalities (1,1) and (1,N)
When an association has these cardinalities:
- (1,1) means that an entity is necessarily linked to only one other entity.
- (1,N) means that an entity can be linked to multiple occurrences of another entity.
Applied Example
- EVENT (1,1) — (1,N) TEST
Interpretation:
- Each test belongs to only one event (1,1).
- An event can consist of multiple tests (1,N).
📌 CIF present → The test functionally depends on the event.
This imposes a functional relationship:
Each TEST belongs to a single EVENT, but an EVENT can group multiple TESTS.
Why is this concept important in databases?
When implementing a relational database, this constraint ensures:
- Data integrity: A test cannot be linked to multiple events at the same time.
- Optimization of relationships: A foreign key is correctly placed to avoid unnecessary duplicates.
- Consistency with business rules: The data structure follows the rules defined in the conceptual model.
Representation in a Conceptual Data Model (CDM)
In a Conceptual Data Model (CDM), a Functional Integrity Constraint (CIF) is represented by an arrow pointing from the dependent entity to the determining entity.
┌───────────┐ 1,1 ┌──────────────┐
│ TEST. │────▶ │ EVENT │
└───────────┘ 1,N └──────────────┘
Time in the Conceptual Data Model
This model represents the reality of a runner who can belong to multiple clubs throughout their career
and who receives a new license each year.
Thus, the model captures the various pieces of information that will be stored over time.
However, there is a limitation in these models: the dynamics of time are not considered.
As a result, the model allows a runner to join multiple clubs in the same year.

Objective of This Model
The goal of this model is to represent the historical record of a runner’s annual club memberships,
along with the associated license.
A new business rule is introduced:
A runner cannot have multiple club memberships for the same year.

Association Management Rules
Two statements are both true:
- A runner can belong to multiple clubs.
- A runner can belong to only one club.
With the following clarification:
- A runner can belong to multiple clubs throughout their career.
- A runner can belong to only one club per given year.
The identifier for the ADHESION association cannot include the CL_ID attribute.
It is therefore reduced to the attributes CO_ID and AD_YEAR.
The link between ADHESION and CLUB is symbolized by an arrow.

Solution to Address This Modeling Limitation
The chosen solution to overcome this limitation is to exclude the CL_ID attribute from the ADHESION relation’s key.
This is covered by the business rule:
“Each runner is a member of only one club per given year.”
Explanations:
Explanation of the Model and the Constraint on Runners’ Club Memberships
In this Conceptual Data Model (CDM), we manage runners’ memberships to a club through the ADHESION entity.
The objective is to correctly model the relationship between RUNNER and CLUB while considering a time constraint:
- A runner can belong to multiple clubs, but not at the same time.
- A runner can be a member of only one club per given year.
Why are these rules important?
1️⃣ A runner can belong to multiple clubs throughout their career.
- Example: A runner may be in Club A in 2022, then in Club B in 2023.
- Therefore, a runner can have multiple memberships, but in different years.
2️⃣ A runner cannot be in more than one club in a given year.
- Example: In 2023, they cannot be simultaneously registered in two different clubs.
- This enforces a uniqueness constraint: a runner can have only one membership per year.
Why exclude CL_ID from the primary key of the ADHESION relationship?
If CL_ID were included in the primary key of ADHESION, a runner could have multiple memberships for the same year (one per club), which would violate the business rule.
📌 Solution:
- The primary key of ADHESION should be (CO_ID, AN_ANNEE) instead of (CO_ID, AN_ANNEE, CL_ID).
- This ensures that a runner can have only one membership per year, even if multiple clubs exist.
Symbolization with an Arrow:
- The link between ADHESION and CLUB is represented by an arrow because CL_ID is not part of the primary key of ADHESION.
- This means that a runner’s club is dependent information of their membership but does not form part of the unique identifier of the membership.
How Will This Be Handled in the Database?
When transitioning to the relational model (RDM), this constraint will be enforced through:
✔ A primary key on (CO_ID, AN_ANNEE) to prevent multiple memberships in the same year.
✔ A foreign key to CLUB to store the club membership information.
Conclusion
💡 Thanks to this solution:
✅ A runner can belong to multiple clubs over their career.
✅ A runner can be registered in only one club per year.
✅ The model adheres to business constraints.
Agence digitale Parisweb.art
Tout savoir sur Julie, notre directrice de projets digitaux :
https://www.linkedin.com/in/juliechaumard/