Mathematical models

A model is an informative and simplified representation of a system, aimed at understanding, communicating, explaining, or designing aspects of interest of that system. In the realm of sports science, mathematical models are used for:

๏ Analysing athlete/human performance in exercise

๏ Evaluating training effects

๏ Predicting training effects or in general the performances

๏ Optimising training protocols and effect and/or exercise performance

Apple Watch showing activity tracking with calories, steps, and distance

Today, models are everywhere. They estimate our VO₂max, our calories, our steps, our sleep quality, and countless other metrics. All these models—whether embedded in your smartwatch, fitness app, or training software—share common elements, which we will discover throughout this course.


🥱 Boring stuff first: ESSENTIAL DEFINITIONS

A helpful way to think about a model is like a shipping service: you provide an input, and it delivers an output. You give something in to the model, and you get something out.

-
xf(params)y

All models share a few essential components:

  • Variables — Symbols used to represent elements that can vary. These are the quantities that change over time or between scenarios.

  • Parameters — Symbols used to represent fixed values within a given context. These do not vary during a specific application of the model, although they can be adjusted between models.

  • Input variable — Typically something easy to measure, obtain, or control. We usually denote this with x.

  • Output variable — Usually the goal of the model: a quantity that is difficult to measure directly, or that we want to predict. This is typically represented with y.

  • Relationships (governing equations) — Mathematical expressions that link x to y in a well-defined (univocal) way — meaning that each input corresponds to a single, unique output.


The EASIEST of the EXAMPLES

I don't want you to get lost in details, semantics, or complex explanations that don't make sense just yet.
We'll introduce more terminology and formal language later. For now, let's start with something that everyone can relate to: an example so simple it resonates immediately.

Let's consider this: maximum heart rate declines with age. How?
A widely known model suggests that maximum heart rate decreases by 1 beat per minute (bpm) for every year of age.
It also proposes that, for any given age, the maximum heart rate can be estimated using the formula:

HRmax=220age\mathrm{HR}_{\max} = 220 - \text{age}

This is a model that takes age as input and returns the predicted maximum heart rate as output.
In this case, the model is linear, as it follows the general form of a straight-line equation:

y=ax+by = a \cdot x + b

Here we have two parameters:

  • b = 220 is the intercept (in units of bpm),
  • a = -1 is the slope (in units of bpm/year).

As we'll discuss later in the course, this model may not be appropriate for every population, age group, or gender.
That's why we often adjust the parameters to better fit specific data or populations.
When we change the values of a and b, the model's predictions will change too.

Try it yourself!
Use the sliders to manually adjust the parameters and observe how the predicted maximum heart rate changes.

Resulting model: HRmax = 220 - 1 * Age
For example, for a 50-year old (on average) your model predicts a HRmax of 170 bpm.

This example should help you understand that behind any model, there is a structure dictated by its underlying equation.
Once that structure is defined, the values of the parameters can still significantly influence the model's response.
In this case, the model was linear, and the two parameters were the slope and intercept of the line described by the single equation.
You will learn throughout this course that this is far from the only structure available—but it is certainly the most common and the simplest to understand.

💡Tip

The complexity of a model is not only determined by the number of parameters, but also by the complexity of its structure—that is, the equation that defines the model.
During this course, you will see that two models might have the same number of parameters, yet very different mathematical formulations, resulting in significantly different complexities.


🪄 THE PROCESS OF “CREATING” A MODEL

Modeling is a structured process that starts with observation of a phenomena in the real world, such as acute or chronic physiological adaptations during exercise. From there, we move into conceptualization, where we ask: What kind of process or relationship do I want to describe? This leads to formulation, where we select the appropriate mathematical tools (e.g., differential equations, regression models, neural networks) to express the concept rigorously.

Next comes identification, where we fit the model to experimental data: this is the parameter estimation process. At this stage, we must ask: Am I solving the correct equations for the problem at hand? Once parameters are fitted, we proceed to validation, where we test whether the model correctly solves the equations and makes accurate predictions. This phase often involves evaluation in with data that we did not use during the identification process.

🚨Danger

I want to stress again the fundamental principle that to validate a model, you must not use the same data that were used during the calibration (or training) phase! Please keep this in mind. This is one of the most common and underestimated mistakes in sports science.

Finally, accreditation answers the practical question: Can I use this model in everyday practice? A model is only useful if it performs well enough to support decision-making or enhance understanding in a real-world context. In summary, good modeling is not just about technical skill, but about choosing the right questions, building meaningful representations, and verifying that they work in the wild.


MODEL EXAMPLE: MUSCLE FORCE-LENGTH RELATIONSHIP

Let's consider a classic example in biomechanics: how torque generation ability during a isometric leg extension exercise varies with knee angle. Torque is primarily generated by the knee extensor muscles (like the quadriceps), and their force production depends heavily on muscle length, which is determined by joint angle.

We often observe that torque is maximal at a specific "optimal" knee angle, and decreases when the joint is either too extended or too flexed. This behavior reflects a fundamental physiological principle: if a muscle operates at a length different from its optimal length, its ability to generate force is reduced.

This phenomenon, known as the force-length relationship, is nonlinear, and a simple way to describe it mathematically is with a parabolic (quadratic) function.

A quadratic function is symmetric and can model a peak: it is perfect for this context.

y=ax2+bx+cy = a \cdot x^2 + b \cdot x + c

where: x is the knee angle (in degrees), y is the torque generated at that angle, then a, b, and c are parameters to be identified from the data.

To "learn" or "find" the values of the parameters (i.e. to calibrate the model), I collect torque data at a few selected knee angles. Using these measurements, I fit the parameters a, b, and c. A fitting process consists in finding those parameters that can minimise the distance between the experimental data and the model. This distance is typically expressed by means of the Root Mean Squared Error.

📝Note

I don't need to test my athlete/patient in every possible knee angle from full flexion to full extension. This is the power of using a model: it helps minimize the number of required measurements while still enabling accurate predictions.

Once the model is calibrated, we can find the knee angle that maximizes torque.
This corresponds to the vertex of the parabola: the point where torque is highest:

x=b2ax^* = -\frac{b}{2a}

At this angle x*, the torque reaches its maximum value.

💡Tip

The values of a must be negative! We expect the parabola to be concave.

In summary: A simple quadratic model can capture key features of the force-length relationship in leg extension.
It allows you to make informed decisions with fewer measurements — and extract useful insights like the optimal angle for torque production.

Try it yourself!
Use the sliders to manually adjust the parameters and observe how the parameters affect the fitting of the model. Observe the variation in the RMSE as you change the parameters. I bet you cannot find a better combination of parameters than the one I initially provided. How I know it? I did not conducted the fitting manually, but automatically, by means of an optimisation process. This concept will be discussed again in the next module.

Root Mean Square Error (RMSE):

1.26

(Lower values indicate a better fit)

Optimal Angle: 61.2°, Max Torque: 118.8 Nm


💡Spreadsheet

You can find the data and models at this link.


MODEL EXAMPLE: MUSCLE FORCE-VELOCITY RELATIONSHIP

When modelling muscle function, another key relationship to consider is how muscle velocity affects force generation.

Observation: There is a clear relationship between how fast a muscle shortens (its contraction velocity) and how much force it can produce.
Concept: The faster the contraction, the lower the force the muscle can express. This is particularly true for concentric (shortening) actions.
Formula: To capture this inverse relationship, we use the Hill equation — a classic model that resembles a hyperbolic curve.
Identification & validation: Once data are collected (e.g., from isokinetic tests), we can calibrate the parameters and validate the model against observations.

(F+a)(vM+b)=(F0M+a)b(F+a)\cdot (v_M+b)=({F_0}^M+a)\cdot b

Solving for FF, the force produced by the muscle at a given contraction velocity vMv_M, gives:

F=F0MbavMvM+bF = \frac{{F_0}^M \cdot b - a \cdot v_M}{v_M + b}

One useful performance indicator is the maximum shortening velocity of the muscle, vMAXv_{MAX}. We can estimate it by substituting F=0F = 0 into the Hill model:

vMAX=F0Mbav_{MAX} = \frac{{F_0}^M \cdot b}{a}

In practice, during isokinetic leg extension exercises, we typically test at different angular velocities (e.g., 90°/s, 180°/s).
If you have already conducted isometric tests, you can use the peak torque estimated from the quadratic model (explained previously) to provide F0M{F_0}^M — the maximal isometric force.

The power-velocity relationship would become:

FvM=vMF0MbavMvM+bF \cdot v_M= v_M \cdot \frac{{F_0}^M \cdot b - a \cdot v_M}{v_M + b}
P=vMF0MbavMvM+bP= v_M \cdot \frac{{F_0}^M \cdot b - a \cdot v_M}{v_M + b}

Loading chart data...

📚Biblio

You can read about applications of this model in the paper written by Muollo et al. 2021.

📝Note

You can find the exact same relationship between the maximal power in 5 sec sprint and power and the cadence in cycling. You can check out this paper by Douglas et al., 2021, where a quadratic relationship is used to model the relationship between cadence and deliverable torque.


MODEL EXAMPLE: VO₂max ESTIMATION AND DOPING IDENTIFICATION

Observation: Professional cyclists sometimes climb at extraordinary speeds, raising questions about the physiological plausibility of such performances.
Concept: We want to build a model that estimates VO₂max based on climbing performance (i.e. vertical ascent speed).
Formulation: We apply a simplified version of the cyclist's equation of motion, focusing on the power needed to overcome gravity during a climb.

In professional cycling, climbing high mountain passes at competitive speeds requires exceptional physiological capacity — often exceeding 6 W/kg sustained for 40+ minutes.

What kind of model can we use to estimate the required VO₂max?

📝Note

I'm a big fan of Ross Tucker, the exercise physiologist behind the SportScientist website and the Real Science of Sport Podcast.
His Tour de France analyses inspired the examples used in this module. Check them out here.

▸ We make the simplifying assumption that most of the required power output is used to gain elevation (i.e., to increase gravitational potential energy).
▸ This gives us a basic mechanical model:

P=mghΔt\mathrm{P} = \frac{m \cdot g \cdot h}{\Delta t}

This formula tells us that power PP is inversely proportional to the time taken to climb.
The faster the ascent (i.e. the shorter the time Δt\Delta t), the higher the required power output.

Climb Power Estimator

0.900
0.013
0.435

Road Gradient: 10.0%

Required Power: 283.4 W

Average Power (W/kg): 4.05 W/kg

VO₂ Demand: 4.06 L O₂/min

Estimated VO₂max: 4.51 L O₂/min

VO₂max per kg: 64.5 mL O₂/min/kg

Physiological Realm

Disclaimer: Limits from Mujika and Padilla, Physiological and performance characteristics of male professional road cyclists, Sports Medicine, 2002.

📚Biblio

An inverse relationship between VO₂max and cycling efficiency in elite cyclists was documented by Lucia et al., 2002. VO₂max and power output limits referenced in this module are from Mujika and Padilla, 2002.

💡Spreadsheet

You can find the data and models at this link.


Checkout

Well done, you were able to complete the first module of this course.

👮🏻‍♂️Checkpoint

At the end of Module-1 you should be able to reply to these questions with confidence:

  1. What are the essential building blocks of a mathematical model (variables, parameters, inputs, outputs, relationships), and how do they work together like a "shipping service"?
  2. What is the difference between linear and nonlinear models, and why might a quadratic model be more appropriate than a linear one for describing muscle force-length relationships?
  3. How can parameters like 'a' and 'b' in the heart rate model (HR_max = 220 - age) be adjusted to change model predictions, and what does this reveal about model calibration?
  4. How does the force-length relationship in knee extensors demonstrate the power of mathematical modeling to minimize required measurements while maximizing predictive capability?
Next Module