Does urolithin A help runners recover?
Learn to train smart, run fast, and be strong with this endurance performance nerd alert from Thomas Solomon, PhD.
Evaluating the Impact of Urolithin A Supplementation on Running Performance, Recovery, and Mitochondrial Biomarkers in Highly Trained Male Distance Runners.
Whitfield et al. (2025) Sports Med (click here to open the original paper)
What type of study is this?
◦ This study is a randomised controlled trialThe “gold standard” approach for determining whether a treatment has a causal effect on an outcome of interest. In such a study, a sample of people representing the population of interest is randomised to receive the treatment or a no-treatment placebo (control), and the outcome of interest is measured before and after exposure to the treatment and control..
What was the authors’ hypothesis or research question?
◦ The authors hypothesised that taking 1,000 milligrams per day of urolithin A for 4 weeks during an intensified altitude camp would reduce muscle damage and inflammation, improve mitochondrial biology, and translate into better endurance performance.
What did the authors do to test the hypothesis or answer the research question?
◦ The researchers enrolled 42 highly trained male middle- and long-distance runners who completed a 5-week camp that included a 3-week training block at roughly 1,700 to 2,200 meters. Participants were randomised to urolithin A (n=22) or placebo (n=20). (NOTE: Urolithin A is a compound produced by gut bacteria after eating certain foods, like pomegranates and berries.) Everyone completed pre- and post-testing for body composition, haemoglobin mass, running economyThe rate of energy expenditure (measured in kiloJoules [KJ], kilocalories [kcal] or oxygen consumption [V̇O2]) per kilogram body mass (kg) per unit of distance, i.e. per 1 kilometre travelled. A runner with a lower energy cost per kilometre has a higher economy than a runner with a higher energy cost., and maximal oxygen uptake (V̇O2maxV̇O2max is the maximal rate of oxygen consumption your body can achieve during exercise. It is a measure of cardiorespiratory fitness and indicates the size of your engine, i.e., your maximal aerobic power, which contributes to endurance performance.). During camp, all athletes performed a weekly downhill run to provoke muscle damage, with capillary blood collected for creatine kinase (a muscle damage marker) and C-reactive protein (an inflammatory marker). A subset of 11 athletes did a 3,000-meter track time trial before and after supplementation, and a separate biopsy subset (n=11 urolithin A, n=9 placebo) provided muscle for proteomics and mitochondrial respiration assays.
What did the authors find?
◦ The 3,000-meter performance did not improve in either group after 4 weeks. However, ratings of perceived exertion (RPE)Rating of perceived exertion (RPE) is a simple way to score how hard exercise feels to you, not to a machine. You pick a number on a scale (often 1–10), where low numbers mean “this feels easy” and high numbers mean “I’m really pushing it.” It blends how heavy your breathing is, how tired your muscles feel, and how much effort you think you’re putting in. during the time trial were lower with urolithin A, and creatine kinase responses were smaller after the race, versus placebo. Maximal oxygen uptake rose within both groups, with a larger within-group increase reported for urolithin A (about 5.4 percent; 66.4 to 70.0 milliliters per kilogram per minute) than placebo (about 3.6 percent; 66.4 to 68.7 milliliters per kilogram per minute), but the time-by-treatment interaction was not significant, meaning that changes between groups were not different. Muscle proteomics showed upregulation of mitochondrial protein-containing complexes and downregulation of inflammatory pathways with urolithin A; mitophagy markers leaned higher (medium effect size), yet high-resolution mitochondrial respiration (a measure of oxygen consumption) in permeabilised muscle fibers did not change.
◦ The authors concluded that 4 weeks of urolithin A improved recovery-related signals and perceived effort but did not enhance 3,000-meter performance in highly trained male runners.
What were the strengths?
◦ The trial used randomisation, identical placebo capsules, and double-blindingBlinding is when people in a study don’t know which treatment they’re getting. It stops expectations or beliefs (from patients or researchers) from skewing the results. “Single-blind” means participants don’t know; “double-blind” means participants and researchers don’t know; “triple-blind” means that the participants, researchers, and data analysts are kept in the dark. The goal is simple: fair tests and trustworthy findings., with training and diet closely standardised in a residential training camp. The protocol was pre-registeredPreregistration is when a detailed description of a study plan is deposited in an open-access repository before collecting the study data. It promotes transparency and accountability, and boosts research integrity. Without preregistration, it is easier for scientists to change outcomes after seeing the data, selectively report “exciting” results, or run many analyses and only show the ones that work, which can introduce bias and weaken the trustworthiness of the findings., the primary outcomes were declared in advance, and statistical methods were appropriate and transparent. The team also went beyond field tests by adding biopsies, proteomics, and respiration to link performance-adjacent outcomes with muscle biology.
What were the limitations?
◦ The total sample was modest and split across performance and biopsy substudies. This can sap statistical powerStatistical power is the probability that a statistical test will correctly detect a real effect if there is one: a true positive. (In jargon: power is the probability that a statistical test correctly rejects a false null hypothesis). Higher statistical power reduces the risk of a false negative (failing to detect a true effect; or a Type II error). Power is typically influenced by sample size, effect size, significance level, and variability in the data, with a common target being at least 80% (or 0.8). and make it harder to spot true but small effects. There was no formal power calculationA power calculation is a way to figure out how many people or data points you need in a study so you can reliably spot a real effect if it exists. It balances four things: the size of the effect you care about, how much random variation there is, how strict you are about false alarms, and how likely you want to be to detect the effect. In plain terms: it helps you avoid running a study that’s too small to be useful or so big that it wastes time and money., and the non-significant treatment interaction for maximal oxygen uptake means we should be cautious about over-reading the larger within-group change. The performance test was a single 3,000-meter time trial per phase, and runners were male only, which limits generalisabilityGeneralisability is about how far you can confidently stretch a study’s findings beyond the specific people, place, and conditions that were tested. In simple terms, it asks: “If this result is true here, how likely is it to also be true in other groups or real-world settings?” It’s closely linked to external validity, which is the overall strength of those broader conclusions. to other populations. Finally, the sponsor manufactured the supplement and some authors were employees, which raises the usual questions about industry involvement, although conflicts of interest were explicitly stated.
How was the study funded, and are there any conflicts of interest that may influence the findings?
◦ The study reports research funding from Amazentis SA. Three authors were employees of Amazentis SA; one author reported travel support from the sponsor; another reported prior employment and advisory roles with nutrition companies; and one author sits on the journal’s editorial board (but was not involved in decisions for this paper).
How can you apply these findings to your training or coaching practice?
◦ For coaches and endurance nerds, there’s a practical signal: urolithin A reduced muscle-damage markers and felt easier at the same race distance, which might help athletes train hard more often, especially during camps at altitude. But since the stopwatch didn’t budge over 4 weeks, performance-minded folks should treat this as a potential recovery aid, not a magic speed potion. In other words, nothing to see here, move along, move along.
◦ One last thought: If something makes hard efforts feel easier and trims muscle-damage signals without moving the finish time after 4 weeks, does it still help over a long race season with races stacked and legs tired? Future research should answer that question.
What is my Rating of Perceived scientific Enjoyment (RPsE)?
7 out of 10 → I experienced moderate scientific enjoyment because the design was strong and the mechanistic depth was excellent, but the split samples and lack of a power calculationA power calculation is a way to figure out how many people or data points you need in a study so you can reliably spot a real effect if it exists. It balances four things: the size of the effect you care about, how much random variation there is, how strict you are about false alarms, and how likely you want to be to detect the effect. In plain terms: it helps you avoid running a study that’s too small to be useful or so big that it wastes time and money. limit my confidence about small performance effects.
Important: Don’t make any major changes to your daily habits based on the findings of one study, especially if the study is small (e.g., less than 30 participants in a randomised controlled trial or less than 5 studies in a meta-analysis) or poor quality (e.g., high risk of biasRisk of bias in meta-analysis refers to the potential for systematic errors in the studies included in the analysis, which can lead to misleading or invalid results. Assessing this risk is crucial to ensure the conclusions drawn from the combined data are reliable. or low quality of evidenceA low quality of evidence means that, in general, studies in this field have several limitations. This could be due to inconsistency in effects between studies, a large range of effect sizes between studies, and/or a high risk of bias (caused by inappropriate controls, a small number of studies, small numbers of participants, poor/absent randomization processes, missing data, inappropriate methods/statistics). When the quality of evidence is low, there is more doubt and less confidence in the overall effect of an intervention, and future studies could easily change overall conclusions. The best way to improve the quality of evidence is for scientists to conduct large, well-controlled, high-quality randomized controlled trials.). What do other trials in this field show? (opens in new tab) Do they confirm the findings of this study or have mixed outcomes? Is there a high-quality systematic review and meta-analysis evaluating the entirety of the evidence in this field? (opens in new tab) If so, what does the analysis show? What is the risk of bias or certainty of evidenceCertainty of evidence tells us how confident we are that the results reflect the true effect. It’s based on factors like study design, risk of bias, consistency, directness, and precision. Low certainty means more doubt and less confidence, and that future studies could easily change the conclusions. High certainty means that the current evidence is so strong and consistent that future studies are unlikely to change conclusions. across the included studies?
Access to education is a right, not a privilege
Equality in education, health, and sustainability matters deeply to me. I was fortunate to be born into a social welfare system where higher education was free. Sadly, that's no longer true. That's why I created Veohtu: to make high-quality exercise science and sports nutrition education freely available to folks from all walks of life. All the content is free, and always will be.