The Endurance Performance Nerd Alert
Learn to train smart, run fast, and be strong with Thomas Solomon, PhD 
November 2025

Use this Nerd Alert of the latest exercise science and sports nutrition research to improve your running performance or coaching practice.
The research studies are divided into subtopics —
media debunking,
null findings,
training methods,
sports nutrition,
supplements,
recovery,
athlete health,
injuries and rehab,
the placebo effect,
masters athletes, and
female athlete physiology
— but I’ve also provided a deeper dive into 4 studies:
Does durability predict marathon speed?
Do compression socks help runners?
Do teen athletes know sports nutrition?
Does post-exercise stretching aid recovery?
And, there’s my beer of the month to wash it all down.
All the interesting papers I found this month are immediately below.
Dig in and evaluate the authors’ findings by clicking on the titles to access the full papers.
Evaluate each paper thoughtfully—be sceptical, not cynical. To guide you, consider using the framework I applied when doing my deep dives. This approach will help you assess the quality of a study while also appreciating the complexity and nuance of scientific research.
General training methods and performance
Durability of Parameters Associated With Endurance Running in Marathoners. Hunter et al. (2025) Eur J Sport Sci.
Influence of Trail Running Footwear Foam on Running Economy and Perceptual Metrics. Muzeau et al. (2025) Eur J Sport Sci.
Persistent Improvements in Running Economy With Advanced Footwear Technology During Prolonged Running in Trained Male Runners. Madsen et al. (2025) Scand J Med Sci Sports.
Tailoring exercise intensity: Acute and chronic effects of constant-speed and heart rate-clamped exercise in healthy, inactive adults. Mazzolari et al. (2025) J Sci Med Sport.
Prolonged running reduces speed at the moderate-to-heavy intensity transition without additional reductions due to increased eccentric load. Barrett et al. (2025) Eur J Appl Physiol.
Comparative effects of low-load blood flow restriction training and high-load resistance training on physical performance in college 800-m runners: a randomized control trial. Yu et al. (2025) Front Physiol.
Muscle Oxygenation Threshold in More and Less Active Muscles and 3,000-m Running Pace. Cirino et al. (2025) Int J Sports Med.
Can Level Ground Biomechanics Predict Uphill and Downhill Running Economy? Steele et al. (2025) J Appl Biomech.
A comparative study of lactate threshold testing outcomes: walking vs. running. Larssen et al. (2025) J Sports Med Phys Fitness.
Prolonged running reduces speed at the moderate-to-heavy intensity transition without additional reductions due to increased eccentric load. Barrett et al. (2025) Eur J Appl Physiol.
Examination of individual responses to acute ketone monoester ingestion during maximal aerobic exercise at altitude may generate physiological insights. McCarthy et al. (2025) J Appl Physiol (1985).
Effect of work-to-rest ratio on acute responses to repeated-sprint training in hypoxia in elite track endurance cyclists. Barratt et al. (2025) J Sports Sci.
Persisting elevation of total hemoglobin mass after altitude training in elite swimmers: a potential role of prolonged erythrocyte survival. Carin et al. (2025) Am J Physiol Heart Circ Physiol.
Sports nutrition and hydration
Effects of ketogenic diet on muscle mass, strength, aerobic metabolic capacity, and endurance in adults: a systematic review and meta-analysis. Wang et al. (2025) J Health Popul Nutr.
General and sport-specific nutrition knowledge and behaviors of adolescent athletes. Gibbs et al. (2025) J Int Soc Sports Nutr.
Under Consumed and Overestimated: Discrepancies in Race-Day Carbohydrate Intake Among Endurance Athletes. Lanpir et al. (2025) Eur J Sport Sci.
The Effect of Heat Stress and Dehydration on Carbohydrate Use During Endurance Exercise: A Systematic Review and Meta-Analysis. Mougin et al. (2025) Sports Med.
Sports supplements
Ergogenic effects of supplement combinations on endurance performance: a systematic review and meta-analysis of randomized controlled trials. Zart et al. (2025) J Int Soc Sports Nutr.
Recovery
Wearing Compression Socks During Running Does Not Change Physiological, Running Performance, and Perceptual Outcomes: A Systematic Review With Meta-Analysis. Telles et al. (2025) J Sport Rehabil.
Effects of post-exercise stretching versus no stretching on lower limb muscle recovery and performance: a meta-analysis. Zhang et al. (2025) Front Physiol.
Athlete health
Impact of Relative Energy Deficiency in Sport (REDs) on Bone Health in Elite Athletes: A Retrospective Analysis. von et al. (2025) J Cachexia Sarcopenia Muscle.
Injury and rehab
Association of Shoe Cushioning Perception and Comfort With Injury Risk in Leisure-Time Runners: A Secondary Analysis of a Randomised Trial. Malisoux et al. (2025) Eur J Sport Sci.
Acute Kidney Injury Biomarkers in Marathon Runners: Systematic Review and Meta-Analysis. Leucuța et al. (2025) Medicina (Kaunas).
Female athletes and sex differences
Sex Differences in the Impact of Exercise Volume on Subclinical Coronary Atherosclerosis: A Meta-Analysis. Abdelaziz et al. (2025) JACC Adv.
My deep dives
Does durability predict marathon speed?
Hunter et al. (2025) Eur J Sport Sci: Durability of Parameters Associated With Endurance Running in Marathoners.
What type of study is this?
This study an observational studyAn observational study is where researchers observe what naturally occurs without intervening — no treatment is assigned. I.e., the researchers watch and learn, but don’t interfere. Observational studies are used in epidemiology and can have different study designs, including cross-sectional, case-control, and cohort study designs. linking data from pre–post lab testing to real marathon race performance.
What was the authors’ hypothesis or research question?
The authors aimed to test whether the durability of key endurance parameters after prolonged running is associated with marathon performance.
What did the authors do to test the hypothesis or answer the research question?
The researchers recruited 23 London Marathon entrants; 18 people (11 male and 7 female, mean age 41) completed all parts of the study and were analysed. Each runner completed two lab visits: a baseline incremental treadmill test to determine V̇O2peakV̇O2peak is the highest oxygen uptake measured during a test, even if VO₂max wasn’t fully reached. I.e., it is the best effort recorded, but not always your max., lactate thresholdLactate threshold is the exercise intensity at which lactate starts to accumulate rapidly in the blood, signalling a shift toward more glycolytic (glucose-using) and anaerobic energy use. It is typically the intensity at which effort begins to feel hard and fatigue builds faster. speed, fractional utilisation Fractional utilisation is the percentage of your VO₂max that you can sustain at a given effort, often measured at the lactate threshold. It describes how much of your engine’s max power you can use steadily without fatigue. at lactate threshold, and running economyThe rate of energy expenditure (measured in kiloJoules [KJ], kilocalories [kcal] or oxygen consumption [V̇O2]) per kilogram body mass (kg) per unit of distance, i.e. per 1 kilometre travelled. A runner with a lower energy cost per kilometre has a higher economy than a runner with a higher energy cost.; and, on a separate day, a 90-minute run at the individual’s lactate-threshold speed followed immediately by the same incremental test. Race files from the 2024 London Marathon were used to compute marathon speed and heart-rate–to–speed “decoupling.” Outcomes were the pre–post changes in peak oxygen uptake, lactate-threshold speed, fractional utilisation at lactate threshold, running economy, and their associations with marathon speed, plus the onset and magnitude of in-race decoupling.
What did the authors find?
Eighteen runners completed both lab visits and the marathon, averaging 3 hours 17 minutes (range 2 hours 27 minutes to 4 hours 14 minutes). After the 90-minute run, V̇O2peak fell from 56.7 to 53.4 millilitres per kilogram per minute (p-valueA p-value is a statistical measure that indicates the probability that the result is at least as extreme as that observed if the null-hypothesis was true. If P is small, the observed difference is big enough to disprove (reject) the null hypothesis. In very basic terms, P equals the probability that the effect could be explained by random chance, and a P-value of less than 0.05 means the results look so promising that there’s only a 1-in-20 (or 5%) chance that they would have occurred if the treatment had no effect at all. Common thresholds for statistical significance are 0.05, 0.01, and 0.001. < 0.001; large effect), and lactate threshold speed fell from 12.8 to 12.1 kilometres per hour (p < 0.001; very large effect). Fractional utilisation and running economy did not change meaningfully. At lactate threshold, oxygen uptake and carbohydrate oxidation decreased, while fat oxidation increased; several ventilatory variables also shifted. Marathon speed correlated strongly with baseline lactate threshold speed (r-valuePearson’s r-value represents the correlation coefficient, which is a statistic that measures the strength and direction of a linear relationship between two variables, ranging from -1 to +1. An r value close to +1 indicates a strong positive correlation, close to -1 a strong negative correlation, and around 0 no linear relationship. = 0.937, p < 0.001) and with baseline peak oxygen uptake (r = 0.809, p < 0.001). Importantly, the percentage change in lactate-threshold speed from fresh to fatigued was also associated with marathon speed (r = 0.680, p < 0.01), indicating that smaller deteriorations tracked with faster races. Heart-rate–to–speed decoupling occurred around 27 kilometres on average and did not relate to performance in this cohort.
The authors concluded that a laboratory measure of durability — i.e., the post-fatigue drop in lactate threshold speed — is associated with marathon performance.
What were the strengths?
The design connects lab physiology to a real race, not just time trials, which is rare and useful. The protocol was clear, instruments and calibration were described, outcomes were pre-specified, and statistics were transparent with effect sizesA standardised measure of the magnitude of an effect of an intervention. Unlike p-values, effect sizes show the size of the effect and how meaningful it might be. Common effect size measures include standardised mean difference (SMD), Cohen’s d, Hedges’ g, eta-squared, and correlation coefficients. and correction for multiple testing. The marathon analysis used verified split data, and the paper reported exact measurement times, testing environment, and missing data. In short, the methods are tidy and detailed enough to replicate.
What were the limitations?
With just 18 participants, this reduces statistical powerStatistical power is the probability that a statistical test will correctly detect a real effect if there is one: a true positive. (In jargon: power is the probability that a statistical test correctly rejects a false null hypothesis). Higher statistical power reduces the risk of a false negative (failing to detect a true effect; or a Type II error). Power is typically influenced by sample size, effect size, significance level, and variability in the data, with a common target being at least 80% (or 0.8). and makes correlation estimates (r-valuesPearson’s r-value represents the correlation coefficient, which is a statistic that measures the strength and direction of a linear relationship between two variables, ranging from -1 to +1. An r value close to +1 indicates a strong positive correlation, close to -1 a strong negative correlation, and around 0 no linear relationship.) wobbly because a few true effectsWhen a statistical test fails to detect an effect or difference when there actually is one. I.e, “a missed detection”. Studies with a small sample size (N, number of participants) are more likely to produce false negative results. might have been missed. The 90-minute run covered about half a marathon, so durability may be underestimated for the final stages of a marathon race when things really bite. Furthermore, dietary intake, pacing strategy, and hydration during the race were not controlled, and the analyses were largely unadjusted correlations without taking such confounding factorsA third factor that influences both the exposure (or intervention) and the outcome, creating a false association (or effect). Adjusting for a confounding factor can change the story. into account.
How was the study funded, and are there any conflicts of interest that may influence the findings?
The authors reported no specific funding, and declared no conflicts of interest.
How can you apply these findings to your training or coaching practice?
For coaches and marathoners, this is immediately actionable. If you can test lactate threshold speed fresh and then again after about 90 minutes at threshold pace, the percentage drop gives you a handle on your durability, which actually ties to your speed on race day. It also nudges us to consider carbohydrate availability and hydration during long runs because those likely shape the drop. The force is strong with this one, though it still needs bigger samples and longer fatiguing runs to mimic the late-race fade.
What is my Rating of Perceived scientific Enjoyment?
RP(s)E = 6 out of 10.
I experienced moderate scientific enjoyment because although the lab-to-race link is elegant and the methods are transparent, the small number of participants, limited control of confounding factorsA third factor that influences both the exposure (or intervention) and the outcome, creating a false association (or effect). Adjusting for a confounding factor can change the story., and a half-marathon-length low-intensity fatigue stimulus reduce my confidence in the findings.
Important: Don’t make any major changes to your daily habits based on the findings of one study, especially if the study is small (e.g., less than 30 participants in a randomised controlled trial or less than 5 studies in a meta-analysis) or poor quality (e.g., high risk of biasRisk of bias in meta-analysis refers to the potential for systematic errors in the studies included in the analysis, which can lead to misleading or invalid results. Assessing this risk is crucial to ensure the conclusions drawn from the combined data are reliable. or low quality of evidenceA low quality of evidence means that, in general, studies in this field have several limitations. This could be due to inconsistency in effects between studies, a large range of effect sizes between studies, and/or a high risk of bias (caused by inappropriate controls, a small number of studies, small numbers of participants, poor/absent randomization processes, missing data, inappropriate methods/statistics). When the quality of evidence is low, there is more doubt and less confidence in the overall effect of an intervention, and future studies could easily change overall conclusions. The best way to improve the quality of evidence is for scientists to conduct large, well-controlled, high-quality randomized controlled trials.). What do other trials in this field show? (Follow the link to explore those trials.) Do they confirm the findings of this study or have mixed outcomes?
Do compression socks help runners?
Telles et al. (2025) J Sport Rehabil: Wearing Compression Socks During Running Does Not Change Physiological, Running Performance, and Perceptual Outcomes: A Systematic Review With Meta-Analysis.
What type of study is this?
This study is a systematic reviewA systematic review answers a specific research question by systematically collating all known experimental evidence, which is collected according to pre-specified eligibility criteria. A systematic review helps inform decisions, guidelines, and policy. with meta-analysisA meta-analysis quantifies the overall effect size of a treatment by compiling effect sizes from all studies of that treatment..
.
What was the authors’ hypothesis or research question?
The authors aimed to test whether wearing compression socks during running improves physiology, running performance, or perceptual outcomes compared with regular socks or placebo sleeves.
What did the authors do to test the hypothesis or answer the research question?
The authors pooled 16 trials with 284 runners in the meta-analysis. Participants included non-injured runners, but their age and sex were not consistently reported across studies. Interventions were below-knee compression socks or sleeves versus regular socks or placebo. Outcomes covered heart rate, V̇O2maxV̇O2max is the maximal rate of oxygen consumption your body can achieve during exercise. It is a measure of cardiorespiratory fitness and indicates the size of your engine, i.e., your maximal aerobic power, which contributes to endurance performance., blood lactate, respiratory exchange ratio (RER)RER is the ratio of carbon dioxide you breathe out to oxygen you use. It hints at the type of fuel your body is predominantly burning: around 0.7 = mostly fat, around 1.0 = mostly carbs, >1.0 = all-out effort., running speed, time to exhaustion, total time, perceived exertion (RPE), and lower-limb soreness. Risk of biasRisk of bias in a meta-analysis refers to the potential for systematic errors in the studies included in the analysis. Such errors can lead to misleading/invalid results and unreliable conclusions. This can arise because of issues with the way participants are selected (randomisation), how data is collected and analysed, and how the results are reported. was assessed, and certainty of evidenceCertainty of evidence tells us how confident we are that the published results accurately reflect the true effect. It’s based on factors like study design, risk of bias, consistency, directness, precision, and publication bias. High certainty means that the current evidence is so strong and consistent that future studies are unlikely to change conclusions. Whereas, low certainty means more doubt and less confidence, and that future studies could easily change current conclusions. was assessed with GRADEGRADE, which stands for Grading of Recommendations Assessment, Development and Evaluation, is a standardised and structured approach used to assess the certainty of evidence in meta-analyses. It evaluates how “confident” researchers are in the results of studies and the recommendations that follow from them. GRADE rates a body of evidence as “high”, “moderate”, “low”, or “very low” certainty using a set of standardised criteria..
What did the authors find?
Across physiological outcomes, pooled effects showed no benefit of compression socks. Heart rate during running (10 trials, n=197) had a mean difference (MD) of 0.82 beats per minute (95% confidence interval [CI]A measure of uncertainty used in Frequentist statistics. The 95% confidence interval is a plausible range of values within which the true value (e.g., the true treatment effect) would be found 95% of the time if the data were repeatedly collected in different samples of people. If this range of values (the confidence interval) crosses zero, there is little confidence that the average value is the true effect. If the confidence interval does not cross zero, we can be confident that the average value is the true effect. −2.03 to 0.39; I-squared (I2)I-squared (I2) is a statistic used in meta-analysis to quantify the percentage of variation across studies that is due to heterogeneity rather than chance. A low I2 value (25-50%) indicates that the findings are relatively consistent between studies, whereas a high I2 value (>75%) suggests considerable variability among the study results, which may affect the reliability of the overall conclusions. =0%; moderate certainty of evidenceA moderate quality of evidence means that, in general, studies in this field have some limitations. This could be due to somewhat inconsistent effects between studies, a moderate range of effect sizes between studies, and/or a moderate risk of bias (caused by a small to medium number of studies, small to medium numbers of participants, partially described randomisation processes, some missing data, some inappropriate methods/statistics). When the quality of evidence is moderate, there is some doubt and only moderate confidence in the overall effect of an intervention, and future studies could change overall conclusions. The best way to improve the quality of evidence is for scientists to conduct large, well-controlled, high-quality randomised controlled trials.). Percent of maximal heart rate (3 trials, n=45) showed a mean difference 0.68 percentage points (95% CI −2.19 to 0.83; I2=0%; low certainty of evidenceA low quality of evidence means that, in general, studies in this field have several limitations. This could be due to inconsistency in effects between studies, a large range of effect sizes between studies, and/or a high risk of bias (caused by inappropriate controls, a small number of studies, small numbers of participants, poor/absent randomisation processes, missing data, inappropriate methods/statistics). When the quality of evidence is low, there is more doubt and less confidence in the overall effect of an intervention, and future studies could easily change overall conclusions. The best way to improve the quality of evidence is for scientists to conduct large, well-controlled, high-quality randomised controlled trials.). Blood lactate post-run (7 trials, n=108) showed a mean difference of 0.30 (95% CI −0.39 to 0.98; I2=0%; low certainty). VO₂ during running (7 trials, n=98) showed a mean difference of 0.18 (95% CI −1.04 to 0.68; I2=0%; very low certainty ), and post-run (3 trials, n=33) showed a mean difference of 0.39 (95% CI −3.27 to 2.49; very low). Respiratory exchange ratio during running showed no difference (3 trials, n=44; standardized mean difference near 0; P=.33; low certainty). For performance, total running time (5 trials, n=73) showed a standardised mean difference (SMD)The standardised mean difference (SMD) is a statistical measure used to compare the mean (average) value between two groups, expressed in terms of standard deviations rather than the value’s original units. The SMD is calculated as the mean difference or mean change in the variable divided by the standard deviation of the variable at baseline. The SMD is often used in meta-analysis because it allows researchers to combine results from studies that use different measurement scales, providing a common metric for effect size across studies. of 0.06 (95% CI −0.27 to 0.38; I2=0%; moderate certainty); running speed (3 trials, n=49) showed a mean difference of −0.24 (95% CI −0.79 to 0.31; I2=0%; very low certainty ); time to exhaustion (4 trials, n=51) showed an SMD of −0.26 (95% CI −0.65 to 0.13; I2=0%; low). Perceptual outcomes were also unchanged: perceived exertion during running (13 trials, n=236, SMD 0.06, 95% CI −0.17 to 0.29; I2=33%; moderate certainty), and lower-limb soreness post-run (3 trials, n=42, SMD 0.08, 95% CI −0.35 to 0.51; I2=0%; very low certainty). The authors downgraded the certainty of evidenceCertainty of evidence tells us how confident we are that the published results accurately reflect the true effect. It’s based on factors like study design, risk of bias, consistency, directness, precision, and publication bias. High certainty means that the current evidence is so strong and consistent that future studies are unlikely to change conclusions. Whereas, low certainty means more doubt and less confidence, and that future studies could easily change current conclusions. for a high risk of biasRisk of bias in a meta-analysis refers to the potential for systematic errors in the studies included in the analysis. Such errors can lead to misleading/invalid results and unreliable conclusions. This can arise because of issues with the way participants are selected (randomisation), how data is collected and analysed, and how the results are reported. and possible publication biasPublication bias in meta-analysis occurs when studies with significant results are more likely to be published than those with non-significant findings, leading to distorted conclusions. This bias can inflate effect sizes and misrepresent the true effectiveness of interventions, making it crucial to identify and correct for it in research..
The authors concluded that compression socks do not improve physiological, performance, or perceptual outcomes during running versus regular socks, though they also do not seem to cause harm.
What were the strengths?
The review used a protocol pre-registrationPreregistration is when a detailed description of a study plan is deposited in an open-access repository before collecting the study data. It promotes transparency and accountability, and boosts research integrity., searched 5 major databases without language limits, and applied Cochrane risk of bias (RoB2) toolThe Cochrane Risk of Bias 2 (RoB 2) tool is a standardised instrument developed by Cochrane for assessing the risk of bias in randomised controlled trials (RCTs). It is widely used to evaluate the internal validity of results from studies in a systematic review by examining bias arising from the randomisation process, deviations from intended interventions, missing outcome data, the measurement of the outcome, and the selection of the reported result. and GRADEGRADE, which stands for Grading of Recommendations Assessment, Development and Evaluation, is a standardised and structured approach used to assess the certainty of evidence in meta-analyses. It evaluates how “confident” researchers are in the results of studies and the recommendations that follow from them. GRADE rates a body of evidence as “high”, “moderate”, “low”, or “very low” certainty using a set of standardised criteria. to rate certainty of evidenceCertainty of evidence tells us how confident we are that the published results accurately reflect the true effect. It’s based on factors like study design, risk of bias, consistency, directness, precision, and publication bias. High certainty means that the current evidence is so strong and consistent that future studies are unlikely to change conclusions. Whereas, low certainty means more doubt and less confidence, and that future studies could easily change current conclusions. — basically the full modern testing kit.
What were the limitations?
All pooled estimates came from randomised controlled trialsThe “gold standard” approach for determining whether a treatment has a causal effect on an outcome of interest. In such a study, a sample of people representing the population of interest is randomised to receive the treatment or a no-treatment placebo (control), and the outcome of interest is measured before and after the exposure to treatment/control. with crossoverCrossover means that all subjects completed all interventions (control and treatment) usually with a wash-out period in between. designs, but most trials were rated as high risk of biasRisk of bias in a meta-analysis refers to the potential for systematic errors in the studies included in the analysis. Such errors can lead to misleading/invalid results and unreliable conclusions. This can arise because of issues with the way participants are selected (randomisation), how data is collected and analysed, and how the results are reported. because of a lack of blindingBlinding is when people in a study don’t know which treatment they’re getting. It stops expectations or beliefs (from patients or researchers) from skewing the results. “Single-blind” means participants don’t know; “double-blind” means participants and researchers don’t know; “triple-blind” means that the participants, researchers, and data analysts are kept in the dark. The goal is simple: fair tests and trustworthy findings. and allocation concealmentAllocation concealment is the step that hides the next treatment assignment before a patient enters a trial. It prevents staff from guessing or peeking, so they can’t steer patients to one group or another. It happens at enrollment, before blinding, and guards against selection bias., as well as reporting biasOnly certain results—often positive or exciting—get written up or highlighted. The full picture stays hidden., which drag confidence down.
How was the study funded, and are there any conflicts of interest that may influence the findings?
The study was supported by the Rio de Janeiro research foundation and Brazil’s federal agency for graduate support; one author held an Australian NHMRC Investigator Grant. The paper provides funding acknowledgements but does not include an explicit conflicts-of-interest statement.
How can you apply these findings to your training or coaching practice?
For coaches and runners, the message is clear: wear compression socks if you like the feel, but don’t expect performance magic — “Nothing to see here, move along, move along.”
What is my Rating of Perceived scientific Enjoyment?
RP(s)E = 9 out of 10.
I experienced high scientific enjoyment because the methods were thorough, current, and transparent even if the answer is a boring null.
Important: Don’t make any major changes to your daily habits based on the findings of one study, especially if the study is small (e.g., less than 30 participants in a randomised controlled trial or less than 5 studies in a meta-analysis) or poor quality (e.g., high risk of biasRisk of bias in meta-analysis refers to the potential for systematic errors in the studies included in the analysis, which can lead to misleading or invalid results. Assessing this risk is crucial to ensure the conclusions drawn from the combined data are reliable. or low quality of evidenceA low quality of evidence means that, in general, studies in this field have several limitations. This could be due to inconsistency in effects between studies, a large range of effect sizes between studies, and/or a high risk of bias (caused by inappropriate controls, a small number of studies, small numbers of participants, poor/absent randomization processes, missing data, inappropriate methods/statistics). When the quality of evidence is low, there is more doubt and less confidence in the overall effect of an intervention, and future studies could easily change overall conclusions. The best way to improve the quality of evidence is for scientists to conduct large, well-controlled, high-quality randomized controlled trials.). What do other trials in this field show? (Follow the link to explore those trials.) Do they confirm the findings of this study or have mixed outcomes? Is there a high-quality systematic review and meta-analysis evaluating the entirety of the evidence in this field? (Follow the link to explore those reviews.) If so, what does the analysis show? What is the risk of bias or certainty of evidenceCertainty of evidence tells us how confident we are that the results reflect the true effect. It’s based on factors like study design, risk of bias, consistency, directness, and precision. Low certainty means more doubt and less confidence, and that future studies could easily change the conclusions. High certainty means that the current evidence is so strong and consistent that future studies are unlikely to change conclusions. across the included studies? I’ve written a deep-dive article on this topic; check it out at veohtu.com/recoverymagictool.
Do teen athletes know sports nutrition?
Gibbs et al. (2025) J Int Soc Sports Nutr: General and sport-specific nutrition knowledge and behaviors of adolescent athletes.
What type of study is this?
. This study is a a cross-sectional studyA cross-sectional study is a type of observational study where the exposure and outcome are measured at a single point in time, giving a snapshot of a population—what’s happening right now. Cross-sectional studies are used in health surveys, prevalence studies, or for hypothesis generation, and can show prevalence (how common something is) and associations (but not cause and effect). observational studyAn observational study is where researchers observe what naturally occurs without intervening — no treatment is assigned. I.e., the researchers watch and learn, but don’t interfere. Observational studies are used in epidemiology and can have different study designs, including cross-sectional, case-control, and cohort study designs..
.
What was the authors’ hypothesis or research question?
The authors aimed to examine adolescent athletes’ general and sport-specific nutrition knowledge and behaviors, and to compare these between males and females.
What did the authors do to test the hypothesis or answer the research question?
The researchers surveyed 194 adolescent athletes from Michigan, Arizona, and California in the Peak Health and Performance program. There were 63 males and 131 females with a mean age of 14.9 years. They used items from the SPAN survey to capture food groups and drinks, and created Healthy and Unhealthy Food Index scores. A 20-item knowledge test covered macronutrients, micronutrients, hydration, and weight management, modeled on prior athlete/coach questionnaires and aligned with major nutrition position statements. Outcomes included diet patterns by sex, adherence to sport-nutrition practices (e.g., hydration checks, pre/post-event eating, supplement use), and the proportion of correct answers on knowledge items.
What did the authors find?
Females ate fruit more often than males (1.50 vs 1.15 times per day; p-valueA p-value is a statistical measure that indicates the probability that the result is at least as extreme as that observed if the null-hypothesis was true. If P is small, the observed difference is big enough to disprove (reject) the null hypothesis. In very basic terms, P equals the probability that the effect could be explained by random chance, and a P-value of less than 0.05 means the results look so promising that there’s only a 1-in-20 (or 5%) chance that they would have occurred if the treatment had no effect at all. Common thresholds for statistical significance are 0.05, 0.01, and 0.001.=0.04). Males drank sugary beverages more often (1.12 vs 0.76 times per day; p=0.004) and reported greater dairy intake (3.70 vs 2.35 times per day; p<0.001). Healthy Food Index scores were similar by sex, but the Unhealthy Food Index was higher in males (8.63 vs 6.20; p=0.002). For sport behaviors, males more often monitored hydration status and used sport supplements; females more often used vitamin/mineral supplements; post-competition eating was slightly higher in females (p=0.07). Knowledge was generally low: both sexes scored under 50% correct overall. Females did better on identifying all USDA MyPlate food groups (76.8% [96/125] vs 61.4% [35/57]; adjusted odds ratioThe odds ratio is a measure of association between an exposure and an outcome, representing the odds of the outcome occurring with the exposure compared to the odds without the exposure. An OR of 1 indicates no association; greater than 1 indicates increased odds; less than 1 indicates decreased odds. 2.27, 95% confidence interval [CI]A measure of uncertainty used in Frequentist statistics. The 95% confidence interval is a plausible range of values within which the true value (e.g., the true treatment effect) would be found 95% of the time if the data were repeatedly collected in different samples of people. If this range of values (the confidence interval) crosses zero, there is little confidence that the average value is the true effect. If the confidence interval does not cross zero, we can be confident that the average value is the true effect. 1.12 to 4.59), while males more often knew that most athlete carbohydrates should be complex (adjusted odds ratioThe odds ratio is a measure of association between an exposure and an outcome, representing the odds of the outcome occurring with the exposure compared to the odds without the exposure. An OR of 1 indicates no association; greater than 1 indicates increased odds; less than 1 indicates decreased odds. 0.46, 95% CIA measure of uncertainty used in Frequentist statistics. The 95% confidence interval is a plausible range of values within which the true value (e.g., the true treatment effect) would be found 95% of the time if the data were repeatedly collected in different samples of people. If this range of values (the confidence interval) crosses zero, there is little confidence that the average value is the true effect. If the confidence interval does not cross zero, we can be confident that the average value is the true effect. 0.24 to 0.89, favoring males) and that protein should provide about 10% to 35% of daily calories (66.7% vs 45.6%; adjusted odds ratioThe odds ratio is a measure of association between an exposure and an outcome, representing the odds of the outcome occurring with the exposure compared to the odds without the exposure. An OR of 1 indicates no association; greater than 1 indicates increased odds; less than 1 indicates decreased odds. 0.39, 95% CI 0.20 to 0.78, favoring males). Females were more likely to endorse that saturated fats can delay recovery (54.4% vs 36.8%; adjusted OR 2.15, 95% CI 1.11 to 4.18).
The authors concluded that adolescent athletes under-consume recommended foods, over-consume sugary drinks, and show low general and sport-specific nutrition knowledge, with some sex differences that could guide targeted education.
What were the strengths?
The study clearly described the setting, participants, measures, and statistics, and it adjusted key comparisons for age, ethnicity, and years in sport. The diet screen tool has prior validityValidity means you're measuring what you think you're measuring. If a test claims to measure exercise performance, validity asks: Does it really? So, validity is about accuracy, and a valid tool hits the target. Without validity, results might look good but lead you in the wrong direction., and the knowledge instrument was content-validated by domain experts and aligned with major position standsA position stand is a detailed policy recommendation published by a society that describes a course of action for practice. and consensus statementsA consensus statement is a collective opinion of a society that is used to develop evidence-based guidelines., which is exactly what we want for a quick, real-world snapshot in youth sport.
What were the limitations?
It’s a convenience sample across athletes from the authors’ local areas, so representativeness is uncertain. All outcomes were self-reported in a single time window, so misclassification is possible and data could be unrelaible. Access to foods and cost/availability were not measured, which might be confounding factorsA third factor that influences both the exposure (or intervention) and the outcome, creating a false association (or effect). Adjusting for a confounding factor can change the story. for diet patterns. The paper did not report the response rate or conduct a non-response analysis. And, odds ratioThe odds ratio is a measure of association between an exposure and an outcome, representing the odds of the outcome occurring with the exposure compared to the odds without the exposure. An OR of 1 indicates no association; greater than 1 indicates increased odds; less than 1 indicates decreased odds. were used for several knowledge items; that’s fine for cross-sectional contrasts, but they don’t translate neatly into practical risk without knowing the absolute frequencies, which were only partially reported.
How was the study funded, and are there any conflicts of interest that may influence the findings?
The authors report internal research support from Michigan State University Extension and no conflicts of interest.
How can you apply these findings to your training or coaching practice?
For coaches and parents, the findings are actionable whether or not they are true because nutritional education must begin young. Work with registered sports nutritionists or dieticians to build simple routines around pre- and post-session eating, hydration checks, and cutting back sugary drinks; integrate bite-size lessons on carbs, protein ranges, and fats into practice briefings. And, for coaches designing a youth program, a short curriculum that embeds nutrition into team culture could be the way. Basically, treat nutrition like training. Don’t make it a taboo subject, but one that can be discussed.
What is my Rating of Perceived scientific Enjoyment?
RP(s)E = 6 out of 10.
I experienced moderate scientific enjoyment because the methods are transparent and practical, but several limitations (see above) keep my confidence in the findings low until larger, high-quality studies are published.
Important: Don’t make any major changes to your daily habits based on the findings of one study, especially if the study is small (e.g., less than 30 participants in a randomised controlled trial or less than 5 studies in a meta-analysis) or poor quality (e.g., high risk of biasRisk of bias in meta-analysis refers to the potential for systematic errors in the studies included in the analysis, which can lead to misleading or invalid results. Assessing this risk is crucial to ensure the conclusions drawn from the combined data are reliable. or low quality of evidenceA low quality of evidence means that, in general, studies in this field have several limitations. This could be due to inconsistency in effects between studies, a large range of effect sizes between studies, and/or a high risk of bias (caused by inappropriate controls, a small number of studies, small numbers of participants, poor/absent randomization processes, missing data, inappropriate methods/statistics). When the quality of evidence is low, there is more doubt and less confidence in the overall effect of an intervention, and future studies could easily change overall conclusions. The best way to improve the quality of evidence is for scientists to conduct large, well-controlled, high-quality randomized controlled trials.). What do other trials in this field show? (Follow the link to explore those trials.) Do they confirm the findings of this study or have mixed outcomes? Is there a high-quality systematic review and meta-analysis evaluating the entirety of the evidence in this field? (Follow the link to explore those reviews.) If so, what does the analysis show? What is the risk of bias or certainty of evidenceCertainty of evidence tells us how confident we are that the results reflect the true effect. It’s based on factors like study design, risk of bias, consistency, directness, and precision. Low certainty means more doubt and less confidence, and that future studies could easily change the conclusions. High certainty means that the current evidence is so strong and consistent that future studies are unlikely to change conclusions. across the included studies? I’ve written several deep-dive articles on sports nutrition; check them out at veohtu.com/healthyeatingpattern, veohtu.com/performancenutrition, and veohtu.com/trainingnutrition.
Does post-exercise stretching aid recovery?
Zhang et al. (2025) Front Physiol: Effects of post-exercise stretching versus no stretching on lower limb muscle recovery and performance: a meta-analysis.
What type of study is this?
This study is a systematic reviewA systematic review answers a specific research question by systematically collating all known experimental evidence, which is collected according to pre-specified eligibility criteria. A systematic review helps inform decisions, guidelines, and policy. with meta-analysisA meta-analysis quantifies the overall effect size of a treatment by compiling effect sizes from all studies of that treatment.
.
What was the authors’ hypothesis or research question?
The authors aimed to clarify whether post-exercise stretching facilitates muscle recovery and enhances subsequent performance.
What did the authors do to test the hypothesis or answer the research question?
The researchers pooled randomized controlled trials and assessed changes in muscle soreness, strength, performance, flexibility, and pain threshold using standardised mean differencesThe standardised mean difference (SMD) is a statistical measure used to compare the mean (average) value between two groups, expressed in terms of standard deviations rather than the value’s original units. The SMD is calculated as the mean difference or mean change in the variable divided by the standard deviation of the variable at baseline. The SMD is often used in meta-analysis because it allows researchers to combine results from studies that use different measurement scales, providing a common metric for effect size across studies. (SMDs). In total, 15 studies with 465 participants were included; most involved healthy adults, with 1 study in minors, and samples ranged across male-only, female-only, and mixed-sex cohorts.
What did the authors find?
Post-exercise stretching did not meaningfully change any outcome versus no stretching. Pooled effects were trivial for soreness (standardised mean difference [SMD]The standardised mean difference (SMD) is a statistical measure used to compare the mean (average) value between two groups, expressed in terms of standard deviations rather than the value’s original units. The SMD is calculated as the mean difference or mean change in the variable divided by the standard deviation of the variable at baseline. The SMD is often used in meta-analysis because it allows researchers to combine results from studies that use different measurement scales, providing a common metric for effect size across studies. = −0.06, 95% confidence interval (CI)A measure of uncertainty used in Frequentist statistics. The 95% confidence interval is a plausible range of values within which the true value (e.g., the true treatment effect) would be found 95% of the time if the data were repeatedly collected in different samples of people. If this range of values (the confidence interval) crosses zero, there is little confidence that the average value is the true effect. If the confidence interval does not cross zero, we can be confident that the average value is the true effect. = −0.32 to 0.19, p-valueA p-value is a statistical measure that indicates the probability that the result is at least as extreme as that observed if the null-hypothesis was true. If P is small, the observed difference is big enough to disprove (reject) the null hypothesis. In very basic terms, P equals the probability that the effect could be explained by random chance, and a P-value of less than 0.05 means the results look so promising that there’s only a 1-in-20 (or 5%) chance that they would have occurred if the treatment had no effect at all. Common thresholds for statistical significance are 0.05, 0.01, and 0.001.=0.63; I-squared [I2]I-squared (I2) is a statistic used in meta-analysis to quantify the percentage of variation across studies that is due to heterogeneity rather than chance. A low I2 value (25-50%) indicates that the findings are relatively consistent between studies, whereas a high I2 value (>75%) suggests considerable variability among the study results, which may affect the reliability of the overall conclusions. = 35%), flexibility (−0.06, −0.31 to 0.20, p=0.67; I2 0%), and pain threshold (−0.02, −0.41 to 0.37, p=0.93; I2 0%), and small but non-significant for performance (0.18, −0.11 to 0.46, p=0.22; I2 10%) and strength (0.27, −0.14 to 0.68, p=0.19; I2 0%). Subgrouping by stretching frequency (3 or more sessions versus fewer than 3) showed no between-subgroup differences; for example, soreness was SMD=−0.41 (95%CI −0.96 to 0.13; I2 60.5%) with 3 or more sessions versus 0.12 (−0.13 to 0.38; I² 0%) with fewer than 3. Meta-regressionWhereas a meta-analysis asks, “What’s the overall effect across studies?”, a meta-regression asks, “Why do effects differ?”. It links study results to study features (e.g., sample size, age of participants, or how a treatment was delivered) to see which features predict bigger or smaller effects. The findings from a meta-regression can help design future studies to test hypotheses based on which features might be relevant. explored stretching type and training level but did not find anything interesting.
The authors concluded that post-exercise stretching, used alone as a recovery strategy, does not significantly improve soreness, strength, performance, flexibility, or pain threshold.
What were the strengths?
The review did the right “boring but important” things: a wide, transparent search across 8 databases; protocol pre-registrationPreregistration is when a detailed description of a study plan is deposited in an open-access repository before collecting the study data. It promotes transparency and accountability, and boosts research integrity.; explicit inclusion criteria; and risk of biasRisk of bias in a meta-analysis refers to the potential for systematic errors in the studies included in the analysis. Such errors can lead to misleading/invalid results and unreliable conclusions. This can arise because of issues with the way participants are selected (randomisation), how data is collected and analysed, and how the results are reported. assessments. Publication biasPublication bias in meta-analysis occurs when studies with significant results are more likely to be published than those with non-significant findings, leading to distorted conclusions. This bias can inflate effect sizes and misrepresent the true effectiveness of interventions, making it crucial to identify and correct for it in research. was checked, and GRADEGRADE, which stands for Grading of Recommendations Assessment, Development and Evaluation, is a standardised and structured approach used to assess the certainty of evidence in meta-analyses. It evaluates how “confident” researchers are in the results of studies and the recommendations that follow from them. GRADE rates a body of evidence as “high”, “moderate”, “low”, or “very low” certainty using a set of standardised criteria. was used to rate the certainty of evidenceCertainty of evidence tells us how confident we are that the published results accurately reflect the true effect. It’s based on factors like study design, risk of bias, consistency, directness, precision, and publication bias. High certainty means that the current evidence is so strong and consistent that future studies are unlikely to change conclusions. Whereas, low certainty means more doubt and less confidence, and that future studies could easily change current conclusions.. Tidy work.
What were the limitations?
Many of the included trials were small and methodologically mixed, with moderate overall quality and a high risk of biasRisk of bias in a meta-analysis refers to the potential for systematic errors in the studies included in the analysis. Such errors can lead to misleading/invalid results and unreliable conclusions. This can arise because of issues with the way participants are selected (randomisation), how data is collected and analysed, and how the results are reported.. The number of studies per outcome was also modest (as low as 3), measurement tools varied widely, and most samples were healthy young adults, reducing generalizability to older-aged people.
How was the study funded, and are there any conflicts of interest that may influence the findings?
The authors state they received no financial support for the research, and reported no conflicts of interest.
How can you apply these findings to your training or coaching practice?
For athletes and coaches eyeing a recovery boost, these findings provide a reality check. If you like stretching, keep it for comfort or routine, but don’t expect it to accelerate recovery or boost next-session performance on its own — mix it with sleep, protein, and smart easy days. So “nothing to see here, move along, move along…”.
What is my Rating of Perceived scientific Enjoyment?
RP(s)E = 9 out of 10.
I experienced high scientific enjoyment because the review hit all the quality beats for meta-analysis: broad database coverage, at least three databases, preregistered protocol, formal risk-of-bias assessment, planned publication-bias checks, and GRADE certainty appraisal
Important: Don’t make any major changes to your daily habits based on the findings of one study, especially if the study is small (e.g., less than 30 participants in a randomised controlled trial or less than 5 studies in a meta-analysis) or poor quality (e.g., high risk of biasRisk of bias in meta-analysis refers to the potential for systematic errors in the studies included in the analysis, which can lead to misleading or invalid results. Assessing this risk is crucial to ensure the conclusions drawn from the combined data are reliable. or low quality of evidenceA low quality of evidence means that, in general, studies in this field have several limitations. This could be due to inconsistency in effects between studies, a large range of effect sizes between studies, and/or a high risk of bias (caused by inappropriate controls, a small number of studies, small numbers of participants, poor/absent randomization processes, missing data, inappropriate methods/statistics). When the quality of evidence is low, there is more doubt and less confidence in the overall effect of an intervention, and future studies could easily change overall conclusions. The best way to improve the quality of evidence is for scientists to conduct large, well-controlled, high-quality randomized controlled trials.). What do other trials in this field show? (Follow the link to explore those trials.) Do they confirm the findings of this study or have mixed outcomes? Is there a high-quality systematic review and meta-analysis evaluating the entirety of the evidence in this field? (Follow the link to explore those reviews.) If so, what does the analysis show? What is the risk of bias or certainty of evidenceCertainty of evidence tells us how confident we are that the results reflect the true effect. It’s based on factors like study design, risk of bias, consistency, directness, and precision. Low certainty means more doubt and less confidence, and that future studies could easily change the conclusions. High certainty means that the current evidence is so strong and consistent that future studies are unlikely to change conclusions. across the included studies? I’ve written a deep-dive article on this topic; check it out at veohtu.com/recoverymagictool.
My beer of the month
(Rating of Perceived beer Enjoyment)
8 out of 10
Access to education is a right, not a privilege
Equality in education, health, and sustainability matters deeply to me. I was fortunate to be born into a social welfare system where higher education was free. Sadly, that's no longer true. That's why I created Veohtu: to make high-quality exercise science and sports nutrition education freely available to folks from all walks of life. All the content is free, and always will be.
Every day is a school day.
Empower yourself to train smart.
Be informed. Stay educated. Think critically.