What are the difficulties in arriving at a consensus regarding the prevalence of long COVID?

NB: When considering information concerning groups of people with ME/CFS and those with long COVID, it is important to remember that ME/CFS is a symptom-based clinical diagnosis not a mechanistic one. It is clear there is a high degree of shared pathophysiology between ME/CFS and long COVID, and the two diagnostic labels are not mutually exclusive. Importantly, some individuals with long COVID meet ME/CFS diagnostic criteria or have a dual diagnosis.

Key points

There are many difficulties faced when attempting to estimate the prevalence of long COVID (LC). In this, the first of two articles, ME Research UK highlights the following issues:
- There is no validated biomarker to detect LC. Rather, LC is identified based on the presence and duration of certain symptoms, and there are several different published definitions. Notably, the published definitions of LC are not consistent, meaning that the findings from prevalence studies using different definitions may not be comparable.
- Prevalence of LC varies depending on the method used to identify the study population. For example, through self-report, hospital records, GP records. Notably prevalence also varies depending on where the data is from, for example from LC clinics compared with a sample from the general population.
- There is also variation in LC prevalence based on the type of study design used, for example studies which follow participants up over time may estimate a different prevalence rate to those which just look at a single time point.
While estimates of the prevalence of LC exist, they are limited by a number of factors such as the definition used, how representative the study population is of the general population, and how reliable the methods are.
Prevalence estimates change over time, and vary between countries (and between different research studies), meaning that it is not always appropriate to directly compare estimates or to extrapolate figures to populations for which they were not intended/beyond that for which they were initially calculated.

Introduction

In this, the first of two articles discussing the difficulties faced when attempting to reach a consensus regarding a prevalence estimate for long COVID (LC), ME Research UK will highlight:

Differences between definitions of LC and the impact this has on research.
The influence of using different methods to identify cases of LC are identified.
Difficulties in identifying cases of LC, especially now COVID-19 testing is no longer routine.
How study design can influence the prevalence of a disease.

Differences in definitions used

Ideally, LC would be reliably diagnosed using a measurable biological indicator (biomarker) over a set period of time – more than 12 weeks according to NICE – following a confirmed COVID-19 infection. This would both accurately enable clear distinction from other diseases, and have high ability to detect those with LC (sensitivity) and to identify those without the disease (specificity).

However, as with ME/CFS, there is no validated biomarker to detect LC. Rather, LC is identified based on the presence and duration of certain symptoms, and there are several different published definitions which can be used for LC including: NASEM 2024, WHO 2021, United States National Centre for Health Statistics (US-NCHS) definition, United Kingdom Office for National Statistics (UK-ONS) definition, and NICE 2020.

More information on the definitions

	Term used	Definition
NASEM 2024	long COVID	An infection-associated chronic condition that occurs after SARS-CoV-2 infection and is present for at least 3 months as a continuous, relapsing and remitting, or progressive disease state that affects one or more organ systems
WHO 2021	Post Covid Condition	Post COVID-19 condition occurs in individuals with a history of probable or confirmed SARS-CoV-2 infection, usually 3 months from the onset of COVID-19 with symptoms that last for at least 2 months and cannot be explained by an alternative diagnosis. Common symptoms include fatigue, shortness of breath, cognitive dysfunction but also others which generally have an impact on everyday functioning. Symptoms may be new onset, following initial recovery from an acute COVID 19 episode, or persist from the initial illness. Symptoms may also fluctuate or relapse over time. A separate definition may be applicable for children.
United States National Centre for Health Statistics (US-NCHS) definition.	Ever Long COVID and Current Long COVID	Ever Long COVID: A “yes” response to the survey question, “Did you have any symptoms lasting 3 months or longer that you did not have prior to having COVID-19?” among those who reported receiving either a positive test or a doctor’s diagnosis of COVID-19 and were symptomatic. Current Long COVID: Based on meeting the definition of ever Long COVID plus the presence of symptoms at time of interview.
United Kingdom Office for National Statistics (UK-ONS) definition.	long COVID	Reported at least one symptom that appeared after SARS-CoV-2 infection, lasted more than 4 weeks and was not explained by an alternative diagnosis.
NICE 2020 (last updated 24^th January 2024)	Post-COVID-19 syndrome	Signs and symptoms that develop during or after an infection consistent with COVID19, continue for more than 12 weeks and are not explained by an alternative diagnosis. It usually presents with clusters of symptoms, often overlapping, which can fluctuate and change over time and can affect any system in the body. Post COVID 19 syndrome may be considered before 12  weeks while the possibility of an alternative underlying disease is also being assessed. In addition to the clinical case definitions, the term ‘LC’ is commonly used to describe signs and symptoms that continue or develop after acute COVID19. It includes both ongoing symptomatic COVID19 (from 4 to 12 weeks) and postCOVID19 syndrome (12 weeks or more).

show less

Notably, the published definitions of LC are not consistent. This means that when research studies use different definitions to identify the illness, the resulting prevalence estimates may not be comparable. Regrettably, some studies also use definitions developed for the purpose of their research rather than published definitions, further complicating matters.

Differences in prevalence rates according to definition have been illustrated in a study by Wisk and colleagues, published in 2025. Here, researchers considered five different LC definitions – taken from previously published papers – and applied them to 4,575 participants; 3,521 who had a history of COVID-19 (those with “self-reported symptoms suggestive of acute SARS-CoV-2 infection at the time of a SARS-CoV-2 test” followed by a further test confirming history of infection) and 1,054 who did not, from the INSPIRE cohort in the US.

Results showed that depending on the criteria used, the prevalence of LC amongst those who had a history of COVID-19 infection ranged from:

30.8% to 42.0% at 3 months after COVID infection.
14.2% to 21.9% at 6 months after COVID infection.

Interestingly, Wisk and colleagues also applied the symptom-based criteria to the COVID-negative population, and found that similar proportions met the long COVID criteria:

28.08% and 40.32% at 3 months
14.60% to 23.27% at 6 months

Findings not only demonstrate variation in prevalence by definition, but also suggest that criteria which require the presence of only one long COVID symptom may not act as a reliable method for identifying true cases of the illness.

It is worth noting that some of the LC ‘cases’ identified among those who were ‘COVID-negative’, may have been in people who had experienced an asymptomatic COVID-19 infection, or in those who did not take a COVID test. As only those who “self-reported symptoms suggestive of acute SARS-CoV-2 infection at the time of a SARS-CoV-2 test” were eligible for the further testing to confirm COVID history, participants who did not self-report the relevant symptoms, or who had not done a COVID test despite having the illness, would have been allocated to the COVID-negative group despite potentially having a history of infection.

More details on the LC definitions used

Reference	Location	LC definition used	Symptoms assessed (number of symptoms required for a diagnosis )
Jones et al, 2021	UK	Self-diagnosed, clinician-diagnosed, or test-confirmed COVID-19 and had symptoms of COVID-19 that lasted more than 4 weeks	12 (≥1 )
Pagen et al, 2023	The Netherlands	Had one of more of the 44 prelisted symptoms at 3 months after positive COVID test.	44 (≥1)
Sudre et al, 2021	UK, US and Sweeden	Symptoms that persisted more than 4 weeks (28 days), more than 8 weeks (56 days) or more than 12 weeks (84 days) following COVID.	14 (≥1)
Thaweethai et al, 2023	US and Puerto Rico	Participants met the PASC score threshold (more than or equal to 12 points); the symptoms were: – PEM – Fatigue – Brain fog – Dizziness – Gastrointestinal symptoms – Palpitations – Changes in sexual desire or capacity -Loss of or change in smell or taste – Thirst – Chronic cough – Chest pain – Abnormal movements	12 (≥1)
Yoo et al, 2022	US	Persistent COVID-19 symptoms in the 90-day post discharge survey (or the 60-day survey if the 90-day survey was incomplete)	9 (≥1)

show less

Differences depending on the method of identification

As with all prevalence estimates, the prevalence of LC varies depending on the method used to identify the study population, including those with the disease.

Methods of identification could include those who self-report through an online survey, hospital records, GP records (and within these medical code or free text notes), recruitment or data from LC clinics, or a sample of the general population.

Difference in prevalence estimate by method of identification has been clearly demonstrated by a study using electronic health record data in Scotland. Results indicated that even within the health record data, there were significant variations in the prevalence estimate based on the method used to identify cases among the 4,676,390 participants.

Clinical codes identified fewest cases (1,092, 0.02%), followed by free text (8,368, 0.2%), sick notes (14,469, 0.3%), and what was termed an ‘operational definition’ based on patterns of clinical interactions recorded in the electronic health records (64,193, 1.4%).

Overall, 1.7% were identified as having LC using one or more method of identification.

Interestingly, there was limited overlap in cases identified between the different measures, and all measures considered indicated a similar trend in the prevalence of LC over time.

Difficulties identifying cases of LC

As COVID-19 is still prevalent but routine testing has ceased, identifying cases of LC is becoming more complex. While cases of LC which developed following a symptomatic COVID-19 infection accompanied by a positive COVID test are more clear cut, those which arise following mild COVID symptoms and no COVID test, an unknown virus with COVID-like symptoms, or an asymptomatic COVID infection not confirmed with a test, are much harder to capture. These difficulties identifying cases of LC may give the impression that the illness is becoming more rare, but in reality the decline in new cases could reflect limitations in measurement.

It is also worth noting that research suggests that the symptoms of long COVID and ME/CFS overlap, and studies have shown that a proportion of individuals with LC meet ME/CFS criteria such as the Canadian Consensus Criteria (CCC) or the 2021 NICE guidelines. However, not everyone with LC meets diagnostic criteria for ME/CFS. Whether or not a person has undiagnosed long COVID does not affect whether or not they qualify for an ME/CFS diagnosis (as ME/CFS is a symptom-based diagnosis).

Variation by study design

There is also variation in LC prevalence based on the type of study design used:

Cross sectional – provides a snapshot in time but does not allow for a picture over time of the number of people who have LC
Longitudinal – Repeated measures over time of the number of people with LC. This gives an indication of how rates of prevalence change but is limited by loss to follow-up which may bias the results, especially if those with LC systematically are more or less likely to drop out.
Case-Control: Often produce “artificial” prevalence, as the proportion of cases to controls is typically set by the researcher rather than reflecting the true population, making them better for identifying risk factors than measuring population burden.

Whilst a well-designed study without any limitations can provide an accurate prevalence estimate, any flaws in design can lead to systematic error which can distort findings.

For example:

If the sample population is not representative of the target population the prevalence may be skewed. For example, if only those who were admitted to a hospital with a COVID-19 infection are included, the prevalence of LC may be artificially high.
Studies with high non-response rates can overestimate prevalence if individuals with LC are more likely to participate than those without.
Using different diagnostic tests (e.g., self-reported surveys versus laboratory tests) can significantly alter prevalence, with screenings and self reported information tending to overestimate compared to use of strict LC criteria.
Clearly defining those with LC is essential, as including ineligible individuals as ‘cases’ can overestimate the prevalence.

Importantly, it is not usual for these ‘flaws’ to be due to researcher error, rather they are normally due to factors like insufficient resources, or poor quality of the data available. It is important that any limitations, alongside their implications, are clearly reported by the researchers in the discussion section of their papers.

Summary

While estimates of the prevalence of LC exist, they are limited by a number of factors such as the definition used, how representative the study population is of the general population, and how reliable the methods are. Additionally, prevalence estimates change over time – and vary between countries (and between different research studies) – meaning that it is not always appropriate to directly compare estimates or to extrapolate figures to populations for which they were not intended/beyond that for which they were initially calculated.

Part 2 of this article will discuss further difficulties faced when attempting to arrive at a consensus regarding the prevalence of long COVID.

What are the difficulties in arriving at a consensus regarding the prevalence of long COVID? – Part 1