Log In

Comparison of health measures between survey self-reports and electronic health records among Millennium Cohort Study participants receiving Veterans Health Administration care

Published 1 month ago32 minute read

BMC Medical Research Methodology volume 25, Article number: 81 (2025) Cite this article

Surveys are a useful tool for eliciting self-reported health information, but the accuracy of such information may vary. We examined the agreement between self-reported health information and medical record data among 116,288 military service members and veterans enrolled in a longitudinal cohort.

Millennium Cohort Study participants who separated from service and registered for health care in the Veterans Health Administration (VHA) by September 18, 2020, were eligible for inclusion. Baseline and follow-up survey responses (2001–2016) about 39 medical conditions, health behaviors, height, and weight were compared with analogous information from VHA and military medical records. Medical record diagnoses were classified as one qualifying ICD code in any diagnostic position between October 1, 1999, and September 18, 2020. Additional analyses were restricted to medical record diagnoses occurring before survey self-report and using specific diagnostic criteria (two outpatient or one inpatient ICD code). Positive, negative, and overall (Youden’s J) agreement was calculated for categorical outcomes; Bland–Altman plots were examined for continuous measures.

Among 116,288 participants, 71.8% self-reported a diagnosed medical condition. Negative agreement between self-reported and VHA medical record diagnoses was > 90% for most (80%) conditions, but positive agreement was lower (6.4% to 56.3%). Mental health conditions were more frequently recorded in medical records, while acute conditions (e.g., bladder infections) were self-reported at a higher frequency. Positive agreement was lower when analyses were restricted to medical record diagnoses occurring prior to survey self-report. Specific diagnostic criteria resulted in higher overall agreement.

While negative agreement between self-reported and medical record diagnoses was high in this population, positive and overall agreement were not strong and varied considerably by health condition. Though the limitations of survey-reported health conditions should be considered, using multiple data sources to examine health outcomes in this population may have utility for research, clinical planning, or public health interventions.

Peer Review reports

Understanding the level of concordance between conditions found in health records and patient self-reports is crucial for determining accurate disease prevalence and assessing patient–provider communication [1, 2]. Medical records are frequently utilized in epidemiological research as the criterion standard of patient health information, but they are subject to limitations such as non-standardized coding, fragmentation of information from multiple facilities, delayed recording, and subjective reporting by the provider [3,4,5,6,7]. Self-reports may provide a more complete health history but are similarly affected by limitations due to recall bias, patient health literacy, and other individual factors [1, 5, 8].

Several studies have investigated the agreement between medical records and patient self-reported conditions with inconsistent results, often varying by condition type, age, health status, sample size, and study design. The Millennium Cohort Study previously compared medical conditions reported in self-administered surveys collected from 37,798 U.S. service members and veterans and linked objective health records maintained in the Military Health System Data Repository (MDR) [1]. The study found near-perfect negative agreement and moderate positive agreement between these two sources, suggesting that self-report may be sufficient to exclude a history of conditions not otherwise documented in the health record [1]. Another large cross-sectional study comparing health surveys with health records found a higher average prevalence of conditions in self-reports and noted that the survey had higher sensitivity in identifying symptomatic conditions, such as chronic allergies, neck and back pain, and arthritis [9]. Other studies have reported a similar overrepresentation of symptom-based conditions among self-reports, including migraine headaches [1, 9], prostatitis [1, 7], urinary tract infections [7], fractures [10], and osteoarthritis [1, 6, 9, 11]. Several studies have provided evidence for high agreement (kappa ≥ 0.70) among chronic conditions, such as diabetes, hypertension, and cancer, and conditions with well-defined diagnostic criteria, including myocardial infarction and stroke [1, 2, 4, 6, 8, 10, 12]. In addition to varied agreement across conditions, multiple studies have reported significant differences in concordance by age [2,3,4, 10], number of comorbidities [2, 3, 6], and frequency of health care utilization [2, 10]. A longer archival period (or longer enrollment period) also appeared to be associated with increased average concordance by allowing more time for conditions to be documented in the medical record [1, 2, 8]. Two studies used multiple strategies to ascertain cases from health records that included augmentation with lab tests, pharmacy records, and physician notes [11, 12]. None of these studies, however, compared self-report and health records agreement across different case ascertainment criteria and time periods.

Reliability of available patient health information is particularly important among military populations, where force readiness depends on the health of service members. Among veterans, a higher frequency of multiple physical and mental health conditions may signal a need for improved health care delivery to address complex comorbidities [13]. Through linkage with triennial survey data from the U.S. Department of Defense (DoD) Millennium Cohort Study and the Veterans Health Administration (VHA) of the Department of Veterans Affairs, the current study will augment previous research in this population to compare up to 20 years of self-reported survey data from multiple survey administration cycles with VHA medical record diagnoses among participants who utilized VHA for care [1].

The objective of this analysis was to compare diagnoses of 39 conditions of interest in medical records (VHA inpatient, outpatient, and purchased care; United States Renal Data System [USRDS]; and MDR) with diagnoses reported on the Millennium Cohort Study survey. The aims were to determine (1) the agreement between positive responses for the 39 conditions of interest on the Millennium Cohort Study survey with a corresponding diagnosis in medical records, (2) the agreement between negative responses for the 39 conditions of interest on the Millennium Cohort Study survey without a corresponding diagnosis in medical records, and (3) a measure of overall concordance between the two sources.

The Millennium Cohort Study is the DoD’s largest prospective cohort study of U.S military service members and veterans. Initiated in 2001 in response to the National Defense Authorization Act for Fiscal Year 1999, which directed the DoD to establish a longitudinal study examining the health impacts of deployments and associated exposures, the study has collected self-report, medical record, and administrative data on military service and health outcomes spanning up to 21 years [14,15,16,17]. Detailed descriptions of the Study and its findings from the past two decades are available elsewhere [14, 15]. The initial panel and each of three subsequent panels of Millennium Cohort Study participants were selected as a random sample from Defense Manpower Data Center (DMDC) rosters of military personnel from all service branches (Army, Navy, Coast Guard, Air Force, and Marine Corps) and Reserve/National Guard personnel, weighted to oversample for specific populations [14, 16, 17]. While all panels oversampled for female service members, Panel 1 (enrolled 2001–2003) oversampled for Reserve/National Guard members and personnel with previous deployment history [16], Panels 2 (2004–2005) and 3 (2007–2008) oversampled for Marine Corps personnel, and Panel 4 (2011–2013) oversampled for married service members. This analysis examined data from participants from these first four enrollment panels.

Millennium Cohort Study baseline questionnaires and triennial follow-up surveys include validated survey instruments to measure military-related exposures and participant health outcomes, including physician-diagnosed conditions, self-reported symptoms, mental health assessment, physical and functional status, alcohol use, tobacco use, sleep patterns, life experiences, and occupational exposures [14,15,16,17]. Survey responses are augmented with data from DMDC personnel files to capture baseline participant characteristics, such as sex, age, education level, marital status, race and ethnicity, pay grade, deployment experience, service branch, length of service, and military occupation [17], and linked with MDR records to capture medical encounters occurring within the Military Health System (MHS). Participants continue to participate in the study even after separating from service and records are also linked to VHA records to capture medical encounters occurring within multiple healthcare systems. For the purposes of this study, veteran characteristics were assessed using the most recent VHA record and included whether participants had a service-connected disability and VHA health care utilization, which was categorized based on average number of VHA encounters per year, with “regular users” defined as those with at least 1 encounter per year on average, “irregular users” as those with fewer than 1 encounter per year on average but at least 1 encounter for the period of observation, and “high frequency users” as those with more than 1 encounter per year every year during the period of observation. The period of observation for determining VHA health care utilization was measured from the first VHA encounter on record to the last encounter date available for each participant prior to September 18, 2020.

International Classification of Diseases, Ninth Revision (ICD-9) and Tenth Revision (ICD-10), and Current Procedural Terminology (CPT) codes for each of the 39 conditions of interest were selected by a clinician and verified by the research team (see Additional File 1). CPT and ICD procedure codes were additionally used to identify kidney failure requiring dialysis. Relevant diagnostic codes were extracted from VHA files (inpatient, outpatient, and purchased care encounters), USRDS data (for kidney failure requiring dialysis), and MDR medical records (inpatient and outpatient encounters) for the time period beginning at the start of fiscal year 1999 and ending on September 18, 2020. For determining the presence of a condition in medical records, sensitive criteria defined a positive case as the presence of any relevant diagnostic code (ICD-9, ICD-10, or CPT) located in any diagnostic position found in either inpatient or outpatient medical records for the 39 conditions of interest. Additional analyses utilized specific criteria, which defined a positive case in the medical record as the presence of at least two outpatient codes or one inpatient diagnostic code located in any diagnostic position. For example, if only one diagnostic code for a given condition was identified in outpatient medical records only, a participant would be flagged as a positive case for that condition based on sensitive criteria but would not be identified as a case using specific criteria. For both sensitive and specific criteria, the first encounter date or admission date when a given condition was noted defined the diagnosis date.

Self-reported cases were assessed via a survey item asking, “Has a doctor or other health professional told you that you have any of the following conditions?” ever at baseline, or in the past 3 years at follow-up. If participants responded “Yes” to a specific condition at any survey assessment between 2001 and 2016, they were classified as a positive case at all subsequent waves, with the diagnosis date corresponding to the survey date at first self-report. Otherwise, if they did not respond “Yes” at any wave and responded “No” at any wave, they were classified as a negative case. Conditions with missing responses at all available survey waves, which account for < 0.02% of all participants across the 39 conditions, were set to missing.

For primary analyses, no restrictions were placed on the temporal sequence such that diagnostic codes in medical records could appear at any time in relation to survey self-report. This approach recognizes the clinical circumstance in which a patient may be informed of a likely diagnosis with confirmation of that diagnosis not appearing in the medical record until a future time. Additional analyses restricted diagnostic codes to those appearing prior to the date of survey self-report to better inform the interpretation of concordance between a self-reported condition and a diagnosis confirmed in the medical record. Codes indicating personal history of a condition were not utilized in additional analyses investigating temporality because these codes lack a date of diagnosis.

Objective height and weight measurements from VHA vital signs medical records were compared with survey-reported height and weight, with weight measurements considered if taken within 1 year of a corresponding survey. For instances in which more than one weight was taken within 1 year of a survey, the weight recorded closest to the date of survey was used. When multiple height measurements were available, the mode was used for both medical records and survey responses. For instances in which all available height measurements were different, the first height recorded was used. Weights < 80 lb or > 500 lb and heights < 48 inches or > 95 inches were omitted from analysis as extreme values.

Problem drinking was assessed on the Millennium Cohort Study survey using 5 items from the Patient Health Questionnaire (PHQ) assessing risky alcohol use behaviors, with probable problem drinking defined as positive endorsement of at least 1 of the 5 items (e.g., driving a car after drinking too much) in the last 12 months [18]. Self-reported alcohol use in the medical record was taken from VHA health factors data and assessed using the Alcohol Use Disorders Identification Test-Concise (AUDIT-C) alcohol dependence screening tool, with scores of 3 or more for women or 4 or more for men indicating alcohol misuse [19]. The AUDIT-C assessment closest in time prior to a completed survey date was compared with the survey PHQ responses. If more than one AUDIT-C alcohol assessment was completed on the same day for a participant, the highest score was used.

Participants were classified as ever smokers via survey responses if they reported at any survey wave that they had smoked 100 cigarettes or more in their lifetime. Smoking was assessed in medical records via VHA health factors data, using the smoking assessments recorded closest to a corresponding survey date.

Demographic, military, and veteran characteristics were examined by enrollment panel and among those who self-reported one or more conditions. The prevalence of the 39 conditions of interest in survey records, VHA medical records only, and combined VHA-MDR medical records were examined, as well as the agreement between these sources. Positive and negative agreement were calculated for the comparisons of self-reported conditions with those diagnosed in medical records [20, 21]. For the purposes of this study, we calculated positive agreement as 2a/[N + (a—d)] and negative agreement as 2d/[N—(a—d)], where “a” represents true positives for the condition, “d” represents true negatives for the condition, and N represents the total number of individuals. Youden’s J statistic was used to measure overall concordance between the medical record and survey responses [22] because this measure accounts for both sensitivity and specificity and is therefore independent of the prevalence of conditions of interest [22, 23]. Positive and negative agreement and Youden’s were also calculated to compare health behaviors (smoking and alcohol use) between survey self-reports and medical records. Bland–Altman plots were used to examine the agreement between self-reported and medical record measurements of height and weight and evaluate bias between the mean differences of these measurements [24].

The main analyses compared survey self-reported conditions with medical record conditions classified using sensitive criteria and occurring at any time in relation to survey completion from VHA medical records only or from combined VHA-MDR medical records. Three additional variations of the main analyses were conducted: (1) with medical record diagnoses classified using specific criteria and occurring at any time in relation to survey completion, (2) with medical record diagnoses classified using sensitive criteria and temporally restricted to those occurring prior to survey completion, and (3) with medical record diagnoses classified using specific criteria and temporally restricted to those occurring prior to survey completion. The sensitivity and specificity of self-report in detecting medical record diagnoses by each case attainment strategy are reported in Additional File 2. Additional supplemental analyses investigated whether concordance of survey and medical record diagnosis varied by VHA user frequency. SAS software, version 9.4 (Cary, NC) was used for all analyses.

Of the 133,163 Millenium Cohort Study participants who had separated from service during the study period, 116,288 participants (or 87.3%) were identified in VHA records. Demographic, military, and veteran characteristics among these 116,288 participants are listed in Table 1. Among participants, 71.8% self-reported ever being told by a health professional that they had one or more of the 39 medical conditions on the survey. Specifically, a higher proportion of those who were female, in older age groups, married or separated/divorced, or had a service-connected disability, self-reported one or more medical conditions. Overall, 57.4% of participants were classified as high frequency VHA users.

Table 1 Characteristics of Millennium Cohort Study participants, by panel and number of conditions (N=116,288)

Full size table

The most prevalent condition identified by self-report, VHA records only, and VHA-MDR records was depression (26.5%, 39.5%, and 51.2% respectively), followed by migraine headaches (23.4%) and tinnitus (22.0%) for self-report surveys; posttraumatic stress disorder (PTSD; 30.3%) and sleep apnea (26.3%) for VHA records only; and sinusitis (41.0%) and sleep apnea (37.5%) for VHA-MDR records (Table 2). Mental health conditions, such as depression and PTSD, and endocrine/metabolic conditions, such as thyroid conditions and diabetes, were identified more often in medical records than in survey self-reports. Conditions associated with changing symptom severity, such as chronic fatigue syndrome, rheumatoid arthritis, and stomach/duodenal/peptic ulcer, were self-reported on surveys at a higher prevalence than what appeared in medical records. Most other conditions had comparable prevalence estimates between surveys and medical records or were slightly higher for surveys.

Table 2 Prevalence and agreement between self-report and medical record conditions, sensitive, at any time criterion (N=116,288)

Full size table

Overall, the positive agreement between self-report and VHA records only was low, ranging from 6.4% for any other hepatitis to 56.3% for hypertension (Table 2). Negative agreement across the 39 conditions ranged from 78.3% for depression to 99.7% for kidney failure. Concordance between self-report and VHA records ranged from slight (e.g., Youden J = 0.09 for cirrhosis) to moderate (e.g., Youden J = 0.45 for multiple sclerosis, migraine headaches, and asthma). When combined VHA-MDR records were examined, positive agreement across all conditions was comparable or higher (with the exception of hepatitis B and hepatitis C), negative agreement was comparable or lower, and concordance was comparable or lower (with the exception of gallstones).

The prevalence of, and agreement between, self-reported conditions and VHA records using varying case ascertainment strategies and by VHA user frequency (irregular, regular, or high) were examined. Comparing specific criteria and sensitive criteria for medical record diagnoses prior to the survey date resulted in higher concordance as assessed by Youden’s J for all diagnoses, but no consistent patterns were observed with respect to the magnitude of positive or negative agreement (see Additional File 3). Comparison of diagnoses occurring ever versus prior to survey date using specific criteria resulted in a lower Youden’s J but a higher positive agreement for nearly all diagnoses. No consistent pattern of differences across diagnoses was seen for positive agreement or Youden’s J comparing high and regular users of VHA health care (see Additional File 4). Irregular users of VHA health care had lower levels of positive agreement for most diagnoses compared with regular or high users.

Sensitivity analyses (Additional File 5) examined the prevalence of depression in VHA only and combined VHA-MDR records when ICD codes relevant to Unspecified Depressive Disorder were excluded. In excluding these codes, the agreement between self-reported conditions and VHA records remained consistent with the results reported in Table 2.

The prevalence of ever smoking and problem drinking (Table 3) was higher for both in VHA records (61.4% and 43.8%, respectively) than in self-reported records (50.8% and 27.1%, respectively). Positive agreement between self-reported and VHA records was 83.2% for ever smoking and 49.2% for problem drinking, while negative agreement was 78.5% for ever smoking and 72.1% for problem drinking. Concordance was substantial for ever smoking (Youden J = 0.65) and fair for problem drinking (Youden J = 0.23).

Table 3 Prevalence and agreement of smoking and alcohol use between self-report and VHA records

Full size table

Agreement between heights is presented in a Bland–Altman plot (Fig. 1). The mean difference in height between VHA and self-reported records was −0.12 inches (SD = 1.07); 1.4% of paired measurements were more than 2 standard deviations from the mean difference. The reference line representing zero mean difference between both methods was within the confidence lines representing 2 standard deviations from the mean, suggesting no difference between the measures. Greater data clustering below the lower confidence line versus the upper confidence line suggests a skew in the data toward heights being higher in self-reported records than in VHA records. Agreement between weights by survey year is presented in Fig. 2. The mean difference in weight was 6.44 lb (SD = 12.21) at the 2001 survey point, 5.68 lb (SD = 11.31) at 2004, 5.99 lb (SD = 11.77) at 2007, 5.58 lb (SD = 11.67) at 2011, and 4.88 lb (SD = 11.28) at 2014. The proportion of paired measurements more than 2 standard deviations from the mean difference was 4.3% at 2001, 4.1% at 2004, 3.9% at 2007, 4.0% at 2011, and 3.6% at 2014. The reference lines representing zero difference between both methods is within the confidence lines at all time points, suggesting no difference between the measures. Greater data clustering above the upper confidence line versus the lower confidence line suggests a skew in the data toward weights being higher in VHA medical records compared with self-reported results.

Fig. 1
figure 1

Bland–Altman plot comparing heights self-reported on the Millennium Cohort Study (MCS) survey and in Veterans Health Administration (VHA) medical records. SD, standard deviation

Full size image

Fig. 2
figure 2

Bland–Altman plot comparing weights self-reported on the Millennium Cohort Study (MCS) survey and in Veterans Health Administration (VHA) medical records. SD, standard deviation

Full size image

While other studies examining agreement between self-reported medical conditions and medical record data exist, to our knowledge, this is the first that incorporated all of the following features: (1) employed two different diagnostic criteria in medical records for case definitions designed to be more sensitive or more specific; (2) employed different temporal restrictions for capture of medical record diagnoses; (3) examined differences in agreement by incorporating medical records from a second care setting to capture diagnoses; and (4) utilized self-reported data from a survey administered repeatedly to participants triennially for up to 20 years. Previous work from the Millennium Cohort Study was only able to examine agreement between three years’ worth of self-report and medical record data from a single data source (i.e., MDR) among participants from the first enrollment panel [1]. This study builds upon and largely corroborates findings from this foundational study by adding over a decade of data from both the MHS and VHA, as well as incorporating 3 additional enrollment panels of service members in the population.

This study observed high levels of negative agreement between self-reported and medical record confirmed diagnoses, with 31 of 39 diagnoses indicating negative agreement > 90%, and only one diagnosis (i.e., depression) indicating negative agreement < 80%. A negative self-report of medical conditions therefore strongly corresponds with a corresponding absence of these conditions in medical records. Positive agreement was much lower, with the highest value observed for hypertension (56.3%), followed by 3 other diagnoses with values > 50% (i.e., migraine headache, depression, PTSD), and 4 others with values between 40 and 50% (i.e., asthma, sleep apnea, hearing loss, tinnitus). The low levels of positive agreement in this study are not surprising, as most conditions examined also have a low prevalence which can impact the level of positive agreement. But in the context of this specific study, the generally low levels of positive agreement may also reflect incorrect self-report of conditions that have not been medically confirmed, self-reported conditions that were not captured due to nonresponse at specific survey waves across time, or failure of the medical record to capture diagnoses due to a lack of proper coding, insufficient frequency of medical care encounters to generate a diagnostic code, incorrect code entry in the medical record, or care received in medical settings not included in these analyses (e.g., outside of VHA or MDR records).

Incorporating MDR records with VHA records provided greater coverage of continuity of care over time, captured a higher prevalence of conditions, and resulted in small increases in positive agreement for most diagnoses (< 10%, except for sinusitis, peptic ulcer, gallstones, pancreatitis, and bladder infection) but little change in overall agreement with self-report. As expected, temporal expansion capturing diagnoses occurring at any time in the medical record as opposed to prior to self-report generally resulted in a higher level of positive agreement for sensitive diagnostic criteria but not for specific criteria. Increasing the time frame for medical record review may have resulted in better capture of valid diagnoses if a delay occurred between the occurrence of a medical condition and entry into the medical record. Positive agreement between medical records and self-reports was lowest among irregular users of VHA health care compared with regular or high frequency users, since more encounters within the health system allow for better capture of diagnoses in the medical record.

The highest agreement of any comparison was seen for self-report of ever smoking. This result is consistent with a systematic review of published studies comparing self-reported smoking with biochemical measures that concluded that self-report is accurate in most studies [25]. Agreement between measures of problem drinking was of a much lower magnitude, likely due to the use of differing tools for self-report in medical records. The AUDIT-C is an extensively validated screen for unhealthy alcohol use, while the PHQ is a screening instrument for detecting the probable presence of multiple mental health conditions [18, 26]. Each tool has its own utility and our results suggest that comparison of the two does not provide equivalent information. Comparison of self-reported and objectively measured height and weight mirrored what has been reported in other populations, with height slightly overestimated and weight underestimated in self-reports [27]. The net effect of this error is to underestimate body mass index, a frequently used proxy measure of body adiposity [28].

Despite the appearance of questions about the presence of medical conditions in national surveys such as those in the U.S. Behavioral Risk Factor Surveillance System, the value of responses to such questions in terms of accurate identification of health conditions of interest is not extensively documented. The available research on self-reported medical diagnoses suggests some limitations to this method. Disparate results can be seen for the same diagnoses, as was shown in research demonstrating high sensitivity (88%) for self-reported diagnosis of rheumatoid arthritis in a systematic review [29], which contrasts with the results of a community-based survey in Norway, where only 19.1% of participants reporting a diagnosis of rheumatoid arthritis were confirmed by medical record review to have this condition [30], and a survey of primary care patients in Boston, where 32% of participants reporting this condition were confirmed as having it from medical record review [31]. Additional difficulties in assessing accuracy of self-reported medical conditions arise from problems intrinsic to the medical record, which has been shown to contain important omissions and inaccuracies [32, 33].

The comparison of existing literature with our results is challenging due to methodologic differences, but some findings can be noted. A meta-analysis that included 22 epidemiological studies of hypertension found that 42.1% of participants diagnosed with this condition reported having it [34]. Similarly, we noted 56.3% positive agreement between medical record diagnosis and self-reported diagnosis. As observed previously in a review of diagnoses resulting in hospitalization, as well as a previous report from the first participants in the Millennium Cohort Study [1, 35], we observed that accuracy of self-reported medical diagnoses varied by condition. While more common conditions were identified similarly between self-reports and VHA records, there were some notable discrepancies between these methods. For instance, mental health (e.g., depression, PTSD) and metabolic conditions (e.g., thyroid conditions, diabetes) were more frequently captured through the medical records, while acute conditions (e.g., bladder infections) were self-reported more often. This may have implications for modes of assessment that better capture cases of conditions, depending on the chronicity and severity of specific medical diagnoses.

This analysis had several limitations. The self-administered survey did not allow for the participant to ask questions to help interpret or clarify any item about which they were unsure. Participants with a condition (e.g., emphysema) may not have responded affirmatively if they recognized it only by a different name (e.g., COPD). Because of the longitudinal nature of this study, some participants may not have responded to all follow-up surveys. Thus, missing data or nonresponse may also contribute to under-documentation of self-reported conditions. Although we compiled extensive lists of codes to capture medical record diagnoses, some may have been overlooked or omitted. We also did not conduct individual medical record reviews due to the large number of participants. Such a review might have detected conditions that were present but either incorrectly coded or not coded (such as those mentioned only in provider notes). Additionally, the medical record review may not have included all sources of care received and may have missed the occurrence of a diagnosis made in a medical setting not included in this analysis. For example, active duty personnel gain automatic eligibility to seek care within the MHS upon entry into the military and all medical encounters among active duty service members should be accounted for in MDR records. However, National Guard and Reserve personnel are only eligible to receive care within the MHS under specific circumstances (e.g., if they have been activated for 30 days or more, if they purchase healthcare plans that allow them to access the MHS) and as such, may routinely seek care outside of the MHS. Likewise, while the use of VHA care is high in our study population, participants may potentially use other sources of care that are not covered by the VA and would not be reflected in the medical records. These potential gaps in documentation of health conditions within the medical records may explain some of the reasons for the low overall positive agreement in this study. As of 2015, 62% of all separated Operations Enduring Freedom, Iraqi Freedom, and New Dawn era veterans had obtained health care from the VA [36]. There are also known differences in sociodemographic and health related characteristics of veterans who use VA healthcare compared with veterans who do not [37, 38], so the population of veterans examined in this study also may not be representative of all veterans within the U.S. Lastly, while the Youden’s J statistic does have benefits in terms of not being prevalence dependent, it also suffers from inherent assumptions of diagnostic accuracy of one measure to evaluate the concordance of another measure. As we cannot assure that either self-report or medical record diagnoses are inherently accurate in this study, this should be considered in the context of our findings and the levels of concordance observed.

This study demonstrated that negative self-reports of any of the 39 medical conditions surveyed showed a high level of agreement with absence of documentation of the condition in the medical record. Positive agreement was observed at a lower level and varied greatly, depending on the type of condition and related factors such as chronicity. The value and limitations of survey-reported medical conditions should be taken into consideration when using this information for research or planning of clinical care or public health interventions.

I am a military service member or employee of the U.S. Government. This work was prepared as part of my official duties. Title 17, U.S.C. §105 provides that copyright protection under this title is not available for any work of the U.S. Government. Title 17, U.S.C. §101 defines a U.S. Government work as work prepared by a military service member or employee of the U.S. Government as part of that person’s official duties. Report No. 24–23 was supported by the Military Operational Medicine Research Program, Defense Health Program, and Department of Veterans Affairs under work unit no. 60002. The views expressed in this article are those of the authors and do not necessarily reflect the official policy or position of the Department of the Navy, Department of Defense, Department of Veterans Affairs, nor the U.S. Government. The study protocol was approved by the Naval Health Research Center Institutional Review Board in compliance with all applicable federal regulations governing the protection of human subjects. Research data were derived from approved Naval Health Research Center Institutional Review Board protocol number NHRC.2000.0007 and VA Puget Sound Institutional Review Board project #1,587,777.

The VHA and DoD datasets used and/or analyzed during the current study are not publicly available due to security protocols and privacy regulations. DoD datasets may be made available on reasonable request by the Naval Health Research Center Institutional Review Board (contact phone +1 619 553 8400) and will require data use agreements to be developed. VA-affiliated researchers can apply to access the VA’s Millennium Cohort Study data at the study website.

AUDIT-C:

Alcohol Use Disorders Identification Test

COPD:

Chronic obstructive pulmonary disease

CPT:

Current Procedural Terminology

DMDC:

Defense Manpower Data Center

DoD:

Department of Defense

ICD:

International Classification of Diseases

MDR:

Military Health System Data Repository

MHS:

Military Health System

PHQ:

Patient Health Questionnaire

PTSD:

Posttraumatic stress disorder

SD:

Standard deviation

USRDS:

United States Renal Data System

VHA:

Veterans Health Administration

In addition to the authors, the Millennium Cohort Study team includes Anna Baccetti, MPH; Jennifer N. Belding, PhD; Satbir K. Boparai, MBA; Marvin A. Brown, Jr.; Nathan Carnes, PhD; Sheila F. Castañeda, PhD; Rebecca A. Consigli; Toni Rose Geronimo-Hara, MPH; Judith Harbertson, PhD; Lauren Jackson, BS; Isabel G. Jacobson, MPH; Claire K. Kolaja, MPH; Cynthia A. LeardMann, MPH; Crystal L. Lewis, EdD; David Moreno Ignacio; Erin L. Richard, PhD; Anna C. Rivera, MPH; Neika Sharifian, PhD; Beverly D. Sheppard; Daniel W. Trone, PhD; Javier Villalobos, MS; Jennifer L. Walstrom; Yunnuo Zhu, MPH. The authors also appreciate contributions from Rayna K. Matsuno, PhD, the Deployment Health Research Department, and Leidos, Inc. We greatly appreciate the contributions of the Millennium Cohort Study participants. Finally, we are greatly indebted to the team of founding investigators (Dr. Gregory Gray, Dr. Margaret A.K. Ryan, Dr. Edward Boyko, Dr. Rick Riddle, Dr. Timothy Wells, Dr. Paul Amoroso, Dr. Tomoko Hooper, and Dr. Gary Gackstetter) whose foresight laid the framework for the Study’s continued success.

Not applicable.

I am a military service member or employee of the U.S. Government. This work was prepared as part of my official duties. Title 17, U.S.C. §105 provides that copyright protection under this title is not available for any work of the U.S. Government. Title 17, U.S.C. §101 defines a U.S. Government work as work prepared by a military service member or employee of the U.S. Government as part of that person’s official duties. Report No. 24-23 was supported by the Military Operational Medicine Research Program, Defense Health Program, and Department of Veterans Affairs under work unit no. 60002. The views expressed in this article are those of the authors and do not necessarily reflect the official policy or position of the Department of the Navy, Department of Defense, Department of Veterans Affairs, nor the U.S. Government. The study protocol was approved by the Naval Health Research Center Institutional Review Board in compliance with all applicable federal regulations governing the protection of human subjects. Research data were derived from approved Naval Health Research Center Institutional Review Board protocol number NHRC.2000.0007 and VA Puget Sound Institutional Review Board project #1587777.

The Millennium Cohort Study is funded through the Military Operational Medicine Research Program, Defense Health Program, and Department of Veterans Affairs Cooperative Studies Program under work unit no. 60002. The funding agencies had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

    Authors

    1. William Culpepper

      You can also search for this author inPubMed Google Scholar

    2. Rudolph P. Rull

      You can also search for this author inPubMed Google Scholar

    3. Edward J. Boyko

      You can also search for this author inPubMed Google Scholar

    Research design – All authors; Data collection from surveys or medical records – FC, EH, RR, EB; Statistical analysis – FC; Drafting the manuscript – EB, FC, NS; Revising the manuscript – All authors; Approval of the final manuscript – All authors.

    Correspondence to Felicia R. Carey.

    The study protocol was approved by the Naval Health Research Center Institutional Review Board in compliance with all applicable federal regulations governing the protection of human subjects. Research data were derived from approved Naval Health Research Center Institutional Review Board protocol number NHRC.2000.0007. Written or electronic informed consent was obtained for all participants.

    Not applicable.

    The authors declare no competing interests.

    A list of authors and their affiliations appears at the end of the paper.

    Additional File 1. ICD-9, ICD-10, and CPT codes for the 39 conditions of interest. Lists all ICD-9, ICD-10, and CPT codes used for case definitions of the 39 conditions of interest, and provides additional data specifications for “kidney failure requiring dialysis.”

    Additional File 2. Sensitivity and specificity of self-report in detecting medical record conditions, by case ascertainment strategy. Provides the sensitivity and specificity between self-report and medical record diagnoses for the 39 conditions of interest, by different case attainment strategies and time-based criteria.

    Additional File 3. Prevalence and agreement between self-report and medical record conditions, by case ascertainment strategy. Provides results from supplemental analyses examining the prevalence and agreement for the 39 conditions of interest, by different case attainment strategies and time-based criteria.

    Additional File 4. Prevalence and agreement between self-report and medical record conditions by VHA user frequency, sensitive, at any time criteria. Provides results from supplemental analyses examining the prevalence and agreement for the 39 conditions of interest, by VHA user frequency.

    Additional File 5. Prevalence and agreement between self-report and medical record diagnoses of depression excluding unspecified depressive disorder, sensitive, at any time criterion. Provides results from sensitivity analyses examining the prevalence and agreement of depression, excluding ICD codes for Unspecified Depressive Disorder [311 (ICD-9) and F32.9 (ICD-10)].

    Check for updates. Verify currency and authenticity via CrossMark

    Cite this article

    Carey, F.R., Hu, E.Y., Stamas, N. et al. Comparison of health measures between survey self-reports and electronic health records among Millennium Cohort Study participants receiving Veterans Health Administration care. BMC Med Res Methodol 25, 81 (2025). https://doi.org/10.1186/s12874-025-02529-x

    Download citation

    • Received:

    • Accepted:

    • Published:

    • DOI: https://doi.org/10.1186/s12874-025-02529-x

    Keywords

    Origin:
    publisher logo
    BioMed Central
    Loading...
    Loading...
    Loading...

    You may also like...