A knowledge-based clinical decision support system for personalized health examination items in China: design and evaluation

BMC Medical Informatics and Decision Making volume 25, Article number: 183 (2025) Cite this article

Health examination identifies risk factors and diseases at an early stage through a series of health examination items. In China, however, the incidence of consulting services for health examination items is low and the current health examination item package is insufficiently personalized. Therefore, we created and evaluated a clinical decision support system (CDSS) for personalized health examination items.

An ontology with the data properties as the core design was created to guide the knowledge expression. A knowledge graph composed of ontology-guided property graphs was developed to provide rich and clear decision-making knowledge. The system, including the web for primary care clinicians and the app for participants, was constructed to directly assist primary care clinicians through personalized and interpretable health examination item recommendations. The enter rate and mapping rate were created to evaluate the system’s capability to process input health feature data. The two-step expert evaluation was designed to assess whether recommendations with several health examination items were appropriate for participants. The system recommendations and existing packages were compared to the expert’s gold standard.

There were 15 classes, 2-level class hierarchies, 3 types of object properties, and 16 types of data properties in the health examination item recommendation ontology. Several different data properties could express a piece of complex decision-making knowledge and reduce the number of classes. There were 584 classes, 781 object properties, and 1094 data properties in the knowledge graph. Retrospective data from 70 participants, with a total of 472 health features, were selected for system evaluation. The ontology can cover 96.2% of the health features. 56.4% health features entered into the system had corresponding health examination items. The precision and recall of the system were 96.3% and 84.8%, and the packages were 72.5% and 69.1%.

The performance of this system was close to experts and outperformed the current impersonalized health examination item packages. This system could improve the personalization of health examination items and the health examination consultation services, and promote participants’ engagement in the health examination.

Peer Review reports

Health examination is the individual preventive examination [1], that assesses a participant’s overall health through a series of screening tests [2], identifies previously unrecognized symptoms or risk factors for diseases [3, 4], and intervenes them at an early stage [5, 6]. Health examinations are mainly for asymptomatic population, excluding people who require medical examinations due to illness or injury [7, 8]. The focus is on screening and risk assessment of high-incidence chronic non-communicable diseases and their risk factors. Some evidence supports that health examination can reduce risk factors associated with increased mortality [1, 4] and may improve public health and patient outcomes [2, 3].

Health examination items need to be tailored to the individual participant and conducted in a highly heterogeneous manner depending on the participant’s age or medical or family history and clinical resources available to different healthcare providers [2, 9, 10]. In contrast to screening items for specific diseases, one of the main concerns of a comprehensive and individualized health examination is determining which health examination items are necessary for each participant [11]. In some health examination institutions, the health examination item decision-making process can be summarized into three stages (ask & collect – analyze & recommend – decide): First, primary care clinicians ask and collect participant’s personal health-related information, such as age, family history, current medical history, etc.; then, they analyze the health-related information and recommend the health examination items based on their knowledge and experience [1, 12]; finally, primary care clinicians and participants jointly decide on the final health examination items [13, 14]. Primary care clinicians mainly are responsible for screening new diseases and providing preventive care in China.

Actually, the provision rate of consultation services for health examination items is low, with the most common barriers being lack of time during office visits and insufficient physician expertise, experience and consulting skills [15, 16]. Because of the large number of participants and the scarcity of primary care clinicians, particularly in developing countries such as China, primary care clinicians need to respond quickly to participant needs for comprehensive and personalized health examination items within a short consultation time [1]. This also requires primary care clinicians with extensive experience to make quick decisions, yet experienced primary care clinicians are even more scarce [17]. Additionally, participants require primary care clinicians to spend more time explaining the reason for health examination item recommendations [14, 15].

Currently, in China, the expert calls for personalized health examination item decisions driven by individual health status [7, 8], but there is no gold standard for personalized health examination items. Most health examination institutions choose economic-oriented health examination item packages (see Additional file 1 for an example of the health examination item package) to meet these two barriers [18,19,20]. Several packages have been developed based on different costs. A combination of multiple fixed health examination items is called a package, which does not incorporate the health features of participants and the relationships between those features and the items. In actual clinical scenarios, participants mainly choose the package based on cost. As a result, some health-related health examination items are not carried out, while other health examination items unrelated to health are carried out. This brings an unnecessary financial burden on participants and raises the possibility of exam-related injury. In this study, the term “health examination item package” refers to a format that combines several fixed health examination items. It is not a software module or app for mobile devices. We refer to the current health examination item package as the “package”.

The rapid pace of medical information technologies has improved the decision-making process in healthcare: mobile technology has saved time [21,22,23], knowledge bases have decreased inequalities in expertise and experience [24,25,26], and CDSS has assisted in decision-making directly [27]. Mobile technology simply addresses personal health-related information gathering concerns to save time [28], but it does not fully process and utilize this information [29, 30]. Knowledge bases are the cornerstones of the CDSS [24], which can help computers organize, express, and utilize decision-making knowledge. There are two knowledge bases in the field of health examination, constructed by the United States Preventive Services Task Force (USPSTF) [25, 31, 32] and the Canadian Task Force on Preventive Health Care (CTFPHC) [26]. These knowledge bases contain some specific disease screening knowledge. It requires primary care clinicians spend considerable time searching and learning on their own before making a decision. It cannot directly assist primary care clinicians in making decisions by processing the health-related information and then recommending health examination items [33]. There is a decision support system that can directly assist primary care clinicians through inputting 7 specific features (age, weight, height, gender, pregnancy, tobacco use, and sexual activity) [27]. However, due to simple input data, it is challenging to make informed decisions based on complex health-related information, resulting in limited utilization of the knowledge base [25] and weak decision support capabilities. It is also inappropriate for the national circumstances of China. It is necessary to improve the CDSS for the personalized health examination items [24].

Evaluation of the system’s capability to recommend personalized health examination items is also a challenging task. There is no gold standard for personalized health examination items. This system provides the health examination regimen that includes several health examination items, so it is more complex than assessment on disease diagnosis (yes or no) [34] and disease classification (A or B or C) [35, 36]. Also, it varies depending on the opinions of different medical professionals. We could only assume that medical professionals with greater expertise make better decisions. The recommendations for health examination items are obtained through discussions and consensus among several experienced medical experts and are finally used as the gold standard. In addition, evaluating the personalization of health examination items requires eliminating time and cost biases, and avoid causing additional harm to participants. A new evaluation framework should be proposed. A novel evaluation framework should be developed to systematically assess the effectiveness of the proposed decision support system.

Several research articles discuss how decision support systems can help primary care clinicians make better decisions about health examination items, and what factors need to be considered. The first research area intends to discuss the recent decision support systems in the health examination.

The United States Preventive Services Task Force (USPSTF) has created and implemented an evidence-based knowledge base [25] to provide information on clinical preventive services [31, 32], such as counseling services, preventive medication, as well as specific disease screening recommendations for certain populations. There are 12 disease categories, 133 topic areas, 4 age groups, and 5 recommendation grades in the knowledge base. This approach is designed to help primary care clinicians determine whether preventive services for a specific disease are appropriate for a patient’s needs. Moreover, the USPSTF developed a decision support application [27] to help primary care clinicians use the proposed knowledge base directly. This application takes 7 specific features, including age, weight, height, gender, pregnancy, tobacco use, and sexual activity, and then provides specific screening recommendations, service frequency, and risk factor information for several disease.

The Canadian Task Force on Preventive Health Care (CTFPHC) [26, 37] has developed a knowledge base of clinical practice guidelines to assist primary care providers in delivering preventive healthcare, and to provide preventive screening recommendations for specific diseases for populations of various ages, genders, and family histories. The Task Force’s primary audience is primary healthcare professionals.

Alaa et al. [11] designed a decision support system for learning and implementing a tailored breast cancer screening policy, which assisted clinicians in choosing which sequence of screening items should be performed for women with different features. The screening policy was learned from data in the electronic health record using supervised learning and clustering algorithms to identify subgroups of patients, learn the policy best suited for each subgroup, and prompt screening item recommendations.

Snezana et al. [38] developed the ontology for newborn screening follow-up and translational research. This method was designed to help clinicians involved in translational research. The ontology contains 1850 classes, 104 object properties, 4 data properties. Hier et al. [39] created a neurological examination ontology, which contains 1100 concepts.

Personalized preventive service is a healthcare paradigm that emphasizes the features of individual participants (e.g., health history, environments, and lifestyles) rather than a “one-size-fits-all” approach to medicine [10]. The second research topic focuses on the key elements that must be considered when selecting health examination items. We reviewed guidelines and expert consensus, serving as the foundation of this study, to provide a solid basis for the construction of the proposed decision support system.

In USPSTF clinical preventive services guidelines and published recommendations [25, 31], the population features for specific disease screening are age, gender, medical history, family history, surgery history, reproductive history, sex life, smoking, and living environment. The recommendations for screening include screening items, frequency, and recommended grade. This knowledge base and application are mainly used by primary care clinicians to make decisions about screening items for their patients.

In CTFPHC published guidelines for specific disease screening [26], the population features involve: age, gender, ethnicity, medical history, reproductive history, sex life, smoking, diet, and taboo. The recommendations include screening items, frequency, and recommended grade. The core audience is primary healthcare professionals, and collaborative decision-making with patients is also vital.

In Chinses Expert Consensus on Health Examination Items [7, 8], experts formulated the basic reference basis for carrying out health examination services, including the structure and main content of health examination items, the framework for collecting personal health-related information, and the homepage of health examination reports. According to expert consensus, health examination items should be tailored to the specific conditions of each participant. Existing health examination item packages failed to meet these requirements. The health examination item policy named “1 + X” includes basic items and optional items. The basic items “1” are required for performing preventive examinations, and the optional items “X” are tailored for the population to satisfy personalized and diverse preventive healthcare needs. The personal health-related information contains 7 dimensions: health history, physical symptoms, lifestyle, environmental health, mental health, sleep health, and health literacy. It guides what health-related information needs to be collected when formulating health examination items. This consensus provides guidance for standardizing the decision-making process for health examination items.

There are numerous knowledge-based models to provide decision support. We review and discussion two types of knowledge-based models, including building the ontology and instantiating it, and building ontology as the scheme layer and the knowledge graph as the data layer.

Hernández et al. [40] built the ontology focused on the head and neck cancer domain, which contains 502 classes. Samwald et al. [41] developed an ontology and decision support rules for enabling clinical pharmacogenomics, which represents 336 single nucleotide polymorphisms, 665 haplotypes, 22 rules related to drug-response phenotypes, and 308 clinical decision support rules. Lokala et al. [42] proposed a drug abuse ontology, which comprises 315 classes, 31 relationships, and 814 instances. Taçyıldız et al. [43] constructed a tracking ontology, semantic Web rules, and an inference engine for obesity management, which involves 8 classes, 6 object properties, 59 data properties, 963 individuals, and 116 rules.

Chen et al. [44] generated lung cancer graph from a hospital EMR of approximately 1 million patients, which contains 187 related biomedical concepts and 188 horizontal biomedical relations. Wu et al. [45] constructed an ontology for automatic diagnosis of COVID-19 infection, which includes the knowledge graph and diagnosis rules. Wang et al. [46] also proposed the ontology to build the scheme layer of the knowledge graph for unmanned combat vehicle decision making. Sung et al. [47] developed an ontology to describe the relationships between diseases and symptoms, and created a knowledge base based on the ontology for self-medication users to search over-the-counter medicines.

The purpose of this paper was to the design and evaluation of a CDSS in China for personalized health examination items (see Fig. 1). We (1) designed an ontology-guided and knowledge graph-based system to directly assist primary care clinicians and (2) created a two-step expert evaluation to assess whether personalized items were appropriate for participants in the absence of a gold standard. The knowledge graph addresses the lack of personalized health examination items, the system provides decision support for healthcare providers and participants in practical applications, and the evaluation method assesses whether personalized items are appropriate for participants in the absence of a gold standard. Our preliminary related work involved the publication of an Expert consensus on recommended adult individualized health examination items [48]. This study represents an important initial step toward achieving the personalized health examination.

This study developed a knowledge graph-based clinical decision support system. An ontology was utilized as the scheme layer of the knowledge graph to structure the expression of decision-making knowledge and support the property graph. The property graph and graph database were employed as the data layer of the knowledge graph to express and store the decision-making knowledge. Figure 1 shows the system architecture.

The system consists of a Service Engine, a Mobile App, and a Web Platform (see Additional file 2 for system screenshot and Additional file 6 for system technology structure). In order to improve the user experience, the mobile application uses the WeChat mini app. Primary care clinicians usually work in front of the computer, and the Web Platform is implemented in the form of the web page.

In the user module (see Additional file 6 for decision-making process and user interaction), primary care clinicians could see the personal health data of participants and the health examination item recommendations given by the system, who should assess the system recommendations and cooperate with participants on final health examination items. Participants are responsible for providing individual health-related information required by the system through questionnaires, and make decisions in collaboration with primary care clinicians.

In the developer module, healthcare professionals have been working in the field of in health examination for many years and are responsible for reviewing, correcting, and updating decision-making knowledge of health examination items. Researchers at the intersection of medicine and informatics are responsible for collecting, summarizing, and encoding decision-making knowledge, as well as the development and maintenance of this system.

The system database stores the personal health information, including health-related questionnaire data and health examination history data, as well as decision result data from the system, primary care clinicians.

We created an ontology as the scheme layer. The widely ontology engineering process [49] was used to create the health examination item recommendation ontology (HEIRO). The proposed ontology is an abstract expression of the concepts and relationships that guide the computer representation of health examination item decision-making knowledge. Two healthcare professionals reviewed decision-making knowledge. Protégé [50] editor was used to develop the ontology. We compared the proposed ontology with SNOMED CT [51] and LOINC [52], two important terminology standards in the medical domain, to illustrate the purpose and significance of our ontology.

First, the domain and scope of HEIRO was a model that structures and standardizes decision-making knowledge of health examination item and supports the execution of decision-making processes. We defined the scope of the health examination items and summarized the health feature descriptions.

Secondly, two existing health examination-related ontologies in BioPortal, Neurologic Examination Ontology [39] and Ontology for Newborn Screening Follow-up and Translational Research [38], are designed for the diagnosis and treatment of specific diseases or populations and primarily model examination items, examination results, and diseases. However, this study emphasizes participants’ health features, which are more complex than health examination items. As a result, the existing ontologies are difficult to reuse for formulating personalized health examination items for participants and providing decision support for healthcare providers in China. Therefore, we developed the HEIRO.

Third, we compiled terms, which describe concepts related to the decision-making process. These terms were mainly guided by the Chinese expert consensus on health examination items [7, 8], and also referred to USPSTF [25, 31, 32] and CTFPHC [26].

Fourth, the classes were organized into two main levels. Level 1 consisted of 2 terms representing the core topics: health examination items and health features. The health features include not only diseases but also spans 10 dimensions, such as lifestyle, physical symptom, environmental health, and others, which cannot be observed in a typical hospital setting.

Fifth, we used the data property to identify several subclasses of level 2, and restricted their range and data type (see Fig. 2A). In addition, we designed several different data properties to accurately describe each complex health feature (see Fig. 2B). In this way, the types of classes were reduced and the clarity of knowledge expression was improved. This distinction further highlights the difference between the existing ontologies and the proposed one.

Sixth, the axioms were employed to restrict properties to complete the precise semantics of classes.

Next, we created instances of classes in the knowledge graph construction. Another key role of the proposed ontology is to guide the expression, storage, and transmission of information throughout the decision support process. Specifically, this involves collecting and storing participants’ health-related information in the user module and system database, representing the formulated health examination items in the service engine, expressing healthcare providers’ decision results in the user module, and guiding the data structure for interactions and transmissions.

The ontology-guided property graph [53] and graph database were as the data layer. The property graph is suitable for representing complex instances and relationships, offers an intuitive structure for reviewing, modifying, and updating knowledge, and provides efficient and accurate application in real-world scenarios.

The nodes in the knowledge graph are instances of the ontology. We gathered the relevant knowledge and transformed it into the property graph. Whether there is an instance of the ontology in the knowledge graph also depends on the content of decision-making knowledge. Our preliminary related work involved the publication of an expert consensus on recommended adult individualized health examination items [48]. Two healthcare professionals reviewed the decision-making knowledge. Neo4j [54] was used to store and visualize knowledge. The construction process was as follows:

First, we defined the scope of health examination items. We gathered health examination items from a health examination institution and five hospitals from different regions (see Additional file 3 for detailed information about health examination items from different regions). Guided by the Chinses Expert Consensus on Health Examination Items [7, 8], we determined the scope of health examination items, and divided them into basic and optional items (the health examination item policy named “1 + X”). The basic items “1” were required, and the optional items “X” were tailored to the participant’s health status. This distinction is represented by the data property “category” in the property graph.

Second, the part of decision-making knowledge came from the book named “Health Examination and Health Management” [55] (see Additional file 3 for details) and expert experience. We extracted and expressed decision-making knowledge in the form of “health feature descriptions - health examination items” (see Fig. 2B). Then, healthcare professionals reviewed the selected knowledge and supplemented the descriptions of health features that lacked mapping relationships.

Third, the graph structure can describe complex instances and relationships. On the one hand, the property graph can represent an instance with a node with multiple properties, which is easier to understand and apply in practice. On the other hand, the relationship between features and items is also diverse, including one-to-one, one-to-many, many-to-many. We deconstructed the health feature descriptions, and expressed them through relevant data properties (see Fig. 2B). Following this, we established mapping relationships between the health examination items and the corresponding health features.

Finally, we coded the knowledge using Neo4j software and formed the health examination item set, the health feature set, and their mapping relationships.

The graph representation is easy to understand and offers a flexible structure, which facilitates the review, modification, and updating of knowledge within the graph. In practical applications, this approach is more efficient than rule-based methods for knowledge retrieval because it uses a single node to represent information with multi-dimensional properties. When searching for a target node, we can easily find the corresponding health examination items.

We assessed the personalization of the health examination items proposed by the system through retrospectively collecting health examination data and expert evaluation, which eliminated time and cost biases.

The data were acquired from Chinese People’s Liberation Army General Hospital Hainan Hospital from 1/1/2005 to 9/9/2021. The inclusion criteria were: (1) Completed health examination after January 1, 2015; (2) The number of health examination was greater than 2; (3) The diagnosis and positive findings of the two consecutive health examinations of participants overlapped by more than 90%, and the time interval was less than two years. It ensured that the health status of the participants and health examination items have not changed significantly in two consecutive health examinations. The number of participants in evaluation part mainly consider the number of experts, the workload of each expert and reliability of evaluation results.

We randomly selected 70 participants from those who met the inclusion criteria, each with a record of two consecutive health examinations (named “prior” and “later” health examination). We extracted demographics (date of birth, gender), diagnosis and positive finding, health examination item (in the form of packages), and health examination date. This study was approved by the ethics committee of Zhejiang University’s School of Public Health (No. ZGL202112-2) and the Hainan Medical Ethics Committee (No. 00824482406). All retrospective data used in this study were derived from existing and anonymized datasets.

Our goal is to evaluate the system’s capability to process input health feature data and recommend personalized health examination items. The demographics, diagnosis and positive findings of prior health examination were as health features entered into the system. Then, the system recommended health examination items. Referring to the existing research on effectiveness evaluation of decision support systems based on expert experience [34,35,36], we invited 11 experienced primary care clinicians as experts to evaluate system. Every expert has worked for more than 20 years to formulate health examination items (see Additional file 4 for details). At last, we compared recommendations proposed by the system and experts, as well as the existing packages of the later health examination.

First, the capability to process input health feature data was examined through the Enter Rate and Mapping Rate, which were calculated based on the given formulas. The enter rate indicates whether ontology-based data structure we created can handle various forms of health feature data. The mapping rate indicates whether the health feature data we enter into the system corresponds to health examination items, and it represents the richness of decision-making knowledge in the knowledge graph.

$$\:Enter\:Rate=\frac{{N}_{enter}}{{N}_{feature}}$$

$$\:Mapping\:Rate=\frac{{N}_{mapping}}{{N}_{feature}}$$

where $\:{N}_{feature}$ was the total number of health features, $\:{N}_{enter}$ was the number of health features that could be entered, $\:{N}_{mapping}$ was the number of health features that have corresponding health examination items.

Second, we evaluated the system’s capability to recommend personalized health examination items through a two-step expert evaluation based on the Delphi method [56], including pre-evaluation and formal evaluation (see Fig. 3). There is no gold standard for personalized health examination items. We assume that medical professionals with greater expertise make better decisions. The recommendations for health examination items are obtained through discussions and consensus among several experienced medical experts and are finally used as the gold standard. The details were as follows:

(1) Pre-evaluation: We designed the expert evaluation document (see Additional file 4 for detailed information about the evaluation form) and conducted the pre-evaluation. We invited 7 primary care clinicians to evaluate the system recommendations for 10 participants. During this pre-evaluation, the primary responsibility of the primary care clinician was to experience the entire evaluation process and provide some advice, which helped us improve the validity and scientific integrity of the expert evaluation process.

(2) Formal evaluation: We analyzed the pre-evaluation results and improved the expert evaluation design. In round one, we invited 11 primary care clinicians (including 7 primary care clinicians in the pre-evaluation) to evaluate the system recommendations for the remaining 60 participants. Each primary care clinician evaluated system recommendations for 30 participants. Each participant was evaluated by at least 5 randomly assigned primary care clinicians. The evaluation results were classified as recommended, uncertain, not recommended, and supplementary (see Fig. 3). We counted expert evaluation results and sorted out uncertain and supplementary items. In round two, we held an online meeting to discuss the uncertain and supplementary items. After these two rounds of formal evaluation, we got the final health examination items that participants should undergo and were used as the gold standard.

We compared system recommendations and existing packages to the gold standard through Precision and Recall. Precision is related to what extend the system defines correct recommendations for the participants, and recall is defined as the specificity of the system, which were defined as:

$$\:Presision=\frac{TP}{TP+FP}$$

$$\:Recall=\frac{TP}{TP+FN}$$

where TP (true positive) refers to the recommendations that were proposed by both the system and the experts, FP (false positive) refers to the recommendations that were proposed only by the system and not by the experts, FN (false negative) refers to the recommendations that were not proposed by the system but actually should have been proposed.

The HEIRO abstractly expressed the decision-making knowledge of health examination item. The Fig. 4 depicts 15 classes, 2-level class hierarchies, 3 types of object properties, and 16 types of data properties of HEIRO. Data properties are the heart of ontology design, which echoes the property graph (see Table 1). The health examination items were divided into two categories, named “1” and “X”. There were 10 subclasses of health feature. The subclasses of the health history, physical symptom, lifestyle, environmental health, mental health, and health literacy were transformed into data property to express. The “category” is identified to classify the subclasses of health features and health examination items, totaling seven categories.

Through a systematic comparison with existing ontologies, we identified 8 overlapping elements with LOINC and 12 with SNOMED CT. The key differences primarily lie in the property representation and application purpose. A detailed comparison, including the classes and properties of HEIRO along with their definitions and distinctions from existing ontologies, is provided in the Additional File 7.

Table 1 Description of the object property (OP) and data property (DP) in HEIRO

Full size table

The knowledge graph consisted of ontology-guided property graphs to represent the decision-making knowledge (see Fig. 5). The decision-making knowledge could be expressed in a variety of ways: The data property was used to describe the subclasses of health examination items and health features to increase clarity of knowledge graph; Health features with complex information can be represented as a node using multiple data property. As the complexity of knowledge increases, we can easily update by taking advantage of property graphs. In practical applications of health examination, the knowledge represented by the property graph has a more cohesive internal structure, resulting in more efficient knowledge querying. This approach enables the delivery of efficient and accurate decision support.

The proposed knowledge graph contained 584 classes, 781 object properties, and 1094 data properties (see Table 2). There were 315 health examination items, 269 health features, and 466 mapping relationships “hasRecommendation”. The main health features were health history and physical symptom. The “frequency” property (n = 49) went into great detail and depth on the frequency of occurrence of physical symptoms, behavioral habits, and mental conditions in daily life. The “type” property (n = 23) described the type of medication, surgical site, as well as behavioral habits in detail and in depth. The number of classes and properties depends on the ontology design, the content of the decision-making knowledge, and actual application scenarios.

Table 2 The number of classes, object properties (OP), and data properties (DP) in the knowledge graph

Full size table

The decision-making process of health examination items is jointly participated by the participants (app), the primary care clinicians (web) and the system. The proposed ontology is to guide the expression, storage, and transmission of information in the decision support process. The recommend process of the system is depicted in (see Fig. 6). Participants’ health-related information was transformed into structured health feature data and entered in the system. Then, system provided a list of personalized health examination item recommendations and reasons. So, the recommendations were interpretable. At last, primary care clinicians review the system recommendations and decide the final health examination items with participants.

We selected 70 participants for system evaluation. In pre-evaluation, 7 primary care clinicians evaluated system recommendations for 10 participants (see Table 3). The primary care clinician experienced the entire evaluation process and provided some advice. We collected advice for enhancing the expert evaluation design (see Additional file 4 for the statement of advice for expert document design in pre-evaluation), including evaluation form content, evaluation criteria, and result processing.

Table 3 Participants characteristics of expert evaluation

Full size table

In the formal evaluation, 11 primary care clinicians evaluated the system recommendations for remaining 60 participants (see Table 3). In round one, we took back a total of 11 expert evaluation documents (see Table 4). Among the system-recommended items, there were 21 items where evaluation results were uncertain (recommended, 1215; not recommended, 26). The items generated by the system that experts believe should not be implemented are referred to as “not recommended”, and experts cannot reach an agreement are referred to as “uncertain”. Furthermore, outside of system generation, experts supplemented 283 items. The uncertain and supplementary items were discussed in round two. After two round formal evaluation, we got the final health examination items (1433, an average of 24 health examination items per participant), which was the gold standard. This shows the workload and difficulty of the evaluation part.

Table 4 Formal evaluation results in round one and round two

Full size table

The system’s capability to process input health feature data was shown in Fig. 7A. There were 472 health features in total (including demographics and health history, an average of 8 health features for each participant). This shows the difficulty of the evaluation part. 96.2% health feature conformed our ontology-based data structure. 56.4% health features we entered into the system had corresponding health examination items in the knowledge graph. This demonstrated that the capability to handle various types of health feature data, as well as the richness of decision-making knowledge in the knowledge graph.

We compared system recommendations and existing packages to the gold standard to evaluate the system’s capability to recommend personalized health examination items (see Fig. 7B). The system recommended 1262 health examination item for 60 participants and the package recommended 1366. For 1262 (package, 1366) system recommendations, 47 (package, 376) of them were classified as unnecessary. In addition, primary care clinicians added 218 items (package, 443) in addition to system recommendations. Considering the precision of the proposed system, it can be said that 96.3% (package, 72.5%) of the recommendations proposed are correct for participants. Considering the recommendations that should have been proposed, 84.8% (package, 69.1%) of them are proposed by the system. Overall, the system outperformed packages and was close to experts.

Moreover, in Fig. 8, we created matrixes for the system and the existing packages to compare their capability to health examination item decision-making in physique, laboratory and instrument examination. In physique examination, packages proposed more correct items (TP, 417:413) and more necessary than the system (FN, 12:16), but also more incorrect items (FP, 50:0). In laboratory and instrument examinations, the system’s capability was stronger than the packages (TP, 389:304, 413:269), and even recommended fewer incorrect items (FP, 0:93, 47:233) and more necessary items (FN, 56:141, 146:290). The system’s recommendations were comparable to those of experts and superior to those of packages, but extensive decision-making knowledge was required to reduce recommendation inadequacy (FN).

To the best of our knowledge, this study is the first to construct a CDSS for personalized health examination items in China, which could ameliorate the problem that the current health examination item package is not personalized enough. The main contributions are: (1) The HEIRO with data properties as the core design provides guidance for the expression of decision-making knowledge in property graphs, and provides reference for the collection of health-related information; (2) The knowledge graph is composed of ontology-guided property graphs and stored in a graph database, which provides rich and clear decision-making knowledge and a flexible data model; (3) This knowledge-based system provides interpretable health examination item recommendations, can be implemented into health examination scenes to assist primary care clinicians directly. It further increased the engagement of the health examination among participants; (4) The two-step expert evaluation was created to assess the system’s capability to recommend personalized health examination items. In the absence of a gold standard, determining whether recommendations (containing several health examination items) were appropriate for participants (with complex health features) was more challenging than deciding yes-or-no or classification results. Overall, the system’s performance was comparable to that of experts and better than that of packages.

In some related ontology of health examination, Neurologic Examination Ontology [39] and Ontology for Newborn Screening Follow-up and Translational Research [38], they mainly designed classes and class hierarchies to represent data entities in a certain type of examination or a specific population screening scenario. In HEIRO, the core of the design was data properties and the overall health examination of the individual, rather than a specific disease [11] or a specific group [38]. Through a systematic comparison, there were 8 overlapping elements in LOINC and 12 in SNOMED CT (see Additional file 7), with notable differences in property representation and application purpose. The primary differences lie in the use of data properties. Unlike LOINC and SNOMED CT, which often represent similar concepts as classes, HEIRO used the data property “hasCategory” to reduce the number of classes and used several different data properties to express a piece of complex decision-making knowledge to improve the clarity and richness of knowledge expression. This design is essential for constructing a knowledge graph and further highlighting the novelty of our approach. The evaluation results shown that the ontology can cover 96.2% of the health feature data. Although the performance is generally sufficient, the ontology still requires expansion. Moreover, experts recommended adding the properties of degree and weight to health features and health examination items (see Additional file 4 for the statement of advice for the health examination item recommendation in formal evaluation).

The knowledge graph was constructed using property graphs under the direction of the HEIRO, which included the mapping relationships between and health features and health examination items. Compare with some disease screening knowledge bases [25, 26], it had a flexible data model for easy knowledge updating, and the system could directly assist primary care clinicians through providing interpretable recommendations for health examination items. Experts also agreed with our decision-making strategy for formulating a personalized health examination item (see Additional file 4 for the primary care clinicians’ attitude towards personalized health examination). There are three aspects to note about the results of mapping rate (56.4%): first, decision-making knowledge needs to be constantly updated; second, basic items can cover items corresponding to some health features to make up for insufficient decision-making knowledge; third, not all health characteristics that occur require further examination.

The evaluation results demonstrated that the system’s performance (precision, 0.963; recall, 0.848) was closer to that of experts and superior to that of packages (precision, 0.725; recall, 0.691). For the system, the main deficiency is indicated in FN, which was consistent with the issue highlighted by the mapping rate result. It is inappropriate to simply add health examination items to health features. It was not that the more items the better. This may lead to an increase in incorrect items (FP) [11]. The health examination items of package (1366) were more than the system (1262). The system (47) and the package (376) had a big difference in FP, with the package generating more incorrect items. This may create an unnecessary financial burden on participants and expose them to potential danger from some items. Among three types of health examination items, the main deficiencies were reflected in laboratory and instrument examinations, which were the parts that need to be focused on in future research. This may be because the physique examination had fewer total items and more basic items. In addition, there were also many suggestions to specific laboratory and instrument examinations mentioned in the expert advice (see Additional file 4 for details).

To the best of our knowledge, this study is the first to construct a knowledge graph-based decision support system for health examination item recommendation in China. Our study has several strengths. First, we created a knowledge graph composed of HEIRO-guided property graph to clearly and flexibly express knowledge. Secondly, the system can assist primary care clinicians directly through providing interpretable health examination item recommendations. Primary care clinicians and participants both knew why a certain health examination was performed, and further increased the acceptance of the health examination among participants. Thirdly, we designed the two-step expert evaluation to determine whether the comprehensive recommendations with several health examination items were appropriate participants, which was more challenging than a simple yes-or-no evaluation and mainly considers personalized needs of health examination, independent of money and resources.

Some potential weaknesses need to be acknowledged. First, the decision-making knowledge needs constantly updated to make up for insufficient knowledge. Secondly, the evaluation data only involved demographics and health history, although they are two relatively important factors in health examination item decision-making. The evaluation should cover more comprehensive health features, system usage, customer satisfaction, etc. in prospective experiment in the future. Thirdly, previous health examination data should be deeply mined by artificial intelligence methods for decision-making. Fourth, we will evaluate the usability of the system in control experiment in the future study.

The proposed CDSS for personalized health examination item in China can assist primary care clinicians directly through interpretable recommendations. The system’s performance was close to experts and outperforms the current impersonalized health examination item packages. It indicated that the system could improve the personalization of health examination items and the health examination consultation services, and also increased the engagement of the health examination among participants.

The raw datasets analyzed during the current study are not publicly available due to personal data privacy protections but are available from the corresponding author on reasonable request. The detailed evaluation result datasets for each participant generated during this study are included in additional file 5.

CDSS:: clinical decision support system
USPSTF:: United States Preventive Services Task Force
CTFPHC:: Canadian Task Force on Preventive Health Care
HEIRO:: Health examination item recommendation ontology
OP:: Object property
DP:: Data property
TP:: True positive
FP:: False positive
FN:: False negative

Download references

All retrospective and anonymized data used in the expert evaluation of this article were obtained from Chinese People’s Liberation Army General Hospital Hainan Hospital. The expert evaluation was supported by eleven primary care clinicians from Chinese People’s Liberation Army General Hospital. The authors would like to thank to Dr. Guanglin Zhong and Dr. Xiaoyuan Huyan for their contributions during the development of the decision-making knowledge.

This work was supported by the Key Research and Development Programs of Ningxia (No. 2023BEG02021), Guangxi (No. AB21196010), and Henan (No. 251111313700), Hainan Health Science and Technology Innovation Project of China (No. WSJK2025MS187), and Hainan Provincial Natural Science Foundation of China (No. 522RC611).

Authors

Yutong She
You can also search for this author inPubMed Google Scholar
Huilong Duan
You can also search for this author inPubMed Google Scholar
Ning Deng
You can also search for this author inPubMed Google Scholar

DW, JA and SN conceptualized the system and designed the study. JA and SN acquired the evaluation data. DW and YS analyzed the evaluation data. DW drafted the original manuscript. DW, DH and ND revised the manuscript. All authors read and approved the final manuscript.

Correspondence to Ning Deng.

The present study was approved by the ethics committee of Zhejiang University’s School of Public Health (No. ZGL202112-2) and the Hainan Medical Ethics Committee (No. 00824482406), and conducted following the guidelines of the Declaration of Helsinki. All methods were performed in accordance with relevant guidelines and regulations. Data were anonymized at the point of extraction and no participant identifiable data is reported in the analysis. The study was approved by the above-mentioned ethics committee prior to data extraction and waived the need for participant consent because any data were collected and analyzed on a fully anonymized basis. This study obtained permission from the Chinese People’s Liberation Army General Hospital Hainan Hospital to use anonymous data during expert evaluation as informed consent. Zhejiang University and Chinese People’s Liberation Army General Hospital are both participating institutions in the National Key Research and Development Program of China (No. 2020YFC2003403). The evaluators who participated in the system evaluation had information about the study and they gave written consent to participate in expert evaluation as informed consent.

Not applicable.

The authors declare no competing interests.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

Wu, D., An, J., Nan, S. et al. A knowledge-based clinical decision support system for personalized health examination items in China: design and evaluation. BMC Med Inform Decis Mak 25, 183 (2025). https://doi.org/10.1186/s12911-025-03019-2

Download citation

A knowledge-based clinical decision support system for personalized health examination items in China: design and evaluation

Recommended Articles

Virtual body image exercises for people with obesity - results on eating behavior and body ...

ABiMed: An intelligent and visual clinical decision support system for medication reviews a...

Comparative evaluation of artificial intelligence models GPT-4 and GPT-3.5 in clinical deci...

Explainable AI for enhanced accuracy in malaria diagnosis using ensemble machine learning m...

Harness machine learning for multiple prognoses prediction in sepsis patients: evidence fro...

You may also like...

Oshiomhole Challenges Abati to Street Fight

Kenyans Borrow Ksh.70B from Hustler Fund

Gulak's Death Certificate Presented in Kanu Trial

Naivas Supermarket Responds to Expired Product Claims, Ordered to Close Outlets

Taiwo Awoniyi in Coma after Surgery

GFA Fines Annor for 'I belong to Jesus' Celebration

Russia Jails Vote Monitor

Women Farm Workers March in Cape Town

You may also like...

Oshiomhole Challenges Abati to Street Fight

Kenyans Borrow Ksh.70B from Hustler Fund

Gulak's Death Certificate Presented in Kanu Trial

Naivas Supermarket Responds to Expired Product Claims, Ordered to Close Outlets

Taiwo Awoniyi in Coma after Surgery

GFA Fines Annor for 'I belong to Jesus' Celebration

Russia Jails Vote Monitor

Women Farm Workers March in Cape Town

Recommended Articles

Virtual body image exercises for people with obesity - results on eating behavior and body ...

ABiMed: An intelligent and visual clinical decision support system for medication reviews a...

Comparative evaluation of artificial intelligence models GPT-4 and GPT-3.5 in clinical deci...

Explainable AI for enhanced accuracy in malaria diagnosis using ensemble machine learning m...

Harness machine learning for multiple prognoses prediction in sepsis patients: evidence fro...