Open AccessReview

Survey on Knowledge Representation Models in Healthcare

Batoul Msheik

^1,*,

Mehdi Adda

Hamid Mcheick

and

Mohamed Dbouk

Computer Science Department, Université du Québec à Chicoutimi, 555, Boul De l’Université, Chicoutimi, QC G7H 2B1, Canada

Département de Mathématiques, Informatique et Génie, Université du Québec à Rimouski, 300 Allée des Ursulines, Rimouski, QC G5L 3A1, Canada

Computer Science Department, Université Libanaise, Hadath, Beirut 6573/14, Lebanon

Author to whom correspondence should be addressed.

Information 2024, 15(8), 435; https://doi.org/10.3390/info15080435

Submission received: 9 April 2024 / Revised: 17 June 2024 / Accepted: 25 June 2024 / Published: 26 July 2024

(This article belongs to the Special Issue Knowledge Representation and Ontology-Based Data Management)

Download

Browse Figures

Figure 1
Data, information, knowledge and wisdom chain. "> Figure 2
Bayesian network representation. "> Figure 3
Ontology representation. "> Figure 4
Decision tree representation. "> Figure 5
Neural network graph. "> Figure 6
Unified modeling language representation. "> Figure 7
Frame representation. "> Figure 8
Semantic network graph. "> Figure 9
Knowledge representation model categorization. "> Figure 10
Satisfaction ratio for knowledge representation models based on medical domain requirements. "> Figure 11
Citation number for knowledge representation models (over the last decade). ">

Versions Notes

Abstract

Knowledge representation models that aim to present data in a structured and comprehensible manner have gained popularity as a research focus in the pursuit of achieving human-level intelligence. Humans possess the ability to understand, reason and interpret knowledge. They acquire knowledge through their experiences and utilize it to carry out various actions in the real world. Similarly, machines can also perform these tasks, a process known as knowledge representation and reasoning. In this survey, we present a thorough analysis of knowledge representation models and their crucial role in information management within the healthcare domain. We provide an overview of various models, including ontologies, first-order logic and rule-based systems. We classify four knowledge representation models based on their type, such as graphical, mathematical and other types. We compare these models based on four criteria: heterogeneity, interpretability, scalability and reasoning in order to determine the most suitable model that addresses healthcare challenges and achieves a high level of satisfaction.

Keywords:

knowledge representation model; healthcare requirements; heterogeneity; interpretability; scalability and reasoning

1. Introduction

As the volume and diversity of data rapidly increase and the usage domain of knowledge expands, there is an urgent need to represent and process knowledge [1]. Therefore, it is not feasible to extract processed “knowledge” directly from extensive data repositories and storage systems. According to [1], “Data is defined as simple facts, information is defined as organized data, and knowledge is defined as the ability to understand the meaning of information”.

To gain a thorough understanding of any subject or concept, it is essential to have a precise and accurate portrayal of its relevant context. This highlights the importance of finding methods to represent knowledge through visual structures (such as trees and frames) or logical frameworks (like first-order logic and second-order logic) [2]. Knowledge representation models (KRMs) can facilitate the sharing and organizing of scientific knowledge, making it easier to comprehend and enhancing reasoning and inference processes [2].

KRMs are widely used in various domains such as expert systems, robotics, artificial intelligence and natural language processing. A study conducted in 2022 shows that 80% of applications in these fields utilize KRMs for representation and reasoning [3]. They have a significant impact on society, facilitating communication among groups and individuals in domains such as healthcare, education, science and engineering, finance, society and politics [4]. In the healthcare field, KRMs have been developed to effectively integrate essential components that healthcare providers work with, including electronic medical records, healthcare information systems and clinical practice guidelines [5]. The representation of vast amounts of information can assist in the creation of systems used for treatment plans and precise diagnoses [4].

There are still many requirements for expressing knowledge in the medical field, such as heterogeneity, interpretability, scalability and reasoning [6,7]. Because healthcare data are highly heterogeneous, simply combining raw data or model outputs from each perspective would mean missing the chance to explore potential connections and relationships between entities from different perspectives, which could enable strong reasoning [6,8]. Model interpretability refers to its ability to be easily understood and explained by end-users. This is especially important in healthcare, as interpretable models allow healthcare experts to make informed and data-driven decisions, leading to personalized care provision [9].

In addition, scalability refers to the ability of a system or model to manage larger volumes of data without affecting performance. This is particularly important in healthcare, where the volume and complexity of data are rapidly increasing. A scalable model can help healthcare professionals better understand patient needs and improve treatment outcomes [10].

This article comprehensively reviews KRMs in general, with a specific focus on their applications in healthcare. It includes an overview of recent surveys and papers on the topic, emphasizing the significance of these models. This review process is critical for determining the functional relevance of each model and understanding its potential applications. The study also highlights the specific requirements of the healthcare sector and compares KRMs to assess their efficiency using a satisfaction ratio formula. This formula, calculated for each model, aids in identifying the most suitable one based on healthcare needs.

The remainder of this paper is organized as follows: Section 2 defines concepts and types of knowledge. Section 3 presents a comprehensive overview of general knowledge representation models. Section 4 emphasizes the significance of knowledge representation models in the healthcare field. Section 5 compares these models based on medical domain requirements, including interpretability, heterogeneity, reasoning and scalability. Finally, Section 6 provides a brief conclusion.

2. Background of Knowledge Concepts and Types

2.1. Data, Information, Knowledge and Wisdom

Data are fundamental facts used for transactions. They are of no use until they are organized, structured, interpreted and formatted for a specific purpose in order to generate accurate information [6]. Although all data can be considered as information, not all information is simply data, highlighting the importance of transforming data into meaningful information [7]. For instance, if Sami has a temperature of 38 °C, this is considered raw data. Providing information from this data point involves adding context: Sami’s normal body temperature is usually around 36.5 °C. Therefore, the information indicates that Sami’s current temperature is elevated above their normal range. Understanding involves a deep comprehension of this information [6]. In this example, knowing that a temperature of 38 °C is generally considered a mild fever suggests that Sami may be experiencing some illness or infection. Wisdom entails intelligently analyzing this knowledge to make judgments and decisions [6]. For example, based on the understanding that a temperature of 38 °C indicates a mild fever and taking into account other factors such as Sami’s symptoms and overall health condition, one might decide to closely monitor Sami’s temperature and seek advice from a healthcare professional if necessary. This example will serve as a foundation that can be expanded upon depending on the model used. Figure 1 represents the DIKW model, illustrating its four components.

2.2. Knowledge Types

Understanding the different types of knowledge is crucial because it greatly influences how models are represented. Additionally, the specific domain of the knowledge also affects the type of model that is used [8]. Knowledge can be categorized into three main categories: explicit knowledge, tacit knowledge and implicit knowledge, as shown in Table 1. Regardless of the type, all knowledge needs to be represented and expressed in a suitable format and language in order to be processed by a computer.

Tacit knowledge is acquired naturally through learning and experience, and it becomes explicit knowledge when we document or represent it in various ways. In contrast, implicit knowledge, which is a component of this innate knowledge, can be acquired incidentally.

3. Knowledge Representation Models (KRMs)

According to Shrobe et al., “A knowledge representation is a fragmentary theory of intelligent reasoning; it should provide the set of inferences the representation allows or recommends” [14]. Knowledge representation involves a formal description of knowledge that emphasizes computer processing as a means of representation. KRMs are used to organize and describe data in a standardized model, facilitating access and handling by other components [15]. There are various models, each with its own strengths and weaknesses. This survey categorizes KRMs into five main categories based on types of representation. Each category offers distinct advantages and disadvantages depending on the specific problem domain, highlighting the importance of selecting the appropriate model for each use case.

3.1. Graphical Representation Models

The graphical representation model focuses on visually depicting relationships between variables and components using graphs [16]. These models typically utilize nodes and edges to represent entities and their interconnections, visually illustrating knowledge relationships. Below are brief descriptions of nine graphical models.

A Bayesian network is a powerful model that combines probabilistic and graph-based representations [17]. It finds extensive application across domains such as machine learning, data mining and diagnosis due to its ability to perform evidence-based inference that aligns with human intuition [18]. To illustrate, consider the example mentioned earlier represented in a Bayesian network graph (Figure 2). In this graph, tables are populated with arbitrary probability values. Each table represents the probabilities based on whether specific conditions are satisfied or not. For example, the probability of a patient having a disease might be 0.9, while the probability of having both a high temperature and cough could be 0.8. Conversely, the probability of not having a disease might be 0.6, even if the probability of having a high temperature alone (without a cough) is 0.9.

Ontology is a structured method of representing knowledge by defining concepts and their relationships, using two different languages: RDF and OWL [19]. Figure 3 displays the ontology representation for the previously mentioned example. Ontologies have a wide range of applications in various fields, such as biology, medicine, cultural heritage, accounting and social media [19].

A decision tree is a graphical model used to represent knowledge, where each node corresponds to a test or decision based on an attribute or feature [20], as represented in Figure 4. Decision trees find applications in various fields, including engineering, education, law, business, healthcare and finance [21].

A neural network is a machine learning model that mimics the structure and functionality of the human brain, comprising interconnected layers of nodes or neurons [22] as shown in Figure 5. Neural networks find applications across diverse domains including finance, healthcare and education [22].

Unified Modeling Language (UML) is rooted in the object-oriented paradigm, providing a framework for knowledge representation where relationships are defined using associations, aggregations and generalizations [23]. Figure 6 illustrates the previously mentioned example. The UML class diagram, a key component of UML, finds versatile applications across various domains including banking, finance, internet, aerospace and healthcare [23].

A frame is a model that represents concepts as structured sets of attributes and values [24] as illustrated in Figure 7. It finds application in diverse fields such as education, finance and others. Frames are designed primarily for straightforward one-to-one relationships, which can sometimes limit their ability to represent intricate or nuanced relationships between entities.

Tuple-based representation is a model that uses sets of data, called tuples, to represent knowledge. In relational databases, a tuple corresponds to a record, where each tuple represents a specific item. For example, in a database table, each row is a tuple associated with a particular entity or object [25]. This model is straightforward and easy to use, as data can be represented in various forms such as tables or lists. In Table 2, a tuple would represent the record or row related to Sami in the context of the database.

A semantic network uses nodes and edges to visually represent information [26]. For instance, within the medical field, the nodes depicted in Figure 8 represent various aspects such as diseases, symptoms, or treatments. Where the connections (edges) between these nodes highlight dependencies and relationships between different concepts, providing an effective graphical representation model for understanding complex data structures.

3.2. Learning-Based Models

Learning-based models encompass machine learning algorithms capable of learning relationships and dependencies from datasets, enabling them to perform predictions and classifications.

Random forest, an ensemble of decision trees, is used for classification or regression tasks, with each tree trained on a distinct subset of the data [27]. A key advantage of random forest is its ability to manage high-dimensional datasets effectively [27], making it widely applicable across various industries [28]. In our example, a random forest could represent multiple patients in the same graph. Other models such as neural networks, decision trees and Bayesian networks are commonly employed in machine learning [29,30]. Naïve Bayes, a supervised machine learning model, represents data using probability theory, allowing for the calculation of feature probabilities belonging to a class [31].

3.3. Rule-Based Models

Rule-based models operate on a set of rules that derive conclusions from expressed knowledge. In these systems, information is structured as a collection of “if–then” statements, where each rule consists of a condition (“if”) and an action (“then”) [31]. For example, in a medical context, when a new case is presented, symptoms are matched against conditions in the rules, and relevant rules are applied to make a diagnosis. Fuzzy logic, a mathematical approach that addresses uncertainty and imprecision in data, is one such method that allows for partial truth values ranging from 0 to 1 [32]. While traditional rule-based systems provide binary decisions (disease present or absent), fuzzy logic enables a more nuanced assessment, indicating the level or degree of disease presence [32,33]. Rule-based systems find applications in diagnosing medical conditions, evaluating financial investments, optimizing manufacturing processes and more. For instance:

Rule 1: If the patient has medium fever and joint pain, then they take Advil.

Rule 2: If the patient has high fever and severe cough, then they visit the physician.

3.4. Mathematical Models

Mathematical models represent terms and rules using logical and probabilistic notation, allowing for the generation of new knowledge. This category encompasses several subcategories, including logic models and probabilistic models, some of which are also considered graphical models and have been previously explained.

Propositional logic is a declarative statement that can be classified as either true or false, serving as a method to represent knowledge in a logical or mathematical format [33]. However, propositional logic has limitations in representing complex sentences or natural language statements [33], and it cannot describe statements in terms of their properties or logical relationships. An example can be expressed as:

(T ∧ H) → C

This statement means that if Sami has a temperature above 38 °C (T) and also has a headache (H), then Sami should consult a healthcare professional (C).

First-order logic, also known as predicate logic, offers a more sophisticated way to describe information about objects and their relationships [34]. Unlike propositional logic, which deals with simple true or false statements, first-order logic recognizes that the world is more complex and operates similarly to natural language, allowing for nuanced expressions. In the example below, for all individuals (people) X and all times T:

If the patient x has a temperature greater than 38 °C (denoted as 1 if true and 0 if false) and also has a cough (similarly treated as 1 if true and 0 if false), then both conditions must be true for their sum to equal 2:

(Temperature(x) > 38) + Cough(x) = 2

For each patient x belonging to X at time t, the following statement holds true:

∀x∀t: ((Temperature(x,t) > 38) + Cough(x,t) = 2)

(VirusDetection(x,t) = 1 ∧ ConsultHealthcare(x,t) = 1)

This indicates that if a patient x at time t meets the criteria of having both a temperature greater than 38 °C and a cough (sum equals 2), it implies positive virus detection (VirusDetection(x,t) = 1) and recommends healthcare consultation (ConsultHealthcare(x,t) = 1).

Second-order logic extends first-order logic by allowing quantification over sets or collections of objects. While first-order logic quantifies over individual objects like people, animals or numbers, second-order logic permits quantification over collections such as properties, relations and functions [34]. In second-order logic, we use P to quantify over properties (such as temperature, cough and consulting healthcare), allowing us to directly describe these properties and their relationships. For example, if the patient x has a temperature greater than 38 °C (denoted as 1 if true and 0 if false) and also has a cough (similarly treated as 1 if true and 0 if false), then both conditions must be true for their sum to equal 2:

P(Temperature (x) > 38) + P(Cough (x)) = 2

(1)

For each patient x, property P and time t, the following statement holds true:

∀x∀P∀t: (P(Temperature(x,t) > 38) + P(Cough(x,t)) = 2) →

(P(VirusDetection(x,t) = 1,t) ∧ P(ConsultHealthcare(x,t) = 1))

3.5. Hybrid Models

Hybrid models combine two or more KRMs to leverage the strengths of each and mitigate their individual limitations [35]. Combining ontologies with rule-based systems, for instance, overcomes many limitations by providing a more comprehensive and adaptable representation of knowledge. This hybrid approach excels in handling complex and uncertain information while offering easier management and maintenance over time [36]. Another effective combination is between ontologies and Bayesian networks, which enhances knowledge representation by structuring domain knowledge and modelling uncertainty and probabilistic relationships effectively [37]. Moreover, the limitations, such as the challenge of accurately capturing probabilities in complex systems, can emerge when using either model alone. By integrating both models, ontology can offer a structured representation of domain knowledge, while Bayesian networks can effectively model uncertainty and probabilistic relationships among concepts and entities [36,37], thereby mitigating some of these challenges. Additionally, combining neural networks with semantic networks harnesses their respective strengths to develop a more adaptable and robust knowledge representation system [38]. While neural networks excel in processing data but often lack interpretability, semantic networks are effective at representing concepts and their relationships but typically do not learn directly from data [38]. Integrating these two approaches allows us to develop a system that combines the data processing capabilities of neural networks with the structured representation strengths of semantic networks, resulting in a more comprehensive and meaningful knowledge representation framework.

Figure 9 illustrates the basic categories and their subcategories for KRMs. The five main categories are distinguished by the beige colour, while models belonging to more than one category are marked in blue. Models that belong exclusively to one category are listed in regular text. Additionally, the mathematical category includes two subcategories highlighted in pink. Table 3 provides a detailed overview of the advantages and disadvantages of each model.

In the previous section, we classified different models utilized for data representation, such as mathematical, graphical and logical models, and examined their respective strengths and weaknesses. Now, we will investigate specific applications of KRMs in healthcare during the past decade.

4. Importance of Knowledge Representation Models in the Medical Domain

In the medical field, healthcare is increasingly embracing tele-monitoring systems, which give patients self-management tools [51]. The effective management and interpretation of vast, complex data are crucial for accurate diagnoses, optimal treatment planning and improved patient outcomes. KRMs play a crucial role in providing a structured framework for organizing and representing medical data. This allows healthcare professionals to store and utilize valuable information effectively, including risk factors, treatments, symptoms and other patient-related data [52]. Various methods are used to represent different types of medical data, each with unique strengths and weaknesses. For example, Bayesian networks can model risk factors, decision trees are effective for treatment pathways, fuzzy logic-based models manage symptoms, while UML or frame notations are useful for demographic variables [52,53,54,55].

Over the past decade, the International Workshop on Knowledge Representation for Health Care (KR4HC) has highlighted the prevalence of several generic models in healthcare. Ontologies constitute 31%, semantic web-related formalisms account for 26%, decision trees and rules represent 19%, logic covers 14%, and probabilistic models encompass 10% of the represented knowledge models. These contributions primarily aim to formalize and represent medical knowledge [56].

In the healthcare field, KRMs provide benefits to various stakeholders such as doctors, patients, hospitals and research institutions across multiple dimensions including disease diagnosis, monitoring and treatment [57]. The monitoring process involves analyzing relevant parameters, measurements and activities over time to maintain control over a patient’s health, monitor treatment progress and provide better care by detecting changes [58]. Organizing medical knowledge through KRMs establishes rules and relationships across different contexts, facilitating decision support during monitoring. This enables medical staff to interpret data and make informed decisions for improved patient care [59]. Furthermore, disease diagnosis involves identifying specific illnesses based on established classifications, which enables accurate diagnosis [60]. Knowledge representation and reasoning techniques assist doctors in decision-making and inferring new information about diseases from previously represented data [61,62]. For example, Shi et al. [63] developed a knowledge graph based on medical texts and applied semantic reasoning to enhance disease diagnosis.

Table 4 provides an overview of the utilization of KRMs in the healthcare domain.

The next section outlines the specific requirements and challenges within healthcare, demonstrating how KRMs can address these needs.

5. Requirements in the Medical Domain

In healthcare and medicine, knowledge plays a crucial role. Healthcare professionals depend on understanding the workings of the human body and follow established methods outlined in clinical guidelines for diagnosis, monitoring and treatment [88]. Understanding healthcare organization is also vital for effective patient management. Currently, we are evaluating different KRMs based on specific requirements. Our objective is to choose a model that can meet most of our needs. However, before selecting a model, it is important to define and fully understand the significance of each characteristic.

Heterogeneity: KRMs must manage various types of information sources in healthcare, such as data from sensors, like GPS sensors, which require interpretation, and user profiles that are updated less frequently and generally do not need additional interpretation. An effective model in the medical domain should be capable of managing these diverse information types to ensure accurate decisions and comprehensive representations [89].

Interpretability: Interpretability is crucial as it enables medical staff and patients to understand results easily and comprehend how inputs are processed and how outputs are derived. Therefore, when developing KRMs for the medical field, prioritizing interpretability is essential to ensure practical usability and acceptance by healthcare professionals [90].

Reasoning: Reasoning refers to the ability to draw deductions and inferences from the information stored in the knowledge base, even when that information is incomplete or uncertain. This capability is vital in medical domain models as it empowers healthcare providers to make informed decisions based on available information [91].

Scalability: Scalability is the model’s ability to manage large volumes of data or knowledge efficiently. It is a critical factor in design as it directly impacts performance and efficiency. Models that are not scalable may struggle with larger datasets or more complex problems, resulting in slower or less effective performance [92].

The evaluation of KRMs in Table 5 is based on the four essential requirements explained earlier. We use the term “satisfy” to indicate supported characteristics, “not satisfy” for characteristics that are not supported and “partial satisfaction” for characteristics where support is unclear or not fully met.

To effectively represent knowledge, it is crucial to utilize models with clear representations. Our analysis, based on approximately 28 articles, examines and evaluates these models against four key criteria using KRMs.

As previously discussed, there are five primary categories of representations, each encompassing multiple models. Table 5 assesses these models according to specific medical needs within this domain, as previously outlined. It is important to note that certain models have inherent limitations. For instance, Naïve Bayes struggles with certain data types like numerical data [110]. Additionally, propositional logic lacks essential quantifiers such as “for all” (∀) and “there exists” (∃), which are necessary for expressing statements about entire classes of objects or specific instances [106].

In terms of interpretability, models like Markov models are considered somewhat non-understandable to a degree. The complexity of the data limits the level of detail and explanation that can be derived from these models [109]. Similarly, a random forest, which comprises multiple decision trees, makes it challenging to comprehend the relationships among them, hindering clear and intuitive interpretation of the entire model [112]. Regarding reasoning, several models such as propositional logic, frames, UML and semantic networks have limitations in making inferences based on data representation [76,100,103].

Lastly, concerning scalability, both frames and semantic networks may struggle to scale when managing large volumes of interconnected information. Naïve Bayes, on the other hand, faces challenges in capturing complex relationships, and its scalability may be impacted as the number of features increases.

We can derive a general equation regardless of the number of requirements and their respective weights in the medical domain. Consider R, a set of n requirements:

R = {r1, r2, r3, …, rn}

And consider W, a set of n different weights corresponding to these requirements:

W = {w1, w2, w3, …, wn}

Therefore, a general formula to calculate the total satisfaction ratio (TSR) can be described as shown in Formula (1):

T S R = \frac{\sum_{k = 1}^{n} (v a l u e O f (R_{k}) * w_{k})}{\sum_{k = 1}^{n} w_{k}}

Depending on the requirements in the healthcare domain—heterogeneity, interpretability, reasoning and scalability—we select the model with the highest satisfaction ratio using Formula (2):

T S R = \frac{(h e t e r o g e n e i t y * W 1 + i n t e r p r e t a b i l i t y * W 2 + r e a s o n i n g * W 3 + s c a l a b i l i t y * W 4)}{\sum_{k = 1}^{4} w_{k}}

To calculate the overall ratio for the set of requirements, we assign a weight factor “w” to each requirement based on its importance. While adjusting these weights depending on the context remains a future challenge we aim to address, our survey currently assumes equal priority among the four requirements. Therefore, each weight factor is set to 1, enabling us to apply Formula (3) for calculating the overall ratio (TSR):

T S R = \frac{V a l u e O f (h e t e r o g e n e i t y + i n t e r p r e t a b i l i t y + r e a s o n i n g + s c a l a b i l i t y)}{4}

We used Table 5, replacing “satisfy”, “partial satisfaction” and “not satisfy” with 1, 0.5 and 0, respectively, in each column. Using Formula (2), we calculated the satisfaction ratio for each model listed in the table by dividing it by 4. The satisfaction ratios of all models are presented in Figure 10, which reflects their performance against the healthcare requirements mentioned earlier. Figure 11 displays citation numbers indicating the usage of each model in the medical field, with higher numbers indicating greater adoption. According to our survey, ontology achieved the highest ratio and citation number, indicating its suitability for meeting healthcare requirements.

6. Results and Discussion

As discussed in previous sections, each model in the medical domain possesses unique strengths and weaknesses. KRMs are crafted to interpret complex information, making it comprehensible and accessible to machines [115]. According to Figure 10, the ontology model demonstrates a higher ratio concerning the healthcare requirements mentioned earlier.

Ontology is recognized as a robust semantic structure that facilitates knowledge representation, integration and reasoning [116]. In the early 1990s, Borst defined ontology as a “formal specification of a shared conceptualization” [117]. Studer later refined this definition, stating that ontology is an “explicit specification of a shared conceptualization” [118,119]. In this context, a conceptualization is an abstract model that simplifies real-world representation using objects, concepts, entities and their relationships within a specific domain. Explicitness requires precise definitions of all concepts and constraints to prevent misinterpretation and ensure clear understanding of symbols [119]. Formality demands that the ontology be machine-readable. Lastly, “shared” implies that an ontology is consensual knowledge accepted by a group or community [119]. Reusing existing ontologies is beneficial as it saves time, leverages proven effective applications and facilitates interoperability with existing tools [119].

Ontology provides a structured representation of concepts and relationships within a domain, making it well-suited for the complexity and diversity inherent in the medical field [120]. Biomedical ontologies play a critical role in enabling efficient retrieval and reuse of machine-readable information stored in databases [120].

7. Conclusions and Future Work

This survey analyzes various KRMs and their significant role in organizing and structuring diverse types of information. KRMs are categorized into five types based on their representation, including logical models, semantic networks, frame-based models, object-oriented models and rule-based models, with a focus on their application in the medical domain. Recent publications from the past decade were evaluated to assess KRMs against four critical requirements in the medical field: reasoning, interpretability, heterogeneity and scalability. This evaluation enabled a comparison of the models’ advantages, disadvantages and capabilities. The study concludes by offering insights that can inspire future research in the medical domain, particularly in diseases like asthma. It underscores the necessity for a standardized model to represent disease knowledge, crucial for early diagnosis, treatment and management of conditions. Overall, this survey provides a comprehensive analysis of KRMs in the medical domain and their potential to enhance healthcare outcomes. However, further efforts are needed to apply the selected model that meets the specific requirements of the medical domain. Emphasizing the capabilities of chosen models in domains such as asthma will be essential for highlighting their effectiveness in healthcare.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Stafford, S.P. Data, information, knowledge, and wisdom. Knowl. Manag. Organ. Intell. Learn. Complex. 2009, 3, 179. [Google Scholar]
Kamala, S.P.R.; Justus, S. A Study on Knowledge Representation Models. Eur. J. Mol. Clin. Med. 2020. [Google Scholar]
Praveenkumar, T.; Sabhrish, B.; Saimurugan, M.; Ramachandran, K. Pattern recognition based on-line vibration monitoring system for fault diagnosis of automobile gearbox. Measurement 2018, 114, 233–242. [Google Scholar] [CrossRef]
Abu-Salih, B. Domain-specific knowledge graphs: A survey. J. Netw. Comput. Appl. 2021, 185, 103076. [Google Scholar] [CrossRef]
Lenz, R.; Miksch, S.; MPeleg Reichert, M.; Riano, D.; Teije, A. Process Support and Knowledge Representation in Health Care; Springer: Tallin, Estonia, 2013. [Google Scholar]
Wang, F.; Preininger, A. AI in health: State of the art, challenges, and future directions. Yearb. Med. Inform. 2019, 28, 16–26. [Google Scholar] [CrossRef] [PubMed]
Vellido, A. The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Comput. Appl. 2019, 32, 18069–18083. [Google Scholar] [CrossRef]
Shi, L.; Li, S.; Yang, X.; Qi, J.; Pan, G.; Zhou, B. Semantic Health Knowledge Graph: Semantic Integration of Heterogeneous Medical Knowledge and Services. BioMed Res. Int. 2017, 2017, 2858423. [Google Scholar] [CrossRef] [PubMed]
Weng, C.; Wu, X.; Luo, Z.; Boland, M.R.; Theodoratos, D.; Johnson, S.B. EliXR: An approach to eligibility criteria extraction and representation. J. Am. Med. Inform. Assoc. 2011, 18, i116–i124. [Google Scholar] [CrossRef]
Blasch, E.; Kadar, I.; Salerno, J.; Kokar, M.M.; Das, S.; Powell, G.M.; Corkill, D.D.; Ruspini, E.H. Issues and challenges of knowledge representation and reasoning methods in situation assessment (Level 2 Fusion). Signal Process. Sens. Fusion Target Recognit. XV 2006, 6235, 355–368. [Google Scholar]
Cooper, P. Data, information, knowledge and wisdom. Anaesth. Intensive Care Med. 2017, 18, 55–56. [Google Scholar] [CrossRef]
Lee, J.; Lapira, E.; Bagheri, B.; Kao, H.-A. Recent advances and trends in predictive manufacturing systems in big data environment. Manuf. Lett. 2013, 1, 38–41. [Google Scholar] [CrossRef]
Bright, C.; Kay, A.; Feeney, A. The effects of domain and type of knowledge on category-based inductive reasoning. In Proceedings of the Annual Meeting of the Cognitive Science Society, Portland, Oregon, USA, 11–14 August 2010; Volume 32. [Google Scholar]
Lieto, A.; Minieri, A.; Piana, A.; Radicioni, D.P. A knowledge-based system for prototypical reasoning. Connect. Sci. 2015, 27, 137–152. [Google Scholar] [CrossRef]
Sowa, J.F. Common Logic. A Framework for a Family of Logic-Based Languages. 2008. Available online: http://www.jfsowa.com/talks/clprop.htm (accessed on 16 June 2024).
Holsapple, C.W. Knowledge and its attributes. In Handbook on Knowledge Management 1: Knowledge Matters; Springer: Berlin/Heidelberg, Germany, 2004; pp. 165–188. [Google Scholar]
Nobécourt, J.; Brigitte, B. Md: A Modelling Language to Build a Formal Ontology in Either Description Logics or Conceptual Graphs. In International Conference on Knowledge Engineering and Knowledge Management; Springer: Berlin/Heidelberg, Germany, 2000; Volume 1937. [Google Scholar]
Edward, E.O.; Nlerum, P.A. Knowledge representation in artificial intelligence and expert systems using inference rule. Int. J. Sci. Eng. Res. 2020, 11, 1886–1900. [Google Scholar]
Chandrasegaran, S.K.; Ramani, K.; Sriram, R.D.; Horváth, I.; Bernard, A.; Harik, R.F.; Gao, W. The evolution, challenges, and future of knowledge representation in product design systems. Comput.-Aided Des. 2012, 45, 204–228. [Google Scholar] [CrossRef]
Pinker, S. A theory of graph comprehension. In Artificial Intelligence and the Future of Testing. Psychology Press; Psychology Press: London, UK, 2014; pp. 73–126. [Google Scholar]
Wang, F.; Li, H.; Dong, C.; Ding, L. Knowledge representation using non-parametric Bayesian networks for tunneling risk analysis. Reliab. Eng. Syst. Saf. 2019, 191, 106529. [Google Scholar] [CrossRef]
Frank, A.U. Tiers of ontology and consistency constraints in geographical information systems. Int. J. Geogr. Inf. Sci. 2001, 15, 667–678. [Google Scholar] [CrossRef]
Azad, M.; Chikalov, I.; Moshkov, M. Representation of Knowledge by Decision Trees for Decision Tables with Multiple Decisions. Procedia Comput. Sci. 2020, 176, 653–659. [Google Scholar] [CrossRef]
Myles, A.J.; Feudale, R.N.; Liu, Y.; Woody, N.A.; Brown, S.D. An introduction to decision tree modeling. J. Chemom. A J. Chemom. Soc. 2004, 18, 275–285. [Google Scholar] [CrossRef]
Levine, D.S.; Aparicio, M., IV (Eds.) Neural Networks for Knowledge Representation and Inference; Psychology Press: London, UK, 2013. [Google Scholar]
Hotz, L.; Felfernig, A.; Stumptner, M.; Ryabokon, A.; Bagley, C.; Wolter, K. Configuration Knowledge Representation and Reasoning; Morgan Kaufmann: San Francisco, CA, USA, 2014. [Google Scholar]
Minock, M. Knowledge Representation using Schema Tuple Queries; KRDB: Umea, Sweden, 2003. [Google Scholar]
Kenett, Y.N.; Faust, M. A Semantic Network Cartography of the Creative Mind. Trends Cogn. Sci. 2019, 23, 271–274. [Google Scholar] [CrossRef] [PubMed]
Sipper, M.; Moore, J.H. Conservation machine learning: A case study of random forests. Sci. Rep. 2021, 11, 3629. [Google Scholar] [CrossRef] [PubMed]
Prajwala, T.R. A comparative study on decision tree and random forest using R tool. Int. J. Adv. Res. Comput. Commun. Eng. 2015, 4, 196–199. [Google Scholar]
Jadhav, S.D.; Channe, H.P. Comparative study of K-NN, naive Bayes and decision tree classification techniques. Int. J. Sci. Res. 2016, 5, 1842–1845. [Google Scholar]
Prieto, J.; Corchado, J.M. A Review of k-NN Algorithm Based on Classical and Quantum Machine Learning. In Distributed Computing and Artificial Intelligence, Special Sessions, 17th International Conference; Springer Nature: L’Aquila, Italy, 2020; Volume 1242, p. 189. [Google Scholar]
Schuster-Böckler, B.; Bateman, A. An Introduction to Hidden Markov Models. Curr. Protoc. Bioinform. 2007, 18, A.3A.1–A.3A.9. [Google Scholar] [CrossRef] [PubMed]
Rushdi, A.M.; Rushdi, M.A. Mathematics and examples of the modern syllogistic method of propositional logic. Math. Appl. Inf. Syst. Bentham Sci. Publ. Emir. Sharjah United Arab. Emir. 2018, 6, 123–167. [Google Scholar]
Bartels, P.H.; Hiessl, H. Expert systems in histopathology. II. Knowledge representation and rule-based systems. Anal. Quant. Cytol. Histol. 1989, 11, 147–153. [Google Scholar] [PubMed]
Lifschitz, V.; Morgenstern, L.; Plaisted, D. Knowledge representation and classical logic. Found. Artif. Intell. 2008, 3, 3–88. [Google Scholar]
Frick, M.; Grohe, M. The complexity of first-order and monadic second-order logic revisited. Ann. Pure Appl. Log. 2004, 130, 3–31. [Google Scholar] [CrossRef]
Fan, C.-Y.; Chang, P.-C.; Lin, J.-J.; Hsieh, J. A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification. Appl. Soft Comput. 2011, 11, 632–644. [Google Scholar] [CrossRef]
Setiawan, F.A.; Budiardjo, E.K.; Basaruddin, T.; Aminah, S. A Systematic Literature Review on Combining Ontology with Bayesian Network to Support Logical and Probabilistic Reasoning. In Proceedings of the 2017 International Conference on Software and e-Business, Hong Kong, China, 28–30 December 2017. [Google Scholar]
Yan, Z.; Zhang, H.; Jia, Y.; Breuel, T.; Yu, Y. Combining the best of convolutional layers and recurrent layers: A hybrid network for semantic segmentation. arXiv 2016, arXiv:1603.04871. [Google Scholar]
Vu, D.-H.; Vu, T.-S.; Luong, T.-D. An efficient and practical approach for privacy-preserving Naive Bayes classification. J. Inf. Secur. Appl. 2022, 68, 103215. [Google Scholar] [CrossRef]
Nikam, S.S. A comparative study of classification techniques in data mining algorithms. Orient. J. Comput. Sci. Technol. 2015, 8, 13–19. [Google Scholar]
Fletcher, S.; Islam, M.Z. Decision tree classification with differential privacy: A survey. ACM Comput. Surv. 2019, 52, 1–33. [Google Scholar] [CrossRef]
Kotsiantis, S.B.; Zaharakis, I.; Pintelas, P. Supervised machine learning: A review of classification techniques. Emerg. Artif. Intell. Appl. Comput. Eng. 2007, 160, 3–24. [Google Scholar]
Uusitalo, L. Advantages and challenges of Bayesian networks in environmental modelling. Ecol. Model. 2007, 203, 312–318. [Google Scholar] [CrossRef]
Schlüter, F. A survey on independence-based Markov networks learning. Artif. Intell. Rev. 2012, 42, 1069–1093. [Google Scholar] [CrossRef]
Weng, C.; Gennari, J.H.; Fridsma, D.B. User-centered semantic harmonization: A case study. J. Biomed. Inform. 2007, 40, 353–364. [Google Scholar] [CrossRef] [PubMed]
Kulakowski, K.; Nalepa, G.J. Using UML state diagrams for visual modeling of business rules. In Proceedings of the 2008 International Multiconference on Computer Science and Information Technology, Wisla, Poland, 20–22 October 2008. [Google Scholar]
Borge-Holthoefer, J.; Arenas, A. Semantic Networks: Structure and Dynamics. Entropy 2010, 12, 1264–1302. [Google Scholar] [CrossRef]
Konopka, B.M. Biomedical ontologies—A review. Biocybern. Biomed. Eng. 2015, 35, 75–86. [Google Scholar] [CrossRef]
Callahan, T.J.; Tripodi, I.J.; Pielke-Lombardo, H.; Hunter, L.E. Knowledge-Based Biomedical Data Science. Annu. Rev. Biomed. Data Sci. 2020, 3, 23–41. [Google Scholar] [CrossRef] [PubMed]
Tian, S.; Yang, W.; Le Grange, J.M.; Wang, P.; Huang, W.; Ye, Z. Smart healthcare: Making medical care more intelligent. Glob. Health J. 2019, 3, 62–65. [Google Scholar] [CrossRef]
Anwar, S.S.; Ahmad, U.; Khan, M.M.; Haider, M.F.; Akhtar, J. Artificial Intelligence in Healthcare: An Overview. Int. J. Eng. Res. Adv. Technol. 2020, 6, 38–45. [Google Scholar]
Narula, S.; Shameer, K.; Omar, A.M.S.; Dudley, J.T.; Sengupta, P.P. Machine-Learning Algorithms to Automate Morphological and Functional Assessments in 2D Echocardiography. J. Am. Coll. Cardiol. 2016, 68, 2287–2295. [Google Scholar] [CrossRef] [PubMed]
Tylman, W.; Waszyrowski, T.; Napieralski, A.; Kamiński, M.; Trafidło, T.; Kulesza, Z.; Kotas, R.; Marciniak, P.; Tomala, R.; Wenerski, M. Real-time prediction of acute cardiovascular events using hardware-implemented Bayesian networks. Comput. Biol. Med. 2016, 69, 245–253. [Google Scholar] [CrossRef] [PubMed]
Ruiz-Fernández, D.; Torra, A.M.; Soriano-Payá, A.; Marín-Alonso, O.; Palencia, E.T. Aid decision algorithms to estimate the risk in congenital heart surgery. Comput. Methods Programs Biomed. 2016, 126, 118–127. [Google Scholar] [CrossRef] [PubMed]
Gulshan, V.; Peng, L.; Coram, M.; Stumpe, M.C.; Wu, D.; Narayanaswamy, A.; Venugopalan, S.; Widner, K.; Madams, T.; Cuadros, J.; et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA 2016, 316, 2402–2410. [Google Scholar] [CrossRef] [PubMed]
Yoo, K.D.; Noh, J.; Lee, H.; Kim, D.K.; Lim, C.S.; Kim, Y.H.; Lee, J.P.; Kim, G.; Kim, Y.S. A Machine Learning Approach Using Survival Statistics to Predict Graft Survival in Kidney Transplant Recipients: A Multicenter Cohort Study. Sci. Rep. 2017, 7, 8904. [Google Scholar] [CrossRef] [PubMed]
Sohn, S.; Larson, D.W.; Habermann, E.B.; Naessens, J.M.; Alabbad, J.Y.; Liu, H. Detection of clinically important colorectal surgical site infection using Bayesian network. J. Surg. Res. 2016, 209, 168–173. [Google Scholar] [CrossRef] [PubMed]
Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef] [PubMed]
Lezcano-Valverde, J.M.; Salazar, F.; León, L.; Toledano, E.; Jover, J.A.; Fernandez-Gutierrez, B.; Soudah, E.; González-Álvaro, I.; Abasolo, L.; Rodriguez-Rodriguez, L. Development and validation of a multivariate predictive model for rheumatoid arthritis mortality using a machine learning approach. Sci. Rep. 2017, 7, 10189. [Google Scholar] [CrossRef] [PubMed]
Khan, R.S.; Zardar, A.A.; Bhatti, Z. Artificial Intelligence based Smart Doctor using Decision Tree Algorithm. arXiv 2018, arXiv:1808.01884. [Google Scholar]
Wang, C.; Zhang, J.J.; Tang, B.B.; Fu, S.Y. Comparison of equivalent current systems for the substorm event of 8 March 2008 derived from the global PPMLR-MHD model and the KRM algorithm. J. Geophys. Res. 2011, 116, A07207. [Google Scholar] [CrossRef]
Bleakley, K.; Yamanishi, Y. Supervised prediction of drug–target interactions using bipartite local models. Bioinformatics 2009, 25, 2397–2403. [Google Scholar] [CrossRef] [PubMed]
Li, L.; Wang, P.; Yan, J.; Wang, Y.; Li, S.; Jiang, J.; Sun, Z.; Tang, B.; Chang, T.-H.; Wang, S.; et al. Real-world data medical knowledge graph: Construction and applications. Artif. Intell. Med. 2020, 103, 101817. [Google Scholar] [CrossRef] [PubMed]
Xie, J.; Jiang, J.; Wang, Y.; Guan, Y.; Guo, X. Learning an expandable EMR-based medical knowledge network to enhance clinical diagnosis. Artif. Intell. Med. 2020, 107, 101927. [Google Scholar] [CrossRef] [PubMed]
Monroy, N.; Altuve, M. Analysis of the observation sequence duration of hidden Markov models for QRS complex detection in single-lead ECG recordings. In Proceedings of the 2018 Computing in Cardiology Conference (CinC), Maastricht, The Netherlands, 23–26 September 2018; Volume 45. [Google Scholar]
Sandag, G.A.; Tedry, N.E.; Lolong, S. Classification of Lower Back Pain Using K-Nearest Neighbor Algorithm. In Proceedings of the 2018 6th International Conference on Cyber and IT Service Management (CITSM), Parapat, Indonesia, 7–9 August 2018. [Google Scholar]
Lin, H.; Long, E.; Ding, X.; Diao, H.; Chen, Z.; Liu, R.; Huang, J.; Cai, J.; Xu, S.; Zhang, X.; et al. Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective, multicentre machine learning study. PLOS Med. 2018, 15, e1002674. [Google Scholar] [CrossRef] [PubMed]
Kar, S.; Majumder, D.D. A mathematical theory of shape and neuro-fuzzy methodology-based diagnostic analysis: A comparative study on early detection and treatment planning of brain cancer. Int. J. Clin. Oncol. 2017, 18, 349–681. [Google Scholar] [CrossRef] [PubMed]
Johansson, F.D.; Collins, J.E.; Yau, V.; Guan, H.; Kim, S.C.; Losina, E.; Sontag, D.; Stratton, J.; Trinh, H.; Greenberg, J.; et al. Predicting Response to Tocilizumab Monotherapy in Rheumatoid Arthritis: A Real-world Data Analysis Using Machine Learning. J. Rheumatol. 2021, 48, 1364–1370. [Google Scholar] [CrossRef] [PubMed]
Tran, T.Q.B.; du Toit, C.; Padmanabhan, S. Artificial intelligence in healthcare—The road to precision medicine. J. Hosp. Manag. Health Policy 2021, 5, 29. [Google Scholar] [CrossRef]
Allalou, A.; Nalla, A.; Prentice, K.J.; Liu, Y.; Zhang, M.; Dai, F.F.; Ning, X.; Osborne, L.R.; Cox, B.J.; Gunderson, E.P.; et al. A Predictive Metabolic Signature for the Transition From Gestational Diabetes Mellitus to Type 2 Diabetes. Diabetes 2016, 65, 2529–2539. [Google Scholar] [CrossRef]
Bertaud-Gounot, V.; Duvauferrier, R.; Burgun, A. Ontology and medical diagnosis. Inform. Health Soc. Care 2012, 37, 51–61. [Google Scholar] [CrossRef] [PubMed]
Nyangaresi, V.O.; El-Omari, N.K.T.; Nyakina, J.N. Efficient Feature Selection and ML Algorithm for Accurate Diagnostics. J. Comput. Sci. Res. 2022, 4, 10–19. [Google Scholar] [CrossRef]
Kasbekar, P.U.; Goel, P.; Jadhav, S.P. A Decision Tree Analysis of Diabetic Foot Amputation Risk in Indian Patients. Front. Endocrinol. 2017, 8, 25. [Google Scholar] [CrossRef] [PubMed]
Topuz, K.; Zengul, F.D.; Dag, A.; Almehmi, A.; Yildirim, M.B. Predicting graft survival among kidney transplant recipients: A Bayesian decision support model. Decis. Support Syst. 2018, 106, 97–109. [Google Scholar] [CrossRef]
Boldsen, J.K.; Engedal, T.S.; Pedraza, S.; Cho, T.-H.; Thomalla, G.; Nighoghossian, N.; Baron, J.-C.; Fiehler, J.; Østergaard, L.; Mouridsen, K. Better Diffusion Segmentation in Acute Ischemic Stroke Through Automatic Tree Learning Anomaly Segmentation. Front. Neurosci. 2018, 12, 21. [Google Scholar] [CrossRef] [PubMed]
Putra, R.M.; Adiwijaya; Utama, D.Q. Snake bite classification using Chain code and K nearest neighbour. J. Phys. Conf. Ser. 2019, 1192, 012015. [Google Scholar] [CrossRef]
Dan, N.; Long, T.; Jia, X.; Lu, W.; Gu, X.; Iqbal, Z.; Jiang, S. A feasibility study for predicting optimal radiation therapy dose distributions of prostate cancer patients from patient anatomy using deep learning. Sci. Rep. 2019, 9, 1076. [Google Scholar]
Dash, D.P.; Kolekar, M.H. Hidden Markov model based epileptic seizure detection using tunable Q wavelet transform. J. Biomed. Res. 2020, 34, 170–179. [Google Scholar] [CrossRef] [PubMed]
Kovačević, Ž.; Pokvić, L.G.; Spahić, L.; Badnjević, A. Prediction of medical device performance using machine learning techniques: Infant incubator case study. Health Technol. 2020, 10, 151–155. [Google Scholar] [CrossRef]
Tran, L.; Li, Y.; Nocera, L.; Shahabi, C.; Xiong, L. MultiFusionNet: Atrial Fibrillation Detection With Deep Neural Networks. AMIA Summits Transl. Sci. Proc. 2020, 2020, 654–663. [Google Scholar] [PubMed]
Jiang, X.; Ding, H.; Shi, H.; Li, C. Novel QoS optimization paradigm for IoT systems with fuzzy logic and visual information mining integration. Neural Comput. Appl. 2020, 32, 16427–16443. [Google Scholar] [CrossRef]
Chen, Z.; Chen, J.; Zhou, J.; Lei, F.; Zhou, F.; Qin, J.-J.; Zhang, X.-J.; Zhu, L.; Liu, Y.-M.; Wang, H.; et al. A risk score based on baseline risk factors for predicting mortality in COVID-19 patients. Curr. Med Res. Opin. 2021, 37, 917–927. [Google Scholar] [CrossRef] [PubMed]
Lucas, P.J.; Cabral, C.; Hay, A.D.; Horwood, J. A systematic review of parent and clinician views and perceptions that influence prescribing decisions in relation to acute childhood infections in primary care. Scand. J. Prim. Health Care 2015, 33, 11–20. [Google Scholar] [CrossRef] [PubMed]
Bettini, C.; Brdiczka, O.; Henricksen, K.; Indulska, J.; Nicklas, D.; Ranganathan, A.; Riboni, D. A survey of context modelling and reasoning techniques. Pervasive Mob. Comput. 2010, 6, 161–180. [Google Scholar] [CrossRef]
Kassam, A.; Kassam, N. Artificial intelligence in healthcare: A Canadian context. Health Manag. Forum 2019, 33, 5–9. [Google Scholar] [CrossRef] [PubMed]
Ahmed, M.U.; Barua, S.; Begum, S. Artificial Intelligence, Machine Learning and Reasoning in Health Informatics—Case Studies. In Signal Processing Techniques for Computational Health Informatics; Springer: Berlin/Heidelberg, Germany, 2021; pp. 261–291. [Google Scholar]
Ben Charif, A.; Zomahoun, H.T.V.; Gogovor, A.; Abdoulaye Samri, M.; Massougbodji, J.; Wolfenden, L.; Ploeg, J.; Zwarenstein, M.; Milat, A.J.; Rheault, N.; et al. Tools for assessing the scalability of innovations in health: A systematic review. Health Res. Policy Syst. 2022, 20, 34. [Google Scholar] [CrossRef] [PubMed]
Kyrimi, E.; Dube, K.; Fenton, N.; Fahmi, A.; Neves, M.R.; Marsh, W.; McLachlan, S. Bayesian networks in healthcare: What is preventing their adoption? Artif. Intell. Med. 2021, 116, 102079. [Google Scholar] [CrossRef] [PubMed]
Nguyen, L. Overview of Bayesian Network. Sci. J. Math. Stat. 2013. [Google Scholar]
Mengshoel, O.J. Understanding the scalability of Bayesian network inference using clique tree growth curves. Artif. Intell. 2010, 174, 984–1006. [Google Scholar] [CrossRef]
Jain, D. Knowledge engineering with markov logic networks: A review. Evol. Knowl. Theory Appl. 2011, 16, 50–75. [Google Scholar]
Faghih-Roohi, S.; Xie, M.; Ng, K.M. Accident risk assessment in marine transportation via Markov modelling and Markov Chain Monte Carlo simulation. Ocean Eng. 2014, 91, 363–370. [Google Scholar] [CrossRef]
Amith, M.F.; He, Z.; Bian, J.; Lossio-Ventura, J.A.; Tao, C. Assessing the practice of biomedical ontology evaluation: Gaps and opportunities. J. Biomed. Inform. 2018, 80, 1–13. [Google Scholar] [CrossRef] [PubMed]
Park, M.J.; Lee, J.; Lee, C.H.; Lin, J.; Serres, O.; Chung, C.W. An efficient and scalable management of ontology. In Proceedings of the Advances in Databases: Concepts, Systems and Applications: 12th International Conference on Database Systems for Advanced Applications, DASFAA 2007, Bangkok, Thailand, 9–12 April 2007; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
Ma, X.; Ma, C.; Wang, C. A new structure for representing and tracking version information in a deep time knowledge graph. Comput. Geosci. 2020, 145, 104620. [Google Scholar] [CrossRef]
Weinstein, M.C.; O’Brien, B.; Hornberger, J.; Jackson, J.; Johannesson, M.; McCabe, C.; Luce, B.R. Principles of Good Practice for Decision Analytic Modeling in Health-Care Evaluation: Report of the ISPOR Task Force on Good Research Practices—Modeling Studies. Value Health 2003, 6, 9–17. [Google Scholar] [CrossRef] [PubMed]
Zhang, D.; Yao, L.; Chen, K.; Wang, S.; Haghighi, P.D.; Sullivan, C. A Graph-Based Hierarchical Attention Model for Movement Intention Detection from EEG Signals. IEEE Trans. Neural Syst. Rehabil. Eng. 2019, 27, 2247–2253. [Google Scholar] [CrossRef] [PubMed]
Enos, J.R. Merging system architecture and social network analysis to better understand emergent networks of systems. In Proceedings of the International Annual Conference of the American Society for Engineering Management, Charlotte, NC, USA, 26–29 October 2016; American Society for Engineering Management (ASEM): Huntsville, AL, USA, 2016. [Google Scholar]
Monte-Serrat, D.M.; Cattani, C. Interpretability in neural networks towards universal consistency. Int. J. Cogn. Comput. Eng. 2021, 2, 30–39. [Google Scholar] [CrossRef]
Navigli, R.; Bevilacqua, M.; Conia, S.; Montagnini, D.; Cecconi, F. Ten Years of BabelNet: A Survey. IJCAI 2021, 4559–4567. [Google Scholar]
Park, S.H. Semantic network analysis of presidential debates in 2007 election in Korea. Korean J. Commun. Inf. 2009, 45, 220–254. [Google Scholar]
Song, S.; Lin, Y.; Guo, B.; Di, Q.; Lv, R. Scalable Distributed Semantic Network for knowledge management in cyber physical system. J. Parallel Distrib. Comput. 2018, 118, 22–33. [Google Scholar] [CrossRef]
Lakemeyer, G.; Levesque, H.J. A first-order logic of limited belief based on possible worlds. In Proceedings of the International Conference on Principles of Knowledge Representation and Reasoning, Rhodes, Greece, 12–18 September 2020; Volume 17. [Google Scholar]
Laurent, S. Reasoning with propositional logic: From sat solvers to knowledge compilation. In A Guided Tour of Artificial Intelligence Research: Volume II: AI Algorithms; Springer: Berlin/Heidelberg, Germany, 2020; pp. 115–152. [Google Scholar]
García-García, J.; Enríquez, J.; Domínguez-Mayo, F. Characterizing and evaluating the quality of software process modeling language: Comparison of ten representative model-based languages. Comput. Stand. Interfaces 2018, 63, 52–66. [Google Scholar] [CrossRef]
Kalcheva, N.; Todorova, M.; Marinova, G. Naive Bayes Classifier, Decision Tree and AdaBoost Ensemble Algorithm–Advantages and Disadvantages. In Proceedings of the 6th ERAZ Conference Proceedings (Part of ERAZ Conference Collection), Online, 21 May 2020. [Google Scholar]
Ahmad, M.A.; Eckert, C.; Teredesai, A. Interpretable machine learning in healthcare. In Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, New York, NY, USA, 15 August 2018. [Google Scholar]
Jackins, V.; Vimal, S.; Kaliappan, M.; Lee, M.Y. AI-based smart prediction of clinical disease using random forest classifier and Naive Bayes. J. Supercomput. 2021, 77, 5198–5219. [Google Scholar] [CrossRef]
Alghunaim, S.; Al-Baity, H.H. On the Scalability of Machine-Learning Algorithms for Breast Cancer Prediction in Big Data Context. IEEE Access 2019, 7, 91535–91546. [Google Scholar] [CrossRef]
Renganathan, V. Overview of artificial neural network models in the biomedical domain. Bratisl. Med J. 2019, 120, 536–540. [Google Scholar] [CrossRef] [PubMed]
Ajami, H.; Mcheick, H. Ontology-Based Model to Support Ubiquitous Healthcare Systems for COPD Patients. Electronics 2018, 7, 371. [Google Scholar] [CrossRef]
Eshghishargh, A.; Milton, S.; Egan, G.F.; Lonie, A.; Kolbe, S.; Killeen, N.E.; Lohrey, J.M. An ontology-based semantic question complexity model and its applications in neuroinformatics. Front. Neurosci. 2015, 9. [Google Scholar]
Khattak, A.M.; Batool, R.; Pervez, Z.; Khan, A.M.; Lee, S. Ontology Evolution and Challenges. J. Inf. Sci. Eng. 2013, 29, 851–871. [Google Scholar]
Ashburner, M.; Ball, C.A.; Blake, J.A.; Botstein, D.; Butler, H.; Cherry, J.M.; Davis, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; et al. Gene ontology: Tool for the unification of biology. Nat. Genet. 2000, 25, 25–29. [Google Scholar] [CrossRef] [PubMed]
Shahpori, R.; Doig, C. Systematized Nomenclature of Medicine–Clinical Terms direction and its implications on critical care. J. Crit. Care 2010, 25, 364.e1–364.e9. [Google Scholar] [CrossRef] [PubMed]
Hastings, J.; Owen, G.; Dekker, A.; Ennis, M.; Kale, N.; Muthukrishnan, V.; Turner, S.; Swainston, N.; Mendes, P.; Steinbeck, C. ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Res. 2015, 44, D1214–D1219. [Google Scholar] [CrossRef] [PubMed]
Tao, C.; Jiang, G.; Oniki, T.A.; Freimuth, R.R.; Zhu, Q.; Sharma, D.; Pathak, J.; Huff, S.M.; Chute, C.G. A semantic-web oriented representation of the clinical element model for secondary use of electronic health records data. J. Am. Med Inform. Assoc. 2012, 20, 554–562. [Google Scholar] [CrossRef]

Figure 1. Data, information, knowledge and wisdom chain.

Figure 2. Bayesian network representation.

Figure 3. Ontology representation.

Figure 4. Decision tree representation.

Figure 5. Neural network graph.

Figure 6. Unified modeling language representation.

Figure 7. Frame representation.

Figure 8. Semantic network graph.

Figure 9. Knowledge representation model categorization.

Figure 10. Satisfaction ratio for knowledge representation models based on medical domain requirements.

Figure 11. Citation number for knowledge representation models (over the last decade).

Table 1. Types of knowledge.

Knowledge Types	Description	Examples
Explicit knowledge	A form of knowledge that can be expressed in various formats such as plain text, documents, spreadsheets, databases, images, etc., and can exist in structured or unstructured forms [10,11].	Scientific formulas; recipes for a chef; books. For example: The capital of France is Paris.
Implicit knowledge	Knowledge gained through incidental activities, or without awareness that learning is occurring [12].	How to walk, run, ride a bicycle or swim.
Tacit knowledge	This type of knowledge is subjective, based on personal experience, and often difficult to express. It resides in the human brain in an inexpressible form and is of an intellectual nature. This original form of knowledge is primarily obtained through learning and experiences [13].	Doctor’s diagnosis; musician’s improvisation.

Table 2. Tuple representation.

Name	Age	Sex	Description	Temperature	Cough
Sami	32	Male	Patient	38 °C	Yes

Table 3. A comprehensive overview: advantages and disadvantages of knowledge models.

Model	Advantages	Disadvantages
Naïve Bayes [39,40]	Simple; manages noisy data and missing values.	Inability to handle continuous features.
Decision tree [41]	Intuitive interpretation; manages missing values.	Difficulty in designing large trees; low ability to manage noisy data.
Random forest [42]	Ability to manage noisy data; suitable for large and heterogeneous data.	Requires more memory; difficult implementation and complicated in some cases; low ability to manage missing values.
Bayesian network [43]	Handles missing or incomplete data; graphical structure facilitates interoperability.	Memory requirements increase with the number of variables. Large and complex networks may face scalability issues.
Markov representation [44]	Ability to represent large data; robustness in handling noisy data.	Future transitions depend only on the current state, not on the entire history. This memoryless property can be limiting, especially when considering long-term effects or complex dependencies.
UML [45,46]	Graphical representation for software design.	Becomes complex for large systems.
Frame [47]	Hierarchical organization of knowledge; easy to construct.	Difficulty in representing complex relationships.
Semantic network [48]	Effective when representing relationships between entities.	Faces challenges with large graphs.
Ontology [49]	Structured representation of data; allows navigation of complex processes. Enables integration of domain knowledge for high-level reasoning.	Creating an ontology model can be time-consuming.
First-order logic [50]	Expressive; handles complex relationships.	Faces challenges in representing certain types of data.
Second-order logic [50]	Adds expressiveness over first-order logic.	Increased complexity in reasoning.
Propositional logic [50]	Simple and easy to understand.	Limited expressiveness for complex relationships.

Table 4. Utilization of knowledge representation models (KRMs) in the healthcare domain.

Ref.	KRMs	Health Domain
[64]	Mathematical representation models	Disease diagnosis.
[65]	Random forest	Treatment for rheumatoid arthritis.
[66]	Random forest	Detection of inflammatory bowel disease.
[67]	Decision tree	Patient and disease diagnosis.
[68]	Ontology	Data representation in medical domain.
[69]	K-nearest neighbours	Diagnosis and data in X-ray image.
[70]	Neural network	Finding signs of diabetic retinopathy.
[71,72]	Naïve Bayes	Data representation in congenital heart surgery.
[73]	Decision tree	Evaluation of the risk of amputation in individuals with diabetic foot.
[74]	Random forest	Antibody data representation in kidney transplantation.
[75]	Naïve Bayes	Data representation from ultrasound images.
[76]	Bayesian network	Used in treatment process.
[77]	Decision tree	Specific details in the diagnosis process.
[78]	K-nearest neighbours	Forecasting the risk of retinopathy.
[79]	Random forest	Monitoring for school children.
[80]	Decision tree	Detection of spread patterns in recent ischemic stroke.
[81]	K-nearest neighbours	Detection of thyroid disease.
[82]	Neural network	Used in cancer treatment using advanced therapy techniques.
[83]	Markov model	Disease diagnosis using ECG.
[84]	Decision tree	Patient monitoring.
[85]	Neural network	Identification of coronary calcium.
[86]	Fuzzy logic	Medical record representation for diagnosis prediction.
[87]	Random forest	Diagnosing CKD disease.

Table 5. Assessing knowledge representation models (KRMs) in achieving medical domain needs.

KRMs	Heterogeneity	Interpretability	Reasoning	Scalability
Bayesian network	Satisfy [93]	Satisfy [94]	Satisfy [93]	Partial satisfaction [94]
Markov representation	Satisfy [75]	Partial satisfaction [95]	Satisfy [95]	Satisfy [96]
Ontology	Satisfy [97]	Satisfy [97]	Satisfy [97]	Satisfy [98]
UML	Satisfy [99]	Satisfy [99]	Partial satisfaction [100]	Partial satisfaction [99]
Frame	Satisfy [101]	Satisfy [99]	Not satisfy [100]	Partial satisfaction [101]
Tuple-based representation	Satisfy [102]	Satisfy [102]	Partial satisfaction [102]	Partial satisfaction [102]
Semantic network	Satisfy [103]	Satisfy [103]	Partial satisfaction [104]	Partial satisfaction [104]
First-order logic	Partial satisfaction [105]	Satisfy [105]	Satisfy [105]	Partial satisfaction [105]
Second-order logic	Satisfy [106]	Satisfy [107]	Satisfy [106]	Partial satisfaction [106]
Propositional logic	Partial satisfaction [106]	Satisfy [106]	Partial satisfaction [107]	Not satisfy [107]
Rule-based system	Satisfy [108]	Satisfy [108]	Satisfy [109]	Partial satisfaction [108]
Naïve Bayes	Partial satisfaction [110]	Partial satisfaction [110]	Satisfy [110]	Satisfy [110]
Random forest	Satisfy [111]	Partial satisfaction [112]	Satisfy [112]	Satisfy [112]
Decision tree	Satisfy [112]	Satisfy [111]	Satisfy [112]	Partial satisfaction [112]
Neural network	Satisfy [113]	Not satisfy [114]	Satisfy [114]	Satisfy [114]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Msheik, B.; Adda, M.; Mcheick, H.; Dbouk, M. Survey on Knowledge Representation Models in Healthcare. Information 2024, 15, 435. https://doi.org/10.3390/info15080435

AMA Style

Msheik B, Adda M, Mcheick H, Dbouk M. Survey on Knowledge Representation Models in Healthcare. Information. 2024; 15(8):435. https://doi.org/10.3390/info15080435

Chicago/Turabian Style

Msheik, Batoul, Mehdi Adda, Hamid Mcheick, and Mohamed Dbouk. 2024. "Survey on Knowledge Representation Models in Healthcare" Information 15, no. 8: 435. https://doi.org/10.3390/info15080435

APA Style

Msheik, B., Adda, M., Mcheick, H., & Dbouk, M. (2024). Survey on Knowledge Representation Models in Healthcare. Information, 15(8), 435. https://doi.org/10.3390/info15080435

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu