CN110349639B - Multi-center medical term standardization system based on general medical term library - Google Patents
Multi-center medical term standardization system based on general medical term library Download PDFInfo
- Publication number
- CN110349639B CN110349639B CN201910629244.9A CN201910629244A CN110349639B CN 110349639 B CN110349639 B CN 110349639B CN 201910629244 A CN201910629244 A CN 201910629244A CN 110349639 B CN110349639 B CN 110349639B
- Authority
- CN
- China
- Prior art keywords
- term
- medical
- terms
- module
- mapping
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Artificial Intelligence (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Mathematical Physics (AREA)
- Quality & Reliability (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a multi-center medical term standardization system based on a general medical term library, which comprises a source database, a database connection management module, a pre-analysis module, a term mapping unit, an increment updating module, an exception handling module and a multi-center interaction module, wherein the source database is connected with the source database; the invention solves the problem of medical term standardization of a plurality of medical data centers, and keeps the consistency of medical term expression of each medical data center; automatically realizing scanning and analysis of a medical data center source database, and realizing automatic mapping of medical terms with standard codes on the basis; the complexity of medical term mapping is fully considered, and a spiral ascending process of automatic mapping to fuzzy matching mapping and then to user-defined term mapping is realized; the incremental updating mechanism fully utilizes the prior mapping records, greatly lightens the pressure of subsequent work and greatly improves the standardization degree of medical term mapping.
Description
Technical Field
The invention belongs to the field of term standardization, and particularly relates to a multi-center medical term standardization system based on a general medical term library.
Background
With the rapid development of medical informatization, the types and scales of medical data are rapidly increased, and it is a necessary trend to perform data analysis and mining by using medical data of multiple medical data centers (simply referred to as "multiple centers") to provide support for clinical decision, medical management service and scientific research. However, the related standards of medical terms in China are deficient, the system is not complete, and medical information system manufacturers are numerous, so that the isomerism of term names and codes among medical data centers and even in medical data centers is serious, and a large amount of semi-structured and unstructured data is accompanied; the internationally mature related term sets are limited to be applied domestically, and the mapping relation among the internationally existing standard term sets is difficult to be applied to the standardization of domestic medical terms due to language barriers; due to the reasons, the medical information systems cannot be operated mutually, and the standardization and the sharing of the medical data among the multiple medical data centers are difficult to realize.
The general medical term library is a medical concept term standard library covering the whole medical process and taking an international general medical term set such as medical term system naming-clinical terms (SNOMED-CT), international disease classification and code (ICD-10), clinical drug standard naming (RxMorm), observation index identifier logic naming precoding system (LOINC) and the like as a core. After the multi-center medical data are mapped to the unified general medical term library, the operations such as big data analysis and the like can be conveniently carried out. Before data analysis is performed by using multi-center medical data, how to perform term standardization and cleaning on medical data of different medical information systems becomes a big problem.
In the prior art scheme [ CN 201510922676-a method and a system for automatically constructing a mapping relationship of medical terms based on participle codes ], [ CN 201710101827-a method and a device for data standardization processing of medical big data ] and [ CN 201710152584-a method and a device for determining medical synonyms ] more, from the perspective of Chinese participles, the participles of the medical terms are realized based on a character string matching and equally participle method, and then the similarity between the medical terms is calculated, so that the medical term with the highest similarity is selected to establish the mapping relationship with a target term. The scheme only aims at solving the matching problem of Chinese medical terms, but not solving the term standardization problem among the whole medical information systems, and only aims at mapping among the Chinese medical terms, and the standardization between the Chinese medical terms and a foreign standard medical term set is not realized.
In the patent documents in the prior art [ CN 201610173625-a method and system for automatically standardizing a medical data dictionary ], a cloud-based data dictionary standardization model is mainly established in a logical level, and a term set of all medical data centers needs to be extracted to the cloud for unified mapping processing.
At present, a relatively wide processing method is that an information technician and a doctor and other personnel with medical background knowledge determine the mapping relationship between data in a medical system and a general medical term library one by one, and then perform semi-automatic mapping by executing an sql script and the like to obtain standardized medical terms; another operation of standardized medical terms is to require medical personnel to enter data in a standardized format as the data is entered. However, the current methods have significant disadvantages:
1. the prior art only focuses on the establishment of mapping relationships between medical terms, and does not address the standardization of medical terms throughout medical information systems.
2. The existing scheme is specific to a certain specific data model, is not only lack of practicability and pertinence, but also is limited to mapping between Chinese medical terms, and cannot establish mapping relation with an international universal medical term library.
3. For medical data coded by using the international universal medical term set, the existing mapping relation among the existing term sets is not fully discovered, and for medical terms which do not use the standard medical term set in the medical data center and self-defined medical terms in the medical data center, the medical terms are generally solved by adopting a fuzzy matching mode or are directly abandoned, and a complete mapping process and mechanism are not established.
4. For term mapping that must involve personnel with medical background knowledge, a relatively friendly interactive interface and a standardized manual review and exception handling mechanism are not provided.
5. Because of data clutter, the term mapping and the subsequent data cleaning are not combined in the conventional medical term standardization process, the term standardization process cannot be actually completed by utilizing the mapping relation among terms, the quality of the mapped data cannot be ensured, and the subsequent data analysis result is seriously influenced.
6. For the medical data after mapping and cleaning, a detailed quality assessment mechanism is not established to ensure the accuracy of the term mapping and data cleaning.
7. The processing mechanism and the subsequent incremental updating mechanism after the updating of the related international general medical term library are not considered.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a multi-center medical term standardization system based on a general medical term library, which solves the multi-center medical term standardization problem based on a plurality of standardized medical term sets, simplifies the medical term mapping operation and enriches the whole process of medical term standardization.
The purpose of the invention is realized by the following technical scheme: a multi-center medical term standardization system based on a general medical term library comprises a source database, a database connection management module, a pre-analysis module, a term mapping unit, an increment updating module, an exception handling module and a multi-center interaction module;
the source database is distributed in the preposed servers of the medical data centers and stores the service data of the medical data centers;
the database connection management module: managing information required by accessing the source database, and providing support for the term mapping tool to access and modify the source database;
the pre-analysis module: automatically scanning a source database, counting the occurrence frequency of each medical term in the original medical data, giving a abandon suggestion to terms with the occurrence frequency less than a set threshold, and sending the terms with the occurrence frequency more than or equal to the set threshold to a term mapping unit for subsequent term mapping;
the term mapping unit comprises an automatic mapping module, a fuzzy matching module and a self-defined term module;
the automatic mapping module: supporting automatic mapping of medical terms, and realizing multidirectional mapping for terms using international universal medical term library standard codes according to the mapping relation among the existing universal medical term library standard codes;
the fuzzy matching module: traversing and inquiring medical terms which cannot be mapped directly according to the mapping relation between the standard codes in the conventional medical term library in a fuzzy matching manner, and providing several groups of standard medical terms with highest similarity for selection as target terms mapped by the terms;
the custom term module: for medical terms which cannot depend on the mapping relation between standard codes in the existing medical term library and cannot be matched with target terms in the existing general medical term library in a fuzzy manner, after a user generates a self-defined term application, the medical terms are sent to a multi-center interaction module to be checked and fed back;
the multi-center interaction module: after receiving the self-defined term application of each medical data center sent by the self-defined term module, auditing the self-defined terms, adding the self-defined terms which are approved as standard terms into the general medical term library, and sending the standard terms to each medical data center to keep the general medical term libraries of each medical data center consistent;
the incremental update module: aiming at the medical term standardization process of generating incremental data by a source database which executes medical term standardization mapping due to business reasons, calling a historical mapping relation record generated by a term mapping unit to complete term standardization mapping on the incremental data;
the exception handling module: recording the execution process of each module, generating an error log aiming at the error occurrence condition, and backtracking the whole medical term mapping process according to the error log.
Further, the system also comprises a data cleaning module which is used for formulating cleaning rules, giving weight to each data element and screening out the data with serious deletion, including cleaning dirty data of a structure level and an example level.
Further, the database connection management module specifically includes: the JDBC module is formed by classes and interfaces written by a programming language, a uniform access interface is provided for various databases, and the functions of establishing connection with the database or other data sources, sending SQL commands to the database and processing the returned results of the database are realized.
Further, after the database connection management module realizes connection to the source database, the pre-analysis module automatically scans the structural information of all data in the source database and the statistical information of specific fields thereof through the module to generate a statistical form, which includes two parts:
firstly, summarizing statistics on all tables in a source database, wherein the summarized statistics comprises field names, numerical value types, maximum lengths of all values, total rows in the tables and the proportion of null values in each table;
secondly, statistics is made on detailed information and occurrence frequency of specific terms in a specific table, and the terms with higher occurrence frequency are arranged according to the occurrence frequency from large to small for subsequent term mapping to preferentially select and process the terms with higher occurrence frequency.
Further, the automatic mapping module: aiming at terms of the standard codes of the international universal medical term library in the source database, after the codes belong to the standard, a target term set to be mapped is selected, if a referential mapping relation exists between the codes of the standard term set to which the terms belong in the source database and the codes of the target term set, the terms can automatically generate mapping SQL sentences, and the terms in the source database are automatically mapped and corresponding data loading is completed.
Further, in the fuzzy matching module, a specific method of fuzzy matching is as follows:
(1) term participles: performing word segmentation on all words in the general medical term library, and performing frequency statistics on each word segmentation to serve as basic word frequency; the source medical term M that needs to be fuzzy matched is participled before matching.
(2) Fuzzy matching: by comparing the probability difference between medical terms as the standard of the similarity, the specific operation is as follows:
(2.1) screening all terms including the participle from the general medical term library, and performing participle to combine the terms into a term set A;
(2.2) calculating the matching degree by using the following formula, and solving the average weighted probability of all terms in the term M and the term set A; wherein n is the number of participles obtained by each term, and P1, P2, P3 and P4 … Pn are the corresponding probabilities of each participle in the basic word frequency:
(2.3) subtracting the average weighted probability of all standard terms in the term set A from the term M needing fuzzy matching, taking the negative value as the matching degree, wherein the larger the matching degree is, the higher the similarity of the two is, and the formula is as follows:
S(M,A)=|D(M)-D(A)|
further, the custom term module: defining constraints in advance to avoid conflict between the custom terms and the known standard terms; when the custom terms are added, the consistency of the added custom standard terms is required to be kept among all the medical data centers, repeated addition is prevented, and meanwhile, the data sharing of the multi-center medical data can be realized after the multi-center medical data is standardized through term mapping. Before adding the custom terms, a request for adding the custom terms is submitted to the multi-center interaction module, and the request content comprises the following steps: custom terms to be added, detailed descriptions of the custom terms, code of the custom terms; after the auditing of the relevant operators of the multi-center interaction module is passed, determining that no custom code similar to the repeated medical terms exists, generating a custom standard term code, and then calling the automatic mapping module to complete term mapping and loading of covered data; if the audit is not passed, returning the existing custom term code for the medical data center to complete the subsequent mapping or returning the reason of the failure of generating the custom term, generating an error document and prompting a user.
Furthermore, the multi-center interaction module is responsible for coordinating and unifying the general medical term libraries and term codes thereof of all the medical data centers, and the personnel with the highest authority of the multi-center interaction module checks and coordinates the use problem of the custom standard terms.
Further, the incremental updating module is used for a subsequent medical term standardization process of the medical data center which operates the medical term mapping, the incremental data is updated mainly according to the mapping record which is generated by the term mapping unit and is standardized by the previous terms, and the custom term module is repeatedly executed for the medical terms which still cannot complete the standardized mapping.
Further, the exception handling module: the log storage module is used for storing all logs during the operation of the system and recording whether each module operates normally; sorting saves an error log comprising: errors occurring during the operation of the system, errors occurring during the calling of each module, and errors occurring during the mapping of each module to a single term during the operation of each module; classifying and saving terms which are not mapped successfully, including terms which are omitted in the automatic analysis module and terms which are omitted in the self-defining module, and generating a failure term document; the exception handling module supports a database backtracking function by setting a timestamp on the database, and supports a user to backtrack the matched database to data of a specified date.
The invention has the beneficial effects that: the medical term standardization problem of a plurality of medical data centers is solved systematically, and the consistency of medical term expression of each medical data center is kept; the method comprises the steps of automatically realizing automatic scanning and analysis of a medical data center source database, and realizing automatic mapping of medical terms with standard codes on the basis; the complexity of medical term mapping is fully considered, and a spiral ascending process of automatic mapping to fuzzy matching mapping and then to user-defined term mapping is realized; the incremental updating mechanism fully utilizes the prior mapping records, greatly lightens the pressure of subsequent work and greatly improves the standardization of medical term mapping.
Drawings
FIG. 1 is a system flow diagram;
FIG. 2 is a system data flow diagram;
FIG. 3 is a schematic diagram of JDBC implementing database connection management;
FIG. 4 is a flow chart of medical term standardized mapping;
fig. 5 is a diagram of a multi-center interaction principle.
Detailed Description
The invention is described in further detail below with reference to the figures and specific examples.
As shown in fig. 1, the system for standardizing a multicenter medical term based on a universal medical term library provided by the present invention includes a source database, a database connection management module, a pre-analysis module, a term mapping unit, an increment update module, an exception handling module, a multicenter interaction module, and may further include a data cleaning module;
the source database is distributed in the prepositive servers of the medical data centers and stores service data of medical information systems such as HIS, LIS, PACS, EMR and the like of the medical data centers, wherein the service data comprises basic information of patients, information of treatment, cost information, diagnosis information, medication information, operation information, inspection information, examination information, text case history information and nursing vital sign information;
the database connection management module: managing (including loading, modifying, storing) information needed to access the source database, providing support for the term mapping tool to access and modify different types of source databases;
a pre-analysis module: automatically scanning a source database, counting the occurrence frequency of each medical term in the original medical data, giving a abandon suggestion to terms with the occurrence frequency less than a set threshold, and sending the terms with the occurrence frequency more than or equal to the set threshold to a term mapping unit for subsequent term mapping;
the term mapping unit comprises an automatic mapping module, a fuzzy matching module and a self-defined term module;
an automatic mapping module: supporting automatic mapping of medical terms, realizing multidirectional mapping for terms using standard codes of the international universal medical term library according to the mapping relation among the standard codes of the existing universal medical term library, and only needing to perform quality control on mapping results;
a fuzzy matching module: for medical terms which cannot be mapped directly according to the mapping relation between the standard codes in the existing medical term library, traversal query can be performed in the general medical term library in a fuzzy matching mode, and several groups of standard medical terms with highest similarity are provided for selection as target terms of the term mapping;
a custom term module: for medical terms which cannot depend on the mapping relation between standard codes in the existing medical term library and cannot be matched with target terms in the existing general medical term library in a fuzzy manner, after a user generates a self-defined term application (which can be determined by technical personnel and doctors together), the self-defined term application is sent to a multi-center interaction module for auditing and feeding back the self-defined term application;
a multi-center interaction module: after receiving the self-defined term application of each medical data center sent by the self-defined term module, auditing the self-defined terms, adding the self-defined terms which are approved as standard terms into the general medical term library, and sending the standard terms to each medical data center to keep the general medical term libraries of each medical data center consistent;
an incremental update module: aiming at the medical term standardization process of generating incremental data by a source database which executes medical term standardization mapping due to business reasons, calling a historical mapping relation record generated by a term mapping unit to complete term standardization mapping on the incremental data;
an exception handling module: the execution process of each module is recorded, especially an error log is generated aiming at the error occurrence condition, and the backtracking of the whole medical term mapping process can be ensured according to the error log.
A data cleaning module: and (4) formulating a cleaning rule, giving a weight to each data element, screening out data with serious deletion and improving the data quality.
The specific implementation of each module is as follows:
first, database connection management module
Managing (including loading, modifying, storing) information needed to access the source database, the source database and the target database may be physically the same database system. The implementation mode mainly comprises that a JDBC module is formed by classes and interfaces written by the existing java programming language, so that a uniform access interface is provided for various databases, the system has good cross-platform performance, and the functions of establishing connection with the databases or other data sources, sending SQL commands to the databases, processing return results of the databases and the like are mainly realized, and the schematic diagram of the system is shown in FIG. 3.
II, a pre-analysis module: after the database connection management module realizes the connection of the source database, the module automatically scans the structural information of all data in the source database and the statistical information of specific fields thereof to generate a statistical table A, and the table comprises two parts:
first, summary statistics of all tables in the source database, including field names, numeric types, maximum lengths of all values, total rows in the tables, and proportions of null values, are as follows:
A | B | C | D | E | F |
table name | Column name | Type of value | Maximum length | Line number | Empty running ratio |
PATIENT | Patient identification | NUMBER | 8 | 3000 | 0 |
PATIENT | Name (I) | VARchar2 | 20 | 3000 | 0 |
PATIENT | Date of birth | DATE | 10 | 3000 | 0 |
Secondly, the detailed information and the occurrence frequency of specific terms in a specific table are counted, and the terms with higher occurrence frequency can be preferentially selected for processing by the subsequent term mapping according to the arrangement of the occurrence frequency from large to small, the system can give a suggestion whether the terms with extremely low occurrence frequency need to participate in the subsequent term mapping, when the terms are undefined, all the terms are defaulted to participate in the mapping, and a user can adjust parameters according to specific conditions so as to determine the minimum occurrence frequency threshold value which does not participate in the subsequent term mapping, so that the subsequent term mapping process can be greatly simplified, certain workload is reduced, and data quality is improved.
A | B | C |
Encoding | Sex | Frequency of |
Z03.001 | For male | 200 |
Z03.002 | Woman | 100 |
For example: a certain term a is a non-standard term with a total amount of N2 and a total amount of data of N1, and the frequency of a is P ═ N1/N2. M is the set minimum frequency of occurrence participating in mapping, if P is more than or equal to M, A is a mapping object; p < M, A is a non-standard term with extremely low occurrence frequency and does not participate in subsequent term mapping, wherein M is a threshold set by a user according to actual conditions.
The document information generated by the module supports the export of formats such as pdf, excel, CSV and the like.
Third, automatic mapping module
Aiming at terms of the standard codes of the international universal medical term library in the source database, after the codes belong to the standard, a target term set to be mapped is selected, if a referential mapping relation exists between the codes of the standard term set to which the terms belong in the source database and the codes of the target term set, the terms can automatically generate mapping SQL sentences, and the terms in the source database are automatically mapped and corresponding data loading is completed.
Four, fuzzy matching module
And performing fuzzy matching on the part of medical terms and standard terms in the general medical term library one by one to give the standard terms of the recommended mapping and the standard term set codes where the standard terms are located. Fuzzy matching generally recommends a plurality of standard terms as matching objects, a professional with medical knowledge background is required to manually determine a unique matching object, and after a mapping relation is determined, an automatic mapping module is called to complete mapping of the medical terms and loading of data covered by the medical terms. The specific method of fuzzy matching is as follows:
(1) term participle
The medical term is mostly composed of a plurality of words and phrases, and herein, the medical term is subdivided into a plurality of words and phrases according to a specific rule.
(1.1) according to the method, all the words in the general medical term library are participated, and each participated word is subjected to frequency statistics to be used as basic word frequency.
(1.2) the source medical terms that need fuzzy matching are also participled before matching. For example: the term M is segmented to obtain [ segmentation 1, segmentation 2, and segmentation n of … ].
(2) Fuzzy matching
The invention compares the probability difference between medical terms as the standard of similarity, and the specific operation is as follows:
(2.1) screening all terms comprising the participles from the general medical term library, and performing participle to combine into a term set A { a, b, c, d, e, … };
and (2.2) calculating the matching degree by using the following formula, and calculating the average weighted probability of all terms in the term M and the term set A. Wherein n is the number of participles obtained by each term, and P1, P2, P3 and P4 … Pn are the corresponding probabilities of each participle in the basic word frequency:
(2.3) subtracting the average weighted probability of all standard terms in the term set A from the term M needing fuzzy matching, taking the negative value as the matching degree, wherein the larger the matching degree is, the higher the similarity of the two is. The formula is as follows:
S(M,A)=|D(M)-D(A)|
the term "donkey-hide gelatin oral liquid for prolonging life" is taken as an example:
a) performing word segmentation on the general medical term library terms, and obtaining the probability of each word segmentation;
b) the term "donkey-hide gelatin oral liquid for prolonging life" is divided into words to obtain the "donkey-hide gelatin \ oral liquid for prolonging life". Inquiring corresponding probabilities in the basic word frequency to obtain donkey-hide gelatin frequency p1, longevity p2 and oral liquid p3 respectively, and calculating the average probability D (M) of each word segmentation;
c) inquiring all terms including donkey-hide gelatin, longevity and oral liquid in a general medical term library, and performing word segmentation to obtain a term set A { [ "donkey-hide gelatin", "calcium", "oral liquid" ], [ "donkey-hide gelatin", "granule" ], [ "donkey-hide gelatin", "blood enriching", "oral liquid" ] … }, and obtain D (a), D (b) and D (c) …;
d) finding matching degree and sequencing
Fuzzy matching terms | Generic database terminology | Degree of matching |
Donkey-hide gelatin oral liquid for prolonging life | Donkey-hide gelatin calcium oral liquid | S(M,a) |
Donkey-hide gelatin oral liquid for enriching blood | S(M,c) | |
Donkey-hide gelatin granules | S(M,b) |
Fifthly, self-defining term module
Under complex conditions, particularly for the actual conditions that the data of the domestic medical data center is redundant and more medical terms related to traditional Chinese medicines and traditional treatment means exist, the situation that the data cannot be matched with the international universal medical term library exists. The custom term module may define the necessary constraints in advance to avoid the custom terms from conflicting with known standard terms, such as: in terms of coding, it is mandatory that custom terms use a defined coding range.
When the custom terms are added, the consistency of the added custom standard terms is required to be kept among all the medical data centers, repeated addition is prevented, and meanwhile, the data sharing of the multi-center medical data can be realized after the multi-center medical data is standardized through term mapping. Therefore, when the term standardization mapping is carried out on the medical data of the medical data center, before the custom term is added, a report for adding the custom term is submitted to the multi-center interaction module, and the report content comprises: custom terms that need to be added, detailed descriptions of the custom terms, code of the custom terms (system auto-generated). After the auditing of the relevant central operators is passed, if the self-defined code without similar repeated medical terms is determined, a self-defined standard term code is generated, and then an automatic mapping module can be called to complete term mapping and loading of covered data; if the audit is not passed, returning the existing custom term code for the medical data center to complete the subsequent mapping or returning the reason of the failure in generating the custom term, generating an error document and prompting a user, wherein the operation schematic diagram of the custom term module is shown in fig. 4.
Six, multi-center interaction module
To achieve data standardization and data sharing among medical information systems of various medical data centers, all medical data centers are required to use a unified general medical term library and a unified medical term set code. The invention adopts a mode of uniformly adding after submitting the audit, and prevents each medical data center from generating term expression difference when customizing the standard terms. In the process of submitting, auditing and authorizing, the interaction problem of multiple medical data centers exists. The multi-center interaction module is responsible for coordinating and unifying the general medical term libraries and term codes of all the medical data centers, the highest authority personnel of the multi-center interaction module audits and coordinates the use problem of the self-defined standard terms, and the multi-center self-defined term interaction network is shown in figure 5.
Seven, incremental updating module
The subsequent medical term standardization process for the medical data center which operates the medical term mapping mainly realizes the updating of the incremental data according to the former term standardization mapping record generated by the term mapping unit, and repeatedly executes the self-defined term module for the medical terms which still can not complete the standardized mapping.
Eight, exception handling module
The log storage module is used for storing all logs during the operation of the system and recording whether each module operates normally; sorting saves an error log comprising: errors occurring during the operation of the system, errors occurring during the calling of each module, and errors occurring during the mapping of each module to a single term during the operation of each module; and classifying and saving the terms which are not mapped successfully, including the terms which are ignored in the automatic analysis module and the terms which are ignored in the self-defining module, and generating a failure term document. The exception handling module supports a database backtracking function by setting a timestamp on the database, and supports a user to backtrack the matched database to data of a specified date.
Nine, data cleaning module
After the standardized mapping of medical terms is completed, medical data cleaning is extremely necessary to improve the quality of medical data for subsequent data mining and analysis; providing a common data cleaning strategy, wherein the dirty data of the structure level and the instance level are mainly cleaned, and the dirty data respectively comprise data violating the requirements of data patterns and integrity constraints, such as data value out-of-range, attribute dependency relationship damage, uniqueness relationship damage, reference integrity damage and the like, and data corresponding to error attributes and dependency relationship damage among the attributes, such as missing values, repeated records, contradictory records, reference errors and the like; the integrity, uniqueness, authority, legality and consistency of the data are met to the maximum extent, data redundancy is reduced, and data quality is improved.
1) Structure level cleaning rules: unified data schema (including data type) definitions; a unified integrity constraint definition; a unified function dependency requirement definition.
2) Example level cleaning rules: and analyzing dirty data, formulating a cleaning rule, evaluating and verifying, and recording a cleaning action into a log for tracing.
The invention is a collaborative mode designed for realizing data sharing among a plurality of medical data centers (mainly hospitals) and fully ensuring the data security of each medical data center along with the continuous improvement of the requirements of the data quantity and quality of the current data mining and analysis, so that the medical process can be optimized, the development of related scientific research is accelerated and the medical service quality of patients is finally improved by sharing medical data. The premise of data sharing among multiple medical data centers is the standardization of medical data, which comprises two parts of contents, namely, the standardization of a data structure and the standardization of medical terms, wherein the contents are designed for the standardization of the latter. The technical points of the invention are summarized as follows:
1. through the interaction among all the modules, the automatic analysis and scanning of the database in the medical information system are realized, the statistical information such as the occurrence frequency of the medical terms in the database is returned, and the practical basis is provided for the subsequent medical term mapping and performance optimization.
2. The mapping of the medical data covered by the medical terms of the part is automatically realized according to the mapping relation between the existing medical term set codes for the data adopting the international universal medical term set codes.
3. For data which is not coded by a standard term set in a medical data center, self-defined medical terms or domestic unique medical terms such as traditional Chinese medicines, the method supports information such as the occurrence frequency of the data in the medical data center according to the medical terms, and supports relevant personnel to visually carry out reasonable and scientific fuzzy matching or directly increase the self-defined standard terms.
4. The interaction requirements between the medical data centers are completed at regular time, the standardization of the universal medical databases of all the medical data centers is kept uniform after the medical terms of all the medical data centers are standardized, and data sharing can be realized.
5. And cleaning the data according to a cleaning strategy to ensure the data quality.
6. All error exceptions are recorded and written into the log, so that functions of error checking, quality evaluation and the like are conveniently realized.
7. The established mapping relation between the medical terms in the medical data center and the international universal standard medical term set is fully utilized, and semi-automatic or even automatic mapping and standardization of the terms in the subsequent medical data center are realized.
The above are merely examples of the present invention, and are not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement and the like, which are not made by the inventive work, are included in the scope of protection of the present invention within the spirit and principle of the present invention.
Claims (9)
1. A multi-center medical term standardization system based on a general medical term library is characterized by comprising a source database, a database connection management module, a pre-analysis module, a term mapping unit, an increment updating module, an exception handling module and a multi-center interaction module;
the source database is distributed in the preposed servers of the medical data centers and stores the service data of the medical data centers;
the database connection management module: managing information required by accessing the source database, and providing support for the term mapping tool to access and modify the source database;
the pre-analysis module: automatically scanning a source database, counting the occurrence frequency of each medical term in the original medical data, giving a abandon suggestion to terms with the occurrence frequency less than a set threshold, and sending the terms with the occurrence frequency more than or equal to the set threshold to a term mapping unit for subsequent term mapping;
the term mapping unit comprises an automatic mapping module, a fuzzy matching module and a self-defined term module;
the automatic mapping module: supporting automatic mapping of medical terms, and realizing multidirectional mapping for terms using international universal medical term library standard codes according to the mapping relation among the existing universal medical term library standard codes;
the fuzzy matching module: traversing and inquiring medical terms which cannot be mapped directly according to the mapping relation between the standard codes in the conventional medical term library in a fuzzy matching manner, and providing several groups of standard medical terms with highest similarity for selection as target terms mapped by the terms; the specific method of fuzzy matching is as follows:
(1) term participles: performing word segmentation on all words in the general medical term library, and performing frequency statistics on each word segmentation to serve as basic word frequency; performing word segmentation on a source medical term M needing fuzzy matching before matching;
(2) fuzzy matching: by comparing the probability difference between medical terms as the standard of the similarity, the specific operation is as follows:
(2.1) screening all terms including the participle from the general medical term library, and performing participle to combine the terms into a term set A;
(2.2) calculating the matching degree by using the following formula, and solving the average weighted probability of all terms in the term M and the term set A; wherein n is the number of participles obtained by each term, and P1, P2, P3 and P4 … Pn are the corresponding probabilities of each participle in the basic word frequency:
(2.3) subtracting the average weighted probability of all standard terms in the term set A from the term M needing fuzzy matching, taking the negative value as the matching degree, wherein the larger the matching degree is, the higher the similarity of the two is, and the formula is as follows:
S(M,A)=|D(M)-D(A)|
the custom term module: for medical terms which cannot depend on the mapping relation between standard codes in the existing medical term library and cannot be matched with target terms in the existing general medical term library in a fuzzy manner, after a user generates a self-defined term application, the medical terms are sent to a multi-center interaction module to be checked and fed back;
the multi-center interaction module: after receiving the self-defined term application of each medical data center sent by the self-defined term module, auditing the self-defined terms, adding the self-defined terms which are approved as standard terms into the general medical term library, and sending the standard terms to each medical data center to keep the general medical term libraries of each medical data center consistent;
the incremental update module: aiming at the medical term standardization process of generating incremental data by a source database which executes medical term standardization mapping due to business reasons, calling a historical mapping relation record generated by a term mapping unit to complete term standardization mapping on the incremental data;
the exception handling module: recording the execution process of each module, generating an error log aiming at the error occurrence condition, and backtracking the whole medical term mapping process according to the error log.
2. The system according to claim 1, further comprising a data cleansing module for formulating cleansing rules, weighting each data element, and screening out heavily missing data, including cleansing dirty data at both the structural level and the instance level.
3. The system for standardizing multicenter medical terms based on a generic medical term library according to claim 1, wherein the database connection management module specifically comprises: the JDBC module is formed by classes and interfaces written by a programming language, a uniform access interface is provided for various databases, and the functions of establishing connection with the database or other data sources, sending SQL commands to the database and processing the returned results of the database are realized.
4. The system according to claim 1, wherein the pre-analysis module automatically scans the structure information of all data in the source database and the statistical information of specific fields thereof to generate a statistical table, which comprises two parts:
firstly, summarizing statistics on all tables in a source database, wherein the summarized statistics comprises field names, numerical value types, maximum lengths of all values, total rows in the tables and the proportion of null values in each table;
secondly, statistics is made on detailed information and occurrence frequency of specific terms in a specific table, and the terms with higher occurrence frequency are arranged according to the occurrence frequency from large to small for subsequent term mapping to preferentially select and process the terms with higher occurrence frequency.
5. The system of claim 1, wherein the automatic mapping module is configured to: aiming at terms of the standard codes of the international universal medical term library in the source database, after the codes belong to the standard, a target term set to be mapped is selected, if a referential mapping relation exists between the codes of the standard term set to which the terms belong in the source database and the codes of the target term set, the terms can automatically generate mapping SQL sentences, and the terms in the source database are automatically mapped and corresponding data loading is completed.
6. The system of claim 1, wherein the custom term module is configured to: defining constraints in advance to avoid conflict between the custom terms and the known standard terms; when the user-defined terms are added, the consistency of the added user-defined standard terms needs to be kept among all the medical data centers, repeated addition is prevented, and meanwhile, the data sharing of the multi-center medical data can be realized after the multi-center medical data are subjected to term mapping standardization; before adding the custom terms, a request for adding the custom terms is submitted to the multi-center interaction module, and the request content comprises the following steps: custom terms to be added, detailed descriptions of the custom terms, code of the custom terms; after the auditing of the relevant operators of the multi-center interaction module is passed, determining that no custom code similar to the repeated medical terms exists, generating a custom standard term code, and then calling the automatic mapping module to complete term mapping and loading of covered data; if the audit is not passed, returning the existing custom term code for the medical data center to complete the subsequent mapping or returning the reason of the failure of generating the custom term, generating an error document and prompting a user.
7. The system according to claim 1, wherein the multi-center interactive module is responsible for coordinating and unifying the universal medical term libraries and term codes thereof of the medical data centers, and the highest-authority personnel of the multi-center interactive module review and coordinate the use of the custom standard terms.
8. The system according to claim 1, wherein the incremental updating module is used for a subsequent medical term standardization process of the medical data center that has operated medical term mapping, and updates incremental data are implemented mainly according to the past term standardized mapping record generated by the term mapping unit, and the custom term module is repeatedly executed for the medical terms that still cannot be mapped in a standardized manner.
9. The system of claim 1, wherein the exception handling module is configured to: the log storage module is used for storing all logs during the operation of the system and recording whether each module operates normally; sorting saves an error log comprising: errors occurring during the operation of the system, errors occurring during the calling of each module, and errors occurring during the mapping of each module to a single term during the operation of each module; classifying and saving terms which are not mapped successfully, including terms which are omitted in the automatic analysis module and terms which are omitted in the self-defining module, and generating a failure term document; the exception handling module supports a database backtracking function by setting a timestamp on the database, and supports a user to backtrack the matched database to data of a specified date.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910629244.9A CN110349639B (en) | 2019-07-12 | 2019-07-12 | Multi-center medical term standardization system based on general medical term library |
PCT/CN2020/083586 WO2020233256A1 (en) | 2019-07-12 | 2020-04-07 | General medical termbase-based multi-center medical terminology standardization system |
JP2021533326A JP7093593B2 (en) | 2019-07-12 | 2020-04-07 | Multi-center medical term standardization system based on general-purpose medical term library |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910629244.9A CN110349639B (en) | 2019-07-12 | 2019-07-12 | Multi-center medical term standardization system based on general medical term library |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110349639A CN110349639A (en) | 2019-10-18 |
CN110349639B true CN110349639B (en) | 2022-01-04 |
Family
ID=68176052
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910629244.9A Active CN110349639B (en) | 2019-07-12 | 2019-07-12 | Multi-center medical term standardization system based on general medical term library |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP7093593B2 (en) |
CN (1) | CN110349639B (en) |
WO (1) | WO2020233256A1 (en) |
Families Citing this family (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12125054B2 (en) | 2018-09-25 | 2024-10-22 | Valideck International Corporation | System, devices, and methods for acquiring and verifying online information |
CN110349639B (en) * | 2019-07-12 | 2022-01-04 | 之江实验室 | Multi-center medical term standardization system based on general medical term library |
CN111126018B (en) * | 2019-11-25 | 2023-08-08 | 泰康保险集团股份有限公司 | Form generation method and device, storage medium and electronic equipment |
CN110990591A (en) * | 2019-12-26 | 2020-04-10 | 北京亚信数据有限公司 | Method and system for auditing transcoding quality of medical data |
CN111291225B (en) * | 2020-05-08 | 2020-08-11 | 成都金盘电子科大多媒体技术有限公司 | Method and system for quickly verifying medical health information data standard |
CN112035451A (en) * | 2020-08-25 | 2020-12-04 | 上海灵长软件科技有限公司 | Data verification optimization processing method and device, electronic equipment and storage medium |
CN112069774A (en) * | 2020-09-03 | 2020-12-11 | 微医云(杭州)控股有限公司 | Data mapping method and device, electronic terminal and storage medium |
CN112347266A (en) * | 2020-09-11 | 2021-02-09 | 湖南中医药大学 | Special term standardization system for children rehabilitation |
CN112052667B (en) * | 2020-09-27 | 2024-05-03 | 沈阳东软智能医疗科技研究院有限公司 | Method, device and equipment for realizing medical coding mapping |
CN112365939B (en) * | 2020-10-14 | 2023-04-07 | 山东大学 | Data management method and system based on medical health big data |
CN112633005B (en) * | 2020-11-11 | 2024-06-21 | 上海数创医疗科技有限公司 | Electrocardiogram term semantic matching method |
CN112395854B (en) * | 2020-12-02 | 2022-11-22 | 中国标准化研究院 | Standard element consistency inspection method |
CN112800324A (en) * | 2021-01-14 | 2021-05-14 | 北京搜狗科技发展有限公司 | Searching method, device and medium |
CN112883157B (en) * | 2021-02-07 | 2023-04-07 | 武汉大学 | Method and device for standardizing multi-source heterogeneous medical data |
CN112951355B (en) * | 2021-02-25 | 2023-05-02 | 武汉大学 | Quality inspection function method and device for warehousing massive medical data |
CN112817945A (en) * | 2021-03-03 | 2021-05-18 | 江苏汇鑫融智软件科技有限公司 | Medical heterogeneous system data warehouse construction method based on ESB |
CN112988966A (en) * | 2021-03-04 | 2021-06-18 | 中建海峡建设发展有限公司 | Voice interaction construction log management system and implementation method |
CN113284630B (en) * | 2021-04-13 | 2024-05-14 | 常州市第二人民医院 | Medical ontology-based medical term knowledge base construction system and method |
CN113239115B (en) * | 2021-05-19 | 2023-06-02 | 中国医学科学院医学生物学研究所 | Quick and accurate synchronization method for vaccine adverse reaction batch data |
CN113377897B (en) * | 2021-05-27 | 2022-04-22 | 杭州莱迈医疗信息科技有限公司 | Multi-language medical term standard standardization system and method based on deep confrontation learning |
CN113342793B (en) * | 2021-06-18 | 2023-04-07 | 立信(重庆)数据科技股份有限公司 | Research data standardization method and system |
CN113704555B (en) * | 2021-07-16 | 2023-11-07 | 杭州医康慧联科技股份有限公司 | Feature management method based on medical direction federal learning |
CN113764086A (en) * | 2021-08-17 | 2021-12-07 | 卫宁健康科技集团股份有限公司 | Nursing information processing system and method based on JHNEBP model |
CN113836126B (en) * | 2021-09-22 | 2024-01-30 | 上海妙一生物科技有限公司 | Data cleaning method, device, equipment and storage medium |
CN113656604B (en) * | 2021-10-19 | 2022-02-22 | 之江实验室 | Medical term normalization system and method based on heterogeneous graph neural network |
CN114003791B (en) * | 2021-12-30 | 2022-04-08 | 之江实验室 | Depth map matching-based automatic classification method and system for medical data elements |
CN114461714B (en) * | 2022-01-13 | 2024-03-29 | 湖北国际物流机场有限公司 | BIM code conversion system |
CN114595668A (en) * | 2022-01-28 | 2022-06-07 | 北京医鸣技术有限公司 | Method, platform, medium and equipment for standardizing medical diagnosis terms |
CN115017323B (en) * | 2022-02-17 | 2024-08-02 | 镇江市精神卫生中心(镇江市第五人民医院) | Automatic medical knowledge graph labeling system and method with variable multi-element framework |
CN114974490B (en) * | 2022-05-27 | 2024-11-05 | 神州医疗科技股份有限公司 | Method, apparatus, electronic device and medium for constructing medical term platform |
CN115080751B (en) * | 2022-08-16 | 2022-11-11 | 之江实验室 | Medical standard term management system and method based on general model |
CN115712839B (en) * | 2022-11-14 | 2023-10-24 | 国网山东省电力公司日照供电公司 | Automatic matching system and method for relay protection device communication model |
CN116303377A (en) * | 2022-11-23 | 2023-06-23 | 南京视察者智能科技有限公司 | Government affair data cleaning and filtering method |
CN115952770B (en) * | 2023-03-15 | 2023-07-25 | 广州汇通国信科技有限公司 | Data standardization processing method and device, electronic equipment and storage medium |
CN116110560A (en) * | 2023-04-13 | 2023-05-12 | 杭州璞睿生命科技有限公司 | Method, device, equipment and medium for docking clinical diagnosis and treatment data to EDC system |
CN116167354B (en) * | 2023-04-19 | 2023-07-07 | 北京亚信数据有限公司 | Medical term feature extraction model training and standardization method and device |
CN116386799B (en) * | 2023-06-05 | 2023-08-18 | 数据空间研究院 | Medical data acquisition and standard conversion method and system |
CN117216042A (en) * | 2023-07-26 | 2023-12-12 | 中电云计算技术有限公司 | Construction method and device of data standardization platform |
CN116737697B (en) * | 2023-08-10 | 2023-10-20 | 云筑信息科技(成都)有限公司 | Method and device for managing main data of materials in construction industry and electronic equipment |
CN117995332B (en) * | 2024-04-07 | 2024-07-05 | 北方健康医疗大数据科技有限公司 | Value range code standardized conversion system and method |
CN118035504B (en) * | 2024-04-15 | 2024-09-03 | 上海森亿医疗科技有限公司 | Medical core word knowledge base construction method, device, medium and terminal |
CN118173211B (en) * | 2024-05-15 | 2024-07-23 | 万链指数(青岛)信息科技有限公司 | Data standardized treatment method and system for medical big data |
CN118586404B (en) * | 2024-08-06 | 2024-11-08 | 杭州古珀医疗科技有限公司 | Method and device for extracting and standardizing hospital leaving doctor's advice information |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1481332A2 (en) * | 2002-03-05 | 2004-12-01 | Siemens Medical Solutions Health Services Corporation | A dynamic dictionary and term repository system |
KR100538577B1 (en) * | 2003-07-14 | 2005-12-22 | 이지케어텍(주) | Method For Standardization Of Computerization Of Medical Information |
JP4955197B2 (en) * | 2004-09-07 | 2012-06-20 | 株式会社日本医療データセンター | Receipt file generation system |
JP4661415B2 (en) * | 2005-07-13 | 2011-03-30 | 株式会社日立製作所 | Expression fluctuation processing system |
US7610192B1 (en) * | 2006-03-22 | 2009-10-27 | Patrick William Jamieson | Process and system for high precision coding of free text documents against a standard lexicon |
CN101452503A (en) * | 2008-11-28 | 2009-06-10 | 上海生物信息技术研究中心 | Isomerization clinical medical information shared system and method |
US10204703B2 (en) * | 2014-11-10 | 2019-02-12 | Accenture Global Services Limited | Medical coding management system using an intelligent coding, reporting, and analytics-focused tool |
JP2016200978A (en) * | 2015-04-10 | 2016-12-01 | 株式会社日立製作所 | Training data generation device |
US20160342746A1 (en) * | 2015-05-21 | 2016-11-24 | Naveen Sarabu | Cloud-Based Medical-Terminology Manager and Translator |
CN106383853A (en) * | 2016-08-30 | 2017-02-08 | 刘勇 | Realization method and system for electronic medical record post-structuring and auxiliary diagnosis |
KR101878217B1 (en) * | 2016-11-07 | 2018-07-13 | 경희대학교 산학협력단 | Method, apparatus and computer program for medical data |
CN110998741B (en) * | 2017-07-18 | 2024-04-09 | 皇家飞利浦有限公司 | Mapping of encoded medical vocabulary |
CN107978341A (en) * | 2017-12-22 | 2018-05-01 | 南京昂特医信数据技术有限公司 | Isomeric data adaptation method and its system under a kind of medicine semantic frame based on linguistic context |
CN109033080B (en) * | 2018-07-12 | 2023-03-24 | 上海金仕达卫宁软件科技有限公司 | Medical term standardization method and system based on probability transfer matrix |
CN109408820A (en) * | 2018-10-17 | 2019-03-01 | 长沙瀚云信息科技有限公司 | A kind of medical terminology mapped system and method, equipment and storage medium |
CN109446340A (en) * | 2018-10-17 | 2019-03-08 | 长沙瀚云信息科技有限公司 | A kind of Medicine standard term ontology management system and method, equipment and storage medium |
CN110349639B (en) * | 2019-07-12 | 2022-01-04 | 之江实验室 | Multi-center medical term standardization system based on general medical term library |
-
2019
- 2019-07-12 CN CN201910629244.9A patent/CN110349639B/en active Active
-
2020
- 2020-04-07 JP JP2021533326A patent/JP7093593B2/en active Active
- 2020-04-07 WO PCT/CN2020/083586 patent/WO2020233256A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
JP2022508350A (en) | 2022-01-19 |
WO2020233256A1 (en) | 2020-11-26 |
CN110349639A (en) | 2019-10-18 |
JP7093593B2 (en) | 2022-06-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110349639B (en) | Multi-center medical term standardization system based on general medical term library | |
Barateiro et al. | A survey of data quality tools. | |
US20170068748A1 (en) | Hybrid data storage system and method and program for storing hybrid data | |
US20080301168A1 (en) | Generating database schemas for relational and markup language data from a conceptual model | |
CN112801488B (en) | Real-time control optimization method and system for clinical test quality | |
US20150142821A1 (en) | Database system for analysis of longitudinal data sets | |
CN108564991A (en) | Digitization coding case history wrong identification system based on ICD and its recognition methods | |
CN111984640A (en) | Portrait construction method based on multi-element heterogeneous data | |
JP2006318422A (en) | Apparatus and method for dimension table processing, apparatus and method for extracting dimension hierarchy, and program | |
CN118116611B (en) | Database construction method based on multi-source medical and nutritional big data fusion integration | |
CN113450928A (en) | Drug test data control method and system | |
Daniel et al. | Managing Data Quality in Business Intelligence Applications. | |
Moro et al. | Schema advisor for hybrid relational-XML DBMS | |
Hu | Research on monitoring system of daily statistical indexes through big data | |
US20150356130A1 (en) | Database management system | |
CN117290304A (en) | Retrieval system and method based on medical big data establishment | |
Babur et al. | Model analytics for industrial MDE ecosystems | |
CN114706878A (en) | Method and device for checking SQL (structured query language) statements | |
Mellner et al. | The Karolinska hospital information system | |
Romanchikova et al. | A framework for user-configurable data quality assurance of electronic patient records | |
CN114566240A (en) | Electronic medical record data conversion method and device, data model and platform | |
Dorok | Towards genome analysis on modern database systems | |
Si-Said et al. | Preface to QMMQ 2018 | |
CN113485990A (en) | Multi-dimensional intelligent data cleaning method and system based on big transfusion data | |
CN117038002A (en) | Method and device for generating observation variable in drug evaluation research |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |