CN117992654A - Common information processing method and device for managing real world data platform across queues - Google Patents
Common information processing method and device for managing real world data platform across queues Download PDFInfo
- Publication number
- CN117992654A CN117992654A CN202410246934.7A CN202410246934A CN117992654A CN 117992654 A CN117992654 A CN 117992654A CN 202410246934 A CN202410246934 A CN 202410246934A CN 117992654 A CN117992654 A CN 117992654A
- Authority
- CN
- China
- Prior art keywords
- data
- processed
- cross
- queue
- platform
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 30
- 238000003672 processing method Methods 0.000 title abstract description 8
- 238000000034 method Methods 0.000 claims abstract description 29
- 230000004927 fusion Effects 0.000 claims abstract description 21
- 238000012545 processing Methods 0.000 claims description 22
- 238000005065 mining Methods 0.000 claims description 14
- 238000013523 data management Methods 0.000 claims description 12
- 238000003745 diagnosis Methods 0.000 claims description 9
- 201000010099 disease Diseases 0.000 claims description 9
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 9
- 238000007726 management method Methods 0.000 claims description 9
- 238000007621 cluster analysis Methods 0.000 claims description 7
- 238000013500 data storage Methods 0.000 claims description 7
- 238000011161 development Methods 0.000 claims description 7
- 230000002452 interceptive effect Effects 0.000 claims description 6
- 210000001503 joint Anatomy 0.000 claims description 4
- 230000003993 interaction Effects 0.000 claims description 3
- 239000000758 substrate Substances 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 6
- 230000036541 health Effects 0.000 description 5
- 238000007405 data analysis Methods 0.000 description 4
- 230000003068 static effect Effects 0.000 description 4
- 230000008030 elimination Effects 0.000 description 3
- 238000003379 elimination reaction Methods 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000000968 medical method and process Methods 0.000 description 1
- 238000003032 molecular docking Methods 0.000 description 1
- 230000002980 postoperative effect Effects 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/283—Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a common information processing method and a device for managing real world data platforms across queues, wherein the method comprises the following steps: building a real data platform; collecting data to be processed; performing cross-queue data treatment on the data to be processed, and classifying the data to be processed to obtain cross-queue data; and carrying out common information processing on the cross-queue data and carrying out data fusion. Based on the real world data platform, various types of data can be integrated, including structured data, unstructured data, multimedia data and the like, users can conveniently share and exchange data, and cooperation and sharing among cross departments and cross institutions are facilitated. In addition, the system has high expandability and flexibility. The back-end server can be horizontally expanded according to actual requirements so as to meet the requirements of large-scale data exchange and sharing. Meanwhile, a plurality of data formats and interface standards are supported, so that the system can be seamlessly integrated and interoperated with other systems.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for processing common information of a real world data platform for cross-queue management.
Background
With the acceleration of digital transformation and explosive growth of data volume, real world data platforms have become an important tool for data analysis and decision making by various medical centers in the medical field. However, conventional real world data platforms often have multi-source heterogeneous cross-queue data differences, which cannot be studied more effectively; the problem that various storage media cannot be efficiently stored, etc., results in the influence of data quality and data availability, and further influences the quality and efficiency of data analysis and decision making.
Disclosure of Invention
The present invention aims to provide a common information processing method and apparatus for managing real world data platforms across queues that overcomes or at least partially solves the above-mentioned problems.
In order to achieve the above purpose, the technical scheme of the invention is specifically realized as follows:
One aspect of the present invention provides a method for processing common information of a real world data platform for cross-queue management, comprising: building a real data platform; collecting data to be processed; performing cross-queue data treatment on the data to be processed, and classifying the data to be processed to obtain cross-queue data; and carrying out common information processing on the cross-queue data and carrying out data fusion.
Wherein, before collecting the data to be processed, the method further comprises: setting rules of data to be processed; the rule for setting the data to be processed comprises the following steps: and setting the data range, the data type, the data format and the data value range uploaded to the database.
Wherein, the collecting the data to be processed includes: the data to be processed are transmitted in a butt joint mode through an API interface; collecting the data to be processed through a log file; manually reporting and collecting the data to be processed through a data platform; and/or the ORC identifies the data to be processed.
Wherein, the cross-queue data management of the data to be processed, and classifying the data to be processed includes: according to the timestamp, the source and the type characteristics of the data to be processed, the data to be processed are divided into different categories, and data tagging is realized, wherein the tag comprises: keywords, phrases, or metadata describing the characteristics, attributes, or meaning of the data, the tag content comprising: department classification, specific disease diagnosis and follow-up outcome.
Wherein, the processing the commonality information of the cross-queue data includes: judging whether the cross-queue data is repeated or not according to a preset rule, wherein the method comprises the following steps: judging whether the cross-queue data is repeated or not according to the time stamp and/or the ID; and de-duplicating according to the content of the cross-queue data, including: performing de-duplication on the text data by using a text similarity algorithm; the data fusion comprises the following steps: and carrying out data fusion by adopting a cluster analysis, association rule mining or text mining mode.
Wherein, the building of the real data platform comprises: adopting a Python-based web framework Django to realize interactive operation between a user and data; adopting a MySQL database as a back-end data storage library; the B/S website framework development mode is adopted, and the following application structure level is adopted: a presentation layer, a business logic layer and a data management layer.
Another aspect of the present invention provides a commonality information processing apparatus for managing a real world data platform across queues, applied to the real data platform, comprising: the acquisition module is used for acquiring data to be processed; the treatment module is used for carrying out cross-queue data treatment on the data to be processed and classifying the data to be processed to obtain cross-queue data; and the processing module is used for carrying out common information processing on the cross-queue data and carrying out data fusion.
Wherein the apparatus further comprises: the setting module is used for setting rules of the data to be processed; the setting module sets rules of data to be processed by the following modes: and setting the data range, the data type, the data format and the data value range uploaded to the database.
The acquisition module acquires data to be processed in the following mode: the data to be processed are transmitted in a butt joint mode through an API interface; collecting the data to be processed through a log file; manually reporting and collecting the data to be processed through a data platform; and/or the ORC identifies the data to be processed.
The treatment module carries out cross-queue data treatment on the data to be processed in the following mode, and classifies the data to be processed: according to the timestamp, the source and the type characteristics of the data to be processed, the data to be processed are divided into different categories, and data tagging is realized, wherein the tag comprises: keywords, phrases, or metadata describing the characteristics, attributes, or meaning of the data, the tag content comprising: department classification, specific disease diagnosis and follow-up outcome.
The processing module performs common information processing on the cross-queue data in the following manner: judging whether the cross-queue data is repeated or not according to a preset rule, wherein the method comprises the following steps: judging whether the cross-queue data is repeated or not according to the time stamp and/or the ID; and de-duplicating according to the content of the cross-queue data, including: performing de-duplication on the text data by using a text similarity algorithm; the processing module performs data fusion in the following manner: and carrying out data fusion by adopting a cluster analysis, association rule mining or text mining mode.
Wherein, the real world platform adopts a Python-based web framework Django to realize the interactive operation of the user and the data; adopting a MySQL database as a back-end data storage library; the B/S website framework development mode is adopted, and the following application structure level is adopted: a presentation layer, a business logic layer and a data management layer.
Therefore, the common information processing method and the common information processing device for managing the real world data platform across the queues can integrate various types of data based on the real world data platform, including structured data, unstructured data, multimedia data and the like, so that users can conveniently share and exchange data, and cooperation and sharing among across departments and across institutions are promoted. In addition, the system has high expandability and flexibility. The back-end server can be horizontally expanded according to actual requirements so as to meet the requirements of large-scale data exchange and sharing. Meanwhile, a plurality of data formats and interface standards are supported, so that the system can be seamlessly integrated and interoperated with other systems.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for processing common information of a real world data platform for cross-queue management according to an embodiment of the present invention;
FIG. 2 is a frame structure diagram of a real world data platform provided by an embodiment of the present invention;
FIG. 3 is a functional block diagram of a real world data platform according to an embodiment of the present invention;
Fig. 4 is a schematic structural diagram of a common information processing device for managing real world data platforms across queues according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Fig. 1 shows a flowchart of a method and a device for processing common information of a real world data platform for cross-queue management according to an embodiment of the present invention, referring to fig. 1, the method and the device for processing common information of a real world data platform for cross-queue management according to an embodiment of the present invention include:
S1, building a real data platform.
Specifically, the real data platform is a real world data platform based on cross-queue data management and processing commonality information technology.
As an optional implementation manner of the embodiment of the present invention, building a real data platform includes: adopting a Python-based web framework Django to realize interactive operation between a user and data; adopting a MySQL database as a back-end data storage library; the B/S website framework development mode is adopted, and the following application structure level is adopted: a presentation layer, a business logic layer and a data management layer.
In specific implementation, the building thought of the real data platform provided by the embodiment of the invention is as follows:
the real data platform provided by the embodiment of the invention adopts the Python-based web frame-Django, and the database and the webpage are communicated through the server-side website frame with complete functions, so that the interactive operation of the user and the data is realized.
The real data platform provided by the embodiment of the invention can meet several requirements:
firstly, the website has good interactivity: the interface is clear, the operation is convenient, and the medical staff can use the interface in daily medical work flow conveniently;
Secondly, the method has the function of dividing the user according to the user authority: an administrator can log in the background through Django self-contained admin application, and perform operations such as adding, deleting, changing, checking and the like on the data of the database. The authorized access user has no authority to modify website data temporarily, but can freely acquire data by using all functional modules of the website to perform simple data analysis;
Thirdly, data warehouse and data integration: the platform is provided with a centralized back-end data warehouse for storing and managing various uploaded medical data, such as electronic medical records, laboratory results, image data and the like. While supporting data integration, integrate data from different sources and systems for medical workers to access and analyze data from a unified interface.
Fourth, data security and privacy protection: medical data relates to personal privacy information of patients, so that the real data platform provided by the embodiment of the invention needs to have strict data security measures, including data encryption, access control, identity verification and the like, so as to ensure confidentiality and integrity of the data;
Fifth, data standardization and interoperability: the medical data platform should support standardized data formats and protocols for data exchange and sharing with other medical systems. For example, seamless integration with electronic medical record systems, laboratory information systems, etc. is achieved using HL7 (medical information exchange standard) or FHIR (rapid health information exchange) standards.
The real data platform provided by the embodiment of the invention adopts a MySQL database as a back-end data storage library.
The MySQL is used as a widely used and tested database management system, and has good reliability and stability. The method is excellent in processing a large amount of data and high concurrent access, and can ensure safe storage and reliable access of medical data.
The user interface of the real data platform provided by the embodiment of the invention fuses basic medical process specifications with different special disease characteristics, and the functions of patient information management, medical record, diagnosis, treatment plan and the like help medical staff to perform daily work, simultaneously provide characteristic functions for different departments, support special disease data label functions and facilitate screening and searching.
The real data platform provided by the embodiment of the invention adopts a B/S website framework development mode (Browser/Server structure, namely Browser combined with database), and comprises three different application structure layers, namely a presentation (UI) layer, a business logic layer and a data management layer.
Performance (UI) layer: the method is used for user interaction and information data interaction, so that a user has visual pages capable of acquiring data, searching the data and analyzing the data. The user's request is submitted from the presentation layer to the business logic layer. The presentation layer is a Template webpage set in Django frames, and the Ajax is used for processing the presentation of static data on the presentation layer.
Business logic layer: the main body of the business logic layer is a Django controller (view. Py) and a URL analyzer (URL. Py), and besides the two core components, the business logic layer combines other functional modules to realize more complex functions. A model Django (models. Py) is used to define the data model, forms (forms. Py) are used to process user input, filters (filters. Py) are used to filter and process data, and decorators (decorators. Py) are used to enhance performance and security of functions or views, etc. Responding to the request of the presentation layer and invoking the data of the data layer, including the individual data tables in the MySQL database, e.g. some static data such as gene sequence data, association data, etc.
Data layer: the method is mainly used for receiving and processing the request for data, responding to the addition, deletion and modification of the database and the request for static data. The request for static data includes a request for a search module and a request for a download module. After receiving the processing operation on the request, the data is returned to the business logic layer and finally presented on the presentation layer.
In the following, a specific example is used to describe a real data platform provided by an embodiment of the present invention, fig. 2 shows a frame structure diagram of the real world data platform of the system of the embodiment of the present invention, fig. 3 shows a functional block diagram of the real world data platform provided by the embodiment of the present invention, and referring to fig. 2 and fig. 3, the real world data platform provided by the embodiment of the present invention includes the following aspects:
1. user function design and implementation.
The real world data platform provided by the embodiment of the invention adds the function module accessed by the user into the database to carry out authority management. The adopted mode is Django +captcha+ smtplib, and a series of functional modules of region- & gt Captcha Confirmm- & gt Email Confirmm- & gt region Success- & gt Login- & gt Loginout are realized.
2. The data management module is realized.
The real world data platform provided by the embodiment of the invention designs the structure table of all data in the MySQL database, wherein the data table comprises, but is not limited to, user data, demographic information, transfer information, treatment records, medical history, physical examination, specialty examination, diagnosis, inspection examination, operation treatment, postoperative recovery and medical treatment, follow-up visit, adverse events, rehabilitation and the like, and each table is structurally connected in series by taking a uniquely identified patient ID number as a main key.
3. And (5) realizing a data retrieval module.
For the implementation of the search function, the real world data platform provided by the embodiment of the invention adopts a software combination of Haystack + whoosh + Django. Haystack is the most mature modular search in Django, and is widely used and has been iterated until now. It has a unified, mature API architecture that allows users to use different auxiliary search tools without modifying the code. Some of the functions of Whoosh include: site indexing and searching; quick indexing and retrieval; pluggable scoring algorithms (including BM 25F), text analysis, storage, release formats, etc.; powerful query languages; a pure Python spell checker.
4. The analysis function is realized.
The real world data platform provided by the embodiment of the invention integrates the functional modules for classifying, setting the labels and carrying out data fusion and deduplication, and integrates the plug-in units and the API of data analysis into the webpage, such as Bokeh, tensorFlow/Keras, matplotlib.
5. And the downloading module is realized.
The real world data platform provided by the embodiment of the invention uses a FileResponse method in Django framework, and for each file pair on each download page (per download /), a corresponding download link (per download/filename) corresponds to a downloadFile function in the view file (view. Py), and the download is realized by calling Fileresponse method.
S2, collecting data to be processed.
Specifically, the data acquisition range and mode of the patient are determined, on the premise of meeting ethical standards, the participating units comprise medical health institutions at all levels and clinical research institutions for acquiring and reporting the medical data, the acquisition range comprises relevant medical data such as case information, follow-up information and the like, and the data acquisition mode is selected according to specific conditions.
As an alternative implementation of the embodiment of the present invention, collecting data to be processed includes: the data to be processed is transmitted through the API interface; collecting data to be processed through a log file; manually reporting and collecting data to be processed through a data platform; and/or the ORC identifies the data to be processed. Specifically, the data acquisition mode comprises API interface docking transmission, log file acquisition, manual data platform reporting, ORC identification and the like.
As an optional implementation manner of the embodiment of the invention, before collecting the data to be processed, the common information processing method for managing the real world data platform across the queues provided by the embodiment of the invention further comprises the following steps: setting rules of data to be processed; the rule for setting the data to be processed comprises the following steps: and setting the data range, the data type, the data format and the data value range uploaded to the database.
Specifically, according to the current medical data standard, the data range, the data type, the data format and the data value range uploaded to the database are specified; wherein the standards include published national standards such as health informatics patient health card data standards; industry standards, such as health information data element catalogs; community standards for related ailments.
S3, performing cross-queue data treatment on the data to be processed, and classifying the data to be processed to obtain the cross-queue data.
As an optional implementation manner of the embodiment of the present invention, performing cross-queue data management on data to be processed, and classifying the data to be processed includes: according to the timestamp, the source and the type characteristics of the data to be processed, the data to be processed is divided into different categories, and data tagging is realized, wherein the tag comprises: keywords, phrases, or metadata describing the characteristics, attributes, or meaning of the data, the tag content comprising: department classification, specific disease diagnosis and follow-up outcome.
Specifically, the real world data platform provided by the embodiment of the invention carries out cross-queue data management on the converted data, realizes data classification, and classifies the data into different categories according to the characteristics of time stamps, sources, types and the like of the data. The data is tagged, the tag can be a keyword, a phrase or metadata and is used for describing the characteristics, the attributes or the meanings of the data, and the tag content can be department classification, specific disease diagnosis, follow-up ending and the like. The data standardization is realized, and the definition, the unit and the value range of various inspection and check data are unified.
S4, carrying out common information processing on the cross-queue data and carrying out data fusion.
As an optional implementation manner of the embodiment of the present invention, performing commonality information processing on cross-queue data includes: judging whether the cross-queue data is repeated according to a preset rule, wherein the method comprises the following steps: judging whether the cross-queue data is repeated or not according to the time stamp and/or the ID; and de-duplication based on the content of the cross-queue data, comprising: performing de-duplication on the text data by using a text similarity algorithm;
the data fusion comprises the following steps: and carrying out data fusion by adopting a cluster analysis, association rule mining or text mining mode.
Specifically, the real world data platform provided by the embodiment of the invention carries out commonality information processing on the cross-queue data to realize data fusion, and various methods including weighted average, data interpolation, wavelet transformation and the like can be adopted. The platform adopts various duplication elimination strategies, including rule-based duplication elimination and content-based duplication elimination. Rule-based deduplication is to judge whether the data is duplicated according to a preset rule, for example, judging according to fields such as a time stamp, an ID and the like; content-based deduplication is based on the content of the data, e.g., text data is deduplicated using a text similarity algorithm. And data induction is realized, including cluster analysis, association rule mining, text mining and the like.
Therefore, the common information processing method for managing the real-world data platform across the queues provided by the embodiment of the invention can integrate various types of data based on the real-world data platform, including structured data, unstructured data, multimedia data and the like, so that users can conveniently share and exchange data, and cooperation and sharing among across departments and across institutions are promoted. In addition, the system has high expandability and flexibility. The back-end server can be horizontally expanded according to actual requirements so as to meet the requirements of large-scale data exchange and sharing. Meanwhile, a plurality of data formats and interface standards are supported, so that the system can be seamlessly integrated and interoperated with other systems.
Fig. 4 is a schematic structural diagram of a common information processing device for managing real world data platforms across queues, where the common information processing device for managing real world data platforms across queues applies the method described above, and the following simply describes the structure of the common information processing device for managing real world data platforms across queues, and other less-than-anything, please refer to the related description in the common information processing method for managing real world data platforms across queues, see fig. 3, where the common information processing device for managing real world data platforms across queues provided by the embodiment of the present invention is applied to real data platforms, and includes:
The acquisition module is used for acquiring data to be processed;
The treatment module is used for treating the data to be treated by crossing the queue data and classifying the data to be treated to obtain the data crossing the queue;
and the processing module is used for carrying out common information processing on the cross-queue data and carrying out data fusion.
As an optional implementation manner of the embodiment of the present invention, the common information processing apparatus for managing real world data platforms across queues provided by the embodiment of the present invention further includes: the setting module is used for setting rules of the data to be processed; the setting module sets rules of the data to be processed by the following modes: and setting the data range, the data type, the data format and the data value range uploaded to the database.
As an alternative implementation manner of the embodiment of the present invention, the acquisition module acquires data to be processed by: the data to be processed is transmitted through the API interface; collecting data to be processed through a log file; manually reporting and collecting data to be processed through a data platform; and/or the ORC identifies the data to be processed.
As an optional implementation manner of the embodiment of the invention, the treatment module carries out cross-queue data treatment on the data to be treated in the following manner, and classifies the data to be treated: according to the timestamp, the source and the type characteristics of the data to be processed, the data to be processed is divided into different categories, and data tagging is realized, wherein the tag comprises: keywords, phrases, or metadata describing the characteristics, attributes, or meaning of the data, the tag content comprising: department classification, specific disease diagnosis and follow-up outcome.
As an optional implementation manner of the embodiment of the invention, the processing module performs commonality information processing on the cross-queue data by the following manner: judging whether the cross-queue data is repeated according to a preset rule, wherein the method comprises the following steps: judging whether the cross-queue data is repeated or not according to the time stamp and/or the ID; and de-duplication based on the content of the cross-queue data, comprising: performing de-duplication on the text data by using a text similarity algorithm; the processing module performs data fusion by the following modes: and carrying out data fusion by adopting a cluster analysis, association rule mining or text mining mode.
As an optional implementation of the embodiment of the invention, the real world platform adopts a Python-based web framework Django to realize the interactive operation of the user and the data; adopting a MySQL database as a back-end data storage library; the B/S website framework development mode is adopted, and the following application structure level is adopted: a presentation layer, a business logic layer and a data management layer.
Therefore, the common information processing device for managing the real-world data platform across the queues provided by the embodiment of the invention can integrate various types of data based on the real-world data platform, including structured data, unstructured data, multimedia data and the like, so that users can conveniently share and exchange data, and cooperation and sharing among across departments and across institutions are promoted. In addition, the system has high expandability and flexibility. The back-end server can be horizontally expanded according to actual requirements so as to meet the requirements of large-scale data exchange and sharing. Meanwhile, a plurality of data formats and interface standards are supported, so that the system can be seamlessly integrated and interoperated with other systems.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.
Claims (12)
1. A method for processing commonality information of a real world data platform for cross-queue management, comprising:
Building a real data platform;
Collecting data to be processed;
Performing cross-queue data treatment on the data to be processed, and classifying the data to be processed to obtain cross-queue data;
and carrying out common information processing on the cross-queue data and carrying out data fusion.
2. The method of claim 1, further comprising, prior to collecting the data to be processed: setting rules of data to be processed; the rule for setting the data to be processed comprises the following steps: and setting the data range, the data type, the data format and the data value range uploaded to the database.
3. The method of claim 1, wherein the acquiring the data to be processed comprises:
the data to be processed are transmitted in a butt joint mode through an API interface;
collecting the data to be processed through a log file;
manually reporting and collecting the data to be processed through a data platform; and/or
The ORC identifies the data to be processed.
4. The method of claim 1, wherein said across-queue data governance of said data to be processed, classifying said data to be processed comprises:
According to the timestamp, the source and the type characteristics of the data to be processed, the data to be processed are divided into different categories, and data tagging is realized, wherein the tag comprises: keywords, phrases, or metadata describing the characteristics, attributes, or meaning of the data, the tag content comprising: department classification, specific disease diagnosis and follow-up outcome.
5. The method of claim 1, wherein the step of determining the position of the substrate comprises,
The processing the commonality information of the cross-queue data comprises the following steps:
Judging whether the cross-queue data is repeated or not according to a preset rule, wherein the method comprises the following steps: judging whether the cross-queue data is repeated or not according to the time stamp and/or the ID; and de-duplicating according to the content of the cross-queue data, including: performing de-duplication on the text data by using a text similarity algorithm;
The data fusion comprises the following steps:
And carrying out data fusion by adopting a cluster analysis, association rule mining or text mining mode.
6. The method of claim 1, wherein the building a real data platform comprises:
Adopting a Python-based web framework Django to realize interactive operation between a user and data;
adopting a MySQL database as a back-end data storage library;
the B/S website framework development mode is adopted, and the following application structure level is adopted: a presentation layer, a business logic layer and a data management layer.
7. A commonality information processing apparatus for managing real world data platforms across queues, applied to real data platforms, comprising:
The acquisition module is used for acquiring data to be processed;
the treatment module is used for carrying out cross-queue data treatment on the data to be processed and classifying the data to be processed to obtain cross-queue data;
and the processing module is used for carrying out common information processing on the cross-queue data and carrying out data fusion.
8. The apparatus as recited in claim 7, further comprising: the setting module is used for setting rules of the data to be processed; the setting module sets rules of data to be processed by the following modes: and setting the data range, the data type, the data format and the data value range uploaded to the database.
9. The apparatus of claim 7, wherein the acquisition module acquires the data to be processed by:
the data to be processed are transmitted in a butt joint mode through an API interface;
collecting the data to be processed through a log file;
manually reporting and collecting the data to be processed through a data platform; and/or
The ORC identifies the data to be processed.
10. The apparatus of claim 7, wherein the governance module governs across-queue data for the data to be processed by classifying the data to be processed by:
According to the timestamp, the source and the type characteristics of the data to be processed, the data to be processed are divided into different categories, and data tagging is realized, wherein the tag comprises: keywords, phrases, or metadata describing the characteristics, attributes, or meaning of the data, the tag content comprising: department classification, specific disease diagnosis and follow-up outcome.
11. The apparatus of claim 7, wherein the device comprises a plurality of sensors,
The processing module performs common information processing on the cross-queue data by the following mode:
Judging whether the cross-queue data is repeated or not according to a preset rule, wherein the method comprises the following steps: judging whether the cross-queue data is repeated or not according to the time stamp and/or the ID; and de-duplicating according to the content of the cross-queue data, including: performing de-duplication on the text data by using a text similarity algorithm;
The processing module performs data fusion in the following manner:
And carrying out data fusion by adopting a cluster analysis, association rule mining or text mining mode.
12. The apparatus of claim 7, wherein the real world platform employs a Python-based web framework Django to enable user interaction with data; adopting a MySQL database as a back-end data storage library; the B/S website framework development mode is adopted, and the following application structure level is adopted: a presentation layer, a business logic layer and a data management layer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410246934.7A CN117992654A (en) | 2024-03-05 | 2024-03-05 | Common information processing method and device for managing real world data platform across queues |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410246934.7A CN117992654A (en) | 2024-03-05 | 2024-03-05 | Common information processing method and device for managing real world data platform across queues |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117992654A true CN117992654A (en) | 2024-05-07 |
Family
ID=90893029
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410246934.7A Pending CN117992654A (en) | 2024-03-05 | 2024-03-05 | Common information processing method and device for managing real world data platform across queues |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117992654A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110502559A (en) * | 2019-07-25 | 2019-11-26 | 浙江公共安全技术研究院有限公司 | A kind of data/address bus and transmission method of credible and secure cross-domain data exchange |
CN111917887A (en) * | 2020-08-17 | 2020-11-10 | 普元信息技术股份有限公司 | System for realizing data governance under big data environment |
CN115132366A (en) * | 2022-06-29 | 2022-09-30 | 电子科技大学 | Multi-source data processing method and system based on health and medical big data standard library |
CN115359863A (en) * | 2022-06-17 | 2022-11-18 | 四川大学华西医院 | Intelligent management internet platform, control method, construction method, equipment and terminal |
CN115831350A (en) * | 2022-11-21 | 2023-03-21 | 暨南大学附属第一医院(广州华侨医院) | Distributed intelligent diagnosis method, device, equipment and storage medium for head and neck tumors |
CN116469571A (en) * | 2023-04-18 | 2023-07-21 | 山东浪潮智慧医疗科技有限公司 | Method and system for constructing specific disease map of real world data |
-
2024
- 2024-03-05 CN CN202410246934.7A patent/CN117992654A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110502559A (en) * | 2019-07-25 | 2019-11-26 | 浙江公共安全技术研究院有限公司 | A kind of data/address bus and transmission method of credible and secure cross-domain data exchange |
CN111917887A (en) * | 2020-08-17 | 2020-11-10 | 普元信息技术股份有限公司 | System for realizing data governance under big data environment |
CN115359863A (en) * | 2022-06-17 | 2022-11-18 | 四川大学华西医院 | Intelligent management internet platform, control method, construction method, equipment and terminal |
CN115132366A (en) * | 2022-06-29 | 2022-09-30 | 电子科技大学 | Multi-source data processing method and system based on health and medical big data standard library |
CN115831350A (en) * | 2022-11-21 | 2023-03-21 | 暨南大学附属第一医院(广州华侨医院) | Distributed intelligent diagnosis method, device, equipment and storage medium for head and neck tumors |
CN116469571A (en) * | 2023-04-18 | 2023-07-21 | 山东浪潮智慧医疗科技有限公司 | Method and system for constructing specific disease map of real world data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20040167884A1 (en) | Methods and products for producing role related information from free text sources | |
Hutchinson | Natural language processing and machine learning as practical toolsets for archival processing | |
CN113836131B (en) | Big data cleaning method and device, computer equipment and storage medium | |
CN112000773B (en) | Search engine technology-based data association relation mining method and application | |
Das et al. | A CV parser model using entity extraction process and big data tools | |
Wanyan et al. | Deep learning with heterogeneous graph embeddings for mortality prediction from electronic health records | |
CN113919336A (en) | Article generation method and device based on deep learning and related equipment | |
Galende et al. | Conspiracy or not? A deep learning approach to spot it on Twitter | |
Feng et al. | Usability of the clinical care classification system for representing nursing practice according to specialty | |
US20160125006A1 (en) | Indexing content and source code of a software application | |
CN115455973A (en) | Lymphoma research database construction and application method based on real world research | |
CN118116611B (en) | Database construction method based on multi-source medical and nutritional big data fusion integration | |
Kumar Attar et al. | The emergence of Natural Language Processing (NLP) techniques in healthcare AI | |
CN117992654A (en) | Common information processing method and device for managing real world data platform across queues | |
Dong et al. | Medical information mining‐based visual artificial intelligence emergency nursing management system | |
Huff et al. | Evaluation and verification of the global rapid identification of threats system for infectious diseases in textual data sources | |
CN105786929A (en) | Information monitoring method and device | |
DeVries et al. | Name it! store it! protect it!: A systems approach to managing data in research core facilities | |
Ashrafuzzaman et al. | Big data analytics techniques for healthcare | |
Han et al. | Application of data mining technology‐based nursing risk Management in Emergency Department Care | |
Borowik et al. | De-identification of electronic health records data | |
CN111079420B (en) | Text recognition method and device, computer readable medium and electronic equipment | |
Lamba et al. | Text Data and Where to Find Them? | |
TWM656879U (en) | Interpersonal relationship exploration and analysis system | |
KR20230142108A (en) | Providing method, apparatus and computer-readable medium of monitoring risk or opportunity events on user-customized topics through deep signal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |