Nothing Special   »   [go: up one dir, main page]

CN116204492A - Metadata quality determination method and device - Google Patents

Metadata quality determination method and device Download PDF

Info

Publication number
CN116204492A
CN116204492A CN202211705259.7A CN202211705259A CN116204492A CN 116204492 A CN116204492 A CN 116204492A CN 202211705259 A CN202211705259 A CN 202211705259A CN 116204492 A CN116204492 A CN 116204492A
Authority
CN
China
Prior art keywords
metadata
quality
document
determining
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211705259.7A
Other languages
Chinese (zh)
Inventor
邢小龙
刘兆平
李环亚
赵铭
林镇锋
周海
田松林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Southern Power Grid Digital Platform Technology Guangdong Co ltd
Original Assignee
China Southern Power Grid Digital Platform Technology Guangdong Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Southern Power Grid Digital Platform Technology Guangdong Co ltd filed Critical China Southern Power Grid Digital Platform Technology Guangdong Co ltd
Priority to CN202211705259.7A priority Critical patent/CN116204492A/en
Publication of CN116204492A publication Critical patent/CN116204492A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/116Details of conversion of file system types or formats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/164File meta data generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a metadata quality determining method and device, wherein the method comprises the following steps: determining a metadata process document according to the metadata change content; determining metadata quality according to the metadata process document, and generating a metadata quality document; the metadata quality document is used for indicating the metadata quality; and checking the metadata quality according to the metadata quality document. Corresponding metadata process documents are determined according to the changing content of the metadata, the changed metadata quality is determined according to the metadata process documents, corresponding metadata quality documents are generated, metadata quality verification is carried out, the standardization level of metadata quality determination is improved, and the efficiency and accuracy of metadata quality determination are further improved.

Description

Metadata quality determination method and device
Technical Field
The present invention relates to the field of metadata technologies, and in particular, to a method and an apparatus for determining metadata quality.
Background
The metadata can be used for describing the attribute or state of the target data, and the quality of the metadata needs to be monitored in the running process of software or projects, so that the data quality and the change condition of the metadata are mastered in real time. However, in the existing metadata quality determining method, corresponding process documents are filled in when metadata is changed in a manual mode, and the data quality is ensured through manual verification. The metadata quality determining method has low efficiency and accuracy of metadata quality determination due to low standardization degree. Therefore, it is particularly important to improve the standardization of metadata quality determination.
Disclosure of Invention
The technical problem to be solved by the invention is that the existing data export method based on the data stream needs to be transited through the memory of the virtual machine, so that the efficiency is not high in the application scene of large data volume.
In order to solve the above technical problem, a first aspect of the present invention discloses a metadata quality determining method, including:
determining a metadata process document according to the metadata change content;
determining metadata quality according to the metadata process document, and generating a metadata quality document; the metadata quality document is used for indicating the metadata quality;
and checking the metadata quality according to the metadata quality document.
As an optional implementation manner, before the determining the metadata process document according to the metadata change content, the method further includes:
establishing a metadata reference library; the metadata reference library is used for storing the metadata;
and determining the metadata change content according to the metadata reference library.
As an alternative embodiment, the metadata process document includes at least one of: a metadata change manifest, a metadata responsibility matrix, and a metadata quality manifest;
The determining the metadata process document according to the metadata change content comprises the following steps:
determining the metadata change list according to the data attribute of the metadata;
determining the metadata responsibility-confirming matrix according to the user relation of the metadata;
determining the metadata quality list according to the data content of the metadata;
and determining the metadata process document according to the metadata change list, the metadata responsibility matrix and the metadata quality list.
As an alternative embodiment, the determining metadata quality according to the metadata process document and generating a metadata quality document includes:
determining the data type, the data coding and the data annotation of the metadata through a regular expression according to the metadata process document;
and determining the quality of the metadata according to the data type, the data coding and the data annotation of the metadata, and generating a metadata quality document.
As an alternative embodiment, the generating the metadata quality document includes:
determining a document format of the metadata quality document;
and generating the metadata quality document through data stream export according to the document format.
As an optional implementation manner, the verifying the metadata quality according to the metadata quality document includes:
determining a metadata quality check flow;
and according to the metadata quality document, checking the metadata quality through the metadata quality checking flow to obtain a checking result.
As an optional implementation manner, after verifying the metadata quality according to the metadata quality document, the method further includes:
if the data quality of at least one piece of target metadata does not meet the preset requirement, marking the target metadata;
and correcting the target metadata according to a preset correction flow.
In a second aspect, the present application provides a metadata quality determination apparatus, the apparatus comprising:
the document determining module is used for determining a metadata process document according to the metadata change content;
the quality determining module is used for determining the quality of the metadata according to the metadata process document and generating a metadata quality document; the metadata quality document is used for indicating the metadata quality;
and the quality verification module is used for verifying the quality of the metadata according to the metadata quality document.
As an alternative embodiment, the apparatus further comprises a reference library building module for, before the document determination module determines the metadata process document based on the metadata change content,
establishing a metadata reference library; the metadata reference library is used for storing the metadata;
and determining the metadata change content according to the metadata reference library.
As an alternative embodiment, the metadata process document includes at least one of: a metadata change manifest, a metadata responsibility matrix, and a metadata quality manifest;
the document determination module determines a specific mode of the metadata process document according to the metadata change content, and comprises the following steps:
determining the metadata change list according to the data attribute of the metadata;
determining the metadata responsibility-confirming matrix according to the user relation of the metadata;
determining the metadata quality list according to the data content of the metadata;
and determining the metadata process document according to the metadata change list, the metadata responsibility matrix and the metadata quality list.
As an alternative embodiment, the specific manner of determining the metadata quality and generating the metadata quality document by the quality determining module according to the metadata process document includes:
Determining the data type, the data coding and the data annotation of the metadata through a regular expression according to the metadata process document;
and determining the quality of the metadata according to the data type, the data coding and the data annotation of the metadata, and generating a metadata quality document.
As an alternative embodiment, the specific manner in which the quality determination module generates the metadata quality document includes:
determining a document format of the metadata quality document;
and generating the metadata quality document through data stream export according to the document format.
As an optional implementation manner, the quality verification module verifies the metadata quality according to the metadata quality document, including:
determining a metadata quality check flow;
and according to the metadata quality document, checking the metadata quality through the metadata quality checking flow to obtain a checking result.
As an alternative embodiment, the apparatus further comprises a correction module for, after the quality verification module verifies the quality of the metadata based on the metadata quality document,
if the data quality of at least one piece of target metadata does not meet the preset requirement, marking the target metadata;
And correcting the target metadata according to a preset correction flow.
Another metadata quality determination apparatus is disclosed in a third aspect of the present invention, the apparatus comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform the metadata quality determination method disclosed in the first aspect of the present invention.
A fourth aspect of the present invention discloses a computer storage medium storing computer instructions which, when called, are adapted to perform the metadata quality determination method disclosed in the first aspect of the present invention.
Compared with the prior art, the embodiment of the invention has the following beneficial effects: corresponding metadata process documents are determined according to the changing content of the metadata, the changed metadata quality is determined according to the metadata process documents, corresponding metadata quality documents are generated, metadata quality verification is carried out, the standardization level of metadata quality determination is improved, and the efficiency and accuracy of metadata quality determination are further improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a metadata quality determination method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a metadata quality determination method disclosed in a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a metadata quality determining apparatus according to a third embodiment of the present invention;
FIG. 4 is a schematic diagram of another metadata quality determination apparatus according to the third embodiment of the present invention;
fig. 5 is a schematic structural diagram of a metadata quality determining apparatus according to a fourth embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms first, second and the like in the description and in the claims and in the above-described figures are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or article that comprises a list of steps or elements is not limited to only those listed but may optionally include other steps or elements not listed or inherent to such process, method, article, or article.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The metadata can be used for describing the attribute or state of the target data, and the quality of the metadata needs to be monitored in the running process of software or projects, so that the data quality and the change condition of the metadata are mastered in real time. However, in the existing metadata quality determining method, corresponding process documents are filled in when metadata is changed in a manual mode, and the data quality is ensured through manual verification. However, the process document of metadata change may relate to a lot of contents to be copied and pasted, and when the change range is large, time is wasted and mistakes are easy to occur, meanwhile, the understanding and the detail degree of rules in manual auditing are different from person to person, and the auditing result is easy to deviate. Therefore, the metadata quality determining method has low efficiency and accuracy of metadata quality determination due to low standardization, so that the standardization of metadata quality determination is very important.
According to the method provided by the application, a series of process documents can be generated according to the metadata change content, and metadata quality documents are further generated, so that the auditing efficiency of technicians is improved. Specifically, by the method provided by the application, technicians can generate relevant process documents such as metadata change definition, a data quality list, a responsibility matrix and the like, and the process documents are mainly used for improving the standardization degree of the operation and maintenance process of the system, supervising the system change from each dimension, and defining the work such as data change content, influence range, responsibility person, data quality inspection and the like. In practical application, only the names of the first to fourth levels of functions corresponding to part of data in the document can be edited, so that other document contents required by process management can be efficiently and accurately completed. Meanwhile, the auditing efficiency of technicians can be improved, metadata is analyzed by identifying the type, coding and annotation information of the metadata indicated in the metadata quality document in a regular expression mode, various requirements in a data quality rule are identified, the metadata quality is automatically checked, and the data quality problem of the metadata is accurately identified.
The invention discloses a metadata quality determining method and device, which are used for determining corresponding metadata process documents according to the changing content of metadata, determining the changed metadata quality according to the metadata process documents, generating corresponding metadata quality documents, and performing metadata quality verification, so that the standardization level of metadata quality determination is improved, and the efficiency and accuracy of metadata quality determination are further improved.
Example 1
Referring to fig. 1, fig. 1 is a flowchart illustrating a metadata quality determining method according to an embodiment of the invention. As shown in fig. 1, the metadata quality determination method may include the following operations:
s101, determining a metadata process document according to metadata change content;
according to the changes of metadata in the aspects of data format, data content, annotation and the like, corresponding metadata process documents can be determined, the changes of the metadata can be preset references, a reference library can be pre-established, and the metadata in the reference library is used as the references to determine the change content of the metadata. In some scenarios, technicians also need to supplement the multi-level function page names described by the newly added metadata, such as the one-level to four-level function names of the metadata in the smart grid scenario.
As an optional implementation manner, before the determining the metadata process document according to the metadata change content, the method further includes:
establishing a metadata reference library; the metadata reference library is used for storing the metadata;
as previously described, the metadata reference library may be used to store metadata, or data indicated by metadata, as a reference for metadata changes.
And determining the metadata change content according to the metadata reference library.
By monitoring the differences in metadata relative to metadata stored in a metadata reference library, changes in metadata, such as changes in metadata format, attributes, or content, can be determined.
By establishing a metadata reference library for storing metadata, the metadata is used as a reference for metadata change comparison, the standardization level of metadata quality determination is improved, and the efficiency and accuracy of metadata quality determination are further improved.
S102, determining metadata quality according to the metadata process document, and generating a metadata quality document; the metadata quality document is used for indicating the metadata quality;
according to the metadata process document, the change condition of metadata can be determined, the quality of the metadata before and after the change is further determined, and a corresponding metadata quality document is generated. The metadata quality document is used to indicate the metadata quality, and specific content can be seen in the following embodiments.
As an alternative embodiment, the metadata process document includes at least one of: a metadata change manifest, a metadata responsibility matrix, and a metadata quality manifest;
At least one of a metadata change list, a metadata validation matrix, and a metadata quality list may be determined based on the change in metadata, thereby determining a process document corresponding to the changed metadata. Specifically, the determining the metadata process document according to the metadata change content includes:
determining the metadata change list according to the data attribute of the metadata;
determining the metadata responsibility-confirming matrix according to the user relation of the metadata;
determining the metadata quality list according to the data content of the metadata;
and determining the metadata process document according to the metadata change list, the metadata responsibility matrix and the metadata quality list.
It should be noted that only one or two of these three types of documents may be used, for example, the metadata responsibility matrix may be determined only from the user relationship of the metadata, and the other two types of documents need not be determined, and the final process document may only include the metadata responsibility matrix, which is the same for other types of documents.
The metadata change list is determined through the data format, the source and other attributes of the metadata, the metadata responsibility matrix is determined through the user management and the user authority relationship of the metadata, the metadata quality list is determined through the data content of the metadata, and then the metadata process document is determined through the metadata change list, the metadata responsibility matrix and the metadata quality list, so that the standardization level of metadata quality determination is improved, and the efficiency and the accuracy of metadata quality determination are further improved.
S103, checking the quality of the metadata according to the metadata quality document.
According to the determined metadata quality document and the preset metadata quality rule requirements, whether each dimension in the metadata quality document meets the metadata quality rule requirements or not can be further checked.
For example, in one scenario, verification of metadata quality may be performed as required by the following metadata quality rules:
2.0, the field in the metadata quality document belongs to the page of which order to fill in the name of the page of the first order, and if the field belongs to the field in the page of the third order, the pages of the first order, the second order and the third order are all filled in.
2.1, the [ data source ] fields in the metadata quality document table 1 can only be filled with four types of manual input, system generation, system calculation and system transmission. When the data source is 'manual entry', the corresponding 'functional page entry' content must be supplemented; when the data source is "system calculation", the data type is "statistics" and the corresponding "statistics caliber" content must be supplemented; corresponding "consistency" content must be supplemented when the data source is "system transfer";
2.2, the [ data type ] field in table 1 of the metadata quality document can only fill in two kinds of "basic data" and "statistical data". If the statistics data is filled, the data source is to be calculated by a system, and the corresponding statistics caliber is filled and written into the statistics caliber of the corresponding data by a calculation formula of the statistics data. If the data is filled as basic data, the situation of "/" needs to be supplemented and perfected. (fields used for statistical analysis all belong to statistical data, others belong to basic data)
2.3, each field of the metadata quality document should comprehensively describe the quality standard constraint condition according to 5 dimensions of "normalization", "integrity", "timeliness", "consistency" and "accuracy", and if a certain dimension does not have requirements, the element "/" isfilled in. In principle, all fields must have corresponding data quality standards, and the situation that 5 quality dimensions are "/" cannot occur;
A. normalization: the definition and the value of the specified data meet the related specification requirements, such as data types, data precision, character length, data format, uniqueness, codes and the like, whether the requirements of national standards, line standards, company related standard specifications and business targets are met.
B. Integrity: the data items required for the specified service need to be defined in the system, the data items need to be entered, etc., and it is generally required that the data items cannot be null.
C. Timeliness: the timeliness and time period requirements of data acquisition, entry, update, processing and deletion between different systems or different services in the same system are specified.
D. Consistency: the business and logical relationships between data specifying that the same data item values between different systems or different businesses within the same system should be consistent or that there is business and logical association needs to remain correct and complete.
E. Accuracy: the specified data is consistent with the actual state of the real object described by the specified data, such as the value is not more than a reasonable range.
1) Normative dimension
According to the category: "Format filling out" is described.
Wherein the class values include: code class, amount class, value class, percentage class, date class, time class, date time class, other class
When the type is the coding type, the description is a coding rule conforming to the service specification;
when the type is "code class", the "description" is "code value 1=meaning 1, code value 2=meaning 2 … …", such as "code class: 0=suspend, 1=run, 2=archive, 3=invalidate, 4=invalidate. ";
when the type is "money class", the "description" is (decimal place postamble) +money unit fill. Such as "money class: (4) ten thousand yuan ";
when the type is "value class", the "description" is (decimal place postamble) +value unit filling. Such as "numerical classes: (3) volts ";
when the type is "percentage class", the "description" is (decimal point post-digit)%. Such as "percentage class: (2)% ";
when the type is "date class," the "description" shape is like YYYY-MM-DD, where Y represents year, M represents month, and D represents date. Such as "date class: YYYY/MM ";
When the type is "time class," the "description" shape is HH: MI: SS. NNN, where H represents hours, MI represents minutes, SS represents seconds, and NNN represents milliseconds. Such as "time class: MI: SS ";
when the type is "date and time class", the "description" is a combination of date and time class, and is separated by "T" in the middle, and is in the form of YYYY-MM-DDTHH: MI: SS.NNN. Such as "date and time class: MM-DDTHH: MI';
when the type is "other types," the description of "description" should be comprehensive, accurate, and clear. Such as "other classes: the telephone number is a fixed telephone number or a mobile telephone number, and accords with the number rule of a telecom operator. Domestic landline format: area code-numbers, such as 020-888888434; domestic mobile phone format: an 11-digit number, wherein the first 3 digits correspond to a range of mobile phone number segments, such as 13900001234; international telephone format: "+country code-area code-number" ", or" +country code-number "", such as +86-020-888801234 or +86-13900001234."
2.3.1, the specification corresponding to the field with suffix of code, number, ID, etc. should belong to the code class: coding rules of the station;
2.3.2 fields with postfix of type, class, mode, status, flag, whether or not, the normative of the field should belong to the code class: 0=enumerated value, 1=enumerated value, 2=enumerated value;
2.3.3, the specification corresponding to the field with the suffix of cost, amount, price, money, fee, etc. should belong to the category of amount: (4) ten thousand yuan (filled according to actual conditions);
2.3.4, the specification for a field with a suffix for number, capacity, number, quantity, etc. should belong to the numeric class: (3) volts (filled in according to the actual situation);
2.3.5, the specifications corresponding to fields with suffix XX rate, XX duty cycle, acceleration, ring ratio, homonymy etc. should belong to the percentage class: (2)% (filled in according to actual conditions);
2.3.6, the specification corresponding to a field with a suffix such as date, etc. should belong to the date class: YYYY/MM (filled according to actual conditions);
2.3.7, the specification for fields with postfix of time etc. the corresponding specification should belong to the time class: MI: SS (filled in according to actual situation);
2.3.8, the specifications corresponding to fields with date, time, etc. suffixes should belong to the date-time class: MM-DDTHH: MI (filled according to actual conditions);
2.3.9, the specifications corresponding to fields such as telephone number, identification card number and certificate number should belong to other classes: the telephone number is a fixed telephone number or a mobile telephone number, and accords with the number rule of a telecom operator. Domestic landline format: area code-numbers, such as 020-888888434; domestic mobile phone format: an 11-digit number, wherein the first 3 digits correspond to a range of mobile phone number segments, such as 13900001234; international telephone format: "+country code-area code-number" or "+country code-number", such as +86-020-8888882334 or +86-13900001234. The description should be comprehensive, accurate and clear. If there is no business meaning, such as a field like a remark, the following classes are possible: the data type (data length) is filled in according to the actual situation.
2) Timeliness dimension
The format is filled in accordance with the format "XX System should transfer XX data to XX System within XX (time Point)" or "XX System should transfer XX data to XX System every XX (time period)". Such as "investment planning system should transfer new project information from month 26 to month 25 to financial system on month 25".
3) Consistency dimension
The system is filled in according to a format of 'X (business object information item name) in the X system should be consistent with xxx (information item name or description) of X (business object information item name) in the X system'. For example, "the birth year and month should be consistent with the 7 th to 14 th positions of the ID card numbers in the system".
4) Integrity dimension
Fill out "cannot be empty".
5) Accuracy dimension
Constraint requirements for the range of values. The terms "[" indicates equal to or greater than "]" indicates equal to or less than "(" indicates greater than "(" indicates ") indicates less than" ("indicates") and "n" indicates no constraint. Such as "(0, n)", means greater than 0.
When the format cannot meet the filling requirement, the constraint condition is described in a comprehensive, accurate and clear description mode with reference to the filling format requirement.
2.4, filling personnel and contact ways are filled completely.
2.5, data quality standard number (filling format: CSG-XX-DQxxxxxx, "XX is the initial spelling of the system name letter" "" XX is a two-digit number, range 01-99 "), wherein the data quality standard number in Table 1 is consistent with the data quantity and content of the data quality standard number in Table 2.
As described above, the method is just a specific example of the check metadata quality rule in the smart grid application scenario, and the method can be designed according to specific service requirements in practical application.
The embodiment provides a metadata quality determining method, which comprises the following steps: determining a metadata process document according to the metadata change content; determining metadata quality according to the metadata process document, and generating a metadata quality document; the metadata quality document is used for indicating the metadata quality; and checking the metadata quality according to the metadata quality document. Corresponding metadata process documents are determined according to the changing content of the metadata, the changed metadata quality is determined according to the metadata process documents, corresponding metadata quality documents are generated, metadata quality verification is carried out, the standardization level of metadata quality determination is improved, and the efficiency and accuracy of metadata quality determination are further improved.
Example two
Referring to fig. 2, fig. 2 is a flowchart illustrating a metadata quality determining method according to a second embodiment of the present invention. As shown in fig. 2, on the basis of any other embodiment, the method includes:
s201, determining a metadata process document according to metadata change content;
s202, determining the data type, the data coding and the data annotation of the metadata through a regular expression according to the metadata process document;
regular expressions can divide and match the form of segments of speech, so that the data type, data coding and data annotation of relevant locations of metadata can be determined in the form of regular expressions, but also according to preset algorithms for specific values and semantics or manually by a technician.
S203, determining the quality of the metadata according to the data type, the data coding and the data annotation of the metadata, and generating a metadata quality document;
metadata quality documents may be generated based on data types, data encodings, and data annotations of metadata partitioned by regular expressions.
As an alternative embodiment, the generating the metadata quality document includes:
Determining a document format of the metadata quality document;
and generating the metadata quality document through data stream export according to the document format.
According to the file format of the metadata quality document and the required form in the specific content, corresponding data flow export rules can be determined, and the corresponding metadata quality document is exported and generated in a database storing related data of the metadata process document through a data flow technology.
The corresponding export standard is determined by presetting the document format of the metadata quality document, and the data is exported into the metadata quality document in a standard form through the data stream, so that the standardization level of metadata quality determination is improved, and the efficiency and accuracy of metadata quality determination are further improved.
S204, checking the quality of the metadata according to the metadata quality document.
For a specific description of the present embodiments S201 and S204, reference may be made to the detailed description of S101 and S103 in the first embodiment, and the detailed description is omitted here.
As an optional implementation manner, the verifying the metadata quality according to the metadata quality document includes:
determining a metadata quality check flow;
and according to the metadata quality document, checking the metadata quality through the metadata quality checking flow to obtain a checking result.
As described in the first embodiment, the metadata quality verification process may be a set of metadata quality verification criteria determined in advance, or may be a set of intelligent algorithms designed by the metadata quality verification process, and the content matched by the regular expression segmentation is automatically verified to obtain a corresponding verification result.
By determining the metadata quality verification process, further metadata quality verification is carried out on the metadata quality document according to the preset metadata quality verification process, the standardization level of metadata quality determination is improved, and further the efficiency and accuracy of metadata quality determination are improved.
As an optional implementation manner, after verifying the metadata quality according to the metadata quality document, the method further includes:
if the data quality of at least one piece of target metadata does not meet the preset requirement, marking the target metadata;
and correcting the target metadata according to a preset correction flow.
After the metadata quality check is completed, automated labeling, and/or automated correction may also be performed. For example, when the data quality of at least one piece of target metadata does not meet the preset requirement, the target metadata can be marked, and then the technician can perform manual processing, or after marking, the target metadata which does not meet the preset requirement can be directly and automatically corrected according to a preset automatic correction flow corresponding to the data quality requirement.
The marked metadata is automatically corrected according to a preset flow by marking the metadata which does not accord with the preset standard, so that the standardization and automation level of the metadata quality determination are improved, and the efficiency and accuracy of the metadata quality determination are further improved.
According to the metadata quality determining method, matching is carried out through the regular expression, the data type, the data coding and the data annotation of metadata reflected in the metadata process document are determined, the metadata quality is further determined, the metadata quality document is generated, the standardization level of metadata quality determination is improved, and the efficiency and the accuracy of metadata quality determination are further improved.
Example III
An embodiment of the present invention further provides a metadata quality determining apparatus to implement the foregoing method, please refer to fig. 3, and fig. 3 is a schematic structural diagram of a metadata quality determining apparatus disclosed in the embodiment of the present invention. As shown in fig. 3, in any other embodiment, the apparatus includes:
a document determining module 31 for determining a metadata process document according to the metadata change content;
a quality determination module 32 for determining metadata quality from the metadata process document and generating a metadata quality document; the metadata quality document is used for indicating the metadata quality;
And the quality checking module 33 is used for checking the quality of the metadata according to the metadata quality document.
Corresponding metadata process documents are determined according to the changing content of the metadata, the changed metadata quality is determined according to the metadata process documents, corresponding metadata quality documents are generated, metadata quality verification is carried out, the standardization level of metadata quality determination is improved, and the efficiency and accuracy of metadata quality determination are further improved.
Referring to fig. 4, fig. 4 is a schematic structural diagram of another metadata quality determining apparatus according to a third embodiment of the present invention. As shown in fig. 4, as an alternative embodiment, the apparatus further includes a reference library establishment module 34 for, before the document determination module 31 determines the metadata process document according to the metadata change content,
establishing a metadata reference library; the metadata reference library is used for storing the metadata;
and determining the metadata change content according to the metadata reference library.
By establishing a metadata reference library for storing metadata, the metadata is used as a reference for metadata change comparison, the standardization level of metadata quality determination is improved, and the efficiency and accuracy of metadata quality determination are further improved.
As an alternative embodiment, the metadata process document includes at least one of: a metadata change manifest, a metadata responsibility matrix, and a metadata quality manifest;
the document determination module 31 determines a specific mode of the metadata process document according to the metadata change content, including:
determining the metadata change list according to the data attribute of the metadata;
determining the metadata responsibility-confirming matrix according to the user relation of the metadata;
determining the metadata quality list according to the data content of the metadata;
and determining the metadata process document according to the metadata change list, the metadata responsibility matrix and the metadata quality list.
The metadata change list is determined through the data format, the source and other attributes of the metadata, the metadata responsibility matrix is determined through the user management and the user authority relationship of the metadata, the metadata quality list is determined through the data content of the metadata, and then the metadata process document is determined through the metadata change list, the metadata responsibility matrix and the metadata quality list, so that the standardization level of metadata quality determination is improved, and the efficiency and the accuracy of metadata quality determination are further improved.
As an alternative embodiment, the specific manner in which the quality determination module 32 determines the metadata quality from the metadata process document and generates the metadata quality document includes:
determining the data type, the data coding and the data annotation of the metadata through a regular expression according to the metadata process document;
and determining the quality of the metadata according to the data type, the data coding and the data annotation of the metadata, and generating a metadata quality document.
The data type, the data coding and the data annotation of the metadata reflected in the metadata process document are determined through matching of the regular expression, so that the metadata quality is determined, the metadata quality document is generated, the standardization level of the metadata quality determination is improved, and the efficiency and the accuracy of the metadata quality determination are improved.
As an alternative embodiment, the specific manner in which the quality determination module 32 generates the metadata quality document includes:
determining a document format of the metadata quality document;
and generating the metadata quality document through data stream export according to the document format.
The corresponding export standard is determined by presetting the document format of the metadata quality document, and the data is exported into the metadata quality document in a standard form through the data stream, so that the standardization level of metadata quality determination is improved, and the efficiency and accuracy of metadata quality determination are further improved.
As an alternative embodiment, the quality verification module 33 verifies the metadata quality according to the metadata quality document, including:
determining a metadata quality check flow;
and according to the metadata quality document, checking the metadata quality through the metadata quality checking flow to obtain a checking result.
By determining the metadata quality verification process, further metadata quality verification is carried out on the metadata quality document according to the preset metadata quality verification process, the standardization level of metadata quality determination is improved, and further the efficiency and accuracy of metadata quality determination are improved.
As shown in fig. 4, as an alternative embodiment, the apparatus further comprises a correction module 35 for, after the quality check module 33 checks the metadata quality based on the metadata quality document,
if the data quality of at least one piece of target metadata does not meet the preset requirement, marking the target metadata;
and correcting the target metadata according to a preset correction flow.
The marked metadata is automatically corrected according to a preset flow by marking the metadata which does not accord with the preset standard, so that the standardization and automation level of the metadata quality determination are improved, and the efficiency and accuracy of the metadata quality determination are further improved.
Example IV
Referring to fig. 5, fig. 5 is a schematic diagram of a metadata quality determining apparatus according to a fourth embodiment of the present invention. As shown in fig. 5, the metadata quality determination apparatus may include:
a Processor 291, the apparatus further comprising a Memory 292 in which executable program code is stored; a communication interface (Communication Interface) 293 and bus 294 may also be included. The processor 291, the memory 292, and the communication interface 293 may communicate with each other via the bus 294. Communication interface 293 may be used for information transfer. The processor 291 is coupled to the memory 292, and the processor 291 may call logic instructions (executable program code) in the memory 292 to perform the metadata quality determination method described in any of the embodiments above.
Further, the logic instructions in memory 292 described above may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product.
The memory 292 is a computer readable storage medium, and may be used to store a software program, a computer executable program, and program instructions/modules corresponding to the methods in the embodiments of the present application. The processor 291 executes functional applications and data processing by running software programs, instructions and modules stored in the memory 292, i.e., implements the methods of the method embodiments described above.
Memory 292 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the terminal device, etc. Further, memory 292 may include high-speed random access memory, and may also include non-volatile memory.
Embodiments of the present invention also provide a computer-readable storage medium having stored therein computer-executable instructions that, when invoked, are adapted to implement the method described in any of the embodiments.
Embodiments of the present invention also disclose a computer program product comprising a non-transitory computer readable storage medium storing a computer program, and the computer program is operable to cause a computer to perform the steps of the metadata quality determination method described in any of the embodiments.
The apparatus embodiments described above are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above detailed description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product that may be stored in a computer-readable storage medium including Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM) or other optical disc Memory, magnetic disc Memory, tape Memory, or any other medium that can be used for computer-readable carrying or storing data.
Finally, it should be noted that: the metadata quality determining method and apparatus disclosed in the embodiments of the present invention are disclosed only in the preferred embodiments of the present invention, and are only used for illustrating the technical scheme of the present invention, but not limiting the technical scheme; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme recorded in the various embodiments can be modified or part of technical features in the technical scheme can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (10)

1. A method of metadata quality determination, the method comprising:
determining a metadata process document according to the metadata change content;
determining metadata quality according to the metadata process document, and generating a metadata quality document; the metadata quality document is used for indicating the metadata quality;
and checking the metadata quality according to the metadata quality document.
2. The method of claim 1, wherein prior to determining the metadata process document based on the metadata change content, the method further comprises:
Establishing a metadata reference library; the metadata reference library is used for storing the metadata;
and determining the metadata change content according to the metadata reference library.
3. The method of claim 1, wherein the metadata process document comprises at least one of: a metadata change manifest, a metadata responsibility matrix, and a metadata quality manifest;
the determining the metadata process document according to the metadata change content comprises the following steps:
determining the metadata change list according to the data attribute of the metadata;
determining the metadata responsibility-confirming matrix according to the user relation of the metadata;
determining the metadata quality list according to the data content of the metadata;
and determining the metadata process document according to the metadata change list, the metadata responsibility matrix and the metadata quality list.
4. The method of claim 1, wherein determining a metadata quality from the metadata process document and generating a metadata quality document comprises:
determining the data type, the data coding and the data annotation of the metadata through a regular expression according to the metadata process document;
And determining the quality of the metadata according to the data type, the data coding and the data annotation of the metadata, and generating a metadata quality document.
5. The method of claim 4, wherein generating a metadata quality document comprises:
determining a document format of the metadata quality document;
and generating the metadata quality document through data stream export according to the document format.
6. The method of claim 1, wherein verifying metadata quality from the metadata quality document comprises:
determining a metadata quality check flow;
and according to the metadata quality document, checking the metadata quality through the metadata quality checking flow to obtain a checking result.
7. The method of any of claims 1-6, wherein after verifying metadata quality from the metadata quality document, the method further comprises:
if the data quality of at least one piece of target metadata does not meet the preset requirement, marking the target metadata;
and correcting the target metadata according to a preset correction flow.
8. A metadata quality determination apparatus, the apparatus comprising:
The document determining module is used for determining a metadata process document according to the metadata change content;
the quality determining module is used for determining the quality of the metadata according to the metadata process document and generating a metadata quality document; the metadata quality document is used for indicating the metadata quality;
and the quality verification module is used for verifying the quality of the metadata according to the metadata quality document.
9. A metadata quality determination apparatus, the apparatus comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform the metadata quality determination method of any of claims 1-7.
10. A computer storage medium storing computer instructions which, when invoked, are operable to perform the metadata quality determination method of any one of claims 1-7.
CN202211705259.7A 2022-12-29 2022-12-29 Metadata quality determination method and device Pending CN116204492A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211705259.7A CN116204492A (en) 2022-12-29 2022-12-29 Metadata quality determination method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211705259.7A CN116204492A (en) 2022-12-29 2022-12-29 Metadata quality determination method and device

Publications (1)

Publication Number Publication Date
CN116204492A true CN116204492A (en) 2023-06-02

Family

ID=86512119

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211705259.7A Pending CN116204492A (en) 2022-12-29 2022-12-29 Metadata quality determination method and device

Country Status (1)

Country Link
CN (1) CN116204492A (en)

Similar Documents

Publication Publication Date Title
CN109254969A (en) Tables of data processing method, device, equipment and storage medium
CN109359277B (en) Data monitoring method, device and computer storage medium
CN111523854B (en) BIM and database-based automatic price-covering system related to automatic pre-settlement
CN113420057A (en) Account checking data processing method and related device
CN113434734A (en) Method, device, equipment and storage medium for generating file and reading file
CN110781235A (en) Big data based purchase data processing method and device, terminal and storage medium
CN116204492A (en) Metadata quality determination method and device
CN116957528B (en) Method and system for automatically generating attendance result for multi-source card punching data
CN109324963B (en) Method for automatically testing profit result and terminal equipment
CN117215932A (en) Display method and device for code increment coverage information and electronic equipment
CN112070470B (en) Annual report reporting method and device, electronic equipment and storage medium
CN111241082B (en) Data correction method and device
CN111427936B (en) Report generation method and device, computer equipment and storage medium
CN115220731A (en) Index data acquisition method and device, computer equipment and storage medium
CN111309623A (en) Coordinate data classification test method and device
CN114492324A (en) Component data statistical method and device
CN112528100A (en) Label strategy recommending and marking method, terminal equipment and storage medium
CN111968022B (en) Service number generation system and method based on JSON configuration mode
CN112258151A (en) Reconciliation method and device based on pandas, computer equipment and storage medium
CN116382760A (en) Index data processing method, device, storage medium and equipment
CN117668074A (en) Enterprise data importing method
CN112347095B (en) Data table processing method, device and server
CN114970485B (en) Industry data processing method and device, electronic equipment and storage medium
CN115357555B (en) Log-based auditing method and system
CN113360505B (en) Time sequence data-based data processing method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination