Nothing Special   »   [go: up one dir, main page]

WO2021135280A1 - Data check method for distributed storage system, and related apparatus - Google Patents

Data check method for distributed storage system, and related apparatus Download PDF

Info

Publication number
WO2021135280A1
WO2021135280A1 PCT/CN2020/110952 CN2020110952W WO2021135280A1 WO 2021135280 A1 WO2021135280 A1 WO 2021135280A1 CN 2020110952 W CN2020110952 W CN 2020110952W WO 2021135280 A1 WO2021135280 A1 WO 2021135280A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
target data
crc check
check value
preset
Prior art date
Application number
PCT/CN2020/110952
Other languages
French (fr)
Chinese (zh)
Inventor
冯龙
何营
白战豪
Original Assignee
浪潮电子信息产业股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浪潮电子信息产业股份有限公司 filed Critical 浪潮电子信息产业股份有限公司
Publication of WO2021135280A1 publication Critical patent/WO2021135280A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Definitions

  • This application relates to the field of distributed storage technology, in particular to a data verification method of a distributed storage system; it also relates to a data verification device, equipment and computer-readable storage medium of a distributed storage system.
  • Mass distributed storage systems usually consist of thousands of hard disks and dozens or hundreds of storage nodes. Each hard disk is managed by an independent storage engine, that is, managed by an independent process. If there are errors in the stored data due to hard disk hardware errors, firmware defects, power supply temperature, etc., it is usually difficult to obtain such anomalies in a timely and effective manner. In a multi-copy distributed storage system, if the wrong data is not found and recovered in time, it may lead to the risk of data loss or data inconsistency. Therefore, it is particularly important to verify the data and find the wrong data in time. At present, the technical solution for data verification is to calculate the CRC (Cyclic Redundancy Check) check value of the data of multiple copies, and compare the CRC check values between the copies.
  • CRC Cyclic Redundancy Check
  • the purpose of this application is to provide a data verification method and related device for a distributed storage system, which can accurately identify the location of data abnormalities and ensure the effective execution of data verification.
  • this application provides a data verification method for a distributed storage system, including:
  • calculating the CRC check value of the data of each preset check interval length in the target data includes:
  • the CRC of the data of each preset check interval length in the target data is directly calculated. Value.
  • calculating the CRC check value of the data of each preset check interval length in the target data includes:
  • start position and/or end position of the write interval of the target data is not an integer multiple of the calculated length of the preset check value, then read the start position from the hard disk before the preset It is assumed that the written data from the position that is an integer multiple of the length of the check interval to the start position, and/or the preset check is read from the end position to the end position from the hard disk The written data at the position that is an integer multiple of the interval length;
  • storing the second CRC check value in the database includes:
  • the second CRC check value and the offset in the object of the data corresponding to the second CRC check value are stored in the database in the form of a map.
  • the reading target data from the hard disk includes:
  • the target data is read from the hard disk when a read request is received or when an active verification is triggered.
  • reading the target data from the hard disk and calculating the CRC check value of the data of each preset check interval length in the target data includes:
  • start position and the end position of the reading interval corresponding to the target data are integer multiples of the length of the preset check interval, directly read and calculate the data of each preset check interval length in the target data CRC check value;
  • start position and/or end position of the read interval of the target data is not an integer multiple of the calculated length of the preset check value, then read the start position from the hard disk before the preset Set the data from the position of an integer multiple of the length of the check interval to the start position and the target data, and/or read the end position to the end position as an integer of the preset check interval length Calculate the CRC check value of the read data and the data of each preset check interval length in the target data.
  • it also includes:
  • the target data is added to the reconstruction queue for data reconstruction.
  • this application also provides a data verification device for a distributed storage system, including:
  • the calculation module is used to read the target data from the hard disk, and calculate the CRC check value of the data of each preset check interval length in the target data to obtain the first CRC check value;
  • a reading module which is used to calculate the second CRC check value obtained by calculating the CRC check value of the data of each preset check interval length in the target data when reading and writing the target data from the database;
  • the comparison module is configured to compare each of the first CRC check values with the corresponding second CRC check values
  • the determining module is configured to determine that an error occurs in the target data in the hard disk if the first CRC check value is inconsistent with the corresponding second CRC check value.
  • this application also provides a data verification device for a distributed storage system, including:
  • Memory used to store computer programs
  • the processor is used to implement the steps of the data verification method of the distributed storage system as described above when the computer program is executed.
  • the present application also provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, realizes the above-mentioned distributed storage system.
  • the steps of the data verification method are described in detail below.
  • the data verification method of the distributed storage system includes reading target data from a hard disk, and calculating the CRC check value of the data of each preset check interval length in the target data, to obtain the first CRC check
  • the second CRC check value obtained by calculating the CRC check value of the data of each preset check interval length in the target data; respectively
  • the first CRC check value is compared with the corresponding second CRC check value; if the first CRC check value is inconsistent with the corresponding second CRC check value, it is determined that the hard disk An error occurred in the target data.
  • the data verification method calculates the CRC check value of the data to be written whenever data is written. , And stored in the database.
  • the CRC check value of the data to be checked is calculated, and further compared with the corresponding CRC check value stored in the database, so as to determine whether the data has an error.
  • the data verification method adopts the data verification method in which the storage engine independently performs data verification. Data verification can be completed without using the data in the copies of other nodes. It can not only accurately identify the location of abnormal data, but also It can ensure the effective execution of data verification.
  • the data verification device, equipment, and computer-readable storage medium of the distributed storage system provided in this application all have the above technical effects.
  • FIG. 1 is a schematic flowchart of a data verification method of a distributed storage system provided by an embodiment of the application
  • FIG. 2 is a schematic diagram of calculating a CRC check value provided by an embodiment of the application.
  • FIG. 3 is another schematic diagram of calculating a CRC check value provided by an embodiment of the application.
  • FIG. 4 is a schematic diagram of a data verification device of a distributed storage system provided by an embodiment of the application.
  • FIG. 5 is another schematic diagram of calculating a CRC check value provided by an embodiment of the application.
  • the core of this application is to provide a data verification method and related device for a distributed storage system, which can accurately identify the location of data abnormalities and ensure the effective execution of data verification.
  • FIG. 1 is a schematic flowchart of a data verification method of a distributed storage system according to an embodiment of the application; referring to FIG. 1, the data verification method includes:
  • S101 Read the target data from the hard disk, and calculate the CRC check value of the data of each preset check interval length in the target data to obtain the first CRC check value;
  • this application provides a data verification method that uses a storage engine to independently perform data verification. After starting the data check, read the data that needs to be checked from the hard disk, that is, the above target data, and then calculate the CRC check value of the data of each preset check interval length in the target data, and obtain one or more The first CRC check value. Among them, there is no data duplication among the data in the target data for calculating the CRC check value.
  • the so-called preset check interval length refers to the length of the data for calculating the CRC check value. For example, the preset check value calculation length is 4KB, which means the calculated CRC check value of data with a length of 4KB.
  • the length of the target data is 8KB . Then two first CRC check values can be calculated at this time.
  • the specific value of the length of the preset check interval is not limited in this application, and can be set differently according to actual needs.
  • the specific calculation method for calculating the CRC check value of the data is not described in detail here in this application, and the existing related technology can be referred to.
  • the CRC check value of each preset check interval length in the data to be written is calculated before the data is written to the hard disk to obtain the second CRC check value, and the calculated second CRC check value is obtained.
  • the CRC check value is stored in the database.
  • the second CRC check value calculated and stored in the database when the target data is written is further read from the database, and subsequently based on each first CRC check value and The corresponding second CRC check value determines whether the target data has an error.
  • calculating the CRC check value of the data of each preset check interval length in the target data includes: if the start position and the end of the write interval corresponding to the target data If the position is an integer multiple of the length of the preset check interval, the CRC check value of the data of each preset check interval length in the target data is directly calculated.
  • this embodiment corresponds to the case where the target data is aligned data.
  • the so-called aligned data refers to data in which the start position and the end position of the writing interval in the data buffer are both integral multiples of the length of the preset check interval.
  • the target data is aligned data, at this time, the target data is directly divided by the preset check value calculation length as a unit to obtain multiple data fragments, and the CRC check value of each data fragment is calculated.
  • the start position of the data write interval is 4KB
  • the end position is 16KB
  • the write length is 12KB.
  • the target data is divided into 3 data fragments, and the CRC check value of each data fragment is calculated respectively to obtain 3 first CRC check values: CRC1 to CRC3.
  • calculating the CRC check value of the data of each preset check interval length in the target data includes: if the writing interval of the target data is If the starting position and/or ending position is not an integer multiple of the calculated length of the preset check value, then the position that is an integer multiple of the length of the preset check interval before the starting position is read from the hard disk The written data to the starting position, and/or the written data from the ending position to the position that is an integer multiple of the length of the preset check interval after the ending position is read from the hard disk Input data; Calculate the CRC check value of the data of each preset check interval length in the written data and the target data.
  • this embodiment corresponds to the case where the target data is non-aligned data.
  • the so-called non-aligned data refers to data whose starting position and/or ending position of the writing interval in the data buffer is not an integer multiple of the length of the preset check interval.
  • the target data is unaligned data
  • the data is first read from the hard disk to the data buffer to fill in the unaligned part. Specifically, if the start position of only the write interval is not an integer multiple of the length of the preset check interval, then the position that is an integer multiple of the preset check interval length before the start position is read from the hard disk to the start position The data has been written.
  • the end position of the write-only interval is not an integer multiple of the calculated length of the preset check value
  • the data written from the end position to the position that is an integer multiple of the preset check interval length after the end position is read from the hard disk.
  • the position that is an integer multiple of the preset check interval length before the start position is read from the hard disk to the start position
  • the written data and the written data from the hard disk from the end position to the position that is an integer multiple of the length of the preset check interval after the end position is read from the hard disk.
  • the CRC check value of the data of each preset check interval length is calculated. Among them, it is preferable to read from the hard disk the minimum number of data that can fill in the misaligned part.
  • the start position of the data write interval is 3KB
  • the end position is 15KB
  • the write length is 12KB.
  • the start position and the end position are not integral multiples of the length of the preset check interval, so first read the written data from 0KB to 3KB and the written data from 15KB to 16KB from the hard disk, and then use 0KB
  • the data from position to 16KB is a whole data, and the whole data is divided into 4 data fragments, and the second CRC check value of the data from 0KB to 4KB is calculated, and the second of the data from 4KB to 8KB is calculated.
  • the calculated second CRC check value is further stored in the database, and the target data is written to the hard disk.
  • the storing of the second CRC check value in the database includes: the second CRC check value and the offset of the data corresponding to the second CRC check value in the object are stored in the database in the form of a map.
  • the data on the hard disk includes business data and database data that records index information and metadata information.
  • Business data is stored on the hard disk in units of objects, and the metadata, index information, and extended attributes of the objects are stored in the database.
  • this embodiment adds content called verification data to the object metadata to record the verification information of the data stored on the disk for a certain object.
  • the second CRC check value of the data of each preset check interval length is calculated and recorded in the Key-Value database, where Key is the object name and Value is the object element Data information.
  • the CRC check value of the object and the offset of the data corresponding to the CRC check value in the object are recorded in the database in the form of a map.
  • the organization of the map is as follows:
  • offset characterizes the offset of the data corresponding to the CRC check value in the object.
  • CRC_LENGTH is the length of the preset check interval.
  • the first CRC check value of each data segment of the target data is respectively compared with the first CRC check value of each data segment
  • the two CRC check values are compared; if the first CRC check value of one or some data fragments is inconsistent with the corresponding second CRC check value, it is determined that the target data has an error. On the contrary, if the first CEC check value of each data segment is consistent with the corresponding second CRC check value, there is no error in the target data.
  • the above-mentioned reading of target data from the hard disk includes: reading the target data from the hard disk when a read request is received or when an active verification is triggered.
  • the timing for data verification in this embodiment includes performing data verification on the target data when reading target data (at this time the target data is target read data) after receiving a read request, and triggering active verification. Then, read the target data and perform data verification.
  • the CRC check value of the data of each preset check interval length in the target data including: determining the start of the read interval corresponding to the target data according to the read request Position and end position; if the start position and end position of the reading interval corresponding to the target data are integer multiples of the preset check interval length, directly read and calculate the data of each preset check interval length in the target data CRC check value; if the start position and/or end position of the read interval of the target data is not an integer multiple of the calculated length of the preset check value, the length of the preset check interval is set before the start position is read from the hard disk Data and target data from positions that are integer multiples of the position to the start position, and/or data and target data from positions that are integer multiples of the length of the preset check interval after reading the end position to the end position; calculate the read data and The CRC check value of the data of each preset check interval length in the target data.
  • the target data is aligned data according to the read request. If the target data is aligned data, the data of the corresponding offset length is directly read from the hard disk, and each preset calibration in the target data is calculated. The CRC check value of the data of the length of the check interval. If the target data is non-aligned data, at this time, in addition to reading the target data from the hard disk, other data is also read from the hard disk, so that the target data and the additionally read data together constitute aligned data. Furthermore, the target data and the additionally read data are taken as a whole data, and the CRC check value of each preset check interval length is calculated.
  • the start position of the reading interval corresponding to the target data is 3KB
  • the end position is 15KB
  • the start position and the end position of the read interval are not integer multiples of the length of the preset check interval
  • the 0KB position is also read from the hard disk.
  • the data to the 3KB position, and the data from the 15KB position to the 16KB position, and the data from the 0KB position to the 16KB position as a whole data calculate the CRC check value of the data of each 4KB length.
  • the target data and the additionally read data are taken as a whole data, and the first CRC check value is calculated, and each calculated first CRC check value is related to the corresponding
  • the additional read data is cut from the data buffer, and only the target data is returned to the client.
  • the verification period can be pre-configured, that is, the above-mentioned preset period, for example, data verification is triggered every other week. Therefore, according to the preset period, when the check time is reached, the data check is triggered, the objects in the database are traversed, the target data, which is the target data, is read, and the first CRC check value of the target data is calculated. Read the second CRC check value stored in the database, and compare the second CRC check value with the corresponding first CRC check value.
  • the method of calculating the first CRC check value of the object data reference may be made to the above-mentioned embodiment, which is not repeated in this application.
  • the target data is added to the reconstruction queue for data reconstruction.
  • the data reconstruction process of the distributed storage system is triggered, and the erroneous data is added to the data reconstruction queue as missing data to perform data reconstruction.
  • normal data is read from other copies and written to the local storage engine to restore the local data.
  • the data reconstruction process can use the original reconstruction process of the distributed storage system, which will not be repeated in this application.
  • the data verification method calculates the CRC check value of the data to be written every time data is written, and stores it in the database, and then calculates the data to be verified during subsequent data verification.
  • the CRC check value is further compared with the corresponding CRC check value stored in the database to determine whether the data has an error.
  • the data verification method adopts the data verification method in which the storage engine independently performs data verification. Data verification can be completed without using the data in the copies of other nodes. It can not only accurately identify the location of abnormal data, but also It can ensure the effective execution of data verification.
  • the present application also provides a data verification device of a distributed storage system, and the device described below may correspond to the method described above with reference to each other.
  • the data verification device includes:
  • the calculation module is used to read the target data from the hard disk, and calculate the CRC check value of the data of each preset check interval length in the target data to obtain the first CRC check value;
  • a reading module which is used to calculate the second CRC check value obtained by calculating the CRC check value of the data of each preset check interval length in the target data when reading and writing the target data from the database;
  • the comparison module is configured to compare each of the first CRC check values with the corresponding second CRC check values
  • the determining module is configured to determine that an error occurs in the target data in the hard disk if the first CRC check value is inconsistent with the corresponding second CRC check value.
  • This application also provides a data verification device for a distributed storage system, the data verification device comprising: a memory and a processor; wherein the memory is used to store a computer program; the processor is used to execute the computer program to implement the following steps :
  • This application also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the following steps are implemented:
  • the computer-readable storage medium may include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc., which can store program codes Medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data check method for a distributed storage system, and a related apparatus. The method comprises: reading target data from a hard disk, and calculating a CRC value of data of each preset check interval length in the target data, and obtaining a first CRC value; when the target data are read and written from a database, calculating a second CRC value obtained from the CRC value of the data of each preset check interval length in the target data; respectively comparing each first CRC value with the corresponding second CRC value; and if the first CRC value is inconsistent with the corresponding second CRC value, determining that the target data in the hard disk has an error. According to the data check method, the anomaly position of data can be accurately identified, and an effective data check can be guaranteed.

Description

一种分布式存储系统的数据校验方法及相关装置Data verification method and related device of distributed storage system
本申请要求于2019年12月31日提交至中国专利局、申请号为201911411044.2、发明名称为“一种分布式存储系统的数据校验方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed to the Chinese Patent Office on December 31, 2019, with the application number 201911411044.2, and the title of the invention "A data verification method and related device for a distributed storage system", all of which The content is incorporated in this application by reference.
技术领域Technical field
本申请涉及分布式存储技术领域,特别涉及一种分布式存储系统的数据校验方法;还涉及一种分布式存储系统的数据校验装置、设备以及计算机可读存储介质。This application relates to the field of distributed storage technology, in particular to a data verification method of a distributed storage system; it also relates to a data verification device, equipment and computer-readable storage medium of a distributed storage system.
背景技术Background technique
海量分布式存储系统通常由数千块硬盘与数十上百个存储节点组成。每个硬盘由一个独立的存储引擎管理,即由一个独立的进程管理。如果由于硬盘的硬件错误、固件缺陷、供电温度等导致存储数据出现错误,通常难以及时有效的获取到这类异常。而在多副本分布式存储系统中,若未能及时发现错误数据并进行恢复,则可能导致数据丢失或数据不一致的风险。由此,对数据进行校验,以及时发现错误数据显得尤为重要。目前,针对数据校验的技术方案是通过对多个副本的数据进行CRC(Cyclic Redundancy Check,循环冗余校验)校验值计算,并比较对各副本之间的CRC校验值进行比较,当CRC校验值不相等时,则认为某个副本的数据出现了错误。然而,采用上述校验方式,在两个副本的情况下,如果两个副本的数据的CRC校验值不同,此时只能判断出其中一个副本的数据出现了错误,却无法确定具体是哪个副本的数据出现了错误。并且,从两个存储节点获取两个副本的数据的校验值进行比较,复杂度高。若某一副本所在节点出现故障,如已经掉电,此时便无法有效进行数据校验。Mass distributed storage systems usually consist of thousands of hard disks and dozens or hundreds of storage nodes. Each hard disk is managed by an independent storage engine, that is, managed by an independent process. If there are errors in the stored data due to hard disk hardware errors, firmware defects, power supply temperature, etc., it is usually difficult to obtain such anomalies in a timely and effective manner. In a multi-copy distributed storage system, if the wrong data is not found and recovered in time, it may lead to the risk of data loss or data inconsistency. Therefore, it is particularly important to verify the data and find the wrong data in time. At present, the technical solution for data verification is to calculate the CRC (Cyclic Redundancy Check) check value of the data of multiple copies, and compare the CRC check values between the copies. When the CRC check value is not equal, it is considered that an error has occurred in the data of a certain copy. However, with the above verification method, in the case of two copies, if the CRC check value of the data of the two copies is different, it can only be judged that the data of one copy has an error, but it is impossible to determine which one. There is an error in the data of the copy. In addition, obtaining the check value of the data of the two copies from the two storage nodes for comparison is high in complexity. If a node where a copy is located fails, such as power failure, data verification cannot be effectively performed at this time.
因此,如何解决上述技术缺陷已成为本领域技术人员亟待解决的技术问题。Therefore, how to solve the above technical defects has become an urgent technical problem to be solved by those skilled in the art.
发明内容Summary of the invention
本申请的目的是提供一种分布式存储系统的数据校验方法及相关装 置,能够精确识别数据异常的位置,并能够保障数据校验的有效进行。The purpose of this application is to provide a data verification method and related device for a distributed storage system, which can accurately identify the location of data abnormalities and ensure the effective execution of data verification.
为解决上述技术问题,本申请提供了一种分布式存储系统的数据校验方法,包括:To solve the above technical problems, this application provides a data verification method for a distributed storage system, including:
从硬盘读取目标数据,并计算所述目标数据中每预设校验区间长度的数据的CRC校验值,得到第一CRC校验值;Reading the target data from the hard disk, and calculating the CRC check value of the data of each preset check interval length in the target data to obtain the first CRC check value;
从数据库中读取写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值所得到的第二CRC校验值;When reading and writing the target data from the database, calculating the second CRC check value obtained by calculating the CRC check value of the data of each preset check interval length in the target data;
分别将各所述第一CRC校验值与相应的所述第二CRC校验值进行比对;Respectively comparing each of the first CRC check values with the corresponding second CRC check values;
若所述第一CRC校验值与相应的所述第二CRC校验值不一致,则确定所述硬盘中的所述目标数据发生错误。If the first CRC check value is inconsistent with the corresponding second CRC check value, it is determined that an error has occurred in the target data in the hard disk.
可选的,写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值,包括:Optionally, when writing the target data, calculating the CRC check value of the data of each preset check interval length in the target data includes:
若所述目标数据对应的写入区间的起始位置与终止位置是所述预设校验区间长度的整数倍,则直接计算所述目标数据中每预设校验区间长度的数据的CRC校验值。If the start position and the end position of the write interval corresponding to the target data are integer multiples of the length of the preset check interval, the CRC of the data of each preset check interval length in the target data is directly calculated. Value.
可选的,写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值,包括:Optionally, when writing the target data, calculating the CRC check value of the data of each preset check interval length in the target data includes:
若所述目标数据的写入区间的起始位置和/或终止位置不是所述预设校验值计算长度的整数倍,则从所述硬盘中读取所述起始位置前为所述预设校验区间长度的整数倍的位置至所述起始位置的已写入数据,和/或,从所述硬盘中读取所述终止位置至所述终止位置后为所述预设校验区间长度的整数倍的位置的已写入数据;If the start position and/or end position of the write interval of the target data is not an integer multiple of the calculated length of the preset check value, then read the start position from the hard disk before the preset It is assumed that the written data from the position that is an integer multiple of the length of the check interval to the start position, and/or the preset check is read from the end position to the end position from the hard disk The written data at the position that is an integer multiple of the interval length;
计算所述已写入数据与所述目标数据中每预设校验区间长度的数据的CRC校验值。Calculate the CRC check value of the data of each preset check interval length in the written data and the target data.
可选的,所述第二CRC校验值存入所述数据库包括:Optionally, storing the second CRC check value in the database includes:
所述第二CRC校验值及所述第二CRC校验值对应的所述数据在对象中的偏移量以map的形式存储于所述数据库。The second CRC check value and the offset in the object of the data corresponding to the second CRC check value are stored in the database in the form of a map.
可选的,所述从硬盘读取目标数据包括:Optionally, the reading target data from the hard disk includes:
接收读请求时或触发主动校验时从所述硬盘读取所述目标数据。The target data is read from the hard disk when a read request is received or when an active verification is triggered.
可选的,接收读请求时从所述硬盘读取目标数据,并计算所述目标数据中每预设校验区间长度的数据的CRC校验值,包括:Optionally, when receiving a read request, reading the target data from the hard disk and calculating the CRC check value of the data of each preset check interval length in the target data includes:
根据所述读取请求确定所述目标数据对应的读取区间的起始位置与终止位置;Determining the start position and the end position of the read interval corresponding to the target data according to the read request;
若所述目标数据对应的读取区间的起始位置与终止位置是所述预设校验区间长度的整数倍,则直接读取并计算所述目标数据中每预设校验区间长度的数据的CRC校验值;If the start position and the end position of the reading interval corresponding to the target data are integer multiples of the length of the preset check interval, directly read and calculate the data of each preset check interval length in the target data CRC check value;
若所述目标数据的读取区间的起始位置和/或终止位置不是所述预设校验值计算长度的整数倍,则从所述硬盘中读取所述起始位置前为所述预设校验区间长度的整数倍的位置至所述起始位置的数据与所述目标数据,和/或读取所述终止位置至所述终止位置后为所述预设校验区间长度的整数倍的位置的数据与所述目标数据;计算读取的所述数据与所述目标数据中每预设校验区间长度的数据的CRC校验值。If the start position and/or end position of the read interval of the target data is not an integer multiple of the calculated length of the preset check value, then read the start position from the hard disk before the preset Set the data from the position of an integer multiple of the length of the check interval to the start position and the target data, and/or read the end position to the end position as an integer of the preset check interval length Calculate the CRC check value of the read data and the data of each preset check interval length in the target data.
可选的,还包括:Optionally, it also includes:
当所述目标数据发生错误时,将所述目标数据添加到重构队列进行数据重构。When an error occurs in the target data, the target data is added to the reconstruction queue for data reconstruction.
为解决上述技术问题,本申请还提供了一种分布式存储系统的数据校验装置,包括:To solve the above technical problems, this application also provides a data verification device for a distributed storage system, including:
计算模块,用于从硬盘读取目标数据,并计算所述目标数据中每预设校验区间长度的数据的CRC校验值,得到第一CRC校验值;The calculation module is used to read the target data from the hard disk, and calculate the CRC check value of the data of each preset check interval length in the target data to obtain the first CRC check value;
读取模块,用于从数据库中读取写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值所得到的第二CRC校验值;A reading module, which is used to calculate the second CRC check value obtained by calculating the CRC check value of the data of each preset check interval length in the target data when reading and writing the target data from the database;
比对模块,用于分别将各所述第一CRC校验值与相应的所述第二CRC校验值进行比对;The comparison module is configured to compare each of the first CRC check values with the corresponding second CRC check values;
确定模块,用于若所述第一CRC校验值与相应的所述第二CRC校验值不一致,则确定所述硬盘中的所述目标数据发生错误。The determining module is configured to determine that an error occurs in the target data in the hard disk if the first CRC check value is inconsistent with the corresponding second CRC check value.
为解决上述技术问题,本申请还提供了一种分布式存储系统的数据校验设备,包括:To solve the above technical problems, this application also provides a data verification device for a distributed storage system, including:
存储器,用于存储计算机程序;Memory, used to store computer programs;
处理器,用于执行所述计算机程序时实现如上所述的分布式存储系统 的数据校验方法的步骤。The processor is used to implement the steps of the data verification method of the distributed storage system as described above when the computer program is executed.
为解决上述技术问题,本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如上所述的分布式存储系统的数据校验方法的步骤。In order to solve the above technical problems, the present application also provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, realizes the above-mentioned distributed storage system. The steps of the data verification method.
本申请所提供的分布式存储系统的数据校验方法,包括从硬盘读取目标数据,并计算所述目标数据中每预设校验区间长度的数据的CRC校验值,得到第一CRC校验值;从数据库中读取写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值所得到的第二CRC校验值;分别将各所述第一CRC校验值与相应的所述第二CRC校验值进行比对;若所述第一CRC校验值与相应的所述第二CRC校验值不一致,则确定所述硬盘中的所述目标数据发生错误。The data verification method of the distributed storage system provided by this application includes reading target data from a hard disk, and calculating the CRC check value of the data of each preset check interval length in the target data, to obtain the first CRC check When the target data is read and written from the database, the second CRC check value obtained by calculating the CRC check value of the data of each preset check interval length in the target data; respectively The first CRC check value is compared with the corresponding second CRC check value; if the first CRC check value is inconsistent with the corresponding second CRC check value, it is determined that the hard disk An error occurred in the target data.
可见,较之基于多个副本中的数据的CRC校验值进行数据校验的技术方案,本申请所提供的数据校验方法,每当写数据时计算需要写入的数据的CRC校验值,并存储到数据库,后续进行数据校验时,计算需要校验的数据的CRC校验值,并进一步与存储在数据库中的相应的CRC校验值进行比对,从而判断数据是否发生错误。该数据校验方法采用存储引擎独立进行数据校验的数据校验方式进行数据校验,无需借助其他节点的副本中的数据,即可完成数据校验,不仅能够精确识别数据异常的位置,并能够保障数据校验的有效进行。It can be seen that, compared to the technical solution of data verification based on the CRC check value of the data in multiple copies, the data verification method provided in this application calculates the CRC check value of the data to be written whenever data is written. , And stored in the database. When subsequent data verification is performed, the CRC check value of the data to be checked is calculated, and further compared with the corresponding CRC check value stored in the database, so as to determine whether the data has an error. The data verification method adopts the data verification method in which the storage engine independently performs data verification. Data verification can be completed without using the data in the copies of other nodes. It can not only accurately identify the location of abnormal data, but also It can ensure the effective execution of data verification.
本申请所提供的分布式存储系统的数据校验装置、设备以及计算机可读存储介质,均具有上述技术效果。The data verification device, equipment, and computer-readable storage medium of the distributed storage system provided in this application all have the above technical effects.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对现有技术和实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly explain the technical solutions in the embodiments of the present application, the following will briefly introduce the prior art and the drawings needed in the embodiments. Obviously, the drawings in the following description are only some of the present application. Embodiments, for those of ordinary skill in the art, without creative work, other drawings can be obtained based on these drawings.
图1为本申请实施例所提供的一种分布式存储系统的数据校验方法的流程示意图;FIG. 1 is a schematic flowchart of a data verification method of a distributed storage system provided by an embodiment of the application;
图2为本申请实施例所提供的一种计算CRC校验值的示意图;2 is a schematic diagram of calculating a CRC check value provided by an embodiment of the application;
图3为本申请实施例所提供的另一种计算CRC校验值的示意图;FIG. 3 is another schematic diagram of calculating a CRC check value provided by an embodiment of the application;
图4为本申请实施例所提供的一种分布式存储系统的数据校验装置的示意图;4 is a schematic diagram of a data verification device of a distributed storage system provided by an embodiment of the application;
图5为本申请实施例所提供的又一种计算CRC校验值的示意图。FIG. 5 is another schematic diagram of calculating a CRC check value provided by an embodiment of the application.
具体实施方式Detailed ways
本申请的核心是提供一种分布式存储系统的数据校验方法及相关装置,能够精确识别数据异常的位置,并能够保障数据校验的有效进行。The core of this application is to provide a data verification method and related device for a distributed storage system, which can accurately identify the location of data abnormalities and ensure the effective execution of data verification.
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
请参考图1,图1为本申请实施例所提供的一种分布式存储系统的数据校验方法的流程示意图;参考图1所示,该数据校验方法包括:Please refer to FIG. 1. FIG. 1 is a schematic flowchart of a data verification method of a distributed storage system according to an embodiment of the application; referring to FIG. 1, the data verification method includes:
S101:从硬盘读取目标数据,并计算目标数据中每预设校验区间长度的数据的CRC校验值,得到第一CRC校验值;S101: Read the target data from the hard disk, and calculate the CRC check value of the data of each preset check interval length in the target data to obtain the first CRC check value;
具体的,本申请提供了一种利用存储引擎独立进行数据校验的数据校验方式。启动数据校验后,从硬盘中读取需要进行数据校验的数据,即上述目标数据,进而计算该目标数据中每预设校验区间长度的数据的CRC校验值,得到一个或一个以上的第一CRC校验值。其中,计算CRC校验值的目标数据中的各数据之间不存在数据重复。所谓预设校验区间长度即计算CRC校验值的数据的长度,例如,预设校验值计算长度为4KB,表示计算长度为4KB的数据的CRC校验值,若目标数据的长度为8KB,则此时可计算得到两个第一CRC校验值。当然,对于上述预设校验区间长度的具体数值,本申请不做限定,可以根据实际需要进行差异性设置。另外,对于计算数据的CRC校验值的具体计算方式,本申请在此不做赘述,参考现有的相关技术即可。Specifically, this application provides a data verification method that uses a storage engine to independently perform data verification. After starting the data check, read the data that needs to be checked from the hard disk, that is, the above target data, and then calculate the CRC check value of the data of each preset check interval length in the target data, and obtain one or more The first CRC check value. Among them, there is no data duplication among the data in the target data for calculating the CRC check value. The so-called preset check interval length refers to the length of the data for calculating the CRC check value. For example, the preset check value calculation length is 4KB, which means the calculated CRC check value of data with a length of 4KB. If the length of the target data is 8KB , Then two first CRC check values can be calculated at this time. Of course, the specific value of the length of the preset check interval is not limited in this application, and can be set differently according to actual needs. In addition, the specific calculation method for calculating the CRC check value of the data is not described in detail here in this application, and the existing related technology can be referred to.
S102:从数据库中读取写目标数据时,计算目标数据中每预设校验区 间长度的数据的CRC校验值所得到的第二CRC校验值;S102: When reading and writing target data from the database, calculate the second CRC check value obtained by calculating the CRC check value of each preset check interval length data in the target data;
具体的,每当写数据时,在数据写入硬盘前计算待写入数据中每预设校验区间长度的数据的CRC校验值,得到第二CRC校验值,并将计算所得第二CRC校验值存入数据库。执行步骤S101计算得到第一CRC校验值的基础上,进一步从数据库中读取写该目标数据时计算并存入数据库的第二CRC校验值,以后续基于各第一CRC校验值与相应的第二CRC校验值判断目标数据是否发生错误。Specifically, whenever data is written, the CRC check value of each preset check interval length in the data to be written is calculated before the data is written to the hard disk to obtain the second CRC check value, and the calculated second CRC check value is obtained. The CRC check value is stored in the database. On the basis of the first CRC check value calculated by performing step S101, the second CRC check value calculated and stored in the database when the target data is written is further read from the database, and subsequently based on each first CRC check value and The corresponding second CRC check value determines whether the target data has an error.
其中,在一种具体的实施方式中,写目标数据时,计算目标数据中每预设校验区间长度的数据的CRC校验值包括:若目标数据对应的写入区间的起始位置与终止位置是预设校验区间长度的整数倍,则直接计算目标数据中每预设校验区间长度的数据的CRC校验值。Wherein, in a specific implementation manner, when writing the target data, calculating the CRC check value of the data of each preset check interval length in the target data includes: if the start position and the end of the write interval corresponding to the target data If the position is an integer multiple of the length of the preset check interval, the CRC check value of the data of each preset check interval length in the target data is directly calculated.
具体的,本实施例对应于目标数据为对齐数据的情况。所谓对齐数据,即对应的位于数据缓存器中的写入区间的起始位置与终止位置均是预设校验区间长度的整数倍的数据。当目标数据为对齐数据时,此时直接以预设校验值计算长度为单位对目标数据进行划分,得到多个数据片段,并计算各数据片段的CRC校验值。Specifically, this embodiment corresponds to the case where the target data is aligned data. The so-called aligned data refers to data in which the start position and the end position of the writing interval in the data buffer are both integral multiples of the length of the preset check interval. When the target data is aligned data, at this time, the target data is directly divided by the preset check value calculation length as a unit to obtain multiple data fragments, and the CRC check value of each data fragment is calculated.
例如,参考图2所示,以预设校验区间长度为4KB为例,数据写入区间的起始位置为4KB,终止位置为16KB,写入长度为12KB,则此时以4KB为单位将目标数据划分为3个数据片段,并分别计算各数据片段的CRC校验值,得到3个第一CRC校验值:CRC1至CRC3。For example, referring to Figure 2, taking the preset check interval length of 4KB as an example, the start position of the data write interval is 4KB, the end position is 16KB, and the write length is 12KB. The target data is divided into 3 data fragments, and the CRC check value of each data fragment is calculated respectively to obtain 3 first CRC check values: CRC1 to CRC3.
在另一种具体的实施方式中,写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值,包括:若所述目标数据的写入区间的起始位置和/或终止位置不是所述预设校验值计算长度的整数倍,则从所述硬盘中读取所述起始位置前为所述预设校验区间长度的整数倍的位置至所述起始位置的已写入数据,和/或,从所述硬盘中读取所述终止位置至所述终止位置后为所述预设校验区间长度的整数倍的位置的已写入数据;计算所述已写入数据与所述目标数据中每预设校验区间长度的数据的CRC校验值。In another specific embodiment, when writing the target data, calculating the CRC check value of the data of each preset check interval length in the target data includes: if the writing interval of the target data is If the starting position and/or ending position is not an integer multiple of the calculated length of the preset check value, then the position that is an integer multiple of the length of the preset check interval before the starting position is read from the hard disk The written data to the starting position, and/or the written data from the ending position to the position that is an integer multiple of the length of the preset check interval after the ending position is read from the hard disk Input data; Calculate the CRC check value of the data of each preset check interval length in the written data and the target data.
具体的,本实施例对应于目标数据为非对齐数据的情况。所谓非对齐数据,即对应的位于数据缓存器中的写入区间的起始位置和/或终止位置不 是预设校验区间长度的整数倍的数据。当目标数据为非对齐数据时,此时首先从硬盘上读取数据到数据缓存器,以补齐未对齐的部分。具体而言,若仅写入区间的起始位置不是预设校验区间长度的整数倍,则从硬盘中读取起始位置前为预设校验区间长度的整数倍的位置至起始位置的已写入数据。若仅写入区间的终止位置不是预设校验值计算长度的整数倍,则从硬盘中读取终止位置至终止位置后为预设校验区间长度的整数倍的位置的已写入数据。若写入区间的起始位置与终止位置均不是预设校验区间长度的整数倍,则从硬盘中读取起始位置前为预设校验区间长度的整数倍的位置至起始位置的已写入数据,以及从硬盘中读取终止位置至终止位置后为预设校验区间长度的整数倍的位置的已写入数据。进而,以已写入数据与目标数据为一体,计算每预设校验区间长度的数据的CRC校验值。其中,优选的,从硬盘中读取能够实现补齐未对齐部分的最少个数的数据。Specifically, this embodiment corresponds to the case where the target data is non-aligned data. The so-called non-aligned data refers to data whose starting position and/or ending position of the writing interval in the data buffer is not an integer multiple of the length of the preset check interval. When the target data is unaligned data, the data is first read from the hard disk to the data buffer to fill in the unaligned part. Specifically, if the start position of only the write interval is not an integer multiple of the length of the preset check interval, then the position that is an integer multiple of the preset check interval length before the start position is read from the hard disk to the start position The data has been written. If the end position of the write-only interval is not an integer multiple of the calculated length of the preset check value, the data written from the end position to the position that is an integer multiple of the preset check interval length after the end position is read from the hard disk. If neither the start position nor the end position of the write interval is an integer multiple of the length of the preset check interval, the position that is an integer multiple of the preset check interval length before the start position is read from the hard disk to the start position The written data and the written data from the hard disk from the end position to the position that is an integer multiple of the length of the preset check interval after the end position is read from the hard disk. Furthermore, taking the written data and the target data as a whole, the CRC check value of the data of each preset check interval length is calculated. Among them, it is preferable to read from the hard disk the minimum number of data that can fill in the misaligned part.
例如,参考图3所示,以预设校验区间长度为4KB为例,数据写入区间的起始位置为3KB,终止位置为15KB,写入长度为12KB,由于此时写入区间的起始位置与终止位置均不是预设校验区间长度的整数倍,故首先从硬盘中读取0KB位置到3KB位置的已写入数据,以及15KB位置至16KB位置的已写入数据,进而以0KB位置至16KB位置的数据为一个数据整体,并将此数据整体划分为4个数据片段,分别计算0KB位置到4KB位置的数据的第二CRC校验值、4KB位置到8KB位置的数据的第二CRC校验值、8KB位置到12KB位置的数据的第二CRC校验值以及12KB位置到16KB位置的数据的第二CRC校验值,从而得到4个第二CRC校验值:CRC1至CRC4。For example, referring to Figure 3, taking the preset check interval length of 4KB as an example, the start position of the data write interval is 3KB, the end position is 15KB, and the write length is 12KB. The start position and the end position are not integral multiples of the length of the preset check interval, so first read the written data from 0KB to 3KB and the written data from 15KB to 16KB from the hard disk, and then use 0KB The data from position to 16KB is a whole data, and the whole data is divided into 4 data fragments, and the second CRC check value of the data from 0KB to 4KB is calculated, and the second of the data from 4KB to 8KB is calculated. The CRC check value, the second CRC check value of the data from the 8KB position to the 12KB position, and the second CRC check value of the data from the 12KB position to the 16KB position, thereby obtaining four second CRC check values: CRC1 to CRC4.
计算得到第二CRC校验值后,进一步将计算所得的各第二CRC校验值存入数据库,将目标数据写入硬盘。After the second CRC check value is calculated, the calculated second CRC check value is further stored in the database, and the target data is written to the hard disk.
其中,第二CRC校验值存入数据库包括:第二CRC校验值及第二CRC校验值对应的数据在对象中的偏移量以map的形式存储于数据库。The storing of the second CRC check value in the database includes: the second CRC check value and the offset of the data corresponding to the second CRC check value in the object are stored in the database in the form of a map.
具体的,硬盘上的数据包括业务数据与记录索引信息、元数据信息的数据库数据。业务数据以对象为单位存储到硬盘上,对象的元数据、索引信息以及扩展属性存储在数据库中。参考图4所示,为满足数据自校验的需求,本实施例在对象元数据中添加名为校验数据的内容,以记录某个对 象在磁盘上所存储的数据的校验信息。当某对象写入数据时,计算每预设校验区间长度的数据的第二CRC校验值,并记录在Key-Value(键值)数据库中,其中,Key为对象名,Value为对象元数据信息。且对象的CRC校验值以及CRC校验值对应的数据在对象中的偏移量以map的形式记录在数据库中。map的组织形式如下所示:Specifically, the data on the hard disk includes business data and database data that records index information and metadata information. Business data is stored on the hard disk in units of objects, and the metadata, index information, and extended attributes of the objects are stored in the database. As shown in Fig. 4, in order to meet the requirement of data self-verification, this embodiment adds content called verification data to the object metadata to record the verification information of the data stored on the disk for a certain object. When an object writes data, the second CRC check value of the data of each preset check interval length is calculated and recorded in the Key-Value database, where Key is the object name and Value is the object element Data information. And the CRC check value of the object and the offset of the data corresponding to the CRC check value in the object are recorded in the database in the form of a map. The organization of the map is as follows:
Key:offset;表征CRC校验值对应的数据在对象中的偏移量。Key: offset; characterizes the offset of the data corresponding to the CRC check value in the object.
Value;校验值;根据对象中[offset,CRC_LENGTH]区间的数据计算得到的CRC校验值。CRC_LENGTH即预设校验区间长度。Value; Check value; CRC check value calculated according to the data in the [offset, CRC_LENGTH] interval in the object. CRC_LENGTH is the length of the preset check interval.
S103:分别将各第一CRC校验值与相应的第二CRC校验值进行比对;S103: Compare each first CRC check value with a corresponding second CRC check value;
S104:若第一CRC校验值与相应的第二CRC校验值不一致,则确定硬盘中的目标数据发生错误。S104: If the first CRC check value is inconsistent with the corresponding second CRC check value, it is determined that an error occurs in the target data in the hard disk.
具体的,计算得到目标数据的第一CRC校验值并读取到目标数据的第二CRC校验值后,分别将目标数据的各数据片段的第一CRC校验值与各数据片段的第二CRC校验值进行比对;若其中的某个或某些数据片段的第一CRC校验值与相应的第二CRC校验值不一致,则确定目标数据发生错误。相反,若各数据片段的第一CEC校验值与相应的第二CRC校验值均一致,则目标数据未发生错误。Specifically, after the first CRC check value of the target data is calculated and the second CRC check value of the target data is read, the first CRC check value of each data segment of the target data is respectively compared with the first CRC check value of each data segment The two CRC check values are compared; if the first CRC check value of one or some data fragments is inconsistent with the corresponding second CRC check value, it is determined that the target data has an error. On the contrary, if the first CEC check value of each data segment is consistent with the corresponding second CRC check value, there is no error in the target data.
进一步,上述从硬盘读取目标数据包括:接收读请求时或触发主动校验时从所述硬盘读取所述目标数据。Further, the above-mentioned reading of target data from the hard disk includes: reading the target data from the hard disk when a read request is received or when an active verification is triggered.
具体的,本实施例中进行数据校验的时机包括接收到读请求后,读取目标数据(此时该目标数据为目标读取数据)时对目标数据进行数据校验,以及触发主动校验后,读取目标数据,进行数据校验。Specifically, the timing for data verification in this embodiment includes performing data verification on the target data when reading target data (at this time the target data is target read data) after receiving a read request, and triggering active verification. Then, read the target data and perform data verification.
其中,接收读请求时从硬盘读取目标数据,并计算目标数据中每预设校验区间长度的数据的CRC校验值,包括:根据读取请求确定目标数据对应的读取区间的起始位置与终止位置;若目标数据对应的读取区间的起始位置与终止位置是预设校验区间长度的整数倍,则直接读取并计算目标数据中每预设校验区间长度的数据的CRC校验值;若目标数据的读取区间的起始位置和/或终止位置不是预设校验值计算长度的整数倍,则从硬盘中读取起始位置前为预设校验区间长度的整数倍的位置至起始位置的数据与目标数据,和/或读取终止位置至终止位置后为预设校验区间长度的整数倍的 位置的数据与目标数据;计算读取的数据与目标数据中每预设校验区间长度的数据的CRC校验值。Among them, when receiving the read request, read the target data from the hard disk, and calculate the CRC check value of the data of each preset check interval length in the target data, including: determining the start of the read interval corresponding to the target data according to the read request Position and end position; if the start position and end position of the reading interval corresponding to the target data are integer multiples of the preset check interval length, directly read and calculate the data of each preset check interval length in the target data CRC check value; if the start position and/or end position of the read interval of the target data is not an integer multiple of the calculated length of the preset check value, the length of the preset check interval is set before the start position is read from the hard disk Data and target data from positions that are integer multiples of the position to the start position, and/or data and target data from positions that are integer multiples of the length of the preset check interval after reading the end position to the end position; calculate the read data and The CRC check value of the data of each preset check interval length in the target data.
具体的,接收读请求后,根据读请求判断目标数据是否为对齐数据,若目标数据为对齐数据,则直接从硬盘中读取相应偏移量长度的数据,并计算目标数据中每预设校验区间长度的数据的CRC校验值。若目标数据为非对齐数据,则此时除从硬盘上读取目标数据外,还额外从硬盘中读取其他数据,以使目标数据与额外读取的数据共同构成对齐数据。进而,以目标数据与额外读取的数据为一个数据整体,计算每预设校验区间长度的CRC校验值。Specifically, after receiving the read request, it is judged whether the target data is aligned data according to the read request. If the target data is aligned data, the data of the corresponding offset length is directly read from the hard disk, and each preset calibration in the target data is calculated. The CRC check value of the data of the length of the check interval. If the target data is non-aligned data, at this time, in addition to reading the target data from the hard disk, other data is also read from the hard disk, so that the target data and the additionally read data together constitute aligned data. Furthermore, the target data and the additionally read data are taken as a whole data, and the CRC check value of each preset check interval length is calculated.
例如,参考图5所示,以预设校验区间长度为4KB为例,目标数据对应的读取区间的起始位置3KB,终止位置为15KB,即需读取3KB位置到15KB位置的数据,由于此时读取区间的起始位置与终止位置均不是预设校验区间长度的整数倍,故此时除从硬盘中读取3KB位置到15KB位置的数据外,还从硬盘中读取0KB位置到3KB位置的数据,以及15KB位置至16KB位置的数据,进而以0KB位置至16KB位置的数据为一个数据整体,计算每4KB长度的数据CRC校验值。For example, referring to Figure 5, taking the preset check interval length of 4KB as an example, the start position of the reading interval corresponding to the target data is 3KB, and the end position is 15KB, that is, data from 3KB to 15KB needs to be read. Since the start position and the end position of the read interval are not integer multiples of the length of the preset check interval, at this time, in addition to reading the data from the 3KB position to the 15KB position from the hard disk, the 0KB position is also read from the hard disk. The data to the 3KB position, and the data from the 15KB position to the 16KB position, and the data from the 0KB position to the 16KB position as a whole data, calculate the CRC check value of the data of each 4KB length.
进一步,在目标数据为非对齐数据的情况下,以目标数据与额外读取的数据为一个数据整体,计算得到第一CRC校验值,且计算得到的各第一CRC校验值与相应的第二CRC校验值均一致时,从数据缓存器中裁剪掉额外读取的数据,仅将目标数据返回给客户端。Further, in the case that the target data is non-aligned data, the target data and the additionally read data are taken as a whole data, and the first CRC check value is calculated, and each calculated first CRC check value is related to the corresponding When the second CRC check values are consistent, the additional read data is cut from the data buffer, and only the target data is returned to the client.
对于触发主动校验的方式,可预先配置校验周期,即上述预设周期,例如,每隔一周触发一次数据校验。从而根据该预设周期,每当到达校验时间后,即触发数据校验,遍历数据库中的对象,读取对象数据即目标数据,计算对象数据的第一CRC校验值。读取数据库中存储的第二CRC校验值,并比较第二CRC校验值与相应的第一CRC校验值。其中,计算对象数据的第一CRC校验值的方式可参考上述实施例,本申请在此不做赘述。For the method of triggering active verification, the verification period can be pre-configured, that is, the above-mentioned preset period, for example, data verification is triggered every other week. Therefore, according to the preset period, when the check time is reached, the data check is triggered, the objects in the database are traversed, the target data, which is the target data, is read, and the first CRC check value of the target data is calculated. Read the second CRC check value stored in the database, and compare the second CRC check value with the corresponding first CRC check value. For the method of calculating the first CRC check value of the object data, reference may be made to the above-mentioned embodiment, which is not repeated in this application.
进一步,当目标数据发生错误时,将目标数据添加到重构队列进行数据重构。具体而言,当校验到数据发生错误时,触发分布式存储系统的数据重构流程,将错误的数据作为丢失数据加入到数据重构队列中,进行数 据重构。根据分布式存储系统的冗余规则,从其他副本读取正常的数据,并写入本地存储引擎中,以使本地数据恢复。其中,数据重构的流程可采用分布式存储系统的原有重构流程,本申请在此不做赘述。Further, when an error occurs in the target data, the target data is added to the reconstruction queue for data reconstruction. Specifically, when it is verified that there is an error in the data, the data reconstruction process of the distributed storage system is triggered, and the erroneous data is added to the data reconstruction queue as missing data to perform data reconstruction. According to the redundancy rules of the distributed storage system, normal data is read from other copies and written to the local storage engine to restore the local data. Among them, the data reconstruction process can use the original reconstruction process of the distributed storage system, which will not be repeated in this application.
综上所述,本申请所提供的数据校验方法,每当写数据时计算需要写入的数据的CRC校验值,并存储到数据库,后续进行数据校验时,计算需要校验的数据的CRC校验值,并进一步与存储在数据库中的相应的CRC校验值进行比对,从而判断数据是否发生错误。该数据校验方法采用存储引擎独立进行数据校验的数据校验方式进行数据校验,无需借助其他节点的副本中的数据,即可完成数据校验,不仅能够精确识别数据异常的位置,并能够保障数据校验的有效进行。To sum up, the data verification method provided by this application calculates the CRC check value of the data to be written every time data is written, and stores it in the database, and then calculates the data to be verified during subsequent data verification. The CRC check value is further compared with the corresponding CRC check value stored in the database to determine whether the data has an error. The data verification method adopts the data verification method in which the storage engine independently performs data verification. Data verification can be completed without using the data in the copies of other nodes. It can not only accurately identify the location of abnormal data, but also It can ensure the effective execution of data verification.
本申请还提供了一种分布式存储系统的数据校验装置,下文描述的该装置可以与上文描述的方法相互对应参照。该数据校验装置包括:The present application also provides a data verification device of a distributed storage system, and the device described below may correspond to the method described above with reference to each other. The data verification device includes:
计算模块,用于从硬盘读取目标数据,并计算所述目标数据中每预设校验区间长度的数据的CRC校验值,得到第一CRC校验值;The calculation module is used to read the target data from the hard disk, and calculate the CRC check value of the data of each preset check interval length in the target data to obtain the first CRC check value;
读取模块,用于从数据库中读取写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值所得到的第二CRC校验值;A reading module, which is used to calculate the second CRC check value obtained by calculating the CRC check value of the data of each preset check interval length in the target data when reading and writing the target data from the database;
比对模块,用于分别将各所述第一CRC校验值与相应的所述第二CRC校验值进行比对;The comparison module is configured to compare each of the first CRC check values with the corresponding second CRC check values;
确定模块,用于若所述第一CRC校验值与相应的所述第二CRC校验值不一致,则确定所述硬盘中的所述目标数据发生错误。The determining module is configured to determine that an error occurs in the target data in the hard disk if the first CRC check value is inconsistent with the corresponding second CRC check value.
本申请还提供了一种分布式存储系统的数据校验设备,该数据校验设备包括:存储器与处理器;其中,存储器用于存储计算机程序;处理器用于执行该计算机程序时实现如下的步骤:This application also provides a data verification device for a distributed storage system, the data verification device comprising: a memory and a processor; wherein the memory is used to store a computer program; the processor is used to execute the computer program to implement the following steps :
从硬盘读取目标数据,并计算所述目标数据中每预设校验区间长度的数据的CRC校验值,得到第一CRC校验值;从数据库中读取写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值所得到的第二CRC校验值;分别将各所述第一CRC校验值与相应的所述第二CRC校验值进行比对;若所述第一CRC校验值与相应的所述第二CRC 校验值不一致,则确定所述硬盘中的所述目标数据发生错误。Read the target data from the hard disk, and calculate the CRC check value of the data of each preset check interval length in the target data to obtain the first CRC check value; when reading and writing the target data from the database, calculate The second CRC check value obtained from the CRC check value of the data of each preset check interval length in the target data; respectively compare each of the first CRC check values with the corresponding second CRC check values Value comparison; if the first CRC check value is inconsistent with the corresponding second CRC check value, it is determined that the target data in the hard disk has an error.
对于本申请所提供的设备的介绍请参照上述方法的实施例,本申请在此不做赘述。For the introduction of the equipment provided in this application, please refer to the embodiment of the above method, which will not be repeated in this application.
本申请还提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现如下的步骤:This application also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the following steps are implemented:
从硬盘读取目标数据,并计算所述目标数据中每预设校验区间长度的数据的CRC校验值,得到第一CRC校验值;从数据库中读取写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值所得到的第二CRC校验值;分别将各所述第一CRC校验值与相应的所述第二CRC校验值进行比对;若所述第一CRC校验值与相应的所述第二CRC校验值不一致,则确定所述硬盘中的所述目标数据发生错误。Read the target data from the hard disk, and calculate the CRC check value of the data of each preset check interval length in the target data to obtain the first CRC check value; when reading and writing the target data from the database, calculate The second CRC check value obtained from the CRC check value of the data of each preset check interval length in the target data; respectively compare each of the first CRC check values with the corresponding second CRC check values The value is compared; if the first CRC check value is inconsistent with the corresponding second CRC check value, it is determined that an error occurs in the target data in the hard disk.
该计算机可读存储介质可以包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The computer-readable storage medium may include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc., which can store program codes Medium.
对于本发明所提供的计算机可读存储介质的介绍请参照上述方法实施例,本发明在此不做赘述。For the introduction of the computer-readable storage medium provided by the present invention, please refer to the foregoing method embodiments, and the present invention will not be repeated here.
因为情况复杂,无法一一列举进行阐述,本领域技术人员应能意识到,在本申请提供的实施例的基本原理下结合实际情况可以存在多个例子,在不付出足够的创造性劳动下,应均在本申请的范围内。Because the situation is complex, it is impossible to enumerate one by one. Those skilled in the art should be aware that there can be multiple examples based on the basic principles of the embodiments provided in this application in combination with the actual situation. All are within the scope of this application.
说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。The various embodiments in the specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments can be referred to each other.
以上对本申请所提供的技术方案进行了详细介绍。本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以对本申请进行若干改进和修饰,这些改进和修饰也落入本申请权利要求的保护范围。The technical solutions provided by this application are described in detail above. Specific examples are used in this article to describe the principles and implementations of the application, and the descriptions of the above examples are only used to help understand the methods and core ideas of the application. It should be pointed out that for those of ordinary skill in the art, without departing from the principles of this application, several improvements and modifications can be made to this application, and these improvements and modifications also fall within the protection scope of the claims of this application.
还需要说明的是,在本说明书中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要 求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其它变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其它要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、物品或者装置中还存在另外的相同要素。It should also be noted that in this specification, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities or operations. There is any such actual relationship or sequence between operations. Moreover, the terms "including", "including" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, but also includes those that are not explicitly listed Other elements of, or also include elements inherent to this process, method, article or device. Without more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or device that includes the element.

Claims (10)

  1. 一种分布式存储系统的数据校验方法,其特征在于,包括:A data verification method for a distributed storage system is characterized in that it comprises:
    从硬盘读取目标数据,并计算所述目标数据中每预设校验区间长度的数据的CRC校验值,得到第一CRC校验值;Reading the target data from the hard disk, and calculating the CRC check value of the data of each preset check interval length in the target data to obtain the first CRC check value;
    从数据库中读取写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值所得到的第二CRC校验值;When reading and writing the target data from the database, calculating the second CRC check value obtained by calculating the CRC check value of the data of each preset check interval length in the target data;
    分别将各所述第一CRC校验值与相应的所述第二CRC校验值进行比对;Respectively comparing each of the first CRC check values with the corresponding second CRC check values;
    若所述第一CRC校验值与相应的所述第二CRC校验值不一致,则确定所述硬盘中的所述目标数据发生错误。If the first CRC check value is inconsistent with the corresponding second CRC check value, it is determined that an error has occurred in the target data in the hard disk.
  2. 根据权利要求1所述的数据校验方法,其特征在于,写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值,包括:The data verification method according to claim 1, wherein when writing the target data, calculating the CRC check value of the data of each preset check interval length in the target data comprises:
    若所述目标数据对应的写入区间的起始位置与终止位置是所述预设校验区间长度的整数倍,则直接计算所述目标数据中每预设校验区间长度的数据的CRC校验值。If the start position and the end position of the write interval corresponding to the target data are integer multiples of the length of the preset check interval, the CRC of the data of each preset check interval length in the target data is directly calculated. Value.
  3. 根据权利要求1所述的数据校验方法,其特征在于,写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值,包括:The data verification method according to claim 1, wherein when writing the target data, calculating the CRC check value of the data of each preset check interval length in the target data comprises:
    若所述目标数据的写入区间的起始位置和/或终止位置不是所述预设校验值计算长度的整数倍,则从所述硬盘中读取所述起始位置前为所述预设校验区间长度的整数倍的位置至所述起始位置的已写入数据,和/或,从所述硬盘中读取所述终止位置至所述终止位置后为所述预设校验区间长度的整数倍的位置的已写入数据;If the start position and/or end position of the write interval of the target data is not an integer multiple of the calculated length of the preset check value, then read the start position from the hard disk before the preset It is assumed that the written data from the position that is an integer multiple of the length of the check interval to the start position, and/or the preset check is read from the end position to the end position from the hard disk The written data at the position that is an integer multiple of the interval length;
    计算所述已写入数据与所述目标数据中每预设校验区间长度的数据的CRC校验值。Calculate the CRC check value of the data of each preset check interval length in the written data and the target data.
  4. 根据权利要求1所述的数据校验方法,其特征在于,所述第二CRC校验值存入所述数据库包括:The data check method according to claim 1, wherein storing the second CRC check value in the database comprises:
    所述第二CRC校验值及所述第二CRC校验值对应的所述数据在对象中的偏移量以map的形式存储于所述数据库。The second CRC check value and the offset in the object of the data corresponding to the second CRC check value are stored in the database in the form of a map.
  5. 根据权利要求1所述的数据校验方法,其特征在于,所述从硬盘读取目标数据包括:The data verification method according to claim 1, wherein said reading the target data from the hard disk comprises:
    接收读请求时或触发主动校验时从所述硬盘读取所述目标数据。The target data is read from the hard disk when a read request is received or when an active verification is triggered.
  6. 根据权利要求5所述的数据校验方法,其特征在于,接收读请求时从所述硬盘读取目标数据,并计算所述目标数据中每预设校验区间长度的数据的CRC校验值,包括:The data verification method according to claim 5, wherein the target data is read from the hard disk when a read request is received, and the CRC check value of the data of each preset check interval length in the target data is calculated ,include:
    根据所述读取请求确定所述目标数据对应的读取区间的起始位置与终止位置;Determining the start position and the end position of the read interval corresponding to the target data according to the read request;
    若所述目标数据对应的读取区间的起始位置与终止位置是所述预设校验区间长度的整数倍,则直接读取并计算所述目标数据中每预设校验区间长度的数据的CRC校验值;If the start position and the end position of the reading interval corresponding to the target data are integer multiples of the length of the preset check interval, directly read and calculate the data of each preset check interval length in the target data CRC check value;
    若所述目标数据的读取区间的起始位置和/或终止位置不是所述预设校验值计算长度的整数倍,则从所述硬盘中读取所述起始位置前为所述预设校验区间长度的整数倍的位置至所述起始位置的数据与所述目标数据,和/或读取所述终止位置至所述终止位置后为所述预设校验区间长度的整数倍的位置的数据与所述目标数据;计算读取的所述数据与所述目标数据中每预设校验区间长度的数据的CRC校验值。If the start position and/or end position of the read interval of the target data is not an integer multiple of the calculated length of the preset check value, then read the start position from the hard disk before the preset Set the data from the position of an integer multiple of the length of the check interval to the start position and the target data, and/or read the end position to the end position as an integer of the preset check interval length Calculate the CRC check value of the read data and the data of each preset check interval length in the target data.
  7. 根据权利要求1所述的数据校验方法,其特征在于,还包括:The data verification method according to claim 1, further comprising:
    当所述目标数据发生错误时,将所述目标数据添加到重构队列进行数据重构。When an error occurs in the target data, the target data is added to the reconstruction queue for data reconstruction.
  8. 一种分布式存储系统的数据校验装置,其特征在于,包括:A data verification device of a distributed storage system is characterized in that it comprises:
    计算模块,用于从硬盘读取目标数据,并计算所述目标数据中每预设校验区间长度的数据的CRC校验值,得到第一CRC校验值;The calculation module is used to read the target data from the hard disk, and calculate the CRC check value of the data of each preset check interval length in the target data to obtain the first CRC check value;
    读取模块,用于从数据库中读取写所述目标数据时,计算所述目标数据中每预设校验区间长度的数据的CRC校验值所得到的第二CRC校验值;A reading module, which is used to calculate the second CRC check value obtained by calculating the CRC check value of the data of each preset check interval length in the target data when reading and writing the target data from the database;
    比对模块,用于分别将各所述第一CRC校验值与相应的所述第二CRC校验值进行比对;The comparison module is configured to compare each of the first CRC check values with the corresponding second CRC check values;
    确定模块,用于若所述第一CRC校验值与相应的所述第二CRC校验值不一致,则确定所述硬盘中的所述目标数据发生错误。The determining module is configured to determine that an error occurs in the target data in the hard disk if the first CRC check value is inconsistent with the corresponding second CRC check value.
  9. 一种分布式存储系统的数据校验设备,其特征在于,包括:A data verification device for a distributed storage system, which is characterized in that it comprises:
    存储器,用于存储计算机程序;Memory, used to store computer programs;
    处理器,用于执行所述计算机程序时实现如权利要求1至7任一项所述的分布式存储系统的数据校验方法的步骤。The processor is configured to implement the steps of the data verification method of the distributed storage system according to any one of claims 1 to 7 when the computer program is executed.
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述的分布式存储系统的数据校验方法的步骤。A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the distributed storage system according to any one of claims 1 to 7 is realized The steps of the data verification method.
PCT/CN2020/110952 2019-12-31 2020-08-25 Data check method for distributed storage system, and related apparatus WO2021135280A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911411044.2 2019-12-31
CN201911411044.2A CN111176885A (en) 2019-12-31 2019-12-31 Data verification method and related device for distributed storage system

Publications (1)

Publication Number Publication Date
WO2021135280A1 true WO2021135280A1 (en) 2021-07-08

Family

ID=70657748

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/110952 WO2021135280A1 (en) 2019-12-31 2020-08-25 Data check method for distributed storage system, and related apparatus

Country Status (2)

Country Link
CN (1) CN111176885A (en)
WO (1) WO2021135280A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111176885A (en) * 2019-12-31 2020-05-19 浪潮电子信息产业股份有限公司 Data verification method and related device for distributed storage system
CN112416891B (en) * 2020-11-26 2023-11-28 北京天融信网络安全技术有限公司 Data detection method, device, electronic equipment and readable storage medium
CN112667431A (en) * 2020-12-23 2021-04-16 江西兴泰科技有限公司 Method for calibrating waveform data of electronic price tag
CN113704150B (en) * 2021-08-13 2023-08-04 苏州浪潮智能科技有限公司 DMA data cache consistency method, device and system in user mode
CN113946468A (en) * 2021-10-28 2022-01-18 北京金山云网络技术有限公司 Data testing method, device, equipment and storage medium
CN114579352A (en) * 2022-04-29 2022-06-03 阿里云计算有限公司 Data reconstruction method and device
CN115729477A (en) * 2023-01-09 2023-03-03 苏州浪潮智能科技有限公司 Distributed storage IO path data writing and reading method, device and equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0596340B1 (en) * 1992-11-04 1999-03-03 Mitsubishi Denki Kabushiki Kaisha Circuit with Reed-Solomon error correction and CRC error detection
US8266499B2 (en) * 2009-05-28 2012-09-11 Kabushiki Kaisha Toshiba CRC protection of data stored in XOR buffer
CN107888344A (en) * 2016-09-29 2018-04-06 中兴通讯股份有限公司 A kind of method, apparatus and system of error detection
US10205470B2 (en) * 2014-02-14 2019-02-12 Samsung Electronics Co., Ltd System and methods for low complexity list decoding of turbo codes and convolutional codes
CN109918226A (en) * 2019-02-26 2019-06-21 平安科技(深圳)有限公司 A kind of silence error-detecting method, device and storage medium
CN111176885A (en) * 2019-12-31 2020-05-19 浪潮电子信息产业股份有限公司 Data verification method and related device for distributed storage system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101452409B (en) * 2007-12-04 2010-10-13 无锡江南计算技术研究所 Data verification redundant method and device
CN102184260B (en) * 2011-06-09 2013-07-10 中国人民解放军国防科学技术大学 Method for accessing mass data in cloud calculation environment
CN104978336A (en) * 2014-04-08 2015-10-14 云南电力试验研究院(集团)有限公司电力研究院 Unstructured data storage system based on Hadoop distributed computing platform
CN108573007A (en) * 2017-06-08 2018-09-25 北京金山云网络技术有限公司 Method, apparatus, electronic equipment and the storage medium of data consistency detection
CN107807792A (en) * 2017-10-27 2018-03-16 郑州云海信息技术有限公司 A kind of data processing method and relevant apparatus based on copy storage system
CN108875061A (en) * 2018-06-29 2018-11-23 郑州云海信息技术有限公司 A kind of conformance test method and relevant apparatus of distributed file system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0596340B1 (en) * 1992-11-04 1999-03-03 Mitsubishi Denki Kabushiki Kaisha Circuit with Reed-Solomon error correction and CRC error detection
US8266499B2 (en) * 2009-05-28 2012-09-11 Kabushiki Kaisha Toshiba CRC protection of data stored in XOR buffer
US10205470B2 (en) * 2014-02-14 2019-02-12 Samsung Electronics Co., Ltd System and methods for low complexity list decoding of turbo codes and convolutional codes
CN107888344A (en) * 2016-09-29 2018-04-06 中兴通讯股份有限公司 A kind of method, apparatus and system of error detection
CN109918226A (en) * 2019-02-26 2019-06-21 平安科技(深圳)有限公司 A kind of silence error-detecting method, device and storage medium
CN111176885A (en) * 2019-12-31 2020-05-19 浪潮电子信息产业股份有限公司 Data verification method and related device for distributed storage system

Also Published As

Publication number Publication date
CN111176885A (en) 2020-05-19

Similar Documents

Publication Publication Date Title
WO2021135280A1 (en) Data check method for distributed storage system, and related apparatus
US9003103B2 (en) Nonvolatile media dirty region tracking
EP2756399B1 (en) Querying and repairing data
CN107479823B (en) Data verification method and device in random read-write file test
CN111078662B (en) Block chain data storage method and device
CN103778030B (en) Daily record subsystem wiring method, error tracking method and processor
US9329799B2 (en) Background checking for lost writes and data corruption
US7827440B1 (en) Re-synchronizing corrupted data
CN107807792A (en) A kind of data processing method and relevant apparatus based on copy storage system
BR112012031912B1 (en) method implemented by computer, computer system and storage medium
CN110008129B (en) Reliability test method, device and equipment for storage timing snapshot
WO2017215377A1 (en) Method and device for processing hard memory error
WO2017143843A1 (en) Metadata recovery method and device
CN112558868A (en) Method, device and equipment for storing configuration data
US9086990B2 (en) Bitline deletion
US10642508B2 (en) Method to limit impact of partial media failure of disk drive and detect/report the loss of data for objects due to partial failure of media
CN105868127A (en) Data storage method and device and data reading method and device
KR102437777B1 (en) Methods and systems to detect silent corruptionof data
US7577804B2 (en) Detecting data integrity
CN113259410A (en) Data transmission verification method and system based on distributed storage
WO2021056798A1 (en) Metadata repairing method, apparatus and device, and storage medium
US11669413B2 (en) Object metadata maintenance pertinent to object restoration
US9262264B2 (en) Error correction code seeding
WO2018218814A1 (en) Method for storing off-line transaction record, and computer device and storage medium
US9152637B1 (en) Just-in time formatting of file system metadata

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20910095

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20910095

Country of ref document: EP

Kind code of ref document: A1