Nothing Special   »   [go: up one dir, main page]

CN112732203B - Regeneration code construction method, file reconstruction method and node repair method - Google Patents

Regeneration code construction method, file reconstruction method and node repair method Download PDF

Info

Publication number
CN112732203B
CN112732203B CN202110348155.4A CN202110348155A CN112732203B CN 112732203 B CN112732203 B CN 112732203B CN 202110348155 A CN202110348155 A CN 202110348155A CN 112732203 B CN112732203 B CN 112732203B
Authority
CN
China
Prior art keywords
block
blocks
coding
code
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110348155.4A
Other languages
Chinese (zh)
Other versions
CN112732203A (en
Inventor
朱兵
曾志伟
赵旭煜
王伟平
王建新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202110348155.4A priority Critical patent/CN112732203B/en
Publication of CN112732203A publication Critical patent/CN112732203A/en
Application granted granted Critical
Publication of CN112732203B publication Critical patent/CN112732203B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/03Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Probability & Statistics with Applications (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a regeneration code coding method, which comprises the steps of obtaining a combined design and a storage data block; sorting and numbering the blocks, and counting the blocks and positions of the elements to obtain file symbols; encoding an original file to obtain an encoded block; allocating a different file symbol for each coding block; and placing the coding blocks corresponding to the file symbol sets with the consistent elements in the same storage node to obtain the regenerated codes. The invention also discloses a file reconstruction method and a node repair method based on the regeneration code construction method. The method for constructing the regeneration code has loose parameter limitation and can reach the upper bound of Singelton, the file reconstruction and node repair process is simple to operate, and the node repair efficiency is high.

Description

一种再生码构造方法、文件重构方法及节点修复方法A regeneration code construction method, a file reconstruction method and a node repair method

技术领域technical field

本发明具体涉及一种再生码构造方法、文件重构方法及节点修复方法。The invention specifically relates to a regeneration code construction method, a file reconstruction method and a node repair method.

背景技术Background technique

在当今信息爆炸的时代,如何安全可靠地存储海量数据成为了亟需解决的问题。传统的文件存储系统已不能满足可用性和可扩展性等要求,分布式存储系统(DistributedStorage System)应运而生。该系统利用分布式技术,将存储节点通过网络联结起来形成一个能存储海量数据的集群。分布式系统将数据文件分散放置于多个存储节点中,由于网络故障或物理损坏,单个存储节点可能会失效,因此分布式系统必须引入冗余信息来保证可靠性。常见的冗余策略有多副本策略和纠删码策略。多副本策略就是把原文件复制若干倍后再存储,当有节点失效时,替换节点就可以直接从它的副本中复制数据,因此系统中只要存在一个完整的数据副本,源文件就能正常使用;但这种方案存储效率和系统可靠性较低。纠删码策略采用纠删码处理原始文件,常用的纠删码为(n, k)最大距离可分(MaximumDistance Separable, MDS)码,即将原始文件分成大小相等的k份,由这k份文件经线性编码后生成n-k份冗余信息,利用n个节点存储n个线性无关的编码块,终端用户可以通过连接到任意k个节点恢复出原始文件。显然纠删码策略存储效率更高,但是实现也更复杂,应用难度大。In today's era of information explosion, how to safely and reliably store massive data has become an urgent problem to be solved. The traditional file storage system can no longer meet the requirements of availability and scalability, and the distributed storage system (Distributed Storage System) emerges as the times require. The system uses distributed technology to connect storage nodes through the network to form a cluster that can store massive data. Distributed systems place data files in multiple storage nodes. Due to network failure or physical damage, a single storage node may fail. Therefore, distributed systems must introduce redundant information to ensure reliability. Common redundancy strategies include multiple replica strategies and erasure coding strategies. The multi-copy strategy is to copy the original file several times before storing it. When a node fails, the replacement node can directly copy the data from its copy. Therefore, as long as there is a complete copy of the data in the system, the source file can be used normally. However, this scheme has low storage efficiency and system reliability. The erasure code strategy uses erasure codes to process original files. The commonly used erasure codes are ( n , k ) Maximum Distance Separable (MDS) codes, that is, the original file is divided into k parts of equal size, and the k parts of the file are divided into k parts. After linear coding, n - k redundant information is generated, and n nodes are used to store n linearly independent coding blocks. The end user can restore the original file by connecting to any k nodes. Obviously, the erasure coding strategy has higher storage efficiency, but the implementation is also more complicated and the application is difficult.

在分布式存储系统中,利用n个节点存储大小为B的原始数据,每个节点存储的数据量为α,终端用户从n个存储节点中的任意k个下载数据即可恢复出原始文件,该过程被称为文件重构过程。当存储系统中有存储节点失效时,为了保证分布式系统的整体功能,需要对该失效节点进行恢复,该过程被称为节点修复过程。In a distributed storage system, n nodes are used to store the original data of size B, and the amount of data stored in each node is α . The end user can restore the original file by downloading data from any k of the n storage nodes. This process is called the file reconstruction process. When a storage node in the storage system fails, in order to ensure the overall function of the distributed system, the failed node needs to be restored. This process is called a node repair process.

RS(Reed-Solomon)码是一种常见的MDS码,RS码的节点修复过程为:先建立一个新的存储节点,再连接至k个节点下载数据并恢复出原始文件,利用原始文件经线性编码后再向该新节点存储相应的数据。图1为现有技术中RS码修复过程示意图,易见,为恢复一个节点中的存储信息而下载了k个节点的存储数据,修复节点时的数据传输量远大于节点的存储量,即节点的修复过程需要浪费大量的网络带宽。RS (Reed-Solomon) code is a common MDS code. The node repair process of RS code is as follows: first establish a new storage node, then connect to k nodes to download data and restore the original file. After encoding, store the corresponding data to the new node. Fig. 1 is a schematic diagram of the RS code repair process in the prior art, it is easy to see that the stored data of k nodes is downloaded in order to restore the stored information in a node, and the data transmission amount when repairing a node is much greater than the storage capacity of the node, that is, the node The repair process requires a lot of wasted network bandwidth.

为了降低修复过程中的修复带宽,文献[A. G. Dimakis, P. B. Godfrey, Y.Wu, M. J. Wainwright, and K. Ramchandran, “Network coding for distributedstorage systems,” IEEE Trans. Inf. Theory, vol. 56, no. 9, pp. 4539–4551,Sep. 2010] 引入了网络编码的概念,即在修复失效节点时,取较多的节点参与修复,且参与修复的节点先将本节点内的数据经过线性组合后再上传,这能极大地降低修复失效节点的带宽消耗,该文使用信息流图对分布式存储系统进行建模,并对其进行最小割分析,如图2所示。图2中S表示信息源,即原始数据。每个存储节点

Figure 8559DEST_PATH_IMAGE001
由输入节点
Figure 881837DEST_PATH_IMAGE002
、输出节点
Figure 58740DEST_PATH_IMAGE003
和有向边
Figure 308456DEST_PATH_IMAGE004
来表示,有向边上的权值为该节点的数据存储量,其值均为α。DC (Data Collector)表示数据收集器,数据收集器DC收集任意k个节点的数据,就能恢复原始数据;从d个节点获取数据,就能修复失效节点。对图2进行最小割分析,将信息源S和数据收集者DC分开的曲线称为该信息流图的割,该曲线切过的所有有向边的权值和称为割的值。根据最大流-最小割定理,当最小割的值不小于信息源S中的原始数据量大小时,DC能够恢复原始数据。基于对信息流图的最小割分析,Dimakis等人给出了单节点修复带宽γ和单节点存储量α之间的折衷曲线,所谓再生码(Regenerating Codes)即αγ取该折衷曲线上的点时对应的编码。在折衷曲线上存在两类特殊的点,即数据存储量α最小值点和修复带宽γ最小值点,分别对应于:最小存储再生码(Minimum-Storage Regenerating, MSR),称为MSR编码(MSR Codes),In order to reduce the repair bandwidth in the repair process, the literature [AG Dimakis, PB Godfrey, Y.Wu, MJ Wainwright, and K. Ramchandran, “Network coding for distributedstorage systems,” IEEE Trans. Inf. Theory, vol. 56, no. 9, pp. 4539–4551, Sep. 2010] introduced the concept of network coding, that is, when repairing a failed node, more nodes are selected to participate in the repair, and the nodes participating in the repair first linearly combine the data in this node. Re-upload, which can greatly reduce the bandwidth consumption of repairing failed nodes. This paper uses information flow graph to model the distributed storage system, and performs minimum cut analysis on it, as shown in Figure 2. S in Fig. 2 represents the information source, that is, the original data. per storage node
Figure 8559DEST_PATH_IMAGE001
by the input node
Figure 881837DEST_PATH_IMAGE002
, output node
Figure 58740DEST_PATH_IMAGE003
and directed edges
Figure 308456DEST_PATH_IMAGE004
to indicate that the weight on the directed edge is the data storage amount of the node, and its value is α . DC (Data Collector) represents a data collector. The data collector DC collects the data of any k nodes and can restore the original data; obtains data from d nodes, and can repair the failed node. Perform the minimum cut analysis on Figure 2, the curve separating the information source S and the data collector DC is called the cut of the information flow graph, and the weights of all directed edges cut by the curve are called the cut value. According to the maximum flow-minimum cut theorem, when the value of the minimum cut is not less than the original data size in the information source S, the DC can restore the original data. Based on the minimum cut analysis of the information flow graph, Dimakis et al . gave a compromise curve between the repair bandwidth γ of a single node and the storage capacity α of a single node. The corresponding code when the point is clicked. There are two kinds of special points on the compromise curve, namely the minimum point of data storage α and the minimum point of repair bandwidth γ , which correspond to: Minimum-Storage Regenerating (MSR), which is called MSR code (MSR). Codes),

Figure 762571DEST_PATH_IMAGE005
Figure 762571DEST_PATH_IMAGE005

最小带宽再生码(Minimum-Bandwidth Regenerating, MSR),称为MBR编码(MBRCodes),Minimum-Bandwidth Regenerating Codes (MSR), called MBR Codes,

Figure 682117DEST_PATH_IMAGE006
Figure 682117DEST_PATH_IMAGE006

再生码的修复过程如下:新建节点从完好的存储节点中任选d个存储节点下载数据,每个节点下载的数据量为β,即再生码的修复带宽

Figure 221682DEST_PATH_IMAGE007
,且修复带宽随着d的增大而减小,这是因为随着参与修复的辅助节点数目d的增加,每个节点修复时传输的数据量β变小,且β变小的速度比d增大的速度更快,从而使总修复带宽减小。可见再生码的修复带宽优于RS码,但再生码的修复过程中的节点访问数d>k。另外,修复节点需要对其存储的数据执行随机线性网络编码操作,为了满足所有编码包是相互独立的,再生码的运算需要在一个较大的有限域内进行。The repair process of the regeneration code is as follows: the new node downloads data from d storage nodes selected from the intact storage nodes, and the amount of data downloaded by each node is β , which is the repair bandwidth of the regeneration code.
Figure 221682DEST_PATH_IMAGE007
, and the repair bandwidth decreases as d increases, because with the increase of the number of auxiliary nodes d participating in the repair, the amount of data β transmitted by each node during repair becomes smaller, and the speed at which β becomes smaller is faster than d The increase is faster, resulting in a reduction in the total repair bandwidth. It can be seen that the repair bandwidth of the regenerative code is better than that of the RS code, but the number of node visits d > k in the repair process of the regenerative code. In addition, the repair node needs to perform random linear network coding operations on its stored data. In order to satisfy that all coding packets are independent of each other, the operation of the regeneration code needs to be performed in a large finite field.

根据节点中存储的信息是否为校验信息,可以将节点分为:According to whether the information stored in the node is verification information, the nodes can be divided into:

1)系统节点(Systematic Nodes):系统节点存储的是未经编码的原始文件信息;1) Systematic Nodes: System nodes store unencoded original file information;

2)校验节点(Parity Nodes):校验节点存储的是经过编码后的文件信息,也就是冗余信息。2) Parity Nodes: Parity Nodes store encoded file information, that is, redundant information.

根据修复后的节点数据是否与原失效节点数据相同可以将修复策略分为:According to whether the restored node data is the same as the original failed node data, the restoration strategies can be divided into:

1)功能修复(Functional Repair):该修复方案修复后的节点数据与原失效节点数据不一定相同,但修复后生成的节点与其他节点组成的分布式存储系统仍具有MDS特性;1) Functional Repair: The repaired node data is not necessarily the same as the original failed node data, but the distributed storage system composed of the repaired node and other nodes still has MDS characteristics;

2)精确修复(Exact Repair):该修复方案下修复后的节点数据原失效节点数据相同;2) Exact Repair: The repaired node data under this repair scheme is the same as the original failed node data;

3)混合修复(Hybrid Repair):该修复方案对系统节点(存储未编码数据)进行精确修复,而对非系统节点(存储冗余数据)不需精确修复只需修复后使新的分布式系统仍满足 MDS 特性即可。3) Hybrid Repair: This repair scheme accurately repairs system nodes (stores uncoded data), while non-system nodes (stores redundant data) do not need to be accurately repaired, and only needs to be repaired to make a new distributed system. It is sufficient to still meet the MDS characteristics.

基于Dimakis等人提出的再生码,文献[C. Tian, B. Sasidharan, V. Aggarwal,V. A. Vaishampayan, and P. V. Kumar, “Layered exact-repair regenerating codesvia embedded error correction and block designs,” IEEE Trans. Inf. Theory,vol. 61, no. 4, pp. 1933–1947, Apr. 2015]提出了一种基于组合设计放置策略的精确修复再生码。组合设计是一组满足某些特性的子集的集合(子集被称为区组)。一个给定的区组设计由(X,B)指定,带有参数(r,n),rn,即X是具有n个元素的集合,BXr元子集的集合。在该文中主要使用了斯坦纳系(Steiner systems)和重复组合区组设计(DuplicatedCombination Block Design, DCBD)这两种设计,实现了分布式存储系统参数限制为[n,k,d=k=n-1],[n,k=n-2,d=n-1]和[n,k,d>k]的精确构造。这种构造比MSR码和MBR码之间的空间共享性能更好,并且是实现这种性能的第一类编码。此外,该构造可以在参数限制为[n,k,d=k=n-1]的最优功能修复折衷曲线上达到不平凡的点,并且在高码率下渐近最优。但是该文提出的构造方式对参数限制比较严格,只容许1~2个节点的失效,实用性不高。Based on the regeneration codes proposed by Dimakis et al. [C. Tian, B. Sasidharan, V. Aggarwal, VA Vaishampayan, and PV Kumar, “Layered exact-repair regenerating codes via embedded error correction and block designs,” IEEE Trans. Inf. Theory,vol. 61, no. 4, pp. 1933–1947, Apr. 2015] proposed an accurate repair regeneration code based on combinatorial design placement strategy. A combinatorial design is a collection of subsets (the subsets are called blocks) that satisfy certain properties. A given block design is specified by ( X , B ), with parameters ( r , n ), where rn , that is, X is a set with n elements and B is a set of r -element subsets of X. In this paper, two designs of Steiner systems and Duplicated Combination Block Design (DCBD) are mainly used, and the parameters of the distributed storage system are limited to [ n , k , d = k = n -1], [ n , k = n -2, d = n -1] and exact constructions for [ n , k , d > k ]. This construction performs better than spatial sharing between MSR codes and MBR codes, and is the first class of codes to achieve this performance. Furthermore, the construction can reach non-trivial points on the optimal functional repair trade-off curve with parameter constraints [ n , k , d = k = n -1], and is asymptotically optimal at high code rates. However, the construction method proposed in this paper has strict parameter restrictions, and only allows the failure of 1~2 nodes, which is not practical.

发明内容SUMMARY OF THE INVENTION

本发明的目的之一在于提供一种再生码构造方法,该方法的构造过程简单,对参数限制更为宽松。One of the objectives of the present invention is to provide a method for constructing a regenerated code, which has a simple construction process and looser parameter restrictions.

本发明的目的之二在于还提供一种基于所述的再生码构造方法的文件重构方法。Another object of the present invention is to further provide a file reconstruction method based on the regenerated code construction method.

本发明的目的之三在于还提供一种基于所述的再生码构造方法的节点修复方法。The third object of the present invention is to further provide a node repair method based on the regenerated code construction method.

本发明提供的这种再生码构造方法,包括如下步骤:This regeneration code construction method provided by the present invention comprises the following steps:

S1.获取组合设计和存储数据块;S1. Get combined design and storage data blocks;

S2.对区组排序并编号,统计元素出现的区组及位置,得到文件符号;S2. Sort and number the blocks, count the blocks and positions where the elements appear, and get the file symbol;

S3.对数据块进行编码,得到编码块;S3. Encode the data block to obtain an encoded block;

S4.为每个编码块分配一个不同的文件符号;S4. assign each encoded block a different file symbol;

S5.将每个元素一致的文件符号集合所对应的编码块放置在同一存储节点中,得到再生码。S5. The coding blocks corresponding to the file symbol sets that are consistent with each element are placed in the same storage node to obtain a regeneration code.

步骤S1,获取的组合设计具体为,选择满足组合条件的t-设计

Figure 9510DEST_PATH_IMAGE008
λt-平衡数,表示任意t元子集均出现在λ个区组中,其中,In step S1, the obtained combination design is specifically, selecting a t -design that satisfies the combination condition
Figure 9510DEST_PATH_IMAGE008
, λ is the t -balance number, which means that any subset of t elements appears in λ blocks, where,

b为区组个数,

Figure 442765DEST_PATH_IMAGE009
;Let b be the number of blocks,
Figure 442765DEST_PATH_IMAGE009
;

r为元素重复度,

Figure 657846DEST_PATH_IMAGE010
;Let r be the element repetition degree,
Figure 657846DEST_PATH_IMAGE010
;

组合条件为:The combination conditions are:

Figure 950287DEST_PATH_IMAGE011
Figure 950287DEST_PATH_IMAGE011
;

其中,t为子集的元素数;v为设计的阶,具体为设计中不同元素的总数;m为区组容量,具体为一个区组中的元素个数;Among them, t is the number of elements in the subset; v is the order of the design, specifically the total number of different elements in the design; m is the block capacity, specifically the number of elements in a block;

获取的存储数据块具体为,设存储原文件大小为M,将原文件拆分成

Figure 276226DEST_PATH_IMAGE012
个大小相等的数据块。The obtained storage data block is as follows, set the size of the original file to be stored as M, and split the original file into
Figure 276226DEST_PATH_IMAGE012
data blocks of equal size.

步骤S2的文件符号设为

Figure 344414DEST_PATH_IMAGE013
,其中1≤ib,1≤jmb为区组个数,m为区组容量;同时
Figure 464817DEST_PATH_IMAGE014
表示元素h出现在第i个区组中的第j个位置;将区组编号记为B 1,B 2,…,B b ,对于区组B i ,将其中的元素由小到大顺序排列,位置编号记为{1,2,…,m}。The file symbol of step S2 is set to
Figure 344414DEST_PATH_IMAGE013
, where 1≤i≤b , 1≤j≤m , b is the number of blocks, m is the block capacity;
Figure 464817DEST_PATH_IMAGE014
Indicates that the element h appears at the jth position in the ith block; record the block number as B 1 , B 2 ,..., B b , for block B i , arrange the elements in the order from small to large , and the position number is marked as {1,2,…, m }.

步骤S3具体为利用双层编码策略对数据块进行编码;外层获得的MDS码构造如下:Step S3 is specifically to use the double-layer coding strategy to encode the data block; the MDS code structure obtained by the outer layer is as follows:

利用参数为

Figure 978975DEST_PATH_IMAGE015
MDS码对
Figure 233238DEST_PATH_IMAGE016
个数据块进行编码,获得b(m-t+1)个编码块,其中b为区组个数,m为区组容量,t为子集的元素数,λt-平衡数,表示任意t元子集出现在λ个区组;Use parameters as
Figure 978975DEST_PATH_IMAGE015
MDS code pair
Figure 233238DEST_PATH_IMAGE016
Encode the data blocks to obtain b ( m - t +1) coding blocks, where b is the number of blocks, m is the block capacity, t is the number of elements of the subset, λ is the t -balance number, representing any A subset of t -elements occurs in λ blocks;

内层获得的MDS码构造如下:The MDS code obtained by the inner layer is constructed as follows:

将上述利用外层编码得到的b(m-t+1)个编码块均分为b组,利用参数为(m,m-t+1)MDS码对每组中的m-t+1个编码块进行二次编码,最终得到bm个编码块。Divide the above b ( m - t +1) coding blocks obtained by using the outer layer coding into b groups, and use the parameter ( m , m - t +1) MDS code for m - t +1 in each group. The coding block is subjected to secondary coding, and finally bm coding blocks are obtained.

步骤S4具体包括,为上述获得的每一个编码块分配一个文件符号作为标识,不失一般性地,将

Figure 782031DEST_PATH_IMAGE017
的前
Figure 338915DEST_PATH_IMAGE018
个文件符号
Figure 340369DEST_PATH_IMAGE019
作为外层编码生成的冗余块的标识;将
Figure 883477DEST_PATH_IMAGE020
的文件符号
Figure 286776DEST_PATH_IMAGE021
作为内层编码生成的冗余块的标识,其中,
Figure 14561DEST_PATH_IMAGE022
表示元素h出现在第i个区组中的第j个位置,m为区组容量。Step S4 specifically includes assigning a file symbol as an identifier for each coding block obtained above, without loss of generality, using
Figure 782031DEST_PATH_IMAGE017
the former
Figure 338915DEST_PATH_IMAGE018
file symbol
Figure 340369DEST_PATH_IMAGE019
As the identifier of the redundant block generated by the outer encoding; the
Figure 883477DEST_PATH_IMAGE020
file symbol
Figure 286776DEST_PATH_IMAGE021
As the identifier of the redundant block generated by the inner layer coding, where,
Figure 14561DEST_PATH_IMAGE022
Indicates that the element h appears at the jth position in the ith block, and m is the block capacity.

步骤S5具体包括,每当发现一个文件符号

Figure 627945DEST_PATH_IMAGE023
,若其元素h与之前的文件符号
Figure 364957DEST_PATH_IMAGE024
不同,则分配一个新的存储节点并编号为h;对于每一个文件符号
Figure 357183DEST_PATH_IMAGE024
,若其元素h已分配存储节点,则将元素h相同的文化符号
Figure 632700DEST_PATH_IMAGE025
所对应的编码块存放到编码为h的存储节点中,则一个存储节点存储r个编码块,r为元素重复度;得到参数为[n,k=n-t,d=n-t+1]的再生码,n为节点个数,k为下载数据的个数,d为参与修复的节点个数,t为子集的元素数,
Figure 608747DEST_PATH_IMAGE026
表示元素h出现在第i个区组中的第j个位置。Step S5 specifically includes, whenever a file symbol is found
Figure 627945DEST_PATH_IMAGE023
, if its element h is the same as the previous file symbol
Figure 364957DEST_PATH_IMAGE024
is different, a new storage node is allocated and numbered h ; for each file symbol
Figure 357183DEST_PATH_IMAGE024
, if its element h has been allocated a storage node, then the same cultural symbol of element h
Figure 632700DEST_PATH_IMAGE025
The corresponding coding blocks are stored in the storage node coded as h , then one storage node stores r coding blocks, r is the element repetition degree; the obtained parameters are [ n , k = n - t , d = n - t +1 ], n is the number of nodes, k is the number of downloaded data, d is the number of nodes participating in the repair, t is the number of elements of the subset,
Figure 608747DEST_PATH_IMAGE026
Indicates that element h appears at the jth position in the ith block.

本发明还提供一种包括上述再生码构造方法的文件重构方法,包括如下步骤:The present invention also provides a file reconstruction method comprising the above-mentioned regeneration code construction method, comprising the following steps:

A1.根据步骤S1~S5构造再生码;A1. Construct the regeneration code according to steps S1~S5;

A2.连接到存储节点;A2. connect to the storage node;

A3.在下载当前编码块时,进行重构第一次判断;A3. When downloading the current encoding block, make the first judgment of reconstruction;

A4.在完成一个编码块的下载后,进行重构第二次判断;A4. After completing the download of a coding block, the second judgment of reconstruction is performed;

A5.利用内层及外层的MDS编码规则恢复原文件。A5. Use the inner and outer MDS coding rules to restore the original file.

步骤A3的重构第一次判断具体包括,对于当前具有相同区组编号i 0的文件符号

Figure 883870DEST_PATH_IMAGE027
,是否已经下载了m-t+1个文件符号
Figure 996183DEST_PATH_IMAGE028
所代表的编码块,
Figure 190404DEST_PATH_IMAGE029
表示元素h出现在第i 0个区组中的第j个位置,m为区组容量;若判断为是,则停止下载当前编码块;若判断为否,则下载当前编码块;The first judgment of the reconstruction of step A3 specifically includes, for the current file symbol with the same block number i 0
Figure 883870DEST_PATH_IMAGE027
, whether m - t +1 file symbols have been downloaded
Figure 996183DEST_PATH_IMAGE028
represents the coding block,
Figure 190404DEST_PATH_IMAGE029
Indicates that the element h appears at the jth position in the i0th block, and m is the block capacity; if it is judged to be yes, then stop downloading the current coding block; if it is judged to be no, then download the current coding block;

步骤A4的重构第二次判断具体包括,判断是否已经下载了

Figure 388167DEST_PATH_IMAGE030
个编码块;若判断为是,则停止下载过程,若判断为否,则继续下载过程,b为区组个数,t为子集的元素数,λt-平衡数,表示任意t元子集出现在λ个区组中。The second judgment of reconstruction in step A4 specifically includes, judging whether it has been downloaded
Figure 388167DEST_PATH_IMAGE030
If it is judged as yes, then stop the download process, if it is judged as no, then continue the download process, b is the number of blocks, t is the number of elements of the subset, λ is the t -balance number, representing any t element Subsets appear in λ blocks.

本发明还提供了一种包括上述的再生码构造方法的节点修复方法,包括如下步骤:The present invention also provides a node repair method including the above-mentioned regeneration code construction method, comprising the following steps:

B1.根据步骤S1~S5构造再生码;B1. Construct the regeneration code according to steps S1~S5;

B2.确定失效节点的序号;B2. Determine the serial number of the failed node;

B3.在组合设计中确定失效节点出现过的区组序号;B3. Determine the block serial number of the failed node in the combined design;

B4.从n-t+1个节点中下载编码块,共下载r(m-t+1)个编码块,r为元素重复度,n为节点个数,t为子集的元素数,m为区组容量;B4. Download coding blocks from n - t +1 nodes, download r ( m - t +1) coding blocks in total, r is the element repetition degree, n is the number of nodes, t is the number of elements of the subset, m is the area group capacity;

B5.将步骤B4下载的r(m-t+1)个编码块分成r组,每组的编码块具有相同的区组编号iB5. The r ( m - t +1) code blocks downloaded in step B4 are divided into r groups, and the code blocks of each group have the same block number i ;

B6.对每一组编码块利用内层获得的MDS码的编码规则恢复一个编码块;将r组编码块恢复出r个编码块;B6. For each group of coding blocks, use the coding rule of the MDS code obtained by the inner layer to restore a coding block; restore r coding blocks from the r group of coding blocks;

B7.将恢复出的r个编码块存储到一个新的存储节点,并将此节点加入原存储系统中完成修复过程。B7. Store the recovered r encoded blocks in a new storage node, and add this node to the original storage system to complete the repair process.

步骤B2具体为,从剩下的完好节点中对元素编号,记为{h 1,...,h v-1},将元素编号与

Figure 466981DEST_PATH_IMAGE031
设计中的元素对比,确定失效节点的序号h 0;其中v为设计的阶,m为区组容量,λt-平衡数;步骤B3的区组序号记为{i 1,...,i r };步骤B4的文件符号具体为
Figure 309166DEST_PATH_IMAGE032
,表示h 0出现在第i个区组中的第j个位置,
Figure 815234DEST_PATH_IMAGE033
。Step B2 is specifically: number the elements from the remaining intact nodes, denoted as { h 1 ,..., h v -1 }, and compare the element number with the
Figure 466981DEST_PATH_IMAGE031
The elements in the design are compared, and the sequence number h 0 of the failed node is determined; where v is the design order, m is the block capacity, λ is the t -balance number; the block sequence number in step B3 is marked as { i 1 ,..., i r }; The file symbol of step B4 is specifically
Figure 309166DEST_PATH_IMAGE032
, indicating that h 0 appears at the jth position in the ith block,
Figure 815234DEST_PATH_IMAGE033
.

本发明提出的再生码构造方法参数限制宽松且能达到辛格尔顿上界,文件重构和节点修复过程运算简单,同时节点修复效率较高。The regeneration code construction method proposed by the invention has loose parameter restrictions and can reach the Singleton upper bound, the file reconstruction and node repairing process are simple in operation, and the node repairing efficiency is high.

附图说明Description of drawings

图1为现有技术中RS码修复过程示意图。FIG. 1 is a schematic diagram of an RS code repair process in the prior art.

图2为现有技术中分布式存储系统的信息流图。FIG. 2 is an information flow diagram of a distributed storage system in the prior art.

图3为本发明方法的再生码构造方法的流程示意图。FIG. 3 is a schematic flowchart of a method for constructing a regenerative code according to the method of the present invention.

图4为本发明方法的再生码构造方法的实施例存储编码示意图。FIG. 4 is a schematic diagram of storage coding according to an embodiment of a method for constructing a regenerative code according to the method of the present invention.

图5为本发明方法的文件重构方法的流程示意图。FIG. 5 is a schematic flowchart of a file reconstruction method according to the method of the present invention.

图6为本发明方法的文件重构方法的实施例示意图。FIG. 6 is a schematic diagram of an embodiment of a file reconstruction method according to the method of the present invention.

图7为本发明方法的节点修复方法的流程示意图。FIG. 7 is a schematic flowchart of a node repairing method according to the method of the present invention.

图8为本发明方法的节点修复方法的实施例示意图。FIG. 8 is a schematic diagram of an embodiment of a node repairing method according to the method of the present invention.

具体实施方式Detailed ways

如图3为本发明方法的再生码构造方法的流程示意图。本发明提供的这种再生码构造方法,包括如下步骤:FIG. 3 is a schematic flowchart of a method for constructing a regenerative code according to the method of the present invention. This regeneration code construction method provided by the present invention comprises the following steps:

S1.获取组合设计和存储数据块;获取的组合设计具体为,选择满足组合条件的t-设计

Figure 500293DEST_PATH_IMAGE034
λt-平衡数,表示任意t元子集均出现在λ个区组中,其中,S1. Obtain the combination design and store the data block; the obtained combination design is specifically, select the t -design that satisfies the combination condition
Figure 500293DEST_PATH_IMAGE034
, λ is the t -balance number, which means that any subset of t elements appears in λ blocks, where,

b为区组个数,

Figure 976274DEST_PATH_IMAGE035
;Let b be the number of blocks,
Figure 976274DEST_PATH_IMAGE035
;

r为元素重复度,

Figure 63179DEST_PATH_IMAGE010
;Let r be the element repetition degree,
Figure 63179DEST_PATH_IMAGE010
;

组合条件为:The combination conditions are:

Figure 740148DEST_PATH_IMAGE036
Figure 740148DEST_PATH_IMAGE036

其中,t为子集的元素数;v为设计的阶,具体为设计中元素的种类数;m为区组容量,具体为一个区组中的元素个数;Among them, t is the number of elements of the subset; v is the order of the design, specifically the number of types of elements in the design; m is the block capacity, specifically the number of elements in a block;

获取的存储数据块具体为,设存储原文件大小为M,将原文件拆分成相等大小的

Figure 912503DEST_PATH_IMAGE037
个数据块。The obtained storage data block is specifically, set the size of the original file to be stored as M, and split the original file into equal-sized files.
Figure 912503DEST_PATH_IMAGE037
data blocks.

S2.对区组排序并编号,统计元素出现的区组及位置,得到文件符号;文件符号设为

Figure 707021DEST_PATH_IMAGE038
,其中1≤ib,1≤jmb为区组个数,m为区组容量;同时
Figure 382853DEST_PATH_IMAGE039
表示元素h出现在第i个区组中的第j个位置;将区组编号记为B 1,B 2,…,B b ,对于区组B i ,将其中的元素先后顺序排列,编号记为位置{1,2,…,m}。S2. Sort and number the blocks, count the blocks and positions where the elements appear, and get the file symbol; the file symbol is set to
Figure 707021DEST_PATH_IMAGE038
, where 1≤i≤b , 1≤j≤m , b is the number of blocks, m is the block capacity;
Figure 382853DEST_PATH_IMAGE039
Indicates that the element h appears at the jth position in the ith block; the block number is recorded as B 1 , B 2 ,..., B b , for the block B i , the elements in it are arranged in sequence, and the number is marked as is the position {1,2,…, m }.

S3.利用双层编码策略对数据块进行编码;外层获得的MDS码构造如下:S3. The data block is coded using a double-layer coding strategy; the MDS code obtained by the outer layer is constructed as follows:

利用参数为

Figure 230724DEST_PATH_IMAGE040
MDS码对
Figure 15009DEST_PATH_IMAGE041
个数据块进行编码,获得b(m-t+1)个编码块,其中b为区组个数,m为区组容量,t为子集的元素数,λt-平衡数,表示任意t元子集出现在λ个区组;Use parameters as
Figure 230724DEST_PATH_IMAGE040
MDS code pair
Figure 15009DEST_PATH_IMAGE041
Encode the data blocks to obtain b ( m - t +1) coding blocks, where b is the number of blocks, m is the block capacity, t is the number of elements of the subset, λ is the t -balance number, representing any A subset of t -elements occurs in λ blocks;

内层获得的MDS码构造如下:The MDS code obtained by the inner layer is constructed as follows:

将上述利用外层编码得到的b(m-t+1)个编码块均分为b组,利用参数为(m,m-t+1)MDS码对每组中的m-t+1个编码块进行二次编码,最终得到bm个编码块。Divide the above b ( m - t +1) coding blocks obtained by using the outer layer coding into b groups, and use the parameter ( m , m - t +1) MDS code for m - t +1 in each group. The coding block is subjected to secondary coding, and finally bm coding blocks are obtained.

S4.为上述获得的每一个编码块分配一个文件符号作为标识,不失一般性地,将

Figure 973738DEST_PATH_IMAGE017
的前
Figure 35235DEST_PATH_IMAGE042
个文件符号
Figure 788427DEST_PATH_IMAGE043
作为外层编码生成的冗余块的标识;将
Figure 810741DEST_PATH_IMAGE044
的文件符号
Figure 307581DEST_PATH_IMAGE026
作为内层编码生成的冗余块的标识,其中,
Figure 223585DEST_PATH_IMAGE045
表示元素h出现在第i个区组中的第j个位置,m为区组容量。S4. Assign a file symbol as an identifier to each coding block obtained above, without loss of generality, set
Figure 973738DEST_PATH_IMAGE017
the former
Figure 35235DEST_PATH_IMAGE042
file symbol
Figure 788427DEST_PATH_IMAGE043
As the identifier of the redundant block generated by the outer encoding; the
Figure 810741DEST_PATH_IMAGE044
file symbol
Figure 307581DEST_PATH_IMAGE026
As the identifier of the redundant block generated by the inner layer coding, where,
Figure 223585DEST_PATH_IMAGE045
Indicates that the element h appears at the jth position in the ith block, and m is the block capacity.

S5.将每个元素一致的文件符号集合所对应的编码块放置在同一存储节点中,得到再生码。具体包括,每当发现一个文件符号

Figure 537891DEST_PATH_IMAGE046
,若其元素h与之前的文件符号
Figure 906556DEST_PATH_IMAGE047
不同,则分配一个新的存储节点并编号为h;对于每一个文件符号
Figure 207087DEST_PATH_IMAGE048
,若其元素h已分配存储节点,则将元素h相同的文化符号
Figure 243176DEST_PATH_IMAGE049
所对应的编码块存放到编码为h的存储节点中,则一个存储节点存储r个编码块,r为元素重复度;得到参数为[n,k=n-t,d=n-t+1]的再生码,n为节点个数,k为下载数据的个数,d为参与修复的节点个数,t为子集的元素数。S5. The coding blocks corresponding to the file symbol sets that are consistent with each element are placed in the same storage node to obtain a regeneration code. Specifically include, whenever a file symbol is found
Figure 537891DEST_PATH_IMAGE046
, if its element h is the same as the previous file symbol
Figure 906556DEST_PATH_IMAGE047
is different, a new storage node is allocated and numbered h ; for each file symbol
Figure 207087DEST_PATH_IMAGE048
, if its element h has been allocated a storage node, then the same cultural symbol of element h
Figure 243176DEST_PATH_IMAGE049
The corresponding coding blocks are stored in the storage node coded as h , then one storage node stores r coding blocks, r is the element repetition degree; the obtained parameters are [ n , k = n - t , d = n - t +1 ], n is the number of nodes, k is the number of downloaded data, d is the number of nodes participating in the repair, and t is the number of elements of the subset.

在具体实施方式中,给出一个参数M=39,n=8,d=7,k=6的存储编码实例,如图4为本发明方法的再生码构造方法的实施例存储编码示意图。具体为如下步骤:In the specific implementation, an example of storage coding with parameters M=39, n =8, d =7, k =6 is given, as shown in FIG. The specific steps are as follows:

步骤(1)、该分布式存储系统取得39个初始数据块(该处的数据块是指将原文件经过步骤S1后获得),利用步骤S3中的双层编码策略进行编码,得到56个编码块。Step (1), the distributed storage system obtains 39 initial data blocks (the data block here refers to the original file obtained after going through step S1), and uses the double-layer encoding strategy in step S3 to encode, and obtains 56 codes piece.

该参数为[n=8,k=6,d=7] 分布式存储系统,是基于2-(8,4,3)设计来构建的,This parameter is [ n =8, k =6, d =7] Distributed storage system, which is built based on 2-(8,4,3) design,

2-(8,4,3)进排序后如下:{{0,1,2,3},{0,1,2,4},{0,1,5,6},{0,2,5,7},{0,3,4,5},{0,3,6,7},{0,4,6,7},{1,2,6,7},{1,3,4,6},{1,3,5,7},{1,4,5,7},{2,3,4,7},{2,3,5,6},{2,4,5,6}}。2-(8,4,3) is sorted as follows: {{0,1,2,3},{0,1,2,4},{0,1,5,6},{0,2, 5,7},{0,3,4,5},{0,3,6,7},{0,4,6,7},{1,2,6,7},{1,3, 4,6},{1,3,5,7},{1,4,5,7},{2,3,4,7},{2,3,5,6},{2,4, 5,6}}.

步骤(2)、统计其内元素出现的区组及位置,结果如下:

Figure 709143DEST_PATH_IMAGE050
Step (2), count the blocks and positions where the elements appear, and the results are as follows:
Figure 709143DEST_PATH_IMAGE050

Figure 565103DEST_PATH_IMAGE051
,
Figure 565103DEST_PATH_IMAGE051
,

Figure 669326DEST_PATH_IMAGE052
Figure 684555DEST_PATH_IMAGE053
Figure 669326DEST_PATH_IMAGE052
Figure 684555DEST_PATH_IMAGE053

Figure 216030DEST_PATH_IMAGE054
Figure 216030DEST_PATH_IMAGE054
;

步骤(3)、为56个编码块各分配一个文件符号

Figure 293708DEST_PATH_IMAGE055
,其中
Figure 811408DEST_PATH_IMAGE056
表示元素h出现在第i个区组中的第j个位置。Step (3), assign a file symbol to each of the 56 coding blocks
Figure 293708DEST_PATH_IMAGE055
,in
Figure 811408DEST_PATH_IMAGE056
Indicates that element h appears at the jth position in the ith block.

步骤(4)、双层编码策略具体如下,内层编码利用参数为(4, 3)的MDS码进行编码,不失一般性地令每个区组中的最后一个文件符号所对应的编码块作为校验块,那么可得Step (4), the double-layer encoding strategy is as follows, the inner layer encoding uses the MDS code with parameters (4, 3) to encode, without loss of generality, make the encoding block corresponding to the last file symbol in each block group. As a check block, then we can get

Figure 556510DEST_PATH_IMAGE057
Figure 556510DEST_PATH_IMAGE057

Figure 258887DEST_PATH_IMAGE058
Figure 258887DEST_PATH_IMAGE058

外层编码利用参数为(42, 39)的MDS进行编码。在本实施例中,不失一般性地令

Figure 89440DEST_PATH_IMAGE059
所对应的编码块作为校验块,那么可得The outer coding is coded using MDS with parameters (42, 39). In this embodiment, without loss of generality, let
Figure 89440DEST_PATH_IMAGE059
The corresponding coding block is used as the check block, then we can get

Figure 394519DEST_PATH_IMAGE060
Figure 394519DEST_PATH_IMAGE060

Figure 994128DEST_PATH_IMAGE061
Figure 994128DEST_PATH_IMAGE061

Figure 867406DEST_PATH_IMAGE062
Figure 867406DEST_PATH_IMAGE062
.

在本实施例中,为了保证校验符号之间相互独立,文件符号前的系数如xx 2x 3等为范德蒙矩阵不同行的数据。同时模加法运算均是在有限域内进行,本实施例将有限域设置为GF(43)。In this embodiment, in order to ensure that the check symbols are independent of each other, the coefficients before the file symbols, such as x , x 2 , x 3 , etc., are data of different rows of the Vandermonde matrix. At the same time, the modulo addition operations are all performed in a finite field, and in this embodiment, the finite field is set as GF(43).

步骤(5)、为不同的元素分配不同的存储节点,该实施例中共由0~7个元素,则需创建8个存储节点;将每个h一致的符号

Figure 559156DEST_PATH_IMAGE063
所对应的编码块放置在同一存储节点中,那么一个存储节点存储7个编码块,最终得到如图4所示的存储编码。Step (5): Allocate different storage nodes for different elements. In this embodiment, there are 0 to 7 elements in total, and 8 storage nodes need to be created; the symbols that are consistent with each h are
Figure 559156DEST_PATH_IMAGE063
If the corresponding coding blocks are placed in the same storage node, then one storage node stores 7 coding blocks, and finally the storage code shown in Figure 4 is obtained.

如图5为本发明方法的文件重构方法的流程示意图。基于所述再生码构造方法的文件重构方法包括如下步骤:FIG. 5 is a schematic flowchart of a file reconstruction method according to the method of the present invention. The file reconstruction method based on the regenerated code construction method comprises the following steps:

A1.根据步骤S1~S5构造再生码;A1. Construct the regeneration code according to steps S1~S5;

A2.连接到存储节点;A2. connect to the storage node;

文件重构方法主要利用步骤S3生成的外层参数为

Figure 808872DEST_PATH_IMAGE064
的MDS码提供的冗余信息;原始数据块数目为
Figure 262987DEST_PATH_IMAGE065
,即增加了
Figure 307166DEST_PATH_IMAGE066
个冗余块,因此,在文件重构过程中需要收集
Figure 971366DEST_PATH_IMAGE067
个不同的编码块。在本实施方式中,最多取n-t个存储节点,用于下载数据。The file reconstruction method mainly uses the outer parameters generated in step S3 as
Figure 808872DEST_PATH_IMAGE064
The redundancy information provided by the MDS code; the number of original data blocks is
Figure 262987DEST_PATH_IMAGE065
, which increases the
Figure 307166DEST_PATH_IMAGE066
redundant blocks, therefore, need to be collected during file reconstruction
Figure 971366DEST_PATH_IMAGE067
different coding blocks. In this implementation manner, at most n - t storage nodes are selected for downloading data.

A3.在下载当前编码块时,进行重构第一次判断;步骤A2的重构第一次判断具体包括,对于当前具有相同区组编号i 0的文件符号

Figure 759193DEST_PATH_IMAGE068
,是否已经下载了m-t+1个文件符号
Figure 333394DEST_PATH_IMAGE069
所代表的编码块,
Figure 423841DEST_PATH_IMAGE070
表示元素h出现在第i 0个区组中的第j个位置,m为区组容量,若判断为是,则停止下载当前编码块,若判断为否,则下载当前编码块;A3. When downloading the current coding block, the first judgment of reconstruction is performed; the first judgment of reconstruction in step A2 specifically includes, for the current file symbol with the same block number i 0
Figure 759193DEST_PATH_IMAGE068
, whether m - t +1 file symbols have been downloaded
Figure 333394DEST_PATH_IMAGE069
represents the coding block,
Figure 423841DEST_PATH_IMAGE070
Indicates that element h appears in the jth position in the i0th block, m is the block capacity, if it is judged to be yes, then stop downloading the current coding block, if it is judged to be no, then download the current coding block;

A4.在完成一个编码块的下载后,进行重构第二次判断;步骤A3的重构第二次判断具体包括,判断是否已经下载了

Figure 716282DEST_PATH_IMAGE071
个编码块,若判断为是,则停止下载过程,若判断为否,则继续下载过程,b为区组个数,t为子集的元素数,λt-平衡数,表示任意t元子集均出现在λ个区组中;A4. After completing the downloading of a coding block, a second judgment of reconstruction is performed; the second judgment of reconstruction in step A3 specifically includes, judging whether it has been downloaded
Figure 716282DEST_PATH_IMAGE071
If the judgment is yes , the downloading process will be stopped; if the judgment is negative , the downloading process will be continued . The subsets all appear in λ blocks;

A5.利用内层及外层的MDS编码规则恢复原文件;具体为已下载到

Figure 776642DEST_PATH_IMAGE071
个编码块时,若同一区组中编码块数目少于m-t+1个,那么利用外层
Figure 329983DEST_PATH_IMAGE072
MDS编码规则恢复编码块;若同一区组中编码块数目为m-t+1个,那么利用内层(m, m-t+1) MDS编码规则恢复编码块。A5. Use the MDS coding rules of the inner and outer layers to restore the original file; specifically, it has been downloaded to
Figure 776642DEST_PATH_IMAGE071
If the number of coding blocks in the same block is less than m - t +1, then the outer layer is used.
Figure 329983DEST_PATH_IMAGE072
The MDS coding rule restores the coding block; if the number of coding blocks in the same block group is m - t +1, then the inner layer ( m , m - t +1) MDS coding rule is used to restore the coding block.

并不是每一次重构操作都需n-t个存储节点传输数据,而是只需下载

Figure 981544DEST_PATH_IMAGE073
不同的编码块即可结束文件传输过程,进而开始文件重构操作。Not every reconstruction operation requires n - t storage nodes to transfer data, but just download
Figure 981544DEST_PATH_IMAGE073
The different encoding blocks can end the file transfer process and start the file reconstruction operation.

具体实施方式中,给出一个M=39,n=8,d=7,k=6的文件重构实例,如图6为本发明方法的文件重构方法的实施例示意图。具体说明如下:In the specific implementation, an example of file reconstruction with M=39, n=8, d=7, and k=6 is given. FIG. 6 is a schematic diagram of an embodiment of a file reconstruction method according to the method of the present invention. The specific instructions are as follows:

步骤1)、在本实施例中,连接至Node 0~Node 5这6个节点进行文件重构操作。Step 1), in this embodiment, connect to 6 nodes Node 0 to Node 5 to perform file reconstruction operation.

步骤2)、Node 0需下载所有数据,Node 1需下载所有数据,Node 2需下载所有数据,Node 3中下载除文件符号

Figure 495702DEST_PATH_IMAGE074
对应的编码块外的所有数据,Node 4中下载除
Figure 625332DEST_PATH_IMAGE075
对应的编码块外的所有数据,Node 5中下载除
Figure 285377DEST_PATH_IMAGE076
对应的编码块外的所有数据,至此共下载39个编码块,数据下载过程结束。Step 2), Node 0 needs to download all data, Node 1 needs to download all data, Node 2 needs to download all data, and Node 3 downloads all data except the file symbol
Figure 495702DEST_PATH_IMAGE074
All data except the corresponding coding block, downloaded in Node 4 except
Figure 625332DEST_PATH_IMAGE075
All data except the corresponding coding block, downloaded in Node 5 except
Figure 285377DEST_PATH_IMAGE076
For all the data outside the corresponding coding block, a total of 39 coding blocks have been downloaded so far, and the data downloading process ends.

步骤3)、利用外层(42, 39)MDS编码规程进行文件恢复。由于Step 3), using the outer layer (42, 39) MDS coding procedure to restore the file. because

Figure 842260DEST_PATH_IMAGE077
Figure 842260DEST_PATH_IMAGE077

Figure 843714DEST_PATH_IMAGE078
Figure 843714DEST_PATH_IMAGE078

Figure 901669DEST_PATH_IMAGE079
Figure 901669DEST_PATH_IMAGE079

而从Node 0~Node 5下载的数据缺少

Figure 304969DEST_PATH_IMAGE080
所代表的的的编码块,所以从
Figure 767174DEST_PATH_IMAGE081
中消去其余编码块后结果如下:And the data downloaded from Node 0~Node 5 is missing
Figure 304969DEST_PATH_IMAGE080
represented by the encoding block, so from
Figure 767174DEST_PATH_IMAGE081
After eliminating the remaining coding blocks, the result is as follows:

Figure 131290DEST_PATH_IMAGE082
Figure 131290DEST_PATH_IMAGE082

Figure 602723DEST_PATH_IMAGE083
Figure 602723DEST_PATH_IMAGE083

Figure 860529DEST_PATH_IMAGE084
Figure 860529DEST_PATH_IMAGE084

步骤4)、编码过程中将有限域设置为GF(43),且

Figure 24794DEST_PATH_IMAGE085
线性无关,那么利用上述三式能将
Figure 125474DEST_PATH_IMAGE086
所代表的编码块恢复出来。那么至此,所有编码块均已得到。Step 4), the finite field is set to GF(43) in the encoding process, and
Figure 24794DEST_PATH_IMAGE085
Linear independent, then using the above three equations can be
Figure 125474DEST_PATH_IMAGE086
The encoded block represented is recovered. So far, all coding blocks have been obtained.

如图7为本发明方法的节点修复方法的流程示意图,基于所述再生码构造方法的节点修复方法,包括如下步骤:FIG. 7 is a schematic flowchart of the node repair method of the method of the present invention. The node repair method based on the regeneration code construction method includes the following steps:

B1.根据步骤S1~S5构造再生码;B1. Construct the regeneration code according to steps S1~S5;

B2.确定失效节点的序号;具体为,从剩下的完好节点中对元素编号,记为

Figure 869439DEST_PATH_IMAGE087
,将元素编号与
Figure 247331DEST_PATH_IMAGE088
设计中的元素对比,确定失效节点的序号h 0;节点修复方法根据步骤S3的内层参数为(m,m-t+1)的MDS码提供的冗余信息,由于内层参数为(m,m-t+1)的MDS码只有t-1个数据符号作为冗余信息,因此共需下载r(m-t+1)个编码块。B2. Determine the serial number of the failed node; specifically, number the elements from the remaining intact nodes, denoted as
Figure 869439DEST_PATH_IMAGE087
, compare the element number with
Figure 247331DEST_PATH_IMAGE088
The elements in the design are compared, and the serial number h 0 of the failed node is determined; the node repair method is based on the redundant information provided by the MDS code whose inner parameter in step S3 is ( m , m - t +1), since the inner parameter is ( m , m - t +1) of the MDS code has only t -1 data symbols as redundant information, so a total of r ( m - t +1) coding blocks need to be downloaded.

B3.在组合设计中确定失效节点出现过的区组序号,依次记为

Figure 690819DEST_PATH_IMAGE089
;B3. Determine the block serial number of the failed node in the combined design, and record it as
Figure 690819DEST_PATH_IMAGE089
;

B4.从n-t+1个节点中下载编码块,对应文件符号具体为

Figure 154162DEST_PATH_IMAGE090
Figure 967397DEST_PATH_IMAGE091
表示h出现在第i个区组中的第j个位置,共下载r(m-t+1)个编码块;B4. Download the encoded block from n - t +1 nodes, the corresponding file symbol is specifically
Figure 154162DEST_PATH_IMAGE090
,
Figure 967397DEST_PATH_IMAGE091
Indicates that h appears at the jth position in the ith block, and a total of r ( m - t +1) coding blocks are downloaded;

B5.将步骤B4下载的r(m-t+1)个编码块分成r组,每组的编码块具有相同的区组编号iB5. The r ( m - t +1) code blocks downloaded in step B4 are divided into r groups, and the code blocks of each group have the same block number i ;

B6.对每一组编码块利用内层获得的MDS码的编码规则恢复一个编码块;将r组编码块恢复出r个编码块;B6. For each group of coding blocks, use the coding rule of the MDS code obtained by the inner layer to restore a coding block; restore r coding blocks from the r group of coding blocks;

B7.将恢复出的r个编码块存储到一个新的存储节点,并将此节点加入原存储系统中完成修复过程。B7. Store the recovered r encoded blocks in a new storage node, and add this node to the original storage system to complete the repair process.

对于一个t-设计

Figure 199795DEST_PATH_IMAGE092
,即共有v个不同的元素,每个区组有m个元素,任意t个元素相遇λ次(即出现在同一个区组中)。在编码过程的步骤S3中,内层参数为(m,m-t+1)的MDS码提供t-1份冗余信息,那么对于属于同一个区组的编码块能容忍t-1个编码块的失效。进一步地,整个存储系统能容忍t个节点的失效,共失效tr个编码块,该过程中失效的编码不能只由内层(m,m-t+1) MDS码提供的冗余信息进行恢复,而还必须依赖于外层参数为
Figure 830497DEST_PATH_IMAGE093
的MDS码来进行恢复;其中内层(m,m-t+1)MDS码能为这tr个失效编码块提供
Figure 249977DEST_PATH_IMAGE094
份冗余信息,外层
Figure 866903DEST_PATH_IMAGE095
MDS码能为这些冗余文件提供
Figure 829174DEST_PATH_IMAGE096
份冗余信息,因此利用这
Figure 506143DEST_PATH_IMAGE097
份冗余信息能恢复这失效的tr个编码块。for a t -design
Figure 199795DEST_PATH_IMAGE092
, that is, there are v different elements in total, each block has m elements, and any t elements meet λ times (that is, appear in the same block). In step S3 of the encoding process, the MDS code whose inner parameter is ( m , m - t +1) provides t -1 redundant information, then t -1 encodings can be tolerated for the encoding blocks belonging to the same block group Invalidation of blocks. Further, the entire storage system can tolerate the failure of t nodes, and tr code blocks fail in total. The code that fails in this process cannot be recovered only by the redundant information provided by the inner ( m , m - t +1) MDS code. , and must also depend on the outer parameters as
Figure 830497DEST_PATH_IMAGE093
The MDS code is used for recovery; the inner ( m , m - t +1) MDS code can provide the tr failed coding blocks
Figure 249977DEST_PATH_IMAGE094
redundant information, outer layer
Figure 866903DEST_PATH_IMAGE095
MDS codes can provide these redundant files
Figure 829174DEST_PATH_IMAGE096
redundant information, so use this
Figure 506143DEST_PATH_IMAGE097
The redundant information can restore the failed tr coding blocks.

在具体实施方式中,给出一个M=39,n=8,d=7,k=6的节点修复实例,如图8为本发明方法的节点修复方法的实施例示意图。具体说明如下:在本实施例中,令Node 0失效,那么具体修复过程如下:In the specific implementation manner, an example of node repairing with M=39, n =8, d =7, k =6 is given. FIG. 8 is a schematic diagram of an embodiment of the node repairing method of the method of the present invention. The specific description is as follows: In this embodiment, if Node 0 is disabled, the specific repair process is as follows:

步骤一、由于Node 1~Node 7节点完好,那么失效节点的元素编号为0;Step 1. Since the nodes from Node 1 to Node 7 are intact, the element number of the failed node is 0;

步骤二、该分布式存储系统是基于2-(8,4,3)设计的,那么元素0出现过的区组序号有{1,2,3,4,5,6,7}。Step 2: The distributed storage system is designed based on 2-(8,4,3), then the block sequence numbers where element 0 has appeared are {1,2,3,4,5,6,7}.

步骤三、从7个存储节点中下载

Figure 678498DEST_PATH_IMAGE098
所代表的文件,Step 3. Download from 7 storage nodes
Figure 678498DEST_PATH_IMAGE098
the document represented,

Figure 833536DEST_PATH_IMAGE099
Figure 833536DEST_PATH_IMAGE099

Figure 899581DEST_PATH_IMAGE100
,则共下载了21个编码块。
Figure 899581DEST_PATH_IMAGE100
, a total of 21 code blocks are downloaded.

步骤四、将所下载的编码块分为7组,每组中的编码块

Figure 747451DEST_PATH_IMAGE101
具有相同的区组编号i,分组后如下
Figure 407103DEST_PATH_IMAGE102
Step 4. Divide the downloaded coding blocks into 7 groups, and the coding blocks in each group
Figure 747451DEST_PATH_IMAGE101
have the same block number i , grouped as follows
Figure 407103DEST_PATH_IMAGE102

Figure 736803DEST_PATH_IMAGE103
Figure 736803DEST_PATH_IMAGE103
.

步骤五、对于对每一组均利用内层参数为(4, 3)的MDS码的编码规则恢复出一个编码块,例如在第1组中

Figure 532721DEST_PATH_IMAGE104
,且
Figure 285913DEST_PATH_IMAGE105
已获得,那么经过异或操作后即可恢复
Figure 557494DEST_PATH_IMAGE106
。类似地对每一组均进行上述恢复操作,最终能获得
Figure 319914DEST_PATH_IMAGE107
,而这正是失效节点中的编码块。Step 5. For each group, use the encoding rule of the MDS code with the inner parameter of (4, 3) to recover a coding block, for example, in the first group
Figure 532721DEST_PATH_IMAGE104
,and
Figure 285913DEST_PATH_IMAGE105
has been obtained, then it can be restored after XOR operation
Figure 557494DEST_PATH_IMAGE106
. Similarly, the above recovery operation is performed for each group, and finally the
Figure 319914DEST_PATH_IMAGE107
, which is the encoded block in the failed node.

步骤六、将上述7个编码块存储至一个新的存储节点,并将此节点加入原分布式存储系统中。至此,节点修复过程完成。Step 6: Store the above seven coding blocks in a new storage node, and add this node to the original distributed storage system. At this point, the node repair process is complete.

总结本发明提出的码字与Chao Tian提出的码字相关参数,如下表所示:Summarize the relevant parameters of the code word proposed by the present invention and the code word proposed by Chao Tian, as shown in the following table:

Figure 235918DEST_PATH_IMAGE108
Figure 235918DEST_PATH_IMAGE108

证明利用上述编码方案生成的[n,k=n-t,d=n-t+1]分布式存储系统能达到辛格尔顿界,即d min= n-k+1,证明过程如下:Prove that the [ n , k = n - t , d = n - t +1] distributed storage system generated by the above coding scheme can reach the Singleton bound, that is, d min = n - k +1, and the proof process is as follows:

Ⅰ.根据编码过程中的步骤S5,将所有具有相同h

Figure 160011DEST_PATH_IMAGE109
所表示的编码块存储至同一节点中,那么一个节点中存储的文件数
Figure 669621DEST_PATH_IMAGE110
r为元素重复度;并且为每一个不同的元素h分配一个新的存储节点,那么存储系统的总节点数n=v。I. According to step S5 in the encoding process, all the
Figure 160011DEST_PATH_IMAGE109
If the encoded block represented is stored in the same node, then the number of files stored in a node
Figure 669621DEST_PATH_IMAGE110
, r is the element repetition degree; and a new storage node is allocated for each different element h , then the total number of nodes in the storage system is n = v .

Ⅱ.编码过程的步骤S3,由于存在双层MDS编码,那么该分布式系统中能存储的数据块数目

Figure 970152DEST_PATH_IMAGE111
,其中
Figure 6242DEST_PATH_IMAGE112
。II. In step S3 of the encoding process, due to the existence of double-layer MDS encoding, the number of data blocks that can be stored in the distributed system
Figure 970152DEST_PATH_IMAGE111
,in
Figure 6242DEST_PATH_IMAGE112
.

Ⅲ.根据

Figure 101236DEST_PATH_IMAGE113
。根据节点修复过程,一个节点失效需连接至
Figure 347410DEST_PATH_IMAGE114
个节点进行修复。Ⅲ. according to
Figure 101236DEST_PATH_IMAGE113
. According to the node repair process, a node failure needs to be connected to
Figure 347410DEST_PATH_IMAGE114
nodes are repaired.

Ⅳ.辛格尔顿界的左式:Ⅳ. The left form of the Singleton bound:

左式=n-(n-t)+1=t+1Left = n -( n - t )+1= t +1

辛格尔顿界的右式:The right-hand form of the Singleton bound:

Figure 451632DEST_PATH_IMAGE115
Figure 451632DEST_PATH_IMAGE115

由于vm>0,所以上式Since vm > 0, the above formula

Figure 811069DEST_PATH_IMAGE116
Figure 811069DEST_PATH_IMAGE116

Ⅴ.由于编码过程的步骤S1中,选择的t-设计需要满足的要求是

Figure 716446DEST_PATH_IMAGE117
,因此辛格尔顿界右式=t+1。V. As in step S1 of the encoding process, the selected t -design needs to satisfy the requirement that
Figure 716446DEST_PATH_IMAGE117
, so the right-hand Singleton bound = t + 1.

Ⅵ.由于左式=右式,因此证毕。VI. Since the left form = the right form, the proof is completed.

Claims (9)

1. A method for constructing a regenerated code, comprising the steps of:
s1, acquiring a combined design and storage data block; the obtained combination design is specifically that the combination condition is satisfiedt-designing
Figure 87018DEST_PATH_IMAGE001
λIs composed oftEquilibrium number, representing arbitrarytThe meta-subset all appear inλThe number of the blocks is the same as the number of the blocks,
note the bookbThe number of the block groups is used as the number of the block groups,
Figure 387549DEST_PATH_IMAGE002
note the bookrThe degree of repetition of the elements is,
Figure 158059DEST_PATH_IMAGE003
the combination conditions are as follows:
Figure 518633DEST_PATH_IMAGE004
wherein,tis the number of elements of the subset;vthe order of the design, specifically the total number of different elements in the design;mthe block capacity is the number of elements in a block;
the obtained storage data block is specifically that the size of the original storage file is set as M, and the original file is split into
Figure 109014DEST_PATH_IMAGE005
Data blocks of equal size;
s2, sorting and numbering the block groups, and counting the block groups and positions of the elements to obtain file symbols;
s3, coding the data block to obtain a coding block;
s4, distributing a different file symbol for each coding block;
and S5, placing the coding blocks corresponding to the file symbol sets with the consistent elements in the same storage node to obtain the regeneration codes.
2. The method according to claim 1, wherein the file symbol of step S2 is set as
Figure 213237DEST_PATH_IMAGE006
Wherein 1 is less than or equal toib,1≤jmbThe number of the block groups is used as the number of the block groups,mthe block capacity is obtained; at the same time
Figure 103832DEST_PATH_IMAGE006
Presentation elementhAppear atiThe first in the groupjA location; the block number is shown asB 1,B 2,…,B b For a block of blocksB i The elements therein are arranged in descending order, and the position numbers are given as {1,2, …,m}。
3. the method according to claim 2, wherein the step S3 is to encode the data block by using a dual-layer encoding strategy; the structure of the MDS code obtained from the outer layer is as follows:
using a parameter of
Figure 884575DEST_PATH_IMAGE007
MDS code pair
Figure 962253DEST_PATH_IMAGE008
Encoding the data block to obtainb(m-t+1) code blocks, whereinbThe number of the block groups is used as the number of the block groups,mis the capacity of the block group,tis the number of elements of the subset,λis composed oftEquilibrium number, representing arbitrarytThe meta-subset appearing inλEach group of cells;
the MDS code obtained in the inner layer is constructed as follows:
obtained by outer layer codingb(m-t+1) code blocks are equally divided intobSet of using parameters of (m,m-t+1) MDS code pairs in each groupm-tCarrying out secondary coding on +1 coding blocks to finally obtainbmAnd coding the blocks.
4. The method according to claim 3, wherein step S4 comprises assigning a file symbol as an identifier to each of the obtained code blocks, and without loss of generality, assigning a file symbol as an identifier to each of the obtained code blocks
Figure 604587DEST_PATH_IMAGE009
Front of
Figure 615268DEST_PATH_IMAGE010
File symbol
Figure 52065DEST_PATH_IMAGE012
As the identification of the redundant block generated by the outer layer coding; will be provided with
Figure 882618DEST_PATH_IMAGE013
File symbol of
Figure 328643DEST_PATH_IMAGE014
As an identification of the redundant blocks generated by the inner layer coding, wherein,
Figure 177519DEST_PATH_IMAGE016
presentation elementhAppear atiThe first in the groupjThe position of each of the plurality of positions,mthe block size.
5. The method of claim 4, wherein the step S5 is embodied in the form of a packageWhenever a file symbol is found
Figure 316377DEST_PATH_IMAGE006
If it is an elementhWith previous file symbols
Figure 368646DEST_PATH_IMAGE017
Otherwise, a new storage node is allocated and numbered ash(ii) a For each file symbol
Figure 352783DEST_PATH_IMAGE017
If it is an elementhAllocated storage node, then elementhThe same cultural symbols
Figure 72477DEST_PATH_IMAGE018
The corresponding code block is stored to be coded ashOf the storage nodes of (1), one storage node storesrA plurality of code blocks, each of which is encoded,ris the element repetition degree; the obtained parameter is [ 2 ]n,k=n-t,d=n-t+1]The reproduction code of (2) is stored,nthe number of the nodes is the number of the nodes,kin order to download the number of data,din order to determine the number of nodes participating in the repair,tis the number of elements of the subset,
Figure 116656DEST_PATH_IMAGE019
presentation elementhAppear atiThe first in the groupjAnd (4) a position.
6. A file reconstruction method based on the reproduction code construction method according to any one of claims 1 to 5, characterized by comprising the steps of:
A1. constructing a reproduction code according to the steps S1-S5;
A2. connecting to a storage node;
A3. when downloading the current coding block, carrying out reconstruction first judgment;
A4. after the downloading of one coding block is finished, carrying out reconstruction and second judgment;
A5. and restoring the original file by using the MDS coding rules of the inner layer and the outer layer.
7. The method of claim 6, wherein the first determination of reconstruction at step A3 includes, for the current granules having the same numberi 0File symbol of
Figure 656222DEST_PATH_IMAGE020
Whether or not to already downloadm-t+1 file symbol
Figure 696247DEST_PATH_IMAGE021
The code blocks represented by the code blocks are,
Figure 4868DEST_PATH_IMAGE022
presentation elementhAppear ati 0The first in the groupjThe position of each of the plurality of positions,mthe block capacity is obtained; if the judgment result is yes, stopping downloading the current coding block; if not, downloading the current coding block; the second determination of the reconstruction in step a4 specifically includes determining whether the download has been completed
Figure 219949DEST_PATH_IMAGE023
A plurality of coding blocks; if the judgment result is yes, the downloading process is stopped, if the judgment result is no, the downloading process is continued,bthe number of the block groups is used as the number of the block groups,tis the number of elements of the subset,λis composed oftEquilibrium number, representing arbitrarytThe meta-subset appearing inλIn groups of blocks.
8. A node repairing method based on the regenerated code constructing method of any claim 1 to 5, characterized by comprising the steps of:
B1. constructing a reproduction code according to the steps S1-S5;
B2. determining the serial number of the failed node;
B3. determining the block serial number of the failed node in the combined design;
B4. fromn-tIn +1 nodesDownloading coding blocks, co-downloadingr(m-t+1) code blocks, the number of code blocks,rthe degree of repetition of the elements is,nthe number of the nodes is the number of the nodes,tis the number of elements of the subset,mthe block capacity is obtained;
B5. downloaded step B4r(m-t+1) coded block partitionsrGroups, the coding blocks of each group having the same block numberi
B6. Recovering a coding block for each group of coding blocks by using the coding rule of the MDS code obtained by the inner layer; will be provided withrRecovery of block coded blocksrA plurality of coding blocks;
B7. will be recoveredrAnd storing the coding blocks into a new storage node, and adding the node into the original storage system to complete the repair process.
9. The node repairing method according to claim 8, wherein the step B2 is embodied as numbering elements from the remaining intact nodes as ∑h 1,...,h v-1Numbering the elements with
Figure 512390DEST_PATH_IMAGE024
Comparing elements in design and determining serial number of failed nodeh 0(ii) a WhereinvIn order to be a step of the design,mis the capacity of the block group,λis composed oft-a balance number; the block number of step B3 is recorded asi 1,...,i r }; the file symbol of step B4 is specifically
Figure 572750DEST_PATH_IMAGE025
Is shown byh 0Appear atiThe first in the groupjThe position of each of the plurality of positions,
Figure 267037DEST_PATH_IMAGE027
CN202110348155.4A 2021-03-31 2021-03-31 Regeneration code construction method, file reconstruction method and node repair method Active CN112732203B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110348155.4A CN112732203B (en) 2021-03-31 2021-03-31 Regeneration code construction method, file reconstruction method and node repair method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110348155.4A CN112732203B (en) 2021-03-31 2021-03-31 Regeneration code construction method, file reconstruction method and node repair method

Publications (2)

Publication Number Publication Date
CN112732203A CN112732203A (en) 2021-04-30
CN112732203B true CN112732203B (en) 2021-06-22

Family

ID=75596222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110348155.4A Active CN112732203B (en) 2021-03-31 2021-03-31 Regeneration code construction method, file reconstruction method and node repair method

Country Status (1)

Country Link
CN (1) CN112732203B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116880778B (en) * 2023-09-07 2023-11-21 杭州迅杭科技有限公司 User privacy protection method based on regenerative coding and distributed storage

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103688514A (en) * 2013-02-26 2014-03-26 北京大学深圳研究生院 An Encoding and Storage Node Restoration Method of Minimal Storage Regeneration Code
CN103688515A (en) * 2013-03-26 2014-03-26 北京大学深圳研究生院 An Encoding and Storage Node Restoration Method of Minimal Bandwidth Regenerating Codes
US8719667B2 (en) * 2010-07-26 2014-05-06 Thomson Licensing Method for adding redundancy data to a distributed data storage system and corresponding device
CN104838626A (en) * 2013-01-04 2015-08-12 北京大学深圳研究生院 Coding method for general projective self-repairing codes, and data reconstruction and repair method
CN105721611A (en) * 2016-04-15 2016-06-29 西南交通大学 General method for generating minimal storage regenerating code with maximum distance separable storage code
CN108279995A (en) * 2018-01-30 2018-07-13 北京交通大学 A kind of storage method for the distributed memory system regenerating code based on safety
CN109062724A (en) * 2018-07-21 2018-12-21 湖北大学 A kind of correcting and eleting codes conversion method and terminal
CN110750382A (en) * 2019-09-18 2020-02-04 华中科技大学 Minimum storage regeneration code coding method and system for improving data repair performance
US10608784B2 (en) * 2016-03-15 2020-03-31 ClineHair Commercial Endeavors Distributed storage system data management and security
CN111125014A (en) * 2019-11-19 2020-05-08 长安大学 Construction method of flexible partial repeat code based on U-shaped design
WO2020160142A1 (en) * 2019-01-29 2020-08-06 ClineHair Commercial Endeavors Encoding and storage node repairing method for minimum storage regenerating codes for distributed storage systems
CN111796966A (en) * 2020-05-19 2020-10-20 中大编码有限公司 Construction method of storage rack-aware regeneration code

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8719667B2 (en) * 2010-07-26 2014-05-06 Thomson Licensing Method for adding redundancy data to a distributed data storage system and corresponding device
CN104838626A (en) * 2013-01-04 2015-08-12 北京大学深圳研究生院 Coding method for general projective self-repairing codes, and data reconstruction and repair method
CN103688514A (en) * 2013-02-26 2014-03-26 北京大学深圳研究生院 An Encoding and Storage Node Restoration Method of Minimal Storage Regeneration Code
CN103688515A (en) * 2013-03-26 2014-03-26 北京大学深圳研究生院 An Encoding and Storage Node Restoration Method of Minimal Bandwidth Regenerating Codes
US10608784B2 (en) * 2016-03-15 2020-03-31 ClineHair Commercial Endeavors Distributed storage system data management and security
CN105721611A (en) * 2016-04-15 2016-06-29 西南交通大学 General method for generating minimal storage regenerating code with maximum distance separable storage code
CN108279995A (en) * 2018-01-30 2018-07-13 北京交通大学 A kind of storage method for the distributed memory system regenerating code based on safety
CN109062724A (en) * 2018-07-21 2018-12-21 湖北大学 A kind of correcting and eleting codes conversion method and terminal
WO2020160142A1 (en) * 2019-01-29 2020-08-06 ClineHair Commercial Endeavors Encoding and storage node repairing method for minimum storage regenerating codes for distributed storage systems
CN110750382A (en) * 2019-09-18 2020-02-04 华中科技大学 Minimum storage regeneration code coding method and system for improving data repair performance
CN111125014A (en) * 2019-11-19 2020-05-08 长安大学 Construction method of flexible partial repeat code based on U-shaped design
CN111796966A (en) * 2020-05-19 2020-10-20 中大编码有限公司 Construction method of storage rack-aware regeneration code

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
Interference Alignment in Regenerating Codes;Nihar B. Shah et al.;《IEEE TRANSACTIONS ON INFORMATION THEORY》;20100510;全文 *
Minimum Storage Regenerating Codes for;HUAYU ZHANG et al.;《IEEE Access》;20170428;全文 *
On the Optimal Minimum Distance of;Bing Zhu et al.;《IEEE Xplore》;20200514;全文 *
Replication-based Distributed Storage Systems with;Bing Zhu et al.;《IEEE Xplore》;20140508;全文 *
二元再生码在分布式存储系统的应用;侯韩旭 等;《计算机研究与发展》;20131126;全文 *
基于柯西矩阵的最小带宽再生码研究伏;宋海龙,王伟平,肖亚龙;《湖南大学学报(自然科学版)》;20170831;全文 *
基于精确再生码的秘密共享方案;宋海龙,王伟平;《中南大学学报(自然科学版)》;20170430;全文 *

Also Published As

Publication number Publication date
CN112732203A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
JP5167243B2 (en) Storage Allocation and Erasure Coding Techniques for Scalable and Fault Tolerant Storage Systems
CN103688515B (en) The coding of a kind of minimum bandwidth regeneration code and memory node restorative procedure
CN108540520B (en) Locality Repair Coding and Node Fault Repair Method Based on Partially Repeated Codes
CN103688514B (en) A kind of minimum memory regenerates the coding and memory node restorative procedure of code
JP5298393B2 (en) Parallel Reed-Solomon RAID (RS-RAID) architecture, device, and method
CN109643258B (en) Multi-node repair using high-rate minimal storage erase code
CN110750382B (en) Minimal storage regeneration code coding method and system for improving data repair performance
CN107656832A (en) A kind of correcting and eleting codes method of low data reconstruction expense
CN106776112B (en) A kind of locality reparation coding method based on Pyramid code
CN110764950A (en) Hybrid coding method, data repair method, and system based on RS code and regeneration code
CN108347306B (en) Quasi-local reconstruction code coding and node fault repair method in distributed storage system
CN107844272A (en) A kind of cross-packet coding and decoding method for improving error correcting capability
CN109491835A (en) A kind of data fault tolerance method based on Dynamic Packet code
CN103746774A (en) Error resilient coding method for high-efficiency data reading
CN112732203B (en) Regeneration code construction method, file reconstruction method and node repair method
CN108762978A (en) A kind of constructed in groups method of Part portions repetitive cycling code
CN112130772A (en) A blockchain secure storage method based on sparse random erasure coding technology
CN111459710A (en) Erasure code memory recovery method, device and memory system capable of sensing heat degree and risk
CN111679793B (en) Single-disk fault rapid recovery method based on STAR code
CN109947587B (en) Construction method and fault repair method of group repair code for non-uniform fault protection
JP7603647B2 (en) Blockchain transaction data storage method, device, and distributed storage system using the same
CN109521955A (en) Isomery part duplication code construction and conversion method based on layering cross-over design
CN113258938B (en) A Construction Method of Erasure Code for Rapid Repair of Single Node Failure
CN110781025B (en) Construction of Symmetric Partial Repetition Code Based on Complete Graph and Repair Method of Faulty Nodes
Gupta et al. On rack-aware cooperative regenerating codes and epsilon-MSCR codes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant