CN111147226A - Data storage method, device and storage medium - Google Patents
Data storage method, device and storage medium Download PDFInfo
- Publication number
- CN111147226A CN111147226A CN201811303561.3A CN201811303561A CN111147226A CN 111147226 A CN111147226 A CN 111147226A CN 201811303561 A CN201811303561 A CN 201811303561A CN 111147226 A CN111147226 A CN 111147226A
- Authority
- CN
- China
- Prior art keywords
- storage
- data
- target
- target storage
- hash ring
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/06—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
- H04L9/0643—Hash functions, e.g. MD5, SHA, HMAC or f9 MAC
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
- H04L49/901—Buffering arrangements using storage descriptor, e.g. read or write pointers
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Power Engineering (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a data storage method, a data storage device and a data storage medium, and belongs to the technical field of data processing. The method comprises the following steps: receiving a data storage instruction, wherein the data storage instruction carries a preset key value corresponding to data to be stored; acquiring target hash ring information corresponding to the latest version number from the hash ring information corresponding to the multiple version numbers, wherein the multiple hash ring information is synchronously acquired from the storage management nodes, and the hash ring information corresponding to each version number is determined by the storage management nodes through a preset hash algorithm based on the changed address information of the storage nodes when the storage management nodes monitor that the number of the storage nodes is changed; and determining a first target storage node based on the preset key value and the target hash ring information, and storing the data into the first target storage node. The embodiment of the invention can avoid the need of data migration in the data storage process, improve the storage performance of the system and improve the safety of data storage.
Description
Technical Field
The embodiment of the invention relates to the technical field of data storage, in particular to a data storage method, a data storage device and a storage medium.
Background
In some application scenarios, the storage system may implement data storage based on a consistent hashing algorithm, and the implementation process includes: the method comprises the steps that a storage management node in a storage system constructs an abstract and closed hash ring, a plurality of hash values are uniformly distributed on the hash ring, the corresponding hash value is determined through a hash algorithm based on address information of the storage node in the storage system, the position of the hash value is searched on the hash ring, the storage node is configured to the position, and therefore data storage is achieved based on the storage node on the hash ring.
In some embodiments, the number of storage nodes in the storage system may vary, such as with large scale increases in data volume, which may require the addition of storage nodes in the storage system. For a storage system for implementing data storage based on a consistent hash algorithm, when the number of storage nodes changes, in order to ensure that the stored data can be successfully acquired subsequently, data migration may need to be performed, for example, data of an original storage node on a hash ring may need to be migrated to a newly added storage node.
However, in the above implementation, if the amount of data is very large, the performance of the system may be affected by performing the data migration operation, and a data loss phenomenon may occur during the data migration process, resulting in poor security of data storage.
Disclosure of Invention
The embodiment of the invention provides a data storage method, a data storage device and a data storage medium, which can solve the problem that data migration is needed when the number of storage nodes in a storage system changes. The technical scheme is as follows:
in a first aspect, a data storage method is provided, the method including:
receiving a data storage instruction, wherein the data storage instruction carries a preset key value corresponding to data to be stored;
acquiring target hash ring information corresponding to the latest version number from hash ring information corresponding to a plurality of version numbers, wherein the hash ring information is synchronously acquired from storage management nodes, and the hash ring information corresponding to each version number is determined by the storage management nodes through a preset hash algorithm based on changed address information of the storage nodes when the storage management nodes monitor that the number of the storage nodes is changed;
and determining a first target storage node based on the preset key value and the target hash ring information, and storing the data into the first target storage node.
Optionally, after storing the data in the first target storage node, the method further includes:
receiving a data acquisition instruction, wherein the data acquisition instruction carries the preset key value;
determining a corresponding hash value through the preset hash algorithm based on the preset key value;
and determining a plurality of second target storage nodes based on the hash value and the plurality of hash ring information, and acquiring the data from the plurality of second target storage nodes.
Optionally, when the hash ring information corresponding to each version number includes a hash value interval, the determining a plurality of second target storage nodes based on the hash value and the plurality of hash ring information includes:
for each hash ring information in the plurality of hash ring information, determining a target hash value interval in which the hash value is located in each hash ring information;
and searching the nearest storage node in each target hash value interval on the hash ring along the clockwise direction, and determining the searched storage node as the second target storage node to obtain the plurality of second target storage nodes.
Optionally, the obtaining the data from the plurality of second target storage nodes includes:
sequencing the plurality of second target storage nodes according to the sequence of the version numbers from new to old;
sending a data acquisition request to a first second target storage node in the sequenced second target storage nodes, and stopping sending the data acquisition request when the first second target storage node returns the data; and when the first second target storage node returns an acquisition failure response, sending the data acquisition request to a next second target storage node in the plurality of sorted second target storage nodes until the data is acquired.
Optionally, before the sorting the plurality of second target storage nodes according to the sequence of the version numbers from new to old, the method further includes:
when second target storage nodes with the same version number are included in the plurality of second target storage nodes, removing the second target storage nodes with the same version number from the plurality of second target storage nodes;
correspondingly, the sorting the plurality of second target storage nodes according to the sequence of the version numbers from new to old comprises:
and sequencing the remaining plurality of second target storage nodes according to the sequence of the version numbers from new to old.
Optionally, before acquiring the target hash ring information corresponding to the latest version number from the hash ring information corresponding to the multiple version numbers, the method further includes:
sending an information synchronization request to the storage management node every other preset time, wherein the information synchronization request is used for requesting the storage management node to return hash ring information corresponding to the plurality of version numbers;
and receiving and storing hash ring information corresponding to a plurality of version numbers sent by the storage management node.
Optionally, the information synchronization request is further configured to request the storage management node to return address information of each storage node, and the receiving and storing hash ring information corresponding to a plurality of version numbers sent by the storage management node includes:
receiving and storing hash ring information corresponding to a plurality of version numbers sent by the storage management node and address information of each storage node;
accordingly, the storing the data into the first target storage node comprises:
storing the data into the first target storage node based on the address information of the first target storage node.
In a second aspect, there is provided a data storage device comprising:
the first receiving module is used for receiving a data storage instruction, and the data storage instruction carries a preset key value corresponding to data to be stored;
the acquisition module is used for acquiring target hash ring information corresponding to the latest version number from the hash ring information corresponding to the multiple version numbers, the multiple hash ring information is synchronously acquired from the storage management node, and the hash ring information corresponding to each version number is determined by a preset hash algorithm based on the changed address information of the storage node when the storage management node monitors that the number of the storage nodes changes;
and the storage module is used for determining a first target storage node based on the preset key value and the target hash ring information, and storing the data into the first target storage node.
Optionally, the apparatus further comprises:
the second receiving module is used for receiving a data acquisition instruction, and the data acquisition instruction carries the preset key value;
the determining module is used for determining a corresponding hash value through the preset hash algorithm based on the preset key value;
the determining module is further configured to determine a plurality of second target storage nodes based on the hash value and the plurality of hash ring information, and obtain the data from the plurality of second target storage nodes.
Optionally, the determining module is configured to:
when the hash ring information corresponding to each version number comprises a hash value interval, determining a target hash value interval of the hash value in each hash ring information for each hash ring information in the plurality of hash ring information;
and searching the nearest storage node in each target hash value interval on the hash ring along the clockwise direction, and determining the searched storage node as the second target storage node to obtain the plurality of second target storage nodes.
Optionally, the determining module is configured to:
sequencing the plurality of second target storage nodes according to the sequence of the version numbers from new to old;
sending a data acquisition request to a first second target storage node in the sequenced second target storage nodes, and stopping sending the data acquisition request when the first second target storage node returns the data; and when the first second target storage node returns an acquisition failure response, sending the data acquisition request to a next second target storage node in the plurality of sorted second target storage nodes until the data is acquired.
Optionally, the determining module is configured to:
when second target storage nodes with the same version number are included in the plurality of second target storage nodes, removing the second target storage nodes with the same version number from the plurality of second target storage nodes;
and sequencing the remaining plurality of second target storage nodes according to the sequence of the version numbers from new to old.
Optionally, the apparatus further comprises:
the synchronization module is used for sending an information synchronization request to the storage management node every other preset time, wherein the information synchronization request is used for requesting the storage management node to return hash ring information corresponding to the plurality of version numbers;
and the receiving and storing module is used for receiving and storing the hash ring information corresponding to the plurality of version numbers sent by the storage management node.
Optionally, the receiving and storing module is further configured to: receiving and storing hash ring information corresponding to a plurality of version numbers sent by the storage management node and address information of each storage node;
the storage module is configured to store the data in the first target storage node based on the address information of the first target storage node.
In a third aspect, a computer-readable storage medium is provided, which stores instructions that, when executed by a processor, implement the data storage method of the first aspect.
In a fourth aspect, there is provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the data storage method of the first aspect described above.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
and receiving a data storage instruction carrying a preset key value corresponding to the data, and acquiring target hash ring information corresponding to the latest version number from the hash ring information corresponding to the plurality of version numbers. The hash ring information corresponding to the plurality of version numbers is obtained from the storage management node in a synchronous manner, and the hash ring information corresponding to each version number is generated by the storage management node based on the address information of the changed storage node after detecting that the number of the storage nodes is changed, that is, the target hash ring information corresponding to the latest version number is the latest hash ring information. That is, the embodiment of the present invention can avoid data migration during data storage, improve system storage performance, and improve data storage security.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a diagram illustrating a hash ring structure in accordance with an illustrative embodiment;
FIG. 2 is a schematic diagram of an implementation environment shown in accordance with an illustrative embodiment;
FIG. 3 is a flow chart illustrating a method of data storage according to an exemplary embodiment;
FIG. 4 is a diagram illustrating a hash ring structure in accordance with an illustrative embodiment;
FIG. 5 is a flow chart illustrating a data storage or read according to another exemplary embodiment;
FIG. 6 is a schematic diagram illustrating a data storage device in accordance with an exemplary embodiment;
FIG. 7 is a schematic diagram illustrating a data storage device according to another exemplary embodiment;
FIG. 8 is a schematic diagram illustrating a data storage device according to another exemplary embodiment;
fig. 9 is a block diagram illustrating a computer device 800 according to an example embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Before describing the data storage method provided by the embodiment of the present invention in detail, terms, application scenarios and implementation environments related to the embodiment of the present invention are briefly described.
First, terms related to the embodiments of the present invention will be briefly described.
Capacity expansion: in a distributed storage system, the increase in the number of storage nodes is referred to as capacity expansion.
Capacity reduction: in a distributed storage system, the case where the number of storage nodes is reduced is called capacity reduction. For example, when a storage node in a storage system is offline for a long time, the storage node is automatically set to be offline, which is called capacity reduction.
Secondly, the application scenarios related to the embodiment of the invention are briefly introduced.
During data storage, the number of storage nodes in the storage system may change, such as an increase in the number of storage nodes in a capacity expansion scenario, or a decrease in the number of storage nodes in a capacity reduction scenario. For a storage system for realizing data storage based on a consistent hash algorithm, when the number of storage nodes changes, data migration may be required during data storage in order to ensure that data may be successfully read subsequently. For example, referring to fig. 1, fig. 1 is a schematic diagram illustrating a structure of a hash ring according to an exemplary embodiment, where the hash ring originally includes a storage node a, a storage node B, and a storage node C, and a hash value corresponding to a data stored on the storage node C is located at position 1 of the hash ring. Since the process of data reading generally includes: and determining the position of the hash value of the data on the hash ring, wherein the first storage node searched along the clockwise direction from the position is the storage position of the data. Therefore, when the storage node D is added and located at the position shown in the figure, the data needs to be migrated from the storage node C to the storage node D, so as to ensure that the data is successfully obtained and read based on the hash value of the data subsequently.
Next, a brief description will be given of an implementation environment related to an embodiment of the present invention.
Referring to fig. 2, fig. 2 is a schematic diagram illustrating an implementation environment in accordance with an example embodiment. The implementation environment mainly includes a client 210 and a storage system 220, and the storage system 220 includes a storage management node 220a and a storage node 220 b. The client 210 may establish a connection with the storage management node 220a and the storage node 220b, respectively, via a network.
In a possible implementation manner, the client 210 may be configured with an SDK (Software Development Kit), and the client 210 may invoke the SDK to implement the data storage method provided by the embodiment of the present invention, in other words, both the operation of the user on the data and the interaction between the client 210 and the storage system 220 may be completed by the SDK. Further, the client 210 may be configured in a computer device.
The storage management node 220a is used for providing storage services, and may be used for allocating storage nodes, or determining hash ring information, for example. In one possible implementation, one or more storage nodes may be randomly selected from the storage system as the storage management node 220 a.
The storage node 220b is used for storing data, such as data of files, audio, video, pictures, and the like. Further, the client 210 may write data to the storage node 220b, or may download data from the storage node 220 b. In the implementation process, the storage system 220 generally includes a plurality of storage nodes, and further, in the storage system 220 implementing data storage based on the consistent hash algorithm, the plurality of storage nodes are all configured on a hash ring.
After the terms, application scenarios and implementation environments related to the embodiments of the present invention are described, the data storage method provided by the embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Referring to fig. 3, fig. 3 is a flowchart illustrating a data storage method according to an exemplary embodiment, where the data storage method may be applied in the implementation environment illustrated in fig. 2, and the data storage method may include the following implementation steps:
step 301: and receiving a data storage instruction, wherein the data storage instruction carries a preset key value corresponding to the data to be stored.
The data storage instruction may be triggered by a user, where the user may trigger through a preset operation, and the preset operation may include a click operation, a sliding operation, and the like, which is not limited in this embodiment of the present invention.
In addition, each complete datum corresponds to a unique preset key value. The preset key value can be set by a user in a user-defined mode according to actual requirements, and can also be randomly generated by a client, which is not limited in the embodiment of the invention.
For example, the client may be provided with a user interaction interface, the user interaction interface may be provided with an input box and a data storage option, when a user wants to store data, a preset key value corresponding to the data to be stored may be input in the input box, and then, the data storage option may be clicked to trigger the data storage instruction, so that the preset key value corresponding to the data to be stored is carried in the data storage instruction.
It should be noted that, when the preset key value is randomly generated by the client, in a possible implementation manner, the client may display the preset key value to the user, so that the user may know and record the preset key value, so as to facilitate subsequent reading based on the preset key value when reading the data. In another possible implementation manner, the client may also store the preset key value locally, so that the preset key value may be automatically loaded when the data is subsequently read, and thus the data corresponding to the preset key value is read based on the preset key value.
Step 302: and acquiring target hash ring information corresponding to the latest version number from the hash ring information corresponding to the multiple version numbers, wherein the multiple hash ring information is synchronously acquired from the storage management nodes, and the hash ring information corresponding to each version number is determined by the storage management nodes through a preset hash algorithm based on the changed address information of the storage nodes when the storage management nodes monitor that the number of the storage nodes is changed.
The hash ring information includes at least one hash value interval, and the at least one hash value interval is obtained by segmenting the hash ring by the storage node. Typically, the spatial range of hash values in the hash ring is [0,2 ]32]Thus, the storage node cuts the hash ring into intervals. For example, referring to fig. 4, fig. 4 is a diagram illustrating a hash ring according to an example embodimentThe hash ring includes a storage node m, a storage node n, and a storage node q, and at this time, the hash ring is divided into three hash value intervals, assuming that the three hash value intervals include a first hash value interval, a second hash value interval, and a third hash value interval.
For the storage management node, the number change condition of the storage nodes in the storage system can be constantly monitored and managed, so that a plurality of hash ring information can be generated according to the number change condition of the storage nodes. For example, when the capacity of the storage node in the storage system is insufficient, the storage node may be added to the storage system, that is, the user may perform a capacity expansion operation. In implementation, a user may send a node addition request to the storage management node through a user interaction interface in the client, and in some embodiments, the node addition request may carry address information of the storage node to be added. After receiving the node increase request, the storage management node records the address information of the storage nodes with the changed number. Further, the storage management node may determine a new hash ring information by using a preset hash algorithm based on the address information of the storage node after the change in the number, that is, the number of the hash ring information stored in the storage management node is increased at this time.
That is to say, when it is monitored that the number of storage nodes in the storage system changes, the storage management node generates a new hash ring based on the address information of the changed storage nodes, in other words, only one piece of hash ring information is stored in the storage system under the condition that the number of storage nodes does not change, and when the number of storage nodes changes, a plurality of pieces of hash ring information will exist in the storage system. Further, the storage management node may store the generated plurality of hash ring information in the form of a list.
The preset hash algorithm may be set by a user according to implementation requirements, or may be set by the storage management node by default, which is not limited in the embodiment of the present invention.
Further, the storage management node may add a version number to the newly generated hash ring information. In an embodiment, the version number presents an increasing trend as time increases. For example, the version number of the original hash ring information corresponding to the hash ring in the storage system may be set to 1, when the number of storage nodes in the storage system changes for the first time, the storage management node may set the version number corresponding to the hash ring information generated based on the address information of the storage nodes after the change to 2, when the number of storage nodes in the storage system changes for the second time, the storage management node may set the version number corresponding to the hash ring information generated based on the address information of the storage nodes after the change to 3, and so on.
For a client, the hash ring information corresponding to multiple version numbers may be synchronized from the storage management node at regular intervals, in a possible implementation manner, the client sends an information synchronization request to the storage management node every preset time duration, where the information synchronization request is used to request the storage management node to return the hash ring information corresponding to the multiple version numbers, and receive and store the hash ring information corresponding to the multiple version numbers sent by the storage management node.
The preset duration may be set by a user according to actual needs in a self-defined manner, or may be set by the client as a default, which is not limited in the embodiment of the present invention.
For example, if the predetermined duration is a week time, and the storage management node stores hash ring information corresponding to a plurality of version numbers in a list form, the client may synchronously store the list of hash ring information corresponding to the plurality of version numbers from the storage management node every week.
In this way, in the process of storing data, the target hash ring information corresponding to the latest version number may be obtained from the hash ring information corresponding to a plurality of version numbers, that is, the hash ring information determined by the storage management node after the number of storage nodes in the storage system has changed last time is obtained.
Further, the information synchronization request is also used to request the storage management node to return address information of each storage node, and at this time, the client receives and stores hash ring information corresponding to a plurality of version numbers sent by the storage management node and address information of each storage node.
That is, the storage management node may synchronize, to the client, address information of each storage node included in the storage system, in addition to the hash ring information corresponding to the plurality of version numbers. Further, the storage management node may synchronize the address information of each storage node to the client in the form of an address information list, i.e., the address information list includes the address information of each storage node.
Step 303: and determining a first target storage node based on the preset key value and the target hash ring information.
In a possible implementation manner, determining a specific implementation of the first target storage node based on the preset key value and the target hash ring information may include: based on the preset key value, determining a corresponding hash value through the preset hash algorithm, determining a hash value interval of the hash value in the target hash ring information, searching a nearest storage node on the hash ring along the clockwise direction, and determining the searched storage node as the first target storage node.
For example, referring to fig. 4, assuming that the corresponding hash value is determined to be within the first hash value interval by the predetermined hash algorithm based on the predetermined key value, the closest storage node m can be found on the hash ring along the clockwise direction, and at this time, the storage node m is determined as the first target storage node.
Step 304: and storing the data into the first target storage node.
In a possible implementation manner, the client may send a data storage request to the first target storage node, where the data storage request carries data to be stored. And after receiving the data storage request, the first target storage node analyzes the data storage request to extract the data to be stored from the data storage request, and stores the data.
Further, the client may store the data in the first target storage node based on the address information of the first target storage node. That is, the client acquires the address information of the first target storage node from the synchronized address information of each storage node, and sends the data storage request to the address information of the first target storage node, so as to store data in the first target storage node.
It should be noted that, the above steps 301 to 304 are used to describe an implementation process of storing data into the storage system, and next, an implementation process of reading data from the storage system will be described, specifically, refer to the following steps 305 to 308.
Step 305: and receiving a data acquisition instruction, wherein the data acquisition instruction carries the preset key value.
The data acquisition instruction can be triggered by a user, and the user can trigger through the preset operation. For example, in some embodiments, the client may provide a user interaction interface that is provided with an input box, as previously described, and may also be provided with data acquisition options. When a user wants to download data from the storage system, a preset key value set when the data is stored can be input in the input box, and the data acquisition option is clicked to trigger the data acquisition instruction.
Step 306: and determining a corresponding hash value through the preset hash algorithm based on the preset key value.
It should be noted that the preset hash algorithm used herein is a hash algorithm used when storing data, that is, the same hash function is used to determine a corresponding hash value, so that the position of the hash value can be successfully found from the hash ring.
Step 307: a plurality of second target storage nodes is determined based on the hash value and the plurality of hash ring information.
As described above, the hash ring information corresponding to each version number includes a hash value interval, and at this time, determining a specific implementation of the plurality of second target storage nodes based on the hash value and the plurality of hash ring information may include: and for each piece of hash ring information in the plurality of pieces of hash ring information, determining a target hash value interval in which the hash value is located in each piece of hash ring information, searching the nearest storage node in each target hash value interval on the hash ring along the clockwise direction, and determining the searched storage node as the second target storage node to obtain the plurality of second target storage nodes.
That is, the client may determine one second target storage node based on the hash value and the hash ring information corresponding to each version number, and thus may determine a plurality of second target storage nodes based on the hash value and the hash ring information corresponding to a plurality of version numbers. For example, assuming that there are three hash ring information, the client may determine three second target storage nodes, where each second target storage node corresponds to a unique version number.
Step 308: the data is retrieved from the plurality of second target storage nodes.
In a possible implementation manner, the specific implementation process of obtaining the data from the plurality of second target storage nodes may include: sequencing the plurality of second target storage nodes according to the sequence of version numbers from new to old, sending a data acquisition request to a first second target storage node in the plurality of sequenced second target storage nodes, and stopping the operation of sending the data acquisition request when the first second target storage node returns the data; and when the first second target storage node returns the acquisition failure response, sending the data acquisition request to a next second target storage node in the sorted plurality of second target storage nodes until the data is acquired.
Since the data is stored based on the target hash ring information corresponding to the latest version number, in order to obtain the data quickly, the plurality of second target storage nodes determined are sorted according to the sequence of the version numbers from new to old, for example, as described above, the version numbers corresponding to the plurality of hash ring information may be in an increasing mode with the increase of time, in which case, the plurality of second target storage nodes are sorted according to the sequence of the version numbers from large to small.
And then, starting from the first second ordered target storage node, sending a data acquisition request to the first second ordered target storage node, and if the first second ordered target storage node returns an acquisition failure response, which indicates that the data acquisition fails, then sending the data acquisition request to the second ordered target storage node until the data is acquired. Otherwise, if the first second target storage node returns data, which indicates that the data acquisition is successful, the data acquisition operation is stopped.
Further, before the second target storage nodes are sorted according to the sequence of the version numbers from new to old, when the second target storage nodes include second target storage nodes with the same version numbers, the second target storage nodes with the same version numbers are removed from the second target storage nodes, and at this time, the remaining second target storage nodes are sorted according to the sequence of the version numbers from new to old.
It is understood that, among the plurality of determined second target storage nodes, there may exist second storage nodes having the same version number, and thus, when data is acquired in the above manner, a certain second storage node may be repeatedly read. Therefore, in order to reduce the operation burden of the client, before the plurality of second target storage nodes are sorted according to the sequence of the version numbers from new to old, the deduplication processing may be performed first, that is, the second target storage nodes with the same version number are removed from the plurality of second target storage nodes, and then the remaining plurality of second target storage nodes are sorted, so that the repeated reading may be avoided.
In addition, in the data reading process, the client may send a data obtaining request to each second target storage node based on the address information of each second target storage node to read data from the second target storage node.
In the embodiment of the invention, a data storage instruction carrying a preset key value corresponding to data is received, and target hash ring information corresponding to the latest version number is obtained from hash ring information corresponding to a plurality of version numbers. The hash ring information corresponding to the plurality of version numbers is obtained from the storage management node in a synchronous manner, and the hash ring information corresponding to each version number is generated by the storage management node based on the address information of the changed storage node after detecting that the number of the storage nodes is changed, that is, the target hash ring information corresponding to the latest version number is the latest hash ring information. That is, the embodiment of the present invention can avoid data migration during data storage, improve system storage performance, and improve data storage security.
Referring next to fig. 5, fig. 5 is a flow chart illustrating a data storage or reading according to an example. The data storage and reading process is briefly described herein in connection with fig. 5. Here, the following two scenarios are included:
the first scenario is: there is only one hash ring scenario in the storage management node.
1. The client synchronizes the hash ring information from the storage management node periodically, and accordingly, the storage management node returns the hash ring information. Further, the storage management node may synchronize address information of each storage node included in the storage system to the client.
2. When data is required to be uploaded or downloaded, a user can send a data writing instruction or a data acquiring instruction to the client, wherein the data writing instruction or the data acquiring instruction carries a unique preset key value corresponding to the data.
3. And the client calculates according to the preset key value and the synchronous hash ring information to determine a target storage node (a first target storage node or a second target storage node).
4. And the client sends a data writing request or a data acquisition request to the target storage node. Further, the client sends a data write request or a data acquisition request to the target storage node based on the address information of the target storage node.
5. And the target storage node stores the data or returns the data required to be downloaded to the client.
The second scenario is: there are multiple hash ring scenarios in the storage management node.
1. The client synchronizes the hash ring information from the storage management node periodically, and accordingly, the storage management node returns the hash ring information. Further, the storage management node may synchronize address information of each storage node included in the storage system to the client.
2. When data needs to be uploaded, a user can send a data writing instruction to a client, and the data writing instruction carries a unique preset key value corresponding to the data.
3. And the client calculates according to the preset key value and the target hash ring information corresponding to the latest version number in the plurality of pieces of synchronous hash ring information, and determines a target storage node (a first target storage node).
4. The client sends a data write request to the target storage node. Further, the client sends a data write request to the target storage node based on the address information of the target storage node.
5. The target storage node stores data.
6. When data needs to be downloaded, a user can send a data acquisition instruction to a client, and the data acquisition instruction carries a unique preset key value corresponding to the data.
7. And the client calculates according to the preset key value and the synchronous multiple hash ring information to determine a target storage node (comprising multiple second target storage nodes).
8. The client sends a data acquisition request to the target storage node, and further sends the data acquisition request to the target storage node based on the address information of the target storage node. The client judges whether the downloading is successful, and stops the operation when the downloading is successful, otherwise, the next step 9 is entered.
9. And the client sends a data acquisition request to the next target storage node until the data downloading is successful.
Fig. 6 is a schematic diagram illustrating a structure of a data storage device according to an exemplary embodiment, which may be implemented by software, hardware, or a combination of both. The data storage device may include:
a first receiving module 610, configured to receive a data storage instruction, where the data storage instruction carries a preset key value corresponding to data to be stored;
an obtaining module 620, configured to obtain target hash ring information corresponding to a latest version number from hash ring information corresponding to multiple version numbers, where the multiple hash ring information is obtained from storage management nodes synchronously, and the hash ring information corresponding to each version number is determined by a preset hash algorithm based on address information of a changed storage node when the storage management node monitors that the number of the storage nodes changes;
a storage module 630, configured to determine a first target storage node based on the preset key value and the target hash ring information, and store the data in the first target storage node.
Optionally, referring to fig. 7, the apparatus further includes:
a second receiving module 640, configured to receive a data obtaining instruction, where the data obtaining instruction carries the preset key value;
a determining module 650, configured to determine, based on the preset key value, a corresponding hash value through the preset hash algorithm;
the determining module 650 is further configured to determine a plurality of second target storage nodes based on the hash value and the plurality of hash ring information, and obtain the data from the plurality of second target storage nodes.
Optionally, the determining module 650 is configured to:
when the hash ring information corresponding to each version number comprises a hash value interval, determining a target hash value interval of the hash value in each hash ring information for each hash ring information in the plurality of hash ring information;
and searching the nearest storage node in each target hash value interval on the hash ring along the clockwise direction, and determining the searched storage node as the second target storage node to obtain the plurality of second target storage nodes.
Optionally, the determining module 650 is configured to:
sequencing the plurality of second target storage nodes according to the sequence of the version numbers from new to old;
sending a data acquisition request to a first second target storage node in the sequenced second target storage nodes, and stopping sending the data acquisition request when the first second target storage node returns the data; and when the first second target storage node returns an acquisition failure response, sending the data acquisition request to a next second target storage node in the plurality of sorted second target storage nodes until the data is acquired.
Optionally, the determining module 650 is configured to:
when second target storage nodes with the same version number are included in the plurality of second target storage nodes, removing the second target storage nodes with the same version number from the plurality of second target storage nodes;
and sequencing the remaining plurality of second target storage nodes according to the sequence of the version numbers from new to old.
Optionally, referring to fig. 8, the apparatus further includes:
a synchronization module 660, configured to send an information synchronization request to the storage management node every preset duration, where the information synchronization request is used to request the storage management node to return hash ring information corresponding to the multiple version numbers;
the receiving storage module 670 is configured to receive and store hash ring information corresponding to multiple version numbers sent by the storage management node.
Optionally, the receiving and storing module 670 is further configured to: receiving and storing hash ring information corresponding to a plurality of version numbers sent by the storage management node and address information of each storage node;
the storage module 630 is configured to store the data in the first target storage node based on the address information of the first target storage node.
In the embodiment of the invention, a data storage instruction carrying a preset key value corresponding to data is received, and target hash ring information corresponding to the latest version number is obtained from hash ring information corresponding to a plurality of version numbers. The hash ring information corresponding to the plurality of version numbers is obtained from the storage management node in a synchronous manner, and the hash ring information corresponding to each version number is generated by the storage management node based on the address information of the changed storage node after detecting that the number of the storage nodes is changed, that is, the target hash ring information corresponding to the latest version number is the latest hash ring information. That is, the embodiment of the present invention can avoid data migration during data storage, improve system storage performance, and improve data storage security
It should be noted that: in the data storage device provided in the foregoing embodiment, when implementing the data storage method, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the data storage device provided by the above embodiment and the data storage method embodiment belong to the same concept, and specific implementation processes thereof are detailed in the method embodiment and are not described herein again.
Fig. 9 is a block diagram illustrating a computer device 800 according to an example embodiment. The computer device may be equipped with the above-mentioned client, specifically:
the computer device 800 includes a Central Processing Unit (CPU)801, a system memory 804 including a Random Access Memory (RAM)802 and a Read Only Memory (ROM)803, and a system bus 805 connecting the system memory 804 and the central processing unit 801. The computer device 800 also includes a basic input/output system (I/O system) 806, which facilitates transfer of information between various devices within the computer, and a mass storage device 807 for storing an operating system 813, application programs 814, and other program modules 815.
The basic input/output system 806 includes a display 808 for displaying information and an input device 809 such as a mouse, keyboard, etc. for user input of information. Wherein a display 808 and an input device 809 are connected to the central processing unit 801 through an input output controller 810 connected to the system bus 805. The basic input/output system 806 may also include an input/output controller 810 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input-output controller 810 also provides output to a display screen, a printer, or other type of output device.
The mass storage device 807 is connected to the central processing unit 801 through a mass storage controller (not shown) connected to the system bus 805. The mass storage device 807 and its associated computer-readable media provide non-volatile storage for the computer device 800. That is, the mass storage device 807 may include a computer-readable medium (not shown) such as a hard disk or CD-ROM drive.
Without loss of generality, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that computer storage media is not limited to the foregoing. The system memory 804 and mass storage 807 described above may be collectively referred to as memory.
According to various embodiments of the present application, the computer device 800 may also operate as a remote computer connected to a network via a network, such as the Internet. That is, the computer device 800 may be connected to the network 812 through the network interface unit 811 coupled to the system bus 805, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 811.
The memory further includes one or more programs, and the one or more programs are stored in the memory and configured to be executed by the CPU. The one or more programs include data storage methods for performing the methods provided by the embodiments of the present application.
Embodiments of the present application also provide a non-transitory computer-readable storage medium, wherein instructions in the storage medium, when executed by a processor of a computer device, enable the computer device to perform the above-mentioned data storage method.
Embodiments of the present application also provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the above data storage method.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (15)
1. A data storage method is applied to a client, and the method comprises the following steps:
receiving a data storage instruction, wherein the data storage instruction carries a preset key value corresponding to data to be stored;
acquiring target hash ring information corresponding to the latest version number from hash ring information corresponding to a plurality of version numbers, wherein the hash ring information is synchronously acquired from storage management nodes, and the hash ring information corresponding to each version number is determined by the storage management nodes through a preset hash algorithm based on changed address information of the storage nodes when the storage management nodes monitor that the number of the storage nodes is changed;
and determining a first target storage node based on the preset key value and the target hash ring information, and storing the data into the first target storage node.
2. The method of claim 1, wherein after storing the data in the first target storage node, further comprising:
receiving a data acquisition instruction, wherein the data acquisition instruction carries the preset key value;
determining a corresponding hash value through the preset hash algorithm based on the preset key value;
and determining a plurality of second target storage nodes based on the hash value and the plurality of hash ring information, and acquiring the data from the plurality of second target storage nodes.
3. The method of claim 2, wherein when the hash ring information corresponding to each version number includes a hash value interval, the determining a plurality of second target storage nodes based on the hash value and the plurality of hash ring information comprises:
for each hash ring information in the plurality of hash ring information, determining a target hash value interval in which the hash value is located in each hash ring information;
and searching the nearest storage node in each target hash value interval on the hash ring along the clockwise direction, and determining the searched storage node as the second target storage node to obtain the plurality of second target storage nodes.
4. The method of claim 3, wherein said retrieving said data from said second plurality of target storage nodes comprises:
sequencing the plurality of second target storage nodes according to the sequence of the version numbers from new to old;
sending a data acquisition request to a first second target storage node in the sequenced second target storage nodes, and stopping sending the data acquisition request when the first second target storage node returns the data; and when the first second target storage node returns an acquisition failure response, sending the data acquisition request to a next second target storage node in the plurality of sorted second target storage nodes until the data is acquired.
5. The method of claim 4, wherein prior to said sorting said second plurality of target storage nodes in order of version numbers from new to old, further comprising:
when second target storage nodes with the same version number are included in the plurality of second target storage nodes, removing the second target storage nodes with the same version number from the plurality of second target storage nodes;
correspondingly, the sorting the plurality of second target storage nodes according to the sequence of the version numbers from new to old comprises:
and sequencing the remaining plurality of second target storage nodes according to the sequence of the version numbers from new to old.
6. The method according to any one of claims 1 to 5, wherein before obtaining the target hash ring information corresponding to the latest version number from the hash ring information corresponding to the plurality of version numbers, the method further includes:
sending an information synchronization request to the storage management node every other preset time, wherein the information synchronization request is used for requesting the storage management node to return hash ring information corresponding to the plurality of version numbers;
and receiving and storing hash ring information corresponding to a plurality of version numbers sent by the storage management node.
7. The method of claim 6, wherein the information synchronization request is further used to request the storage management node to return address information of each storage node, and the receiving and storing hash ring information corresponding to a plurality of version numbers sent by the storage management node comprises:
receiving and storing hash ring information corresponding to a plurality of version numbers sent by the storage management node and address information of each storage node;
accordingly, the storing the data into the first target storage node comprises:
storing the data into the first target storage node based on the address information of the first target storage node.
8. A data storage device configured in a client, the device comprising:
the first receiving module is used for receiving a data storage instruction, and the data storage instruction carries a preset key value corresponding to data to be stored;
the acquisition module is used for acquiring target hash ring information corresponding to the latest version number from the hash ring information corresponding to the multiple version numbers, the multiple hash ring information is synchronously acquired from the storage management node, and the hash ring information corresponding to each version number is determined by a preset hash algorithm based on the changed address information of the storage node when the storage management node monitors that the number of the storage nodes changes;
and the storage module is used for determining a first target storage node based on the preset key value and the target hash ring information, and storing the data into the first target storage node.
9. The apparatus of claim 8, wherein the apparatus further comprises:
the second receiving module is used for receiving a data acquisition instruction, and the data acquisition instruction carries the preset key value;
the determining module is used for determining a corresponding hash value through the preset hash algorithm based on the preset key value;
the determining module is further configured to determine a plurality of second target storage nodes based on the hash value and the plurality of hash ring information, and obtain the data from the plurality of second target storage nodes.
10. The apparatus of claim 9, wherein the determination module is to:
when the hash ring information corresponding to each version number comprises a hash value interval, determining a target hash value interval of the hash value in each hash ring information for each hash ring information in the plurality of hash ring information;
and searching the nearest storage node in each target hash value interval on the hash ring along the clockwise direction, and determining the searched storage node as the second target storage node to obtain the plurality of second target storage nodes.
11. The apparatus of claim 10, wherein the determination module is to:
sequencing the plurality of second target storage nodes according to the sequence of the version numbers from new to old;
sending a data acquisition request to a first second target storage node in the sequenced second target storage nodes, and stopping sending the data acquisition request when the first second target storage node returns the data; and when the first second target storage node returns an acquisition failure response, sending the data acquisition request to a next second target storage node in the plurality of sorted second target storage nodes until the data is acquired.
12. The apparatus of claim 11, wherein the determination module is to:
when second target storage nodes with the same version number are included in the plurality of second target storage nodes, removing the second target storage nodes with the same version number from the plurality of second target storage nodes;
and sequencing the remaining plurality of second target storage nodes according to the sequence of the version numbers from new to old.
13. The method of any one of claims 8-12, wherein the apparatus further comprises:
the synchronization module is used for sending an information synchronization request to the storage management node every other preset time, wherein the information synchronization request is used for requesting the storage management node to return hash ring information corresponding to the plurality of version numbers;
and the receiving and storing module is used for receiving and storing the hash ring information corresponding to the plurality of version numbers sent by the storage management node.
14. The apparatus of claim 13,
the receiving and storing module is further used for: receiving and storing hash ring information corresponding to a plurality of version numbers sent by the storage management node and address information of each storage node;
the storage module is configured to store the data in the first target storage node based on the address information of the first target storage node.
15. A computer-readable storage medium having instructions stored thereon, wherein the instructions, when executed by a processor, implement the steps of any of the methods of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811303561.3A CN111147226B (en) | 2018-11-02 | 2018-11-02 | Data storage method, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811303561.3A CN111147226B (en) | 2018-11-02 | 2018-11-02 | Data storage method, device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111147226A true CN111147226A (en) | 2020-05-12 |
CN111147226B CN111147226B (en) | 2023-07-18 |
Family
ID=70516256
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811303561.3A Active CN111147226B (en) | 2018-11-02 | 2018-11-02 | Data storage method, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111147226B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113282243A (en) * | 2021-06-09 | 2021-08-20 | 杭州海康威视系统技术有限公司 | Method and device for storing object file |
CN114844911A (en) * | 2022-04-20 | 2022-08-02 | 网易(杭州)网络有限公司 | Data storage method and device, electronic equipment and computer readable storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101996217A (en) * | 2009-08-24 | 2011-03-30 | 华为技术有限公司 | Method for storing data and memory device thereof |
CN102739704A (en) * | 2011-04-02 | 2012-10-17 | 中兴通讯股份有限公司 | Method and system for data migration in peer-to-peer network |
CN104636286A (en) * | 2015-02-06 | 2015-05-20 | 华为技术有限公司 | Data access method and equipment |
CN106503139A (en) * | 2016-10-20 | 2017-03-15 | 上海携程商务有限公司 | Dynamic data access method and system |
CN106951179A (en) * | 2016-01-07 | 2017-07-14 | 杭州海康威视数字技术股份有限公司 | A kind of data migration method and device |
US20170300552A1 (en) * | 2016-04-18 | 2017-10-19 | Amazon Technologies, Inc. | Versioned hierarchical data structures in a distributed data store |
-
2018
- 2018-11-02 CN CN201811303561.3A patent/CN111147226B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101996217A (en) * | 2009-08-24 | 2011-03-30 | 华为技术有限公司 | Method for storing data and memory device thereof |
CN102739704A (en) * | 2011-04-02 | 2012-10-17 | 中兴通讯股份有限公司 | Method and system for data migration in peer-to-peer network |
CN104636286A (en) * | 2015-02-06 | 2015-05-20 | 华为技术有限公司 | Data access method and equipment |
CN106951179A (en) * | 2016-01-07 | 2017-07-14 | 杭州海康威视数字技术股份有限公司 | A kind of data migration method and device |
US20170300552A1 (en) * | 2016-04-18 | 2017-10-19 | Amazon Technologies, Inc. | Versioned hierarchical data structures in a distributed data store |
CN106503139A (en) * | 2016-10-20 | 2017-03-15 | 上海携程商务有限公司 | Dynamic data access method and system |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113282243A (en) * | 2021-06-09 | 2021-08-20 | 杭州海康威视系统技术有限公司 | Method and device for storing object file |
CN114844911A (en) * | 2022-04-20 | 2022-08-02 | 网易(杭州)网络有限公司 | Data storage method and device, electronic equipment and computer readable storage medium |
CN114844911B (en) * | 2022-04-20 | 2024-07-09 | 网易(杭州)网络有限公司 | Data storage method, device, electronic equipment and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111147226B (en) | 2023-07-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3125501B1 (en) | File synchronization method, server, and terminal | |
US20190087439A1 (en) | Data replication from a cloud-based storage resource | |
US10187255B2 (en) | Centralized configuration data in a distributed file system | |
CN104580439B (en) | Method for uniformly distributing data in cloud storage system | |
CN111399764B (en) | Data storage method, data reading device, data storage equipment and data storage medium | |
CN111800468A (en) | Cloud-based multi-cluster management method, device, medium and electronic equipment | |
CN111291062B (en) | Data synchronous writing method and device, computer equipment and storage medium | |
CN113220660A (en) | Data migration method, device and equipment and readable storage medium | |
CN117931756B (en) | FTP file real-time monitoring and analyzing system and method based on Flink | |
CN114422537B (en) | Multi-cloud storage system, multi-cloud data reading and writing method and electronic equipment | |
CN111147226B (en) | Data storage method, device and storage medium | |
CN115004662A (en) | Data synchronization method, data synchronization device, data storage system and computer readable medium | |
JP6610189B2 (en) | Synchronize collaborative documents with online document management systems | |
CN112000850B (en) | Method, device, system and equipment for processing data | |
CN112363980B (en) | Data processing method and device of distributed system | |
CN106570068B (en) | Information recommendation method and device | |
CN111431951B (en) | Data processing method, node equipment, system and storage medium | |
CN113127438A (en) | Method, apparatus, server and medium for storing data | |
CN115129789A (en) | Bucket index storage method, device and medium of distributed object storage system | |
CN111966533B (en) | Electronic file management method, electronic file management device, computer equipment and storage medium | |
CN110677497B (en) | Network medium distribution method and device | |
CN108376104B (en) | Node scheduling method and device and computer readable storage medium | |
CN112035174B (en) | Method, apparatus and computer storage medium for running web service | |
CN108733822A (en) | A kind of file memory method, device, electronic equipment and storage medium | |
CN113010475B (en) | Method and apparatus for storing track data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |