Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In the prior art, when a system crash needs to perform data recovery, the Redo logs in each node need to be read, all the Redo logs are combined according to the operation sequence, and then the corresponding data pages in the database are played back according to the combined Redo logs, so that the data recovery can be realized. Because the Redo log has sequentiality, the Redo log must be played back piece by piece in a serial (i.e., single process) manner, so that data recovery can be realized, and the data recovery efficiency is very low. Therefore, in view of the defects in the prior art, the application proposes a data recovery scheme, the main principle of which is as follows: the logs of the storage nodes are redistributed, the operation logs (Redo logs) corresponding to the same data page are uniquely distributed to the same storage node, and each storage node performs playback operation according to the sequence aiming at the operation logs of the same data page, so that parallel recovery of data is realized, and the data recovery efficiency is improved.
The foregoing embodiments are illustrative of the technical principles of embodiments of the present invention, and the detailed description of specific technical solutions of the embodiments of the present invention will be further described below through a plurality of embodiments.
Example 1
Fig. 1 is a system block diagram of an embodiment of a data recovery system provided by the present invention, and the structure shown in fig. 1 is only one example of a service system to which the technical solution of the present invention can be applied. As shown in fig. 1, the data recovery system includes: a control node 11 and a plurality of storage nodes 12.
The control node 11 is configured to reassign the operation log corresponding to the same data page to the same storage node 12 according to the data page (stored in the database 13) corresponding to the operation log (i.e., the redox log) in the storage node 12; the plurality of storage nodes 12 are respectively used for playing back the operation logs corresponding to the same data page according to the operation sequence.
The data recovery system provided by the embodiment of the invention can be used for executing the following processing flows shown in fig. 4 and 5. The data recovery system can be applied to a distributed shared storage database system, when the system crashes and needs to recover data, firstly, the control node 11 reads all operation logs from each storage node 12, and then, according to the data page corresponding to each operation log, the operation log corresponding to the same data page is allocated to the same storage node 12 only. That is, after the operation logs are reassigned, each storage node 12 may include operation logs corresponding to one or more data pages, but operation logs corresponding to the same data page are stored in only the same storage node 12. After the operation logs are redistributed to the storage nodes 12, each storage node 12 plays back the operation logs corresponding to the same data page stored by the storage node according to the operation sequence, so that parallel data recovery of a plurality of storage nodes 12 is realized.
According to the data recovery system provided by the embodiment of the invention, the operation logs in the storage nodes are redistributed, so that the operation logs corresponding to the same data page are uniquely distributed to the same storage node, and then each storage node performs playback operation on the operation logs corresponding to the same data page, thereby realizing parallel recovery of data, greatly improving data recovery efficiency and shortening data recovery time.
Example two
Fig. 2 is a schematic structural diagram of a control node in an embodiment of a data recovery system according to the present invention. As shown in fig. 2, in the data recovery system provided by the present invention, the control node 11 may include: a reading module 111, a calculating module 112 and a transmitting module 113.
Wherein, the reading module 111 is configured to read an operation log in the storage node 12; the computing module 112 is configured to obtain a page number of a data page corresponding to each operation log, and calculate a Hash (Hash) value of the page number; the sending module 113 is configured to send the operation log to the storage node 12 that uniquely corresponds to the hash value according to the hash value corresponding to the operation log.
In the embodiment of the present invention, when a system crash needs to perform data recovery, first, the control node 11 allocates the operation log corresponding to the same data page to the same storage node 12 only according to the data page corresponding to the operation log in each storage node 12. Specifically, the reading module 111 reads all the operation logs from each storage node 12, and then the calculating module 112 obtains the page numbers of the data pages corresponding to each operation log and calculates the Hash value of each page number. Finally, the sending module 113 sends the operation log to the storage node 12 uniquely corresponding to the Hash value according to the Hash value corresponding to the operation log calculated by the calculating module 112. After the operation logs are reassigned, each storage node 12 may include operation logs corresponding to one or more data pages, but operation logs corresponding to the same data page are stored only in the same storage node 12.
According to the data recovery system provided by the embodiment of the invention, the hash value of the page number of the data page corresponding to each operation log is calculated to redistribute each operation log, so that the operation logs corresponding to the same data page are uniquely distributed to the same storage node, and then each storage node performs playback operation on the operation logs corresponding to the same data page respectively, thereby realizing parallel recovery of data, greatly improving data recovery efficiency and shortening data recovery time.
Example III
Fig. 3 is a schematic structural diagram of a storage node in an embodiment of a data recovery system according to the present invention. As shown in fig. 3, in the data recovery system provided by the present invention, the storage node 12 may include: a receiving module 121, a combining module 122 and a playback module 123.
The receiving module 121 is configured to receive an operation log sent by the control node 11; the merging module 122 is configured to merge the operation logs with the same hash value according to the operation sequence; the playback module 123 is configured to play back the combined operation log according to the operation sequence.
In the embodiment of the present invention, when a system crash needs to perform data recovery, first, the control node 11 allocates the operation log corresponding to the same data page to the same storage node 12 only according to the data page corresponding to the operation log in each storage node 12. Then, each storage node 12 plays back the operation logs corresponding to the same data page stored by the storage node according to the operation sequence, so that parallel data recovery of a plurality of storage nodes 12 is realized. Specifically, the receiving module 121 receives the operation logs sent by the control node 11, and then the merging module 122 merges the operation logs having the same hash value in the operation order of the operation logs; finally, playback operations are performed on the combined operation logs by the playback module 123 in the above-described operation order.
Further, the merging module 122 may be specifically configured to build a Hash (Hash) table, and store the operation logs with the same Hash value in the same record of the Hash table according to the above operation sequence.
In addition, in the embodiment of the present invention, the playback module 123 is further configured to delete the operation log that has been played back. That is, when the operation log is played back, it is deleted by the playback module 123 to free up the storage space.
Further, in the data recovery system provided by the embodiment of the present invention, the storage node 12 may further include: a detection module 124. The detection module 124 may be configured to, when the operation log is played back by the playback module 123, detect that the currently played back operation log is inconsistent with the version number in the previous operation log, instruct the playback module 123 to pause playback until the operation log with the consistent version number appears, and instruct the playback module 123 to continue playing back the operation log with the consistent version number.
In the embodiment of the invention, two version numbers before and after the operation of the corresponding data page are recorded in the operation log. Therefore, it can be further judged whether the operation log is missing. Specifically, when the playback module 123 plays back the operation log, the detection module 124 may detect that the currently played back operation log is inconsistent with the version number in the previous operation log, and when the version number after the operation in the previous operation log is inconsistent with the version number before the operation in the current operation log, it indicates that the operation log is missing, so that the playback module 123 is instructed to pause playback until the operation log with the consistent version number appears, and then instructs the playback module 123 to continue playing back the operation log with the consistent version number.
According to the data recovery system provided by the embodiment of the invention, the hash value of the page number of the data page corresponding to each operation log is calculated to redistribute each operation log, so that the operation logs corresponding to the same data page are uniquely distributed to the same storage node, then, each storage node respectively merges the operation logs with the same hash value, and plays back the merged operation logs according to the operation sequence, thereby realizing parallel recovery of data, greatly improving the data recovery efficiency and shortening the data recovery time.
Example IV
Fig. 4 is a flowchart of an embodiment of a data recovery method according to the present invention, where an execution body of the method may be the data recovery system, or may be various terminals or server devices with a software data recovery function, or may be a system or a chip integrated on these devices. As shown in fig. 4, the data recovery method includes the steps of:
s401, the control node allocates the operation log corresponding to the same data page to the same storage node only according to the data page corresponding to the operation log in the storage node.
In the embodiment of the invention, the data recovery method can be applied to a distributed shared storage database system, when the system crashes and needs to recover data, firstly, a control node reads all operation logs from each storage node, and then, according to the data page corresponding to each operation log, the operation log corresponding to the same data page is allocated to the same storage node only. That is, after the operation logs are reassigned, each storage node may include operation logs corresponding to one or more data pages, but operation logs corresponding to the same data page are stored in only the same storage node.
S402, the storage nodes play back the operation logs corresponding to the same data page according to the operation sequence.
In the embodiment of the invention, after the control node redistributes the operation logs to the storage nodes, each storage node plays back the operation logs corresponding to the same data page stored by the control node according to the operation sequence, thereby realizing the parallel data recovery of a plurality of storage nodes.
According to the data recovery method provided by the embodiment of the invention, the operation logs in the storage nodes are redistributed, so that the operation logs corresponding to the same data page are uniquely distributed to the same storage node, and then each storage node performs playback operation on the operation logs corresponding to the same data page, thereby realizing parallel recovery of data, greatly improving data recovery efficiency and shortening data recovery time.
Example five
Fig. 5 is a flowchart of another embodiment of a data recovery method according to the present invention. As shown in fig. 5, on the basis of the embodiment shown in fig. 4, the data recovery method provided in this embodiment may further include the following steps:
s501, the control node reads the operation log in the storage node.
S502, obtaining page numbers of data pages corresponding to the operation logs, and calculating hash values of the page numbers.
S503, according to the hash value corresponding to the operation log, each operation log is sent to a storage node uniquely corresponding to the hash value.
In the embodiment of the invention, when the system crashes and needs to perform data recovery, firstly, the control node allocates the operation log corresponding to the same data page to the same storage node only according to the data page corresponding to the operation log in each storage node. Specifically, the control node reads all operation logs from each storage node, then obtains page numbers of data pages corresponding to each operation log, and calculates Hash values of each page number. And finally, according to the calculated Hash value corresponding to the operation log, sending the operation log to a storage node uniquely corresponding to the Hash value. After the operation logs are reassigned, each storage node may include operation logs corresponding to one or more data pages, but operation logs corresponding to the same data page are stored only in the same storage node 12.
S504, the storage node receives the operation log sent by the control node.
S505, merging the operation logs with the same hash value according to the operation sequence.
S506, playing back the combined operation log.
In the embodiment of the present invention, each storage node plays back the operation logs corresponding to the same data page stored in each storage node according to the operation sequence, so as to implement parallel data recovery of multiple storage nodes 12. Specifically, the storage node receives the operation logs sent by the control node, and then merges the operation logs with the same hash value according to the operation sequence of each operation log; finally, according to the operation sequence, playback operation is carried out on the combined operation logs.
Specifically, the step S506 may be to build a Hash table (Hash) and store the operation logs with the same Hash value in the same record of the Hash table according to the operation sequence.
S507, deleting the operation log which has been played back.
In the embodiment of the invention, when the operation log is played back, the operation log is deleted to release the storage space.
S508, detecting the version numbers in the current operation log and the previous operation log when the operation log is replayed, and pausing the replay when the version number after the operation in the current operation log is inconsistent with the version number before the operation in the current operation log until the operation log with the consistent version number appears, and then continuing to replay the operation log with the consistent version number.
In the embodiment of the invention, two version numbers before and after the operation of the corresponding data page are recorded in the operation log. Therefore, it can be further judged whether the operation log is missing. Specifically, when the operation log is played back, the version numbers of the operation log which is played back currently and the previous operation log can be detected, when the version number after the operation in the previous operation log is inconsistent with the version number before the operation in the current operation log, the operation log is indicated to be missed, so that the playback is paused until the operation log with the consistent version number appears, and the operation log with the consistent version number is played back continuously.
According to the data recovery method provided by the embodiment of the invention, the hash value of the page number of the data page corresponding to each operation log is calculated to redistribute each operation log, so that the operation logs corresponding to the same data page are uniquely distributed to the same storage node, then each storage node respectively merges the operation logs with the same hash value, and plays back the merged operation logs according to the operation sequence, thereby realizing parallel recovery of data, greatly improving the data recovery efficiency and shortening the data recovery time.
Example six
The foregoing describes the internal functions and structure of a data recovery system that may be implemented as an electronic device. Fig. 6 is a schematic structural diagram of an embodiment of an electronic device according to the present invention. As shown in fig. 6, the electronic device includes a memory 61 and a processor 62.
A memory 61 for storing a program. In addition to the programs described above, the memory 61 may also be configured to store other various data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on the electronic device, contact data, phonebook data, messages, pictures, videos, and the like.
The memory 61 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
A processor 62 coupled to the memory 61, executing a program stored in the memory 61, causing the electronic device to:
the control node is used for re-distributing the operation logs corresponding to the same data page to the same storage node only according to the data pages corresponding to the operation logs in the storage node;
and controlling the storage nodes to play back according to the operation sequence respectively aiming at the operation logs corresponding to the same data page.
Further, as shown in fig. 6, the electronic device may further include: communication component 63, power component 64, audio component 65, display 66, and other components. Only some of the components are schematically shown in fig. 6, which does not mean that the electronic device only comprises the components shown in fig. 6.
The communication component 63 is configured to facilitate communication between the electronic device and other devices, either wired or wireless. The electronic device may access a wireless network based on a communication standard, such as WiFi,2G, or 3G, or a combination thereof. In one exemplary embodiment, the communication component 63 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 63 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
A power supply assembly 64 provides power to the various components of the electronic device. Power supply components 64 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for electronic devices.
The audio component 65 is configured to output and/or input audio signals. For example, the audio component 65 includes a Microphone (MIC) configured to receive external audio signals when the electronic device is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 61 or transmitted via the communication component 63. In some embodiments, audio assembly 65 further includes a speaker for outputting audio signals.
The display 66 includes a screen, which may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.