Nothing Special   »   [go: up one dir, main page]

CN113626240A - Cluster fault recovery method and device, computer equipment and storage medium - Google Patents

Cluster fault recovery method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113626240A
CN113626240A CN202110914006.XA CN202110914006A CN113626240A CN 113626240 A CN113626240 A CN 113626240A CN 202110914006 A CN202110914006 A CN 202110914006A CN 113626240 A CN113626240 A CN 113626240A
Authority
CN
China
Prior art keywords
application server
message
target application
target
service processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110914006.XA
Other languages
Chinese (zh)
Inventor
赵楠
张卫
牛亮亮
李�杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Archforce Financial Technology Co Ltd
Original Assignee
Shenzhen Archforce Financial Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Archforce Financial Technology Co Ltd filed Critical Shenzhen Archforce Financial Technology Co Ltd
Priority to CN202110914006.XA priority Critical patent/CN113626240A/en
Publication of CN113626240A publication Critical patent/CN113626240A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/073Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a memory management context, e.g. virtual memory or cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Retry When Errors Occur (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application relates to a cluster fault recovery method and device, computer equipment and a storage medium. The method comprises the following steps: when target application programs in all application servers in the cluster fail, determining the quantity of historical messages stored in the message persistence files of all the application servers; determining the application server with the largest number of stored historical messages as a target application server; reading a target historical message from a message persistence file of a target application server through the target application server; and performing service processing on the target historical message in the memory through a target application program in the target application server to recover the memory state data of the target application server. By adopting the method, the final memory state data before the cluster failure can be reliably recovered.

Description

Cluster fault recovery method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for recovering a cluster failure, a computer device, and a storage medium.
Background
With the development of computer technology, clustering technology has emerged, which is a relatively new technology, by which relatively high gains in performance, reliability, flexibility can be achieved at a relatively low cost. A cluster is a group of mutually independent computers interconnected by a high-speed network, which form a group and are managed in a single system mode. A client interacts with a cluster, which appears as a stand-alone server. The cluster configuration is for improved availability and scalability. When target application programs in all application servers in a cluster fail, the prior art cannot reliably recover final memory state data of the cluster before failure.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a cluster failure recovery method, device, computer device and storage medium capable of reliably recovering the final memory state data of the cluster before failure.
A method of cluster failure recovery, the method comprising:
when target application programs in all application servers in the cluster fail, determining the quantity of historical messages stored in the message persistence files of all the application servers;
determining the application server with the largest number of stored historical messages as a target application server;
reading a target history message from a message persistence file of the target application server through the target application server;
and performing service processing on the target historical message in a memory through a target application program in the target application server to recover the memory state data of the target application server.
In one embodiment, the method further comprises:
when the memory state data of the target application server are completely recovered, acquiring a new message from a network;
and storing the new message into a message persistence file of the target application server, and performing service processing on the new message based on the memory state data through a target application program in the target application server to obtain a service processing result corresponding to the new message.
In one embodiment, after performing, by the target application program in the target application server, service processing on the new message based on the memory state data to obtain a service processing result corresponding to the new message, the method further includes:
and storing the service processing result corresponding to the new message into a result persistence file of the target application server.
In one embodiment, the target history message is all history messages in the message persistence file of the target application server, or part of history messages specified by preset configuration parameters in the message persistence file of the target application server.
In one embodiment, after the target application program in the target application server performs service processing on the target history message in the memory, the method further includes:
acquiring a service processing result corresponding to the target historical message;
and sending all the service processing results corresponding to the target historical messages to a network, or filtering the service processing results corresponding to the target historical messages and sending the filtered service processing results corresponding to the target historical messages to the network.
In one embodiment, the method further comprises:
when the target application program in at least one application server in the cluster is not failed, acquiring a message from a network through the application server which is not failed;
storing the acquired message into the message persistence file of the non-failure application server, and performing service processing on the acquired message through a target application program in the non-failure application server to obtain a service processing result corresponding to the acquired message.
A cluster failure recovery device, wherein a cluster comprises a plurality of application servers, and each application server runs a target application program, the device comprises:
the determining module is used for determining the quantity of the historical messages stored in the message persistence files of the application servers when the target application programs in all the application servers in the cluster fail; determining the application server with the largest number of stored historical messages as a target application server;
the reading module is used for reading a target historical message from the message persistence file of the target application server through the target application server;
and the processing module is used for performing service processing on the target historical message in the memory through a target application program in the target application server so as to recover the memory state data of the target application server.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
when target application programs in all application servers in the cluster fail, determining the quantity of historical messages stored in the message persistence files of all the application servers;
determining the application server with the largest number of stored historical messages as a target application server;
reading a target history message from a message persistence file of the target application server through the target application server;
and performing service processing on the target historical message in a memory through a target application program in the target application server to recover the memory state data of the target application server.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
when target application programs in all application servers in the cluster fail, determining the quantity of historical messages stored in the message persistence files of all the application servers;
determining the application server with the largest number of stored historical messages as a target application server;
reading a target history message from a message persistence file of the target application server through the target application server;
and performing service processing on the target historical message in a memory through a target application program in the target application server to recover the memory state data of the target application server.
According to the cluster fault recovery method, the cluster fault recovery device, the computer equipment and the storage medium, when target application programs in all application servers in a cluster fail, the number of historical messages stored in message persistence files of all the application servers is determined; determining the application server with the largest number of stored historical messages as a target application server; reading a target historical message from a message persistence file of a target application server through the target application server; and performing service processing on the target historical message in the memory through a target application program in the target application server to recover the memory state data of the target application server. In this way, the message persistence file is set in each application server, so that the target application program synchronously stores the received message into the message persistence file while processing the received message in the memory. Because the processes of the memory and the target application program are relative, if the target application program exits due to a fault, the memory can be recovered by the system, the memory state data of the application server can be lost, but the message persistence file cannot be influenced by the process of the target application program, and the message stored in the message persistence file cannot be lost. When the target application programs in all the application servers in the cluster fail, reading the target historical messages from the message persistence files of the application servers with the largest number of stored historical messages, and reprocessing the target historical messages through the target application programs so as to recover the final memory state data before the cluster fails.
Drawings
FIG. 1 is a diagram illustrating an application scenario of a cluster failover method in an embodiment;
FIG. 2 is a flowchart illustrating a cluster failure recovery method according to an embodiment;
FIG. 3 is a diagram illustrating message persistence and message handling prior to a cluster failure in one embodiment;
FIG. 4 is a diagram illustrating a process for handling historical messages of a target after a cluster failure in one embodiment;
FIG. 5 is a block diagram of an embodiment of a device for cluster failover;
FIG. 6 is a block diagram of an embodiment of a cluster failover apparatus;
FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The cluster fault recovery method provided by the application can be applied to the application environment shown in fig. 1. The application environment includes a cluster of multiple application servers 102. Those skilled in the art will understand that the application environment shown in fig. 1 is only a part of the scenario related to the present application, and does not constitute a limitation to the application environment of the present application.
When target application programs in all application servers 102 in the cluster fail, determining the quantity of historical messages stored in the message persistence files of all the application servers 102; determining the application server 102 with the largest number of stored history messages as a target application server; reading a target historical message from a message persistence file of a target application server through the target application server; and performing service processing on the target historical message in the memory through a target application program in the target application server to recover the memory state data of the target application server.
In one embodiment, as shown in fig. 2, a cluster failure recovery method is provided, which is described by taking the application server 102 in fig. 1 as an example, and includes the following steps:
s202, when the target application programs in all the application servers in the cluster are in failure, the quantity of the historical messages stored in the message persistence files of the application servers is determined.
Wherein the message persistent file is a file stored in a disk of the application server for persistently storing the message received from the network. Persistent storage refers to a storage mode in which a storage state is not affected by an operating state of a target application program, and it can be understood that a message stored in a message persistent file is not lost under the condition that the target application program exits due to a fault. The history messages are messages that are stored in a message persistence file before a failure of a target application in all application servers in the cluster.
Specifically, the monitoring thread in the application server may monitor the health state of the target application program in the application server in real time, and when the monitoring thread monitors that the target application program in at least one application server in the cluster is not failed, the monitoring thread may continue to receive and process messages through the non-failed application server. When the monitoring thread monitors that target application programs in all the application servers in the cluster have faults, the number of the historical messages stored in the message persistence files of the application servers can be determined.
S204, the application server with the largest number of stored history messages is determined as the target application server.
Wherein the target application server is an application server that is a recovery target.
Specifically, the target application program in each application server in the cluster has different failure times and different numbers of history messages stored in each application server message persistence file, and it can be understood that the application server with the earlier failure time of the target application program has the smallest number of history messages stored in its persistence file, and the application server with the later failure time of the target application program has the largest number of history messages stored in its persistence file, and the application server with the largest number of stored history messages can be determined as the target application server.
S206, reading the target history message from the message persistence file of the target application server through the target application server.
Specifically, the target application server may read the target history message from a message persistence file of the target application server.
And S208, performing service processing on the target historical message in the memory through a target application program in the target application server to recover the memory state data of the target application server.
The memory can be affected by the running state of the target application program, and it can be understood that the message stored in the memory and the memory state data corresponding to the message are lost under the condition that the target application program exits due to a fault.
Specifically, after the target history message is read, the target history message may be subjected to service processing in the memory through a target application program in the target application server, so as to recover the memory state data of the target application server. It can be understood that when the target application program in the target application server exits due to a failure, the message stored in the memory before the failure and the memory state data corresponding to the message are lost, and the subsequent service processing of the new message needs to be performed based on the memory state data corresponding to the target history message, which may result in that the new message cannot be acquired from the network for continuing the service processing. At this time, the target history message may be subjected to another service processing in the memory through the target application program in the target application server, so as to recover the memory state data of the target application server before the failure of the target application program. After the memory state data is recovered, the target application server can continue to receive new messages from the network, and the new messages are subjected to business processing through the target application program based on the recovered memory state data.
In the cluster fault recovery method, when target application programs in all application servers in a cluster fail, the number of historical messages stored in message persistence files of each application server is determined; determining the application server with the largest number of stored historical messages as a target application server; reading a target historical message from a message persistence file of a target application server through the target application server; and performing service processing on the target historical message in the memory through a target application program in the target application server to recover the memory state data of the target application server. In this way, the message persistence file is set in each application server, so that the target application program synchronously stores the received message into the message persistence file while processing the received message in the memory. Because the processes of the memory and the target application program are relative, if the target application program exits due to a fault, the memory can be recovered by the system, the memory state data of the application server can be lost, but the message persistence file cannot be influenced by the process of the target application program, and the message stored in the message persistence file cannot be lost. When the target application programs in all the application servers in the cluster fail, reading the target historical messages from the message persistence files of the application servers with the largest number of stored historical messages, and reprocessing the target historical messages through the target application programs so as to recover the final memory state data before the cluster fails.
In an embodiment, the cluster failure recovery method further includes: when the memory state data of the target application server is completely recovered, acquiring a new message from the network; and storing the new message into a message persistence file of the target application server, and performing service processing on the new message based on the memory state data through a target application program in the target application server to obtain a service processing result corresponding to the new message.
And the new message is obtained by the target application server from the network after the memory state data of the target application server are completely recovered.
Specifically, when the memory state data of the target application server is completely recovered, the target application server may obtain a new message from the network, and store the new message in a message persistent file of the target application server, so as to ensure that the new message is not lost when the target application program of the target application server fails next time. And when the new message is stored in the message persistence file of the target application server, asynchronously performing service processing on the new message through a target application program in the target application server based on the memory state data to obtain a service processing result corresponding to the new message.
Optionally, the service processing on the new message may specifically be processing of service logic, such as parsing, calculating, and determining, on the new message.
In the embodiment, the new message is acquired from the network only when the memory state data of the target application server is completely recovered, and the service processing is performed on the new message, so that the new message can be normally processed, and the service processing efficiency is improved.
In an embodiment, after the step of performing, by the target application program in the target application server, service processing on the new message based on the memory state data to obtain a service processing result corresponding to the new message, the cluster failure recovery method further includes: and storing the service processing result corresponding to the new message into a result persistence file of the target application server.
The result persistent file is a file stored in a disk of the application server and is used for persistently storing a service processing result after the target application program performs service processing on the message.
Specifically, the target application server may store the service processing result corresponding to the new message in a result persistence file of the target application server.
In the above embodiment, the service processing result is stored in the result persistence file of the target application server, so as to provide data support for corresponding processing of the service processing result subsequently.
In one embodiment, the target history message is all history messages in the message persistence file of the target application server, or part of the history messages specified by the preset configuration parameters in the message persistence file of the target application server.
Specifically, all history messages can be directly acquired from the message persistent file of the target application server as target history messages, and the target history messages are provided for the target application program to be reprocessed so as to recover the memory state data of the target application server. Alternatively, a part of all history messages can be acquired from the message persistence file of the target application server as the target history messages.
For example, the history messages may form a history message sequence according to a persistent precedence order. Furthermore, the history message pointed by the starting position and the ending position can be selected from the history message sequence as the target history message according to the preset starting position and the preset ending position. Alternatively, the history message of the designated section in the history message sequence may be discarded, and the history messages except the designated section in the history message sequence may be set as the target history message. For example, before the target application program of the target application server fails, the message persistence file of the target application server stores 1000 history messages, and after the target application program of the target application server fails, 1-500 of the history message sequence are filtered by the configuration parameters, so that the 501 th and 1000 th history messages can be directly used as the target history messages.
In the above embodiment, a flexible cluster fault recovery mode can be provided by two selection modes of the target history message.
In an embodiment, after step S208, that is, after the step of performing, by the target application program in the target application server, service processing on the target history message in the memory, the cluster failure recovery method further includes: acquiring a service processing result corresponding to the target historical message; and sending the service processing results corresponding to all the target historical messages to the network, or filtering the service processing results corresponding to the target historical messages and sending the filtered service processing results corresponding to the target historical messages to the network.
For example, the service processing result corresponding to the history message before the breakpoint position can be filtered according to the breakpoint position of the message in the message sequence at the time of the fault, and the service processing result corresponding to the history message before the breakpoint position is already received by the downstream application, so that the service processing result is not required to be repeatedly sent to the network.
In the above embodiment, by sending the service processing result corresponding to the filtered target history message to the network, the repeated sending of the same service processing result can be avoided.
In an embodiment, the cluster failure recovery method further includes: when the target application program in at least one application server in the cluster is not failed, acquiring a message from a network through the application server which is not failed; and storing the acquired message into a message persistence file of the application server without failure, and performing service processing on the acquired message through a target application program in the application server without failure to obtain a service processing result corresponding to the acquired message.
Specifically, in a general case, all the application programs in the application servers in the cluster do not suddenly fail, when at least one target application program in the application servers in the cluster fails, the application servers that do not fail continue to acquire messages from the network to ensure high availability of the cluster, the acquired messages are stored in a message persistence file of the application servers that do not fail, and meanwhile, the target application programs in the application servers that do not fail perform service processing on the acquired messages to obtain a service processing result corresponding to the acquired messages.
In the above embodiment, as long as there is no failure in the target application program in at least one application server in the cluster, the non-failed application server obtains the message from the network and performs service processing, so as to ensure that the cluster can provide services to the outside normally, and improve the fault tolerance of the cluster.
In an embodiment, after the memory state data of the target application server is completely recovered, the memory state data of the application servers in the cluster except the target application server may also be recovered through other cluster failure recovery methods.
In an embodiment, as shown in fig. 3, before an application program in the application server fails, the application server may receive a message from a network, persist the received message in a message persistence file of a disk, and perform service processing on the message in a memory through a target application program in the application server to obtain a service processing result corresponding to the message. The application server can store the result corresponding to the message in the result persistence file of the disk in a persistent manner, and simultaneously send the service processing result corresponding to the message to the network.
In an embodiment, as shown in fig. 4, after the application program in the application server fails, the application server may suspend receiving the message from the network, read the target history message from the message persistence file of the disk, and perform service processing on the target history message in the memory through the target application program in the application server to obtain a service processing result corresponding to the target history message. The application server can store the result corresponding to the target historical message in a result persistence file of the disk in a persistent mode, and meanwhile, the target service processing result corresponding to the filtered target historical message is sent to the network.
It should be understood that although the various steps of fig. 2 are shown in order, the steps are not necessarily performed in order. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 2 may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 5, there is provided a cluster failure recovery apparatus 500, where a cluster includes a plurality of application servers, each application server runs a target application program, and the cluster failure recovery apparatus 500 includes: a determination module 501, a reading module 502 and a processing module 503, wherein:
a determining module 501, configured to determine, when a target application program in all application servers in a cluster fails, the number of history messages stored in a message persistence file of each application server; and determining the application server with the largest number of stored historical messages as the target application server.
A reading module 502, configured to read, by the target application server, the target history message from the message persistent file of the target application server.
The processing module 503 is configured to perform service processing on the target history message in the memory through a target application program in the target application server, so as to recover memory state data of the target application server.
In one embodiment, the processing module 503 is further configured to obtain a new message from the network when all memory state data of the target application server is completely recovered; and storing the new message into a message persistence file of the target application server, and performing service processing on the new message based on the memory state data through a target application program in the target application server to obtain a service processing result corresponding to the new message.
In one embodiment, the target history message is all history messages in the message persistence file of the target application server, or part of the history messages specified by the preset configuration parameters in the message persistence file of the target application server.
In one embodiment, the processing module 503 is further configured to, when there is no failure of the target application program in at least one application server in the cluster, obtain a message from the network through the application server that has not failed; and storing the acquired message into a message persistence file of the application server without failure, and performing service processing on the acquired message through a target application program in the application server without failure to obtain a service processing result corresponding to the acquired message.
Referring to fig. 6, in one embodiment, the cluster failure recovery apparatus 500 further includes: a storage module 504 and a sending module 505, wherein:
the storage module 504 is configured to store the service processing result corresponding to the new message in a result persistence file of the target application server.
A sending module 505, configured to obtain a service processing result corresponding to the target history message; and sending the service processing results corresponding to all the target historical messages to the network, or filtering the service processing results corresponding to the target historical messages and sending the filtered service processing results corresponding to the target historical messages to the network.
The cluster fault recovery device determines the number of the historical messages stored in the message persistence file of each application server when the target application programs in all the application servers in the cluster have faults; determining the application server with the largest number of stored historical messages as a target application server; reading a target historical message from a message persistence file of a target application server through the target application server; and performing service processing on the target historical message in the memory through a target application program in the target application server to recover the memory state data of the target application server. In this way, the message persistence file is set in each application server, so that the target application program synchronously stores the received message into the message persistence file while processing the received message in the memory. Because the processes of the memory and the target application program are relative, if the target application program exits due to a fault, the memory can be recovered by the system, the memory state data of the application server can be lost, but the message persistence file cannot be influenced by the process of the target application program, and the message stored in the message persistence file cannot be lost. When the target application programs in all the application servers in the cluster fail, reading the target historical messages from the message persistence files of the application servers with the largest number of stored historical messages, and reprocessing the target historical messages through the target application programs so as to recover the final memory state data before the cluster fails.
For specific limitations of the cluster failure recovery apparatus, reference may be made to the above limitations of the cluster failure recovery method, which is not described herein again. All or part of the modules in the cluster fault recovery device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be the application server 102 in fig. 1, and its internal structure diagram may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store cluster failure recovery data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a cluster failure recovery method.
Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
when target application programs in all application servers in the cluster fail, determining the quantity of historical messages stored in the message persistence files of all the application servers;
determining the application server with the largest number of stored historical messages as a target application server;
reading a target historical message from a message persistence file of a target application server through the target application server;
through the target application program in the target application server, the target historical information is processed in the memory to recover the memory state data of the target application server
In one embodiment, the processor, when executing the computer program, further performs the steps of:
when the memory state data of the target application server is completely recovered, acquiring a new message from the network;
and storing the new message into a message persistence file of the target application server, and performing service processing on the new message based on the memory state data through a target application program in the target application server to obtain a service processing result corresponding to the new message.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and storing the service processing result corresponding to the new message into a result persistence file of the target application server.
In one embodiment, the target history message is all history messages in the message persistence file of the target application server, or part of the history messages specified by the preset configuration parameters in the message persistence file of the target application server.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
acquiring a service processing result corresponding to the target historical message;
and sending the service processing results corresponding to all the target historical messages to the network, or filtering the service processing results corresponding to the target historical messages and sending the filtered service processing results corresponding to the target historical messages to the network.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
when the target application program in at least one application server in the cluster is not failed, acquiring a message from a network through the application server which is not failed;
and storing the acquired message into a message persistence file of the application server without failure, and performing service processing on the acquired message through a target application program in the application server without failure to obtain a service processing result corresponding to the acquired message.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
when target application programs in all application servers in the cluster fail, determining the quantity of historical messages stored in the message persistence files of all the application servers;
determining the application server with the largest number of stored historical messages as a target application server;
reading a target historical message from a message persistence file of a target application server through the target application server;
through the target application program in the target application server, the target historical information is processed in the memory to recover the memory state data of the target application server
In one embodiment, the computer program when executed by the processor further performs the steps of:
when the memory state data of the target application server is completely recovered, acquiring a new message from the network;
and storing the new message into a message persistence file of the target application server, and performing service processing on the new message based on the memory state data through a target application program in the target application server to obtain a service processing result corresponding to the new message.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and storing the service processing result corresponding to the new message into a result persistence file of the target application server.
In one embodiment, the target history message is all history messages in the message persistence file of the target application server, or part of the history messages specified by the preset configuration parameters in the message persistence file of the target application server.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring a service processing result corresponding to the target historical message;
and sending the service processing results corresponding to all the target historical messages to the network, or filtering the service processing results corresponding to the target historical messages and sending the filtered service processing results corresponding to the target historical messages to the network.
In one embodiment, the computer program when executed by the processor further performs the steps of:
when the target application program in at least one application server in the cluster is not failed, acquiring a message from a network through the application server which is not failed;
and storing the acquired message into a message persistence file of the application server without failure, and performing service processing on the acquired message through a target application program in the application server without failure to obtain a service processing result corresponding to the acquired message.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A cluster fault recovery method is characterized in that a cluster comprises a plurality of application servers, and a target application program runs in each application server, and the method comprises the following steps:
when target application programs in all application servers in the cluster fail, determining the quantity of historical messages stored in the message persistence files of all the application servers;
determining the application server with the largest number of stored historical messages as a target application server;
reading a target history message from a message persistence file of the target application server through the target application server;
and performing service processing on the target historical message in a memory through a target application program in the target application server to recover the memory state data of the target application server.
2. The method of claim 1, further comprising:
when the memory state data of the target application server are completely recovered, acquiring a new message from a network;
and storing the new message into a message persistence file of the target application server, and performing service processing on the new message based on the memory state data through a target application program in the target application server to obtain a service processing result corresponding to the new message.
3. The method according to claim 2, wherein after the target application program in the target application server performs service processing on the new message based on the memory state data and obtains a service processing result corresponding to the new message, the method further comprises:
and storing the service processing result corresponding to the new message into a result persistence file of the target application server.
4. The method of claim 1, wherein the target history message is all history messages in the message persistence file of the target application server, or part of history messages specified by preset configuration parameters in the message persistence file of the target application server.
5. The method of claim 1, wherein after the target history message is processed in the memory by the target application program in the target application server, the method further comprises:
acquiring a service processing result corresponding to the target historical message;
and sending all the service processing results corresponding to the target historical messages to a network, or filtering the service processing results corresponding to the target historical messages and sending the filtered service processing results corresponding to the target historical messages to the network.
6. The method of claim 1, further comprising:
when the target application program in at least one application server in the cluster is not failed, acquiring a message from a network through the application server which is not failed;
storing the acquired message into the message persistence file of the non-failure application server, and performing service processing on the acquired message through a target application program in the non-failure application server to obtain a service processing result corresponding to the acquired message.
7. A cluster failure recovery apparatus, wherein a cluster includes a plurality of application servers, and each application server runs a target application program, the apparatus comprising:
the determining module is used for determining the quantity of the historical messages stored in the message persistence files of the application servers when the target application programs in all the application servers in the cluster fail; determining the application server with the largest number of stored historical messages as a target application server;
the reading module is used for reading a target historical message from the message persistence file of the target application server through the target application server;
and the processing module is used for performing service processing on the target historical message in the memory through a target application program in the target application server so as to recover the memory state data of the target application server.
8. The apparatus of claim 7, further comprising:
the acquisition module is used for acquiring a new message from a network when the memory state data of the target application server is completely recovered;
the storage module is used for storing the new message into a message persistence file of the target application server;
the processing module is further configured to perform service processing on the new message based on the memory state data through a target application program in the target application server to obtain a service processing result corresponding to the new message.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 6 are implemented by the processor when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
CN202110914006.XA 2021-08-10 2021-08-10 Cluster fault recovery method and device, computer equipment and storage medium Withdrawn CN113626240A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110914006.XA CN113626240A (en) 2021-08-10 2021-08-10 Cluster fault recovery method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110914006.XA CN113626240A (en) 2021-08-10 2021-08-10 Cluster fault recovery method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113626240A true CN113626240A (en) 2021-11-09

Family

ID=78384012

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110914006.XA Withdrawn CN113626240A (en) 2021-08-10 2021-08-10 Cluster fault recovery method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113626240A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114584462A (en) * 2021-12-27 2022-06-03 天翼云科技有限公司 Network service processing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669821A (en) * 2018-11-16 2019-04-23 深圳证券交易所 Cluster partial fault restoration methods, server and the storage medium of message-oriented middleware
CN109684128A (en) * 2018-11-16 2019-04-26 深圳证券交易所 Cluster overall failure restoration methods, server and the storage medium of message-oriented middleware
CN112231148A (en) * 2020-10-23 2021-01-15 北京思特奇信息技术股份有限公司 Distributed cache data offline transmission method and device and readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669821A (en) * 2018-11-16 2019-04-23 深圳证券交易所 Cluster partial fault restoration methods, server and the storage medium of message-oriented middleware
CN109684128A (en) * 2018-11-16 2019-04-26 深圳证券交易所 Cluster overall failure restoration methods, server and the storage medium of message-oriented middleware
CN112231148A (en) * 2020-10-23 2021-01-15 北京思特奇信息技术股份有限公司 Distributed cache data offline transmission method and device and readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114584462A (en) * 2021-12-27 2022-06-03 天翼云科技有限公司 Network service processing method and device

Similar Documents

Publication Publication Date Title
CN110995468B (en) System fault processing method, device, equipment and storage medium of system to be analyzed
CN108710673B (en) Method, system, computer device and storage medium for realizing high availability of database
CN111294845B (en) Node switching method, device, computer equipment and storage medium
CN112491659B (en) Flow playback test method and device, computer equipment and storage medium
CN110727698A (en) Database access method and device, computer equipment and storage medium
CN113672415B (en) Disk fault processing method, device, equipment and storage medium
CN115994044B (en) Database fault processing method and device based on monitoring service and distributed cluster
CN111198921A (en) Database switching method and device, computer equipment and storage medium
CN111506326A (en) Method, device and equipment for upgrading terminal equipment and storage medium
CN108509322B (en) Method for avoiding excessive return visit, electronic device and computer readable storage medium
CN113626240A (en) Cluster fault recovery method and device, computer equipment and storage medium
CN111475335A (en) Method, system, terminal and storage medium for fast recovery of database
CN112818204B (en) Service processing method, device, equipment and storage medium
CN114500315A (en) Equipment state monitoring method and device, computer equipment and storage medium
CN113010306A (en) Service data processing method and device, computer equipment and storage medium
CN110489208B (en) Virtual machine configuration parameter checking method, system, computer equipment and storage medium
CN113312309B (en) Snapshot chain management method, device and storage medium
CN111061610B (en) Generation method and device of cluster system performance test report and computer equipment
CN111813592A (en) Method and device for optimizing system fault recovery plan and computer storage medium
CN110555017A (en) block chain data cleaning method and device, computer equipment and storage medium
CN116010199A (en) Application service self-adjustment method, device, computer equipment and storage medium
CN110673987A (en) Database recovery method, device, equipment and storage medium
CN115328814A (en) Fault injection method, device, equipment and storage medium based on image pair
CN113282334A (en) Method and device for recovering software defects, computer equipment and storage medium
CN115168236A (en) Automatic testing method, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 2301, building 5, Shenzhen new generation industrial park, 136 Zhongkang Road, Meidu community, Meilin street, Futian District, Shenzhen City, Guangdong Province

Applicant after: Shenzhen Huarui Distributed Technology Co.,Ltd.

Address before: Room 2301, building 5, Shenzhen new generation industrial park, 136 Zhongkang Road, Meidu community, Meilin street, Futian District, Shenzhen City, Guangdong Province

Applicant before: SHENZHEN ARCHFORCE FINANCIAL TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
WW01 Invention patent application withdrawn after publication

Application publication date: 20211109

WW01 Invention patent application withdrawn after publication