CN112631872A - Exception handling method and device of multi-core system - Google Patents
Exception handling method and device of multi-core system Download PDFInfo
- Publication number
- CN112631872A CN112631872A CN202011603786.8A CN202011603786A CN112631872A CN 112631872 A CN112631872 A CN 112631872A CN 202011603786 A CN202011603786 A CN 202011603786A CN 112631872 A CN112631872 A CN 112631872A
- Authority
- CN
- China
- Prior art keywords
- cpu
- subsystem
- cpus
- core system
- resident
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000012544 monitoring process Methods 0.000 claims abstract description 88
- 238000012545 processing Methods 0.000 claims abstract description 80
- 230000002159 abnormal effect Effects 0.000 claims abstract description 77
- 230000005856 abnormality Effects 0.000 claims description 31
- 238000004590 computer program Methods 0.000 claims description 4
- 239000013256 coordination polymer Substances 0.000 description 35
- 230000006870 function Effects 0.000 description 12
- 230000008569 process Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 101150115013 DSP1 gene Proteins 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 238000007639 printing Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 229940012720 subsys Drugs 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 238000009966 trimming Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3006—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0709—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0766—Error or fault reporting or storing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/079—Root cause analysis, i.e. error or fault diagnosis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0793—Remedial or corrective actions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3065—Monitoring arrangements determined by the means or processing involved in reporting the monitored data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3089—Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
- G06F11/3093—Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computer Hardware Design (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Hardware Redundancy (AREA)
- Debugging And Monitoring (AREA)
Abstract
The application provides an exception handling method and device for a multi-core system, wherein the method comprises the following steps: configuring a multi-core system into a monitoring subsystem and a plurality of processing subsystems, wherein the monitoring subsystem comprises a resident CPU, each processing subsystem comprises at least one CPU, one of the CPUs is a main CPU, and the other CPUs are slave CPUs; when any CPU of one processing subsystem captures the abnormity, the any CPU reports the captured abnormity to the resident CPU of the monitoring subsystem, and the main CPU of the processing subsystem outputs abnormal field data. The multi-core system is configured in function by designing a multi-level exception handling scheme, and exception field data are stored to the maximum extent when an exception occurs.
Description
Technical Field
The present disclosure relates to the field of embedded systems, and in particular, to a method and an apparatus for exception handling in a multi-core system.
Background
The existing embedded system is often a multi-core system, the multi-core system usually comprises a plurality of CPUs, some CPUs also comprise large and small cores, different CPUs work in different power supply domains, voltage domains and main frequencies by using different processes, different power consumption models are used, and different scenes are run. The industry often uses large and small core systems to reduce system power consumption and increase power consumption ratio.
With the increase of the number of cores in the embedded system, the system has more and more peripherals, more and more complex system functions and larger software code amount. In addition, the operating environment of the embedded system is often complex, especially in a wireless communication application scenario, the signal environment is variable and susceptible to interference, and the system is prone to generating anomalies.
Aiming at the problem, a general exception handling mode is to store the running environment information when the system is abnormal, restore the site and search the reason of the abnormality for solving the problem. The stored system abnormal field data comprises software and hardware version information, CPU registers, stack information, abnormal and call stack information, current task/thread information, important memory area data and the like.
However, the exception handling method often has the following problems:
in many scenes, the CPU after the system is abnormal cannot work normally, and the important data required for recovering the abnormal field cannot be completely stored;
under the condition that the CPU enters dead loop or runs away, the CPU can not process any system exception any more, and further no abnormal field data is obtained;
under the condition of lacking abnormal field data, the system abnormality cannot be correctly analyzed and modified, and more data can be acquired only when the abnormality occurs next time;
when an exception occurs, the system CPU is suspended, and thus, the necessary restart recovery operation cannot be performed, which affects the subsequent system operation.
Disclosure of Invention
In view of the above, a primary objective of the present application is to provide an exception handling method and apparatus for a multi-core system, which configure a multi-level exception handling scheme to perform function configuration on the multi-core system, and store exception field data to the maximum extent when an exception occurs.
In a first aspect, the present application provides an exception handling method for a multi-core system, including:
configuring a multi-core system into a monitoring subsystem and a plurality of processing subsystems, wherein the monitoring subsystem comprises a resident CPU, each processing subsystem comprises at least one CPU, one of the CPUs is a main CPU, and the other CPUs are slave CPUs;
when any CPU of one processing subsystem captures the abnormity, the any CPU reports the captured abnormity to the resident CPU of the monitoring subsystem, and the main CPU of the processing subsystem outputs abnormal field data.
In the method, the multi-core system is configured with functions, the monitoring subsystem and the processing subsystems are respectively configured, and at least one main CPU is correspondingly configured in each processing subsystem, specifically, when each processing subsystem comprises one CPU, the one CPU is the main CPU of the processing subsystem, when the processing subsystem comprises a plurality of CPUs, one of the CPUs is the main CPU, and the other CPUs are the slave CPUs, and the main CPU manages the slave CPUs. The monitoring subsystem is used for monitoring the working state of the CPU of each processing subsystem, acquiring the abnormality occurrence information reported by the CPU of each processing subsystem in real time, reporting the captured abnormality to the resident CPU of the monitoring subsystem by the CPU when the CPU in any one processing subsystem has capturable abnormality, and outputting abnormal field data through the main CPU in the processing subsystem. By the method, abnormal field data can be stored to the maximum extent when an abnormality occurs.
Optionally, the processing subsystem is divided into a core subsystem and a service subsystem according to a set standard.
Therefore, according to the function task specifically executed in the multi-core system, the processing subsystem can be further divided into a core subsystem for executing the core task and a service subsystem for executing the auxiliary task, wherein the exception of the core subsystem can affect the operation of the whole multi-core system, when the exception occurs, the exception processing needs to be performed on the whole multi-core system, and the exception of the service subsystem does not affect the operation of the whole multi-core system.
Optionally, the method further includes:
when the processing subsystem to which the abnormal CPU belongs is captured as a core subsystem, suspending other CPUs of the multi-core system through a main CPU of the core subsystem; performing, by a resident CPU of a monitoring subsystem, a reboot of the multi-core system;
when the processing subsystem to which the captured abnormal CPU belongs is a service subsystem, suspending other CPUs of the service subsystem through a main CPU of the service subsystem; the restart of the service subsystem is performed by the resident CPU of the monitoring subsystem.
Therefore, the core subsystem is used for executing core tasks in the multi-core system, the operation of the whole multi-core system is affected by whether the core subsystem is abnormal or not, when the core subsystem is abnormal, other CPUs of the whole multi-core system need to be suspended, and further, the whole multi-core system is restarted by monitoring the resident CPU of the subsystem. The service subsystem is used for executing auxiliary tasks in the multi-core system, the operation of the whole multi-core system cannot be influenced if the auxiliary tasks are abnormal, when the auxiliary tasks are abnormal, only other CPUs of the service subsystem need to be hung, and further, the service subsystem is restarted through a resident CPU of the monitoring subsystem.
Optionally, the method further includes:
and when the captured abnormal CPU is a slave CPU, transmitting abnormal field data to a main CPU of the processing subsystem to which the abnormal field data belongs through the captured abnormal slave CPU, and outputting the abnormal field data by the main CPU.
When each processing subsystem comprises one CPU, the CPU is a main CPU of the processing subsystem, when the processing subsystem comprises a plurality of CPUs, one of the CPUs is the main CPU, the other CPUs are slave CPUs, the main CPU manages the slave CPUs, when the slave CPUs capture the abnormity, the slave CPUs report the captured abnormity to a resident CPU of the monitoring subsystem, the slave CPUs also carry out trimming on the abnormal field data to the main CPU of the processing subsystem to which the slave CPUs belong, and the main CPU outputs the abnormal field data.
Optionally, the method further includes:
when capturing the abnormality, the resident CPU of the monitoring subsystem suspends other CPUs of the multi-core system through the resident CPU of the monitoring subsystem and outputs abnormal field data;
and executing the restart of the multi-core system through the resident CPU of the monitoring subsystem.
Therefore, when the monitoring subsystem resident CPU has capturable abnormal conditions, the monitoring subsystem also needs to output abnormal field data through the resident CPU, suspend other CPUs of the entire multi-core system through the resident CPU, and restart the entire multi-core system according to the configured abnormal processing mode.
Optionally, the method further includes:
detecting heartbeat data of two adjacent cycles of each CPU of each processing subsystem through a resident CPU of the monitoring subsystem, and judging whether each CPU is abnormal or not;
when a CPU is judged to be abnormal:
if the processing subsystem to which the CPU belongs is a core subsystem, suspending other CPUs of the multi-core system through the resident CPU, and executing the restart of the multi-core system;
if the processing subsystem to which the CPU belongs is a service subsystem, suspending other CPUs of the service subsystem through the resident CPU, and executing the restart of the service subsystem.
Optionally, the determining whether each CPU is abnormal includes:
and when the heartbeat data of two adjacent periods of each CPU are detected to be consistent, judging that the heartbeat data are abnormal.
According to the method, each CPU in a core subsystem and a service subsystem can periodically update heartbeat data, the heartbeat data of each period is different, and the heartbeat data are respectively stored in a shared memory area of a multi-core system, a monitoring subsystem can periodically (one period or two periods) read the heartbeat data of each CPU in the shared memory area and store the heartbeat data in the local area, the monitoring subsystem compares the read heartbeat data with the local data to detect the heartbeat data of two adjacent periods of each CPU, if the heartbeat data of two adjacent periods are consistent, the CPU sending the heartbeat data is considered to have dead cycle or run away abnormity, and at the moment, the CPU with abnormity belongs to the core subsystem or the service subsystem to judge whether other CPUs in the whole multi-core system or other CPUs in the abnormal service subsystem are hung, and then restarting the whole multi-core system or restarting the abnormal service subsystem according to the configured restart task.
Optionally, the resident CPU is a CPU with the lowest performance or the lowest power consumption in the multi-core system.
Therefore, a CPU with the lowest performance or the lowest power consumption in the multi-core system is generally selected as a resident CPU of the monitoring subsystem to monitor the working states of other cores in the multi-core system, and when an exception occurs, the processing of the exception is performed in a coordinated manner.
Optionally, the outputting the abnormal field data includes:
and outputting the abnormal field data to a printing system or a file saving system.
In a second aspect, the present application provides an exception handling apparatus for a multi-core system, including:
the system comprises a configuration unit, a monitoring subsystem and a plurality of processing subsystems, wherein the configuration unit is used for configuring the multi-core system into the monitoring subsystem and the processing subsystems, the monitoring subsystem comprises a resident CPU, each processing subsystem comprises at least one CPU, one of the CPUs is a main CPU, and the other CPUs are slave CPUs;
and the processing unit is used for reporting the captured abnormality to the resident CPU of the monitoring subsystem by any CPU when the abnormality is captured by the CPU of the processing subsystem, and outputting abnormal field data through the main CPU of the processing subsystem.
In a third aspect, the present application provides a computer device comprising:
one or more processors;
a memory for storing one or more programs;
when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement an exception handling method of the multi-core system.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a computer, implements an exception handling method of a multi-core system.
These and other aspects of the present application will be more readily apparent from the following description of the embodiment(s).
Drawings
Fig. 1 is a flowchart of an exception handling method of a multi-core system according to an embodiment of the present application;
fig. 2 is a flowchart illustrating exception handling of an AP subsystem according to an embodiment of the present disclosure;
fig. 3 is a flowchart of exception handling of a main CPU of a CP subsystem according to an embodiment of the present disclosure;
fig. 4 is a flowchart of slave CPU exception handling of a CP subsystem according to an embodiment of the present disclosure;
fig. 5 is a flowchart of processing a heartbeat detection exception of a CP subsystem according to an embodiment of the present disclosure;
fig. 6 is a block diagram of an exception handling apparatus of a multi-core system according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a computing device according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
The terms "first, second, third and the like" or "module a, module B, module C and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order, it being understood that specific orders or sequences may be interchanged where permissible to effect embodiments of the present application in other than those illustrated or described herein.
In the following description, reference to reference numerals indicating steps, such as S110, S120 … …, etc., does not necessarily indicate that the steps are performed in this order, and the order of the preceding and following steps may be interchanged or performed simultaneously, where permissible.
The term "comprising" as used in the specification and claims should not be construed as being limited to the contents listed thereafter; it does not exclude other elements or steps. It should therefore be interpreted as specifying the presence of the stated features, integers, steps or components as referred to, but does not preclude the presence or addition of one or more other features, integers, steps or components, and groups thereof. Thus, the expression "an apparatus comprising the devices a and B" should not be limited to an apparatus consisting of only the components a and B.
Reference in the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the application. Thus, appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments, as would be apparent to one of ordinary skill in the art from this disclosure.
The application provides an exception handling method and device for a multi-core system, wherein a multi-layer exception handling scheme is designed, the function configuration is carried out on the multi-core system, exception field data are stored to the maximum extent when an exception occurs, corresponding exception handling is carried out according to the function of a CPU with the exception, and therefore the working state of the multi-core system is recovered quickly.
The embodiments of the present application will be described in detail below with reference to the accompanying drawings.
As shown in fig. 1, an exception handling method for a multi-core system provided in an embodiment of the present application includes:
s101: configuring a multi-core system into a monitoring subsystem and a plurality of processing subsystems, wherein the monitoring subsystem comprises a resident CPU, each processing subsystem comprises at least one CPU, one of the CPUs is a main CPU, and the other CPUs are slave CPUs;
s102: when any CPU of one processing subsystem captures the abnormity, the any CPU reports the captured abnormity to the resident CPU of the monitoring subsystem, and the main CPU of the processing subsystem outputs abnormal field data.
In the embodiment of the application, the multi-core system comprises a plurality of CPUs, and the multi-core system can be divided into a monitoring subsystem and a plurality of processing subsystems according to a function or application scene, wherein the monitoring subsystem can select the CPU with the lowest performance or the lowest power consumption in the multi-core system, the CPU is a single core to monitor the working state of each processing subsystem in the multi-core system, and when an exception occurs, the main CPU of each processing subsystem outputs exception field data, and the exception handling of the multi-core system is completed in a coordinated manner through the resident CPU of the monitoring subsystem. Specifically, according to the importance degree of executing tasks, the processing subsystem is divided into a core subsystem and a service subsystem; the core subsystem is used for executing core tasks of the multi-core system; the service subsystem is used for executing auxiliary tasks of the multi-core system. The operation of the whole multi-core system is influenced by the abnormality of the core subsystem, when the abnormality occurs, the abnormality of the whole multi-core system needs to be processed, the operation of the whole multi-core system is not influenced by the abnormality of the service subsystem, when the abnormality occurs, the abnormality processing only needs to be carried out on the service subsystem, and other subsystems of the multi-core system can normally operate.
The core subsystem may select a CPU with the highest performance in the multi-core system, and the CPU may be a dual-core CPU or a multi-core CPU to execute a core task of the multi-core system, and its operating state directly affects the overall operating state of the multi-core system. The service subsystem may include a master CPU core and a plurality of slave CPUs, the master CPU and the slave CPUs are usually single-core CPUs, and the performance of the master CPU is greater than that of the slave CPUs, wherein the master CPU is configured to manage the plurality of slave CPUs, each service subsystem is only configured to execute its corresponding service task, and the plurality of service subsystems are independent from each other and do not affect each other.
Based on this, according to the subsystem to which the CPU capturing the abnormality belongs, the following processing can be further performed:
and when the processing subsystem to which the abnormal CPU belongs is captured as a core subsystem, suspending other CPUs of the multi-core system through a main CPU of the core subsystem, and executing the restart of the multi-core system by a resident CPU of the monitoring subsystem.
And when the processing subsystem to which the captured abnormal CPU belongs is a service subsystem, suspending other CPUs of the service subsystem through a main CPU of the service subsystem, and executing the restart of the service subsystem by a resident CPU of the monitoring subsystem.
When the CPU capturing the abnormity is a slave CPU, the abnormal field data is sent to a main CPU of the processing subsystem through the slave CPU capturing the abnormity, and the main CPU outputs the abnormal field data.
In addition, the operation of the whole multi-core system can be influenced by whether the monitoring subsystem is abnormal or not, so that when the resident CPU of the monitoring subsystem captures the abnormality, other CPUs of the multi-core system are suspended through the resident CPU of the monitoring subsystem, and abnormal field data are output; further, according to the configured restart task, the restart of the multi-core system is executed through the resident CPU of the monitoring subsystem.
In the embodiment of the application, when any one of the CPUs of the core subsystem or the service subsystem is abnormal, the heartbeat data of two adjacent cycles of each CPU of the core subsystem and the service subsystem can be detected by the resident CPU of the monitoring subsystem, and whether each CPU is abnormal or not can be judged. The heartbeat data is heartbeat data which is periodically updated to a shared memory area of the multi-core system by each CPU in the core subsystem and the service subsystem respectively, the heartbeat data of each period is different, the resident CPU of the monitoring subsystem can periodically (one period or two periods) read the heartbeat data of each CPU in the shared memory area and store the heartbeat data in the local, the resident CPU of the monitoring subsystem compares the read heartbeat data with the local data, namely the heartbeat data of two adjacent periods of each CPU is detected, and if the heartbeat data of two adjacent periods are consistent, the CPU sending the heartbeat data is considered to have endless loop or abnormal runaway;
in this embodiment of the application, the heartbeat data may also be equivalently replaced by timing data, that is, by setting a watchdog timer in the shared memory area, where the watchdog timer stores timing data corresponding to each CPU, respectively, when the multi-core system is started, an initial value of the watchdog timer is set to zero, and then when the CPU updates the heartbeat data once, the corresponding timing data is incremented by one, the resident CPU of the monitoring subsystem may periodically (one cycle or two cycles) read the timing data of each CPU of the watchdog timer in the shared memory area and store the timing data locally, and the resident CPU of the monitoring subsystem compares the read timing data with the local data, that is, compares the timing data of two adjacent cycles of each CPU of the watchdog timer, if the timing data of two adjacent cycles are consistent, determining that the CPU corresponding to the timing data has endless loop or abnormal running, and suspending each CPU of the service subsystem through the resident CPU of the monitoring subsystem when the abnormal CPU is located in the service subsystem; or when the abnormal CPU is positioned in the core subsystem, suspending each CPU of the multi-core system through the monitoring subsystem.
When the resident CPU of the monitoring subsystem monitors that one or more CPUs in the core subsystem or the service subsystem are abnormal through the two monitoring modes, corresponding restarting processing is carried out according to the severity of the abnormality.
In the embodiment of the present application, the severity of the anomaly is mainly determined according to whether the subsystem in which the anomaly occurs belongs to the core subsystem or the service subsystem, and specifically,
when the service subsystem is abnormal, according to the configured restart task, restarting each CPU of the service subsystem through the resident CPU of the monitoring subsystem;
and when the core subsystem is abnormal, restarting each CPU of the multi-core system through the resident CPU of the monitoring subsystem according to the configured restart task.
The method for processing the abnormality of the multi-core system provided by the embodiment of the application includes the steps that the multi-core system is configured in function, a monitoring subsystem, a core subsystem and a service subsystem are respectively configured, each subsystem is monitored through the monitoring subsystem, when abnormality occurs, the abnormal CPU reports abnormality occurrence information, abnormal field data are output through a main CPU in the abnormal subsystem, and then corresponding abnormality processing is carried out according to the function of the subsystem, for example, when the abnormal subsystem is the service subsystem, each CPU of the service subsystem is synchronously hung up; or when the abnormal subsystem is a core subsystem, synchronously suspending each CPU of the multi-core system. Furthermore, after the output and suspension are finished, the service subsystem or the whole multi-core system can be selected to be restarted, so that the working state of the multi-core system can be recovered as quickly as possible.
The technical solution of the present application is further described below according to an embedded multi-core system provided by the present application.
In the embodiment of the application, the embedded multi-core system comprises a plurality of CPUs, specifically 1 Cortex-M4 CPU (hereinafter referred to as CM4), 1 Cortex-a7 dual-core CPU (hereinafter referred to as AP), 2 Cortex-a7 single-core CPU (hereinafter referred to as CP), 2X 1643 DSPs (hereinafter referred to as DSP0) and two X4500 DSPs (hereinafter referred to as DSP 1).
According to the function and application scenarios, the multiple CPUs in the multi-core system are divided into four subsystems, namely a monitoring subsystem (including CM4), an AP subsystem (including AP), a first CP subsystem (including CP0, CP0_ DSP0 and CP0_ DSP1) and a second CP subsystem (including CP1, CP1_ DSP0 and CP1_ DSP 1). The monitoring subsystem and the AP subsystem are core subsystems of the multi-core system, and the first CP subsystem and the second CP subsystem are service subsystems of the multi-core system. The exception handling flow of each subsystem is described in detail below with reference to fig. 2-4.
As shown in fig. 2, in an exception handling process of an AP subsystem provided in an embodiment of the present application, a monitoring subsystem monitors working states of each subsystem in the multi-core system in real time, when an AP subsystem is abnormal, a CPU (AP) in the AP subsystem sends a captured exception to the monitoring subsystem (sends a CPU _ Assert (AP) message) to notify the monitoring subsystem that the exception occurs, and at the same time, the CPU (AP) in the AP subsystem notifies CPUs of other subsystems (a first CP subsystem and a second CP subsystem) in the multi-core system to enter a standby mode (sends a CPU _ IDLE message), and meanwhile, the CPU (AP) in the AP subsystem also synchronously outputs exception field data and sends the exception field data to a printing system or a file saving system;
because the AP subsystem is a core subsystem of the multi-core system, when an exception occurs, the operating state of the entire multi-core system may be affected, and therefore, further exception handling needs to be performed according to the configured exception handling mode, for example, when the AP subsystem is configured to be the exception restart mode, a cpu (AP) of the AP subsystem sends a restart request (sends a SUBSYS _ REBOOT message) to the monitoring subsystem, and the monitoring subsystem restarts the entire multi-core system according to the received restart request; when the non-abnormal restarting mode is configured, the standby state of the whole multi-core system is continuously maintained.
As shown in fig. 3, in the main CPU exception handling flow of the CP subsystem provided in the embodiment of the present application, the monitoring subsystem monitors the working state of each subsystem in the multi-core system in real time, when an abnormality occurs in one of the CPUs of one of the CP subsystems, for example, the main CPU (CP0) of the first CP subsystem occurs, the main CPU (CP0) with the abnormality occurring in the first CP subsystem sends the captured abnormality to the monitoring subsystem (sends a CPU _ ASSERT (CP0) message), notifies the monitoring subsystem that the abnormality occurs, meanwhile, the other slave CPUs (CP0_ DSP0 and CP0_ DSP1) of the first CP subsystem are notified by the master CPU (CP0) of the first CP subsystem to enter the standby mode (send CPU _ IDLE message), and at the same time, the main CPU (CP0) of the first CP subsystem also synchronously outputs abnormal field data to be sent to a printing system or a file saving system;
because the first CP subsystem is a service subsystem of the multi-core system, when an exception occurs, it only affects its own working state, and therefore further exception handling needs to be performed according to the configured exception handling mode, for example, when the exception handling mode is configured, a main CPU (CP0) of the first CP subsystem sends a restart request (sends a SUBSYS _ REBOOT message) to the monitoring subsystem, and the monitoring subsystem restarts the first CP subsystem according to the received restart request; when the system is configured in the non-abnormal restarting mode, the standby state of the first CP subsystem is continuously kept, but the working states of other subsystems are not influenced.
As shown in fig. 4, in the slave CPU exception handling process of the CP subsystem provided in this embodiment of the present application, when the CPU with the exception occurring in the first CP subsystem is the slave CPU (CP1_ DSP0 or CP1_ DSP1), the slave CPU with the exception occurring sends the captured exception to the monitoring subsystem (send CPU _ ASSERT (DSP0 or DSP1) message), and then the master CPU (CP0) of the first CP subsystem notifies other slave CPUs of the first CP subsystem to enter the standby mode (send CPU _ IDLE message), and meanwhile, the slave CPU with the exception occurring sends the exception field data to the master CPU (CP0) of the first CP subsystem, and the master CPU (CP0) of the first CP subsystem sends the exception field data to the printing system or the file saving system;
further, when the configured exception handling mode is the exception restart mode, the main CPU (CP0) of the first CP subsystem sends a restart request (sends a SUBSYS _ REBOOT message) to the monitoring subsystem, and the monitoring subsystem restarts the first CP subsystem according to the received restart request; when the configured abnormal processing mode is a non-abnormal restarting mode, the standby state of the first CP subsystem is continuously kept, but the working states of other subsystems are not influenced, and other subsystems without abnormality are kept in normal working states.
As shown in fig. 5, in a heartbeat detection exception handling process of a CP subsystem provided in an embodiment of the present application, each CPU of the multi-core system periodically updates heartbeat data thereof to a shared memory area, the heartbeat data of each period is different, the monitoring subsystem periodically (in one period or two periods) reads the heartbeat data of each CPU of the shared memory area and stores the heartbeat data in the local area, the monitoring subsystem compares the read heartbeat data with the local data, that is, compares the heartbeat data of two adjacent periods of each CPU, and if the heartbeat data of two adjacent periods are consistent, it is determined that a dead cycle or a runaway exception occurs in the CPU sending the heartbeat data.
Taking the first CP SUBSYSTEM as an example, the CPUs (CP0, CP0_ DSP0, and CP0_ DSP1) of the first CP SUBSYSTEM update their heartbeat data at the frequencies of T0 cycle, T1 cycle, and T2 cycle, respectively, and store the heartbeat data in the shared memory area, the monitoring SUBSYSTEM reads the heartbeat data of each CPU stored in the shared memory area at the frequencies of two cycles, compares the heartbeat data with the heartbeat data read last time locally, if the heartbeat data are consistent, the CPU corresponding to the consistent heartbeat data is considered not to update its heartbeat data to the shared memory area regularly, and determines that it has a dead cycle or a runaway exception, and at this time, the monitoring SUBSYSTEM notifies each CPU of the first CP SUBSYSTEM to enter a standby mode (sends a bsystem _ IDLE message), and restarts each CPU of the first CP SUBSYSTEM according to a configured restart task.
In the embodiment of the present application, if a CPU with an abnormal dead cycle or running-away is a core subsystem, each CPU of the multi-core system needs to be notified to enter a standby state, and then each CPU of the multi-core system is restarted according to a configured restart task.
Fig. 6 shows an exception handling apparatus of a multi-core system according to an embodiment of the present application, where the apparatus includes:
the configuration unit 601 is configured to configure a multi-core system into a monitoring subsystem and a plurality of processing subsystems, wherein the monitoring subsystem includes a resident CPU, each processing subsystem includes at least one CPU, one of the CPUs is a master CPU, and the others are slave CPUs;
and the processing unit 602 is configured to, when an exception is captured by any one of the CPUs of one of the processing subsystems, report the captured exception to the resident CPU of the monitoring subsystem by the any one of the CPUs, and output exception field data through the main CPU of the processing subsystem.
Through the exception handling apparatus, the exception handling method of the embodiments shown in fig. 1 to fig. 5 can be implemented, and details are not repeated here.
Fig. 7 is a schematic structural diagram of a computing device 1500 provided by an embodiment of the present application. The computing device 1500 includes: processor 1510, memory 1520, communications interface 1530, and bus 1540.
It is to be appreciated that the communication interface 1530 in the computing device 1500 illustrated in FIG. 7 can be utilized to communicate with other devices.
The processor 1510 may be connected to a memory 1520, among other things. The memory 1520 may be used to store the program code and data. Accordingly, the memory 1520 may be a storage unit inside the processor 1510, an external storage unit independent of the processor 1510, or a component including a storage unit inside the processor 1510 and an external storage unit independent of the processor 1510.
Optionally, computing device 1500 may also include a bus 1540. The memory 1520 and the communication interface 1530 may be connected to the processor 1510 via a bus 1540. Bus 1540 can be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus 1540 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one line is shown in FIG. 7, but it is not intended that there be only one bus or one type of bus.
It should be understood that, in the embodiment of the present application, the processor 1510 may adopt a Central Processing Unit (CPU). The processor may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. Or the processor 1510 uses one or more integrated circuits for executing related programs to implement the technical solutions provided in the embodiments of the present application.
The memory 1520, which may include both read-only memory and random access memory, provides instructions and data to the processor 1510. A portion of the processor 1510 may also include non-volatile random access memory. For example, the processor 1510 may also store information of the device type.
When the computing device 1500 is run, the processor 1510 executes the computer-executable instructions in the memory 1520 to perform the operational steps of the above-described method.
It should be understood that the computing device 1500 according to the embodiment of the present application may correspond to a corresponding main body for executing the method according to the embodiments of the present application, and the above and other operations and/or functions of each module in the computing device 1500 are respectively for implementing corresponding flows of each method of the embodiment, and are not described herein again for brevity.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The present embodiments also provide a computer-readable storage medium, on which a computer program is stored, the program being used for executing a diversification problem generation method when executed by a processor, the method including at least one of the solutions described in the above embodiments.
The computer storage media of the embodiments of the present application may take any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It should be noted that the foregoing is only illustrative of the preferred embodiments of the present application and the technical principles employed. It will be understood by those skilled in the art that the present application is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the application. Therefore, although the present application has been described in more detail with reference to the above embodiments, the present application is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present application.
Claims (10)
1. An exception handling method for a multi-core system, comprising:
configuring a multi-core system into a monitoring subsystem and a plurality of processing subsystems, wherein the monitoring subsystem comprises a resident CPU, each processing subsystem comprises at least one CPU, one of the CPUs is a main CPU, and the other CPUs are slave CPUs;
when any CPU of one processing subsystem captures the abnormity, the any CPU reports the captured abnormity to the resident CPU of the monitoring subsystem, and the main CPU of the processing subsystem outputs abnormal field data.
2. The method of claim 1, wherein the processing subsystem is divided into a core subsystem and a service subsystem according to a predetermined standard.
3. The method of claim 2, further comprising:
when the processing subsystem to which the abnormal CPU belongs is captured as a core subsystem, suspending other CPUs of the multi-core system through a main CPU of the core subsystem; performing, by a resident CPU of a monitoring subsystem, a reboot of the multi-core system;
when the processing subsystem to which the captured abnormal CPU belongs is a service subsystem, suspending other CPUs of the service subsystem through a main CPU of the service subsystem; the restart of the service subsystem is performed by the resident CPU of the monitoring subsystem.
4. The method of claim 1, further comprising:
and when the captured abnormal CPU is a slave CPU, transmitting abnormal field data to a main CPU of the processing subsystem to which the abnormal field data belongs through the captured abnormal slave CPU, and outputting the abnormal field data by the main CPU.
5. The method of claim 1, further comprising:
when capturing the abnormality, the resident CPU of the monitoring subsystem suspends other CPUs of the multi-core system through the resident CPU of the monitoring subsystem and outputs abnormal field data;
and executing the restart of the multi-core system through the resident CPU of the monitoring subsystem.
6. The method of claim 2, further comprising:
detecting heartbeat data of two adjacent cycles of each CPU of each processing subsystem through a resident CPU of the monitoring subsystem, and judging whether each CPU is abnormal or not;
when a CPU is judged to be abnormal:
if the processing subsystem to which the CPU belongs is a core subsystem, suspending other CPUs of the multi-core system through the resident CPU, and executing the restart of the multi-core system;
if the processing subsystem to which the CPU belongs is a service subsystem, suspending other CPUs of the service subsystem through the resident CPU, and executing the restart of the service subsystem.
7. The method according to any one of claims 1 to 6, wherein the resident CPU is the lowest-performance or lowest-power CPU in the multi-core system.
8. An exception handling apparatus for a multi-core system, comprising:
the system comprises a configuration unit, a monitoring subsystem and a plurality of processing subsystems, wherein the configuration unit is used for configuring the multi-core system into the monitoring subsystem and the processing subsystems, the monitoring subsystem comprises a resident CPU, each processing subsystem comprises at least one CPU, one of the CPUs is a main CPU, and the other CPUs are slave CPUs;
and the processing unit is used for reporting the captured abnormality to the resident CPU of the monitoring subsystem by any CPU when the abnormality is captured by the CPU of the processing subsystem, and outputting abnormal field data through the main CPU of the processing subsystem.
9. A computer device, characterized in that the computer device comprises:
one or more processors;
a memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored which, when executed by a computer, implements the method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011603786.8A CN112631872B (en) | 2020-12-30 | 2020-12-30 | Exception handling method and device for multi-core system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011603786.8A CN112631872B (en) | 2020-12-30 | 2020-12-30 | Exception handling method and device for multi-core system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112631872A true CN112631872A (en) | 2021-04-09 |
CN112631872B CN112631872B (en) | 2024-02-23 |
Family
ID=75287564
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011603786.8A Active CN112631872B (en) | 2020-12-30 | 2020-12-30 | Exception handling method and device for multi-core system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112631872B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114327828A (en) * | 2021-12-29 | 2022-04-12 | 科东(广州)软件科技有限公司 | Method, device, equipment and medium for concurrent access of shared data |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101256519A (en) * | 2008-03-27 | 2008-09-03 | 中兴通讯股份有限公司 | Isomerization multicore system as well as serial port control automatic switch method based on said system |
CN101324855A (en) * | 2008-08-12 | 2008-12-17 | 杭州华三通信技术有限公司 | Method, system, component and multi-CPU equipment for detecting auxiliary CPU operating status |
CN101635652A (en) * | 2009-09-07 | 2010-01-27 | 杭州华三通信技术有限公司 | Method and equipment for recovering fault of multi-core system |
CN102073572A (en) * | 2009-11-24 | 2011-05-25 | 中兴通讯股份有限公司 | Monitoring method for multi-core processor and system thereof |
US20120089861A1 (en) * | 2010-10-12 | 2012-04-12 | International Business Machines Corporation | Inter-processor failure detection and recovery |
CN103544092A (en) * | 2013-11-05 | 2014-01-29 | 中国航空工业集团公司西安飞机设计研究所 | Health monitoring system of avionic electronic equipment based on ARINC653 standard |
-
2020
- 2020-12-30 CN CN202011603786.8A patent/CN112631872B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101256519A (en) * | 2008-03-27 | 2008-09-03 | 中兴通讯股份有限公司 | Isomerization multicore system as well as serial port control automatic switch method based on said system |
CN101324855A (en) * | 2008-08-12 | 2008-12-17 | 杭州华三通信技术有限公司 | Method, system, component and multi-CPU equipment for detecting auxiliary CPU operating status |
CN101635652A (en) * | 2009-09-07 | 2010-01-27 | 杭州华三通信技术有限公司 | Method and equipment for recovering fault of multi-core system |
CN102073572A (en) * | 2009-11-24 | 2011-05-25 | 中兴通讯股份有限公司 | Monitoring method for multi-core processor and system thereof |
US20120089861A1 (en) * | 2010-10-12 | 2012-04-12 | International Business Machines Corporation | Inter-processor failure detection and recovery |
CN103544092A (en) * | 2013-11-05 | 2014-01-29 | 中国航空工业集团公司西安飞机设计研究所 | Health monitoring system of avionic electronic equipment based on ARINC653 standard |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114327828A (en) * | 2021-12-29 | 2022-04-12 | 科东(广州)软件科技有限公司 | Method, device, equipment and medium for concurrent access of shared data |
Also Published As
Publication number | Publication date |
---|---|
CN112631872B (en) | 2024-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107734035B (en) | Virtual cluster automatic scaling method in cloud computing environment | |
US9335998B2 (en) | Multi-core processor system, monitoring control method, and computer product | |
WO2023115999A1 (en) | Device state monitoring method, apparatus, and device, and computer-readable storage medium | |
US9298553B2 (en) | Methods, apparatus and system for selective duplication of subtasks | |
US20170269984A1 (en) | Systems and methods for improved detection of processor hang and improved recovery from processor hang in a computing device | |
WO2018095107A1 (en) | Bios program abnormal processing method and apparatus | |
EP3591485B1 (en) | Method and device for monitoring for equipment failure | |
CN102761439A (en) | Device and method for detecting and recording abnormity on basis of watchdog in PON (Passive Optical Network) access system | |
CN114328098B (en) | Slow node detection method and device, electronic equipment and storage medium | |
US20150006978A1 (en) | Processor system | |
CN112596568B (en) | Method, system, device and medium for reading error information of voltage regulator | |
CN104506362A (en) | Method for system state switching and monitoring on CC-NUMA (cache coherent-non uniform memory access architecture) multi-node server | |
CN102891762B (en) | The system and method for network data continuously | |
CN111796954A (en) | Watchdog control method, device, equipment and storage medium based on JVM | |
CN101964724A (en) | Energy conservation method of communication single plate and communication single plate | |
CN114116280A (en) | Interactive BMC self-recovery method, system, terminal and storage medium | |
US9910717B2 (en) | Synchronization method | |
US10955900B2 (en) | Speculation throttling for reliability management | |
US8060778B2 (en) | Processor controller, processor control method, storage medium, and external controller | |
CN112035285A (en) | Hardware watchdog circuit system based on high-pass platform and monitoring method thereof | |
CN112631872A (en) | Exception handling method and device of multi-core system | |
US10198275B2 (en) | Protecting firmware flashing from power operations | |
CN115576734B (en) | Multi-core heterogeneous log storage method and system | |
CN109062718B (en) | Server and data processing method | |
KR102023164B1 (en) | Method for monitoring os task of twin micom in rtos |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |