CN109947596A - PCIE device failure system delay machine processing method, device and associated component - Google Patents
PCIE device failure system delay machine processing method, device and associated component Download PDFInfo
- Publication number
- CN109947596A CN109947596A CN201910209284.8A CN201910209284A CN109947596A CN 109947596 A CN109947596 A CN 109947596A CN 201910209284 A CN201910209284 A CN 201910209284A CN 109947596 A CN109947596 A CN 109947596A
- Authority
- CN
- China
- Prior art keywords
- failure
- pcie device
- delay machine
- data
- machine processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 31
- 238000012545 processing Methods 0.000 claims abstract description 28
- 230000001960 triggered effect Effects 0.000 claims abstract description 5
- 238000005516 engineering process Methods 0.000 abstract description 6
- 206010003830 Automatism Diseases 0.000 abstract description 4
- 230000009286 beneficial effect Effects 0.000 abstract description 4
- 238000002955 isolation Methods 0.000 abstract description 4
- 230000008439 repair process Effects 0.000 abstract description 3
- 238000000034 method Methods 0.000 description 15
- 230000006870 function Effects 0.000 description 11
- 230000008569 process Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 2
- 230000004888 barrier function Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
Landscapes
- Retry When Errors Occur (AREA)
Abstract
This application discloses a kind of PCIE device failure system delay machine processing methods, it is related to electronic technology field, when detecting server system delay machine, by the reserved fault log memory space of the internal register data write-in for carrying fault log information, which can be used for subsequent determining failure PCIE device;When register data, which is written, to be completed, system reboot is triggered;Failure PCIE device is determined after system reboot triggering, and down state is set by faulty equipment, to automatism isolation faulty equipment, in addition, the faulty equipment for being set to down state can be replaced update in any convenient time, the broken string of business caused by needing artificial apparatus to repair in system failure is avoided, customer service can be continued to execute according to available PCI E equipment after the completion of system reboot;Disclosed herein as well is a kind of PCIE device failure system delay machine processing unit, equipment and a kind of computer readable storage mediums, have above-mentioned beneficial effect.
Description
Technical field
This application involves electronic technology field, in particular to a kind of PCIE device failure system delay machine processing method, device,
Equipment and a kind of computer readable storage medium.
Background technique
Server is the core of whole network system and computing platform, with the quick hair of cloud computing and big data technology
Exhibition, the data center of construction is also more and more, the exponentially other growth of the quantity of server system, the clothes of server system
Business quality is the server system attribute that user is concerned about the most.
The availability of server system decides the service quality of server system, and PCIE belongs to the master on server system
Component is wanted, when the failure of unrepairable occurs in PCIE device, mistake occurs due to being likely to result in processor, or cause to grasp
Make system failure, so generally causing entire server system delay machine.
After the generation of system failure caused by unrepairable mistake occurs for certain PCIE device, conventional method is fixed after finding delay machine
Position carries out replacement operation behind problem PCIE device position, powers on during this, services again with regard to the positioning that needs to take time, system cut-off
Device administrative staff carry out a series of actions, the times for causing server system offline such as PCIE device replacement can be long.Service
Device system is offline can not to provide service, and client traffic is caused to break.
Therefore, how to shorten server system downtime, promote client traffic and execute stability, be those skilled in the art
Member's technical issues that need to address.
Summary of the invention
The purpose of the application is to provide a kind of PCIE device failure system delay machine processing method, and this method greatly increases service
Device system provides the time of service, reduces the time of client traffic broken string, improves client traffic and executes stability;The application's
Another object is to provide a kind of PCIE device failure system delay machine processing unit, equipment and a kind of computer readable storage medium,
With above-mentioned beneficial effect.
In order to solve the above technical problems, the application provides a kind of PCIE device failure system delay machine processing method, comprising:
When detecting server system delay machine, the internal register data for carrying fault log information are written reserved
Fault log memory space;
When the register data, which is written, to be completed, system reboot is triggered;
When system reboot triggering, failure PCIE device is determined;Wherein, the failure PCIE device is posted according to the inside
Latch data carries out data and parses to obtain;
Down state is set by the failure PCIE device;
Customer service is executed according to available PCI E equipment after the completion of system reboot.
Optionally, the reserved storage space is written into processor internal data, comprising:
Position the error register inside CPU;
The reserved storage space is written into the data stored in the error register.
Optionally, the PCIE device failure system delay machine processing method further include:
After the completion of system reboot, the prompt information of failure PCIE device exception is exported.
Optionally, the PCIE device failure system delay machine processing method further include:
When detecting the failure PCIE device is available mode, the data in the reserved storage space are deleted.
Optionally, when system reboot triggers, failure PCIE device is determined, comprising:
When system reboot triggering, the positioning of failure PCIE device is carried out according to the internal register data, obtains failure
PCIE device.
The application discloses a kind of PCIE device failure system delay machine processing unit, comprising:
Data write. module, for the inside of fault log information will to be carried when detecting server system delay machine
The reserved fault log memory space of register data write-in;
Trigger module is restarted, for triggering system reboot when completion is written in the register data;
Fault determination module, for determining failure PCIE device when system reboot triggering;Wherein, the failure PCIE
Equipment carries out data according to the internal register data and parses to obtain;
Fault flag module, for setting down state for the failure PCIE device;
Business execution module, for executing customer service according to available PCI E equipment after the completion of system reboot.
Optionally, the Data write. module includes:
Positioning submodule, for positioning the error register inside CPU;
Submodule is written, the reserved storage space is written in the data for will store in the error register.
Optionally, the fault determination module is specially fault location module, and the fault location module is used for: working as system
When restarting triggering, the positioning of failure PCIE device is carried out according to the internal register data, obtains failure PCIE device.
The application discloses a kind of PCIE device failure system delay machine processing equipment, comprising:
Memory, for storing program;
Processor, the step of PCIE device failure system delay machine processing method is realized when for executing described program.
The application discloses a kind of readable storage medium storing program for executing, and program is stored on the readable storage medium storing program for executing, and described program is located
The step of reason device realizes the PCIE device failure system delay machine processing method when executing.
PCIE device failure system delay machine processing method provided herein, when detecting server system delay machine,
By the reserved fault log memory space of the internal register data write-in for carrying fault log information, fault log is carried
The internal register data of information can be used for subsequent determining failure PCIE device;When register data, which is written, to be completed, triggering
System reboot;Failure PCIE device is determined after system reboot triggering, and sets down state for failure PCIE device, from
And automatism isolation failure PCIE device, its influence for server system is avoided, the available of server system is improved
Property, in addition, update can be replaced in any convenient time by being set to the failure PCIE device of down state, avoid
The broken string of business caused by needing artificial apparatus to repair when system failure considerably increases server system and provides the time of service,
Customer service can be continued to execute according to available PCI E equipment after the completion of system reboot, improve client traffic and execute stabilization
Property.
Disclosed herein as well is a kind of PCIE device failure system delay machine processing unit, equipment and a kind of computer-readable deposit
Storage media has above-mentioned beneficial effect, and details are not described herein.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of PCIE device failure system delay machine processing method flow chart provided by the embodiments of the present application;
Fig. 2 is a kind of structural block diagram of PCIE device failure system delay machine processing unit provided by the embodiments of the present application;
Fig. 3 is a kind of structural schematic diagram of PCIE device failure system delay machine processing equipment provided by the embodiments of the present application.
Specific embodiment
The core of the application is to provide a kind of PCIE device failure system delay machine processing method, and this method greatly increases service
Device system provides the time of service, reduces the time of client traffic broken string, improves client traffic and executes stability;The application's
Another core is to provide a kind of PCIE device failure system delay machine processing unit, equipment and a kind of computer readable storage medium,
With above-mentioned beneficial effect.
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art
Every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
The service quality of server system is always the attribute that user is concerned about the most.Measure the quality index of server system
Generally comprise following three aspects: reliability, availability and maintainability.Reliability generally refers to server system
The degree of any problem;Availability refers to the ability that still can be used after mistake occurs in server system;Maintainability is
Refer to after there is hardware error in server, can quick positioning question, problem-solving ability.
This application provides a kind of PCIE device failure system delay machine processing method, this method can be in server delay machine
Failure PCIE device is subjected to automatism isolation, promotes the availability of server.
Fig. 1 is please referred to, Fig. 1 is a kind of process of PCIE device failure system delay machine processing method provided in this embodiment
Figure.This method specifically includes that
Step s110: when detecting server system delay machine, the internal register number of fault log information will be carried
According to the reserved fault log memory space of write-in.
Certain memory space is reserved for saving the log of delay machine scene, reserved storage space volume is big as far as possible, to protect
Card data to be stored can all be written.
In addition, the register data of write-in reserved space can be all register datas at delay machine scene, can also incite somebody to action
The content of registers for occurring wrong (error) preserves, and to reduce data to be written, shortens server system business
Downtime, while reducing the occupancy for daily record data for space to the greatest extent, it is preferable that can only institute inside storage processor
There is the content of error register.
For convenience of daily record data reading analyze, unified data format can be set, data write-in while according to
Reserved space is written in preset data format, it is of course also possible to the setting without data format, it is not limited here.
It is being detected it should be noted that can be the condition of register data write-in reserved space in triggering the present embodiment
It is to server system delay machine, i.e., uncertain when being the server delay machine as caused by which kind of reason;Service can also be detected working as
Delay machine caused by PCIE device failure occurs for device system, that is, when determining that server delay machine is caused by PCIE device failure,
It is not limited here.
Step s120: when register data, which is written, to be completed, system reboot is triggered.
After the completion of fault message write-in, artificial replacement failure PCIE device this operation is no longer carried out, certain is avoided
PCIE device needs manually to participate in after breaking down that server system service could be restored, and greatly increases server system and provides clothes
The time of business reduces the time of client traffic broken string.
Step s130: when system reboot triggering, failure PCIE device is determined.
Determine that failure PCIE device, failure PCIE device are counted according to internal register data while system reboot
It is obtained according to parsing, it should be noted that can be in step according to the process that processor internal data carries out the positioning of failure PCIE device
It is completed in rapid s120, i.e., determines failure PCIE device according to the data of write-in immediately after writing data into reserved space, later
In triggering system reboot, predetermined failure PCIE device can be directly acquired in step s130;It can also be in step
It is completed in s130, i.e., after system trigger is restarted, reads data in reserved space and carry out the determination of failure PCIE device, this reality
Example is applied not limit this.When for the latter, i.e., data analysis acquisition failure PCIE device is carried out after system trigger is restarted can
To shorten the server system out-of-service time, the availability of server system is promoted.It is therefore preferred that step s130 is specific
It can be with are as follows: when system reboot triggering, carry out the positioning of failure PCIE device according to internal register data, obtain failure PCIE and set
It is standby.
In addition, determine that the specific steps of failure PCIE device are referred to the relevant technologies according to the data of reserved space, this
It is repeated no more in embodiment.
Step s140: down state is set by failure PCIE device.
After setting down state for failure PCIE device, failure PCIE device no longer provides user service, automatically every
The failure PCIE device of influence from to(for) server system improves OpenPOWER server availability.
Step s150: customer service is executed according to available PCI E equipment after the completion of system reboot.
Although doing so can lose certain functions (problem PCIE device provide function), entire server system can be with
It works on, for server system, generally can all configure different types of a plurality of PCIE devices, such as SAS card, net
Card, GPU card etc..And different PCIE devices carries different functions, for example the major function of SAS card is to support the number of user
Network service function is provided according to store function, network card equipment.After carrying out unavailable setting to malfunctioning module, other PCIE devices
It can continue to work.It avoids and needs manually to participate in that server system service could be restored after certain PCIE device breaks down,
It greatly increases server system and the time of service is provided, reduce the time of client traffic broken string.
It is handled it should be noted that delay machine system processing method provided by the present application is suitable for existing using Intel, AMD
The x86 server system of device and OpenPOWER server system using IBM POWER processor, be applied equally to it is upper
State the server system of the identical PCIE specification of server system.
Based on the above-mentioned technical proposal, PCIE device failure system delay machine processing method provided by the embodiment of the present application, when
When detecting server system delay machine, by the reserved fault log of the internal register data write-in for carrying fault log information
Memory space, the internal register data for carrying fault log information can be used for subsequent determining failure PCIE device;When posting
When latch data write-in is completed, system reboot is triggered;Failure PCIE device is determined after system reboot triggering, and by failure PCIE
Equipment is set as down state, so that automatism isolation failure PCIE device, avoids its influence for server system,
The availability of server system is improved, in addition, the failure PCIE device for being set to down state can be when any convenient
Between be replaced update, avoid needed in system failure artificial apparatus repair caused by business broken string, considerably increase clothes
Device system of being engaged in provides the time of service, can continue to execute customer service according to available PCI E equipment after the completion of system reboot,
It improves client traffic and executes stability.
Substantially restore normal customer service after the completion of system reboot, but before the processing of progress failure PCIE device, therefore
Barrier PCIE device is still within down state, and server system has lost the function of failure PCIE device offer, for guarantee compared with
Restore the service for restoring failure PCIE device while client's entirety business as early as possible in short time, it is preferable that when system reboot is complete
Cheng Hou can export the prompt information of failure PCIE device exception, in order to related technical personnel the suitable time as early as possible into
The maintenance of row failure PCIE device is handled.
Pre-set down state is adjusted to available mode after the maintenance of failure PCIE device, indicates failure
The formal business recovery of PCIE device can work as detection at this time to reduce temporary hash to the greatest extent to the occupancy of Installed System Memory
To failure PCIE device be available mode when, delete reserved storage space in data.It is of course also possible to which data are moved to other
Free space, it is not limited here.
To deepen the understanding to delay machine processing method provided by the present application, the present embodiment (is referred to and is adopted with OpenPOWER server
With the server system of IBM POWER architecture processor) for be introduced, other server systems can refer to this implementation
The introduction of example, details are not described herein.Calling in OpenPOWER server and executing delay machine treatment process mainly includes following functions mould
Block: OCC module (functional module built in On-Chip Controller OpenPOWER processor), BMC, BIOS parsing
Module (Basic Input Output System basic input output system).
Whole delay machine treatment process is carried out by above-mentioned module to mainly comprise the steps that
Fixing address and capacity foot are reserved when designing BIOS Flash memory space layout to save the log of delay machine scene
Enough memory spaces.
When server system delay machine, the OCC module built in POWER processor detects processor and unrepairable occurs
The problem of after, the DUMP code of OCC module preserves the content of all Error registers inside POWER processor, and
Data are written in the memory space reserved in BIOS Flash chip according to designed format.
OCC notice BMC server system delay machine simultaneously completes log write-in, and BMC receives the letter that OCC has completed DUMP work
After breath, activation system is restarted.
When BIOS parsing module detects system starting, first checks in " reserved space " and posted with the presence or absence of effective Error
Latch data orients accurate trouble unit information, for example be PCIE if there is just parsing to data therein
There is unrepairable mistake in PCIE device on Slot2.Then the failure PCIE device is set as Disable state.It
After allow system reboot.Since failure PCIE device is by Disable, the PCIE device would not be made again after system reboot
With then server system can normally lead into OS, execute the business of user.
Without effective Error register data in " if reserved space ", it is possible to which being not due to PCIE failure causes
System failure, system reboot can be continued.
OpenPOWER server delay machine processing method provided in this embodiment is collected and is parsed delay machine information and automatically event
Barrier PCIE device isolates outside system, improves OpenPOWER server availability, avoiding after certain PCIE device goes wrong needs
It manually to participate in that server system service could be restored, greatly increase server system and the time of service is provided, reduce client
The time of business broken string.
Referring to FIG. 2, Fig. 2 is a kind of knot of PCIE device failure system delay machine processing unit provided by the embodiments of the present application
Structure block diagram;The device mainly includes: Data write. module 210 restarts trigger module 220, fault determination module 230, failure mark
Remember module 240 and business execution module 250.
Wherein, Data write. module 210 is mainly used for that fault log will be carried when detecting server system delay machine
The reserved fault log memory space of the internal register data write-in of information;
Restart trigger module 220 to be mainly used for triggering system reboot when completion is written in register data;
Fault determination module 230 is mainly used for determining failure PCIE device when system reboot triggers;Wherein, failure
PCIE device carries out data according to internal register data and parses to obtain;
Fault flag module 240 is mainly used for setting down state for failure PCIE device;
Business execution module 250 is mainly used for executing customer service according to available PCI E equipment after the completion of system reboot.
Wherein, Data write. module may further include:
Positioning submodule, for positioning the error register inside CPU;
Submodule is written, for reserved storage space to be written in the data stored in error register.
Wherein, fault determination module is specifically as follows fault location module, and fault location module is used for: when system reboot touches
When hair, the positioning of failure PCIE device is carried out according to internal register data, obtains failure PCIE device.
In addition, PCIE device failure system delay machine processing unit provided in this embodiment may further include: abnormal to mention
Show module, for exporting the prompt information of failure PCIE device exception after the completion of system reboot.
In addition, PCIE device failure system delay machine processing unit provided in this embodiment may further include: data are deleted
Except module, for deleting the data in reserved storage space when detecting failure PCIE device is available mode.
PCIE device failure system delay machine processing unit provided in this embodiment can increase server system and provide service
Time, reduce client traffic broken string time, promoted client traffic execute stability.
The present embodiment provides a kind of PCIE device failure system delay machine processing equipments;The equipment specifically include that memory with
And processor.PCIE device failure system delay machine processing equipment can refer to above-mentioned PCIE device failure system delay machine processing method
It introduces.
Wherein, memory is mainly used for storing program;
Processor is mainly used for the step of realizing above-mentioned PCIE device failure system delay machine processing method when executing program.
Referring to FIG. 3, being a kind of structural representation of PCIE device failure system delay machine processing equipment provided in this embodiment
Figure, the PCIE device failure system delay machine processing equipment can generate bigger difference because configuration or performance are different, can wrap
One or more processors (central processing units, CPU) 322 is included (for example, at one or more
Manage device) and memory 332, one or more store storage medium 330 (such as one of application programs 342 or data 344
Or more than one mass memory unit).Wherein, memory 332 and storage medium 330 can be of short duration storage or persistent storage.
The program for being stored in storage medium 330 may include one or more modules (diagram does not mark), and each module can wrap
It includes to the series of instructions operation in data processing equipment.Further, central processing unit 322 can be set to be situated between with storage
Matter 330 communicates, and the series of instructions behaviour in storage medium 330 is executed in PCIE device failure system delay machine processing equipment 301
Make.
PCIE device failure system delay machine processing equipment 301 can also include one or more power supplys 326, one or
More than one wired or wireless network interface 350, one or more input/output interfaces 358, and/or, one or one
The above operating system 341, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
Step in PCIE device failure system delay machine processing method described in above figure 1 can be by PCIE device failure
The structure of system failure processing equipment is realized.
Present embodiment discloses a kind of readable storage medium storing program for executing, program is stored on readable storage medium storing program for executing, program is by processor
The step of PCIE device failure system delay machine processing method is realized when execution, wherein PCIE device failure system delay machine processing side
Method can refer to above-described embodiment, and details are not described herein.
The readable storage medium storing program for executing be specifically as follows USB flash disk, mobile hard disk, read-only memory (Read-Only Memory,
ROM), the various program storage generations such as random access memory (Random Access Memory, RAM), magnetic or disk
The readable storage medium storing program for executing of code.
Each embodiment is described in a progressive manner in specification, the highlights of each of the examples are with other realities
The difference of example is applied, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment
Speech, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part illustration
?.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure
And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and
The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These
Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession
Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered
Think beyond scope of the present application.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor
The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit
Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology
In any other form of storage medium well known in field.
Above to PCIE device failure system delay machine processing method, device, equipment and readable storage provided herein
Medium is described in detail.Specific examples are used herein to illustrate the principle and implementation manner of the present application, with
The explanation of upper embodiment is merely used to help understand the present processes and its core concept.It should be pointed out that being led for this technology
For the those of ordinary skill in domain, under the premise of not departing from the application principle, can also to the application carry out it is several improvement and
Modification, these improvement and modification are also fallen into the protection scope of the claim of this application.
Claims (10)
1. a kind of PCIE device failure system delay machine processing method characterized by comprising
When detecting server system delay machine, by the reserved event of the internal register data write-in for carrying fault log information
Hinder log memory space;
When the register data, which is written, to be completed, system reboot is triggered;
When system reboot triggering, failure PCIE device is determined;Wherein, the failure PCIE device is according to the internal register
Data carry out data and parse to obtain;
Down state is set by the failure PCIE device;
Customer service is executed according to available PCI E equipment after the completion of system reboot.
2. PCIE device failure system delay machine processing method as described in claim 1, which is characterized in that by number inside processor
According to the write-in reserved storage space, comprising:
Position the error register inside CPU;
The reserved storage space is written into the data stored in the error register.
3. PCIE device failure system delay machine processing method as described in claim 1, which is characterized in that further include:
After the completion of system reboot, the prompt information of failure PCIE device exception is exported.
4. PCIE device failure system delay machine processing method as described in claim 1, which is characterized in that further include:
When detecting the failure PCIE device is available mode, the data in the reserved storage space are deleted.
5. such as the described in any item PCIE device failure system delay machine processing methods of Claims 1-4, which is characterized in that when being
When system restarts triggering, failure PCIE device is determined, comprising:
When system reboot triggering, the positioning of failure PCIE device is carried out according to the internal register data, obtains failure PCIE
Equipment.
6. a kind of PCIE device failure system delay machine processing unit characterized by comprising
Data write. module, for when detecting server system delay machine, the inside for carrying fault log information to be deposited
The reserved fault log memory space of device data write-in;
Trigger module is restarted, for triggering system reboot when completion is written in the register data;
Fault determination module, for determining failure PCIE device when system reboot triggering;Wherein, the failure PCIE device
Data are carried out according to the internal register data to parse to obtain;
Fault flag module, for setting down state for the failure PCIE device;
Business execution module, for executing customer service according to available PCI E equipment after the completion of system reboot.
7. PCIE device failure system delay machine processing unit as claimed in claim 6, which is characterized in that mould is written in the data
Block includes:
Positioning submodule, for positioning the error register inside CPU;
Submodule is written, the reserved storage space is written in the data for will store in the error register.
8. PCIE device failure system delay machine processing unit as claimed in claims 6 or 7, which is characterized in that the failure is true
Cover half block is specially fault location module, and the fault location module is used for: when system reboot triggering, being posted according to the inside
Latch data carries out the positioning of failure PCIE device, obtains failure PCIE device.
9. a kind of PCIE device failure system delay machine processing equipment characterized by comprising
Memory, for storing program;
Processor realizes the PCIE device failure system delay machine as described in any one of claim 1 to 5 when for executing described program
The step of processing method.
10. a kind of readable storage medium storing program for executing, which is characterized in that be stored with program on the readable storage medium storing program for executing, described program is located
It manages and is realized when device executes as described in any one of claim 1 to 5 the step of PCIE device failure system delay machine processing method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910209284.8A CN109947596A (en) | 2019-03-19 | 2019-03-19 | PCIE device failure system delay machine processing method, device and associated component |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910209284.8A CN109947596A (en) | 2019-03-19 | 2019-03-19 | PCIE device failure system delay machine processing method, device and associated component |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109947596A true CN109947596A (en) | 2019-06-28 |
Family
ID=67010253
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910209284.8A Pending CN109947596A (en) | 2019-03-19 | 2019-03-19 | PCIE device failure system delay machine processing method, device and associated component |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109947596A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110609778A (en) * | 2019-08-16 | 2019-12-24 | 苏州浪潮智能科技有限公司 | Method and system for storing server downtime log |
CN111400076A (en) * | 2020-02-28 | 2020-07-10 | 苏州浪潮智能科技有限公司 | Downtime restoration method, device, equipment and storage medium |
CN111404725A (en) * | 2020-02-27 | 2020-07-10 | 苏州浪潮智能科技有限公司 | Method and system for isolating failure PCIE (peripheral component interface express) equipment |
CN111414268A (en) * | 2020-02-26 | 2020-07-14 | 华为技术有限公司 | Fault processing method and device and server |
CN112699073A (en) * | 2021-01-06 | 2021-04-23 | 同方计算机有限公司 | PCIE card on-line replacement method and system with controllable BMC system |
CN113127243A (en) * | 2019-12-30 | 2021-07-16 | 美光科技公司 | Real-time triggering of transcryption error logs |
CN113722156A (en) * | 2021-11-02 | 2021-11-30 | 四川华鲲振宇智能科技有限责任公司 | N +1 redundancy backup method and system for PCIe equipment |
CN114356644A (en) * | 2022-03-18 | 2022-04-15 | 阿里巴巴(中国)有限公司 | PCIE equipment fault processing method and device |
CN115426244A (en) * | 2022-08-09 | 2022-12-02 | 武汉虹信技术服务有限责任公司 | Network equipment fault detection method based on big data |
WO2022267349A1 (en) * | 2021-06-22 | 2022-12-29 | 苏州浪潮智能科技有限公司 | Register reading method and apparatus, device, and medium |
CN116382968A (en) * | 2023-06-05 | 2023-07-04 | 苏州浪潮智能科技有限公司 | Fault detection method and device for external equipment |
CN116737396A (en) * | 2023-08-14 | 2023-09-12 | 苏州浪潮智能科技有限公司 | Method, device, electronic equipment and storage medium for configuring maintainability of server |
US11971776B2 (en) | 2019-12-30 | 2024-04-30 | Micron Technology, Inc. | Real-time trigger to dump an error log |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102664702A (en) * | 2012-04-05 | 2012-09-12 | 烽火通信科技股份有限公司 | Protection mode of cross disc of M to N |
US20160004608A1 (en) * | 2014-07-01 | 2016-01-07 | Bull Sas | Method and device for synchronously running an application in a high availability environment |
CN105893171A (en) * | 2015-01-04 | 2016-08-24 | 伊姆西公司 | Method and device for fault recovery in storage equipment |
US20180039548A1 (en) * | 2016-08-08 | 2018-02-08 | International Business Machines Corporation | Smart virtual machine snapshotting |
CN108287775A (en) * | 2018-03-01 | 2018-07-17 | 郑州云海信息技术有限公司 | A kind of method, apparatus, equipment and the storage medium of server failure detection |
CN108984332A (en) * | 2018-06-22 | 2018-12-11 | 郑州云海信息技术有限公司 | A kind of device and method of location-server delay machine failure |
-
2019
- 2019-03-19 CN CN201910209284.8A patent/CN109947596A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102664702A (en) * | 2012-04-05 | 2012-09-12 | 烽火通信科技股份有限公司 | Protection mode of cross disc of M to N |
US20160004608A1 (en) * | 2014-07-01 | 2016-01-07 | Bull Sas | Method and device for synchronously running an application in a high availability environment |
CN105893171A (en) * | 2015-01-04 | 2016-08-24 | 伊姆西公司 | Method and device for fault recovery in storage equipment |
US20180039548A1 (en) * | 2016-08-08 | 2018-02-08 | International Business Machines Corporation | Smart virtual machine snapshotting |
CN108287775A (en) * | 2018-03-01 | 2018-07-17 | 郑州云海信息技术有限公司 | A kind of method, apparatus, equipment and the storage medium of server failure detection |
CN108984332A (en) * | 2018-06-22 | 2018-12-11 | 郑州云海信息技术有限公司 | A kind of device and method of location-server delay machine failure |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110609778A (en) * | 2019-08-16 | 2019-12-24 | 苏州浪潮智能科技有限公司 | Method and system for storing server downtime log |
CN113127243A (en) * | 2019-12-30 | 2021-07-16 | 美光科技公司 | Real-time triggering of transcryption error logs |
US11829232B2 (en) | 2019-12-30 | 2023-11-28 | Micron Technology, Inc. | Real-time trigger to dump an error log |
US11971776B2 (en) | 2019-12-30 | 2024-04-30 | Micron Technology, Inc. | Real-time trigger to dump an error log |
CN111414268A (en) * | 2020-02-26 | 2020-07-14 | 华为技术有限公司 | Fault processing method and device and server |
CN111404725B (en) * | 2020-02-27 | 2022-06-07 | 苏州浪潮智能科技有限公司 | Method and system for isolating failure PCIE (peripheral component interface express) equipment |
CN111404725A (en) * | 2020-02-27 | 2020-07-10 | 苏州浪潮智能科技有限公司 | Method and system for isolating failure PCIE (peripheral component interface express) equipment |
CN111400076A (en) * | 2020-02-28 | 2020-07-10 | 苏州浪潮智能科技有限公司 | Downtime restoration method, device, equipment and storage medium |
CN112699073A (en) * | 2021-01-06 | 2021-04-23 | 同方计算机有限公司 | PCIE card on-line replacement method and system with controllable BMC system |
US11860718B2 (en) | 2021-06-22 | 2024-01-02 | Inspur Suzhou Intelligent Technology Co., Ltd. | Register reading method and apparatus, device, and medium |
WO2022267349A1 (en) * | 2021-06-22 | 2022-12-29 | 苏州浪潮智能科技有限公司 | Register reading method and apparatus, device, and medium |
CN113722156A (en) * | 2021-11-02 | 2021-11-30 | 四川华鲲振宇智能科技有限责任公司 | N +1 redundancy backup method and system for PCIe equipment |
CN113722156B (en) * | 2021-11-02 | 2022-02-18 | 四川华鲲振宇智能科技有限责任公司 | N +1 redundancy backup method and system for PCIe equipment |
CN114356644B (en) * | 2022-03-18 | 2022-06-14 | 阿里巴巴(中国)有限公司 | PCIE equipment fault processing method and device |
CN114356644A (en) * | 2022-03-18 | 2022-04-15 | 阿里巴巴(中国)有限公司 | PCIE equipment fault processing method and device |
CN115426244A (en) * | 2022-08-09 | 2022-12-02 | 武汉虹信技术服务有限责任公司 | Network equipment fault detection method based on big data |
CN115426244B (en) * | 2022-08-09 | 2024-03-15 | 武汉虹信技术服务有限责任公司 | Network equipment fault detection method based on big data |
CN116382968B (en) * | 2023-06-05 | 2023-08-18 | 苏州浪潮智能科技有限公司 | Fault detection method and device for external equipment |
CN116382968A (en) * | 2023-06-05 | 2023-07-04 | 苏州浪潮智能科技有限公司 | Fault detection method and device for external equipment |
CN116737396A (en) * | 2023-08-14 | 2023-09-12 | 苏州浪潮智能科技有限公司 | Method, device, electronic equipment and storage medium for configuring maintainability of server |
CN116737396B (en) * | 2023-08-14 | 2023-11-03 | 苏州浪潮智能科技有限公司 | Method, device, electronic equipment and storage medium for configuring maintainability of server |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109947596A (en) | PCIE device failure system delay machine processing method, device and associated component | |
KR101574451B1 (en) | Imparting durability to a transactional memory system | |
CN104685474B (en) | For the method for handling not repairable EMS memory error and non-transient processor readable medium | |
WO2019196199A1 (en) | Method and device for processing bad tracks of disk and computer storage medium | |
CN110008129B (en) | Reliability test method, device and equipment for storage timing snapshot | |
CN103198122B (en) | Restart the method and apparatus of memory database | |
CN106682162A (en) | Log management method and device | |
CN106603279A (en) | Disaster tolerance method and disaster tolerance system | |
CN109582502A (en) | Storage system fault handling method, device, equipment and readable storage medium storing program for executing | |
CN104216771B (en) | The method for restarting and device of software program | |
CN108768793A (en) | A kind of storage dual-active link failure test method and device | |
CN109753378A (en) | A kind of partition method of memory failure, device, system and readable storage medium storing program for executing | |
CN108776579A (en) | A kind of distributed storage cluster expansion method, device, equipment and storage medium | |
CN107391307A (en) | The method of testing and device of storage area network storage device snapshot functions | |
CN108958965A (en) | A kind of BMC monitoring can restore the method, device and equipment of ECC error | |
CN111475335A (en) | Method, system, terminal and storage medium for fast recovery of database | |
CN103678608A (en) | Log management method and device | |
CN104407806A (en) | Method and device for revising hard disk information of redundant array group of independent disk (RAID) | |
WO2024077863A1 (en) | Recovery method for all-flash storage system, and related apparatus | |
CN106407385A (en) | Data management method and system, and equipment | |
CN111292796B (en) | RAID damage detailed information acquisition method, system, terminal and storage medium | |
CN115391106A (en) | Method, system and device for pooling backup resources | |
CN109189615A (en) | A kind of delay machine treating method and apparatus | |
CN110968456A (en) | Method and device for processing fault disk in distributed storage system | |
CN114816806A (en) | Container availability verification method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190628 |