Nothing Special   »   [go: up one dir, main page]

CN106411643B - BMC detection method and device - Google Patents

BMC detection method and device Download PDF

Info

Publication number
CN106411643B
CN106411643B CN201610841255.XA CN201610841255A CN106411643B CN 106411643 B CN106411643 B CN 106411643B CN 201610841255 A CN201610841255 A CN 201610841255A CN 106411643 B CN106411643 B CN 106411643B
Authority
CN
China
Prior art keywords
bmc
state
server
detecting
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610841255.XA
Other languages
Chinese (zh)
Other versions
CN106411643A (en
Inventor
于延宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Information Technologies Co Ltd
Original Assignee
New H3C Information Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Information Technologies Co Ltd filed Critical New H3C Information Technologies Co Ltd
Priority to CN201610841255.XA priority Critical patent/CN106411643B/en
Publication of CN106411643A publication Critical patent/CN106411643A/en
Application granted granted Critical
Publication of CN106411643B publication Critical patent/CN106411643B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/085Retrieval of network configuration; Tracking network configuration history
    • H04L41/0853Retrieval of network configuration; Tracking network configuration history by actively collecting configuration information or by backing up configuration information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Debugging And Monitoring (AREA)
  • Computer And Data Communications (AREA)
  • Stored Programmes (AREA)

Abstract

The application provides a BMC detection method and a BMC detection device, wherein the method comprises the following steps: detecting the state of the BMC through communication connection established between the operating system and the BMC; when the state change of the BMC is detected, determining the state change information of the BMC; and advertising the state change information of the BMC. The invention can reduce the consumption of human resources.

Description

BMC detection method and device
Technical Field
The present application relates to the field of communications technologies, and in particular, to a BMC detection method and apparatus.
Background
A BMC (Baseboard Management Controller) is a Management unit of a server, and is used for monitoring and managing the server. Because the server with BMC can be placed in an independent machine room, if the BMC of the server is abnormal, only professional technicians can find the abnormal BMC through the serial server configured in the machine room with a complex environment, and other personnel cannot perceive whether the BMC is abnormal or not, and cannot repair the abnormal BMC in time, so that the server cannot be monitored and managed, and the performance of the server is affected.
Disclosure of Invention
aiming at the defects of the prior art, the invention provides a BMC detection method and a BMC detection device.
The application provides a BMC detection method, which is applied to network equipment, wherein the network equipment establishes communication connection with BMC through a running operating system, and the method comprises the following steps:
Detecting the state of the BMC through communication connection established between the operating system and the BMC;
When the state change of the BMC is detected, determining the state change information of the BMC;
and advertising the state change information of the BMC.
the present application further provides a BMC detection apparatus, which is applied to a network device, where the network device establishes a communication connection with a BMC through a running operating system, and the apparatus includes:
the detection unit is used for detecting the state of the BMC through the communication connection established between the operating system and the BMC;
The determining unit is used for determining the state change information of the BMC when the state change of the BMC is detected;
and the notification unit is used for notifying the state change information of the BMC.
according to the BMC detection method and device, a special administrator does not need to enter a complex machine room environment to detect the BMC state of the server, the BMC state is detected through communication connection established between the operating system and the BMC, and when the change of the BMC state is detected, the change information of the BMC state is notified, so that the cost and complexity of solving the problem of BMC failure can be reduced, and the consumption of human resources is reduced.
Drawings
fig. 1 is a schematic diagram of an SFC network to which a BMC detection method is applied in an embodiment of the present application;
FIG. 2 is a schematic flow chart of a BMC detection method in an embodiment of the present application;
FIG. 3 is a schematic diagram of a logic structure of a BMC detection device in an embodiment of the present application;
Fig. 4 is a schematic hardware architecture diagram of a network device where the BMC detection apparatus is located in the embodiment of the present application.
Detailed Description
for the purpose of making the present application more apparent, its technical solutions and advantages will be further described in detail with reference to the accompanying drawings.
in order to solve the problems in the prior art, the application provides a BMC detection method and a BMC detection device.
Fig. 1 shows a network schematic diagram applied to a BMC detection method provided by the present application, which includes a plurality of servers S101, S102, and S103, where each of the servers S101, S102, and S103 has a BMC, in an example, the servers S101 and S102 may be in a machine room environment, the S103 outside the machine room environment has a BMC management module, and the S103 may establish a communication connection with the BMCs of the servers S101 and S102 through an Operating System (OS) operated by the BMC management module.
Referring to fig. 2, a schematic processing flow diagram of a BMC detection method provided in the present application is shown, where the BMC detection method is applicable to a network device, where the network device may be a server or other devices, and the network device establishes a communication connection with a BMC through a running operating system, and the method includes the following steps:
step 201, detecting the state of the BMC through a communication connection established between the operating system and the BMC;
in this embodiment, a connection may be established between the operating system running in the server and the BMCs of the servers to perform local communication. In one embodiment, the operating system running in the server may have an Agent (Agent), and the BMC communication connection with each server may be established via the Agent of the operating system. The connection established with the BMC of each server includes a connection established with the BMC of the server and a connection established with the BMCs of other servers, and may communicate with the BMCs of each server according to the established connections.
After the server is connected with the BMC through the operating system, when an operating instruction for detecting the BMC state is received, the state detection of the corresponding BMC can be started in response to the operating instruction. In this embodiment of the present application, triggering to send the operation instruction may include various conditions, for example: when the fact that the BMC is restarted or upgraded is monitored, the operation instruction can be triggered to be sent, so that the state detection of the restarted or upgraded BMC is automatically started according to the operation instruction; or when the administrator needs to monitor the state of a BMC, the administrator may manually trigger an operation instruction for detecting the state of the BMC, so as to start the state detection of the BMC to be detected.
In this embodiment, the detecting the BMC state includes:
Detecting the change of the firmware version operated by the BMC;
detecting a change in a port accessing the BMC;
detecting that the IP address of the BMC changes;
Detecting abnormal change of the IPMI (Intelligent Platform Management Interface) process state of the BMC;
And detecting abnormal change of the network state of the BMC and the like.
specifically, when the state of the BMC is detected, if it is detected that the firmware version running by the BMC is upgraded from a lower version to a higher version, for example, from the V2 version to the V3 version, and the upgrade is successful, it may be determined that the firmware version running by the BMC changes (from the V2 version to the V3 version); if the upgraded firmware version is detected to be lower than the version before upgrading, it may be determined that the firmware version running by the BMC fails to be upgraded, for example, if the currently running firmware version is version V2, after the version upgrading is performed, the detection result to the BMC is that the firmware version is version V1, which indicates that the firmware version running by the BMC has version rollback, and determines that the BMC firmware version fails to be upgraded; or, when an upgrade error condition such as an upgrade interruption occurs in the upgrade process, it may also be determined that the BMC firmware version fails to be upgraded, and when the BMC firmware version fails to be upgraded, since the configuration file of the BMC firmware version that fails to be upgraded may change, it may also be determined that the firmware version that the BMC operates changes.
If a change in the Port accessing BMC is detected, for example from Port1 to Port2, then a determination may be made that the Port accessing BMC has changed (access Port changed to Port 2).
if the change of the IP address of the BMC is detected, for example, from 1.1.1.1 to 1.1.1.2, it can be determined that the IP address of the BMC has changed (the IP address changes to 1.1.1.2).
If the problems of cycle restart, rejection of commands and the like of the IPMI process of the BMC are detected, the abnormal change of the BMC can be indicated. For example: and if the loop restart of the IPMI process caused by the loss of the configuration file of the BMC is detected, the abnormal change of the IPMI process state can be determined.
When detecting the network state of the BMC, it may be checked whether the IP address of the BMC becomes a default IP address, for example: if the BMC requests an IP address from a DHCP (Dynamic Host Configuration Protocol) server and the DHCP server allocates the IP address, the BMC may not notify the IP address but restore the IP address to a default IP address, and at this time, it may be determined that the network state of the BMC changes.
Step 202, when the state change of the BMC is detected, determining the state change information of the BMC;
The state change information of the BMC may include the reason of the state change, for example:
if the firmware version changed to be executed by the BMC is upgraded from a lower version to a higher version, for example, from the V2 version to the V3 version, and the upgrade is successful, the state change information of the BMC may include: information that the firmware version of the BMC running changes (upgraded from V2 version to V3 version).
if the firmware version changed to the upgraded version is lower than the version before upgrading, it may be determined that the firmware version running by the BMC fails to be upgraded, for example, if the current running firmware version is version V2, and after the version upgrading is performed, the detection result of the BMC is that the firmware version is version V1, then the state change information of the BMC may include: and the firmware version running by the BMC fails to be upgraded (version rollback: from the V2 version to the V1 version).
If the change is an upgrade error caused by an upgrade interrupt in the upgrade process, the state change information of the BMC may include: and the information that the firmware version upgrading operated by the BMC fails (firmware version upgrading interruption operated by the BMC).
if the Port changed to access BMC changes, for example, from Port1 to Port2, the BMC status change information may include: information that the Port to access the BMC changes (access Port becomes Port 2).
if the change is that the IP address of the BMC changes, for example, from 1.1.1.1 to 1.1.1.2, the state change information of the BMC may include: and the IP address of the BMC changes (the IP address changes to 1.1.1.2).
if the change is that the IPMI process is restarted in a loop due to the absence of the configuration file, the state change information of the BMC may include: and the IPMI process state of the BMC has abnormal change information (configuration file missing).
if the change is that the network state of the BMC is abnormal because the IP address of the BMC is changed to the default IP address, the state change information of the BMC may include: and the information that the network state of the BMC is abnormally changed (the IP address is a default IP address).
Further, if it is detected that the change of the BMC state is an abnormal change of the IPMI process state, a failure in upgrading a firmware version running on the BMC, or an abnormal change of the network state of the BMC, it may be determined that the BMC fails. For better application of the method, when the BMC fails, a solution may also be generated according to a reason of a status change of the failed BMC, and at this time, the status change information may include the solution, for example:
When the reason for BMC failure is: if the running firmware version fails to be upgraded due to version rollback or other reasons, the generated solution may be: re-downloading the higher or stable firmware version and re-upgrading;
When the reason for BMC failure is: the configuration file is missing, then the generated solution may be: downloading a corresponding configuration file;
when the reason for BMC failure is: if the parameter configuration error results in the IPMI process rejecting the command, the generated solution may be: informing the parameters with wrong configuration and the correct parameters;
when the reason for BMC failure is: an IP address error, then the generated solution may be: and informing the BMC of the correct IP address or reapplying for the allocation of the IP address.
Furthermore, the state change information of the BMC may further include a reason of the state change and a corresponding solution, so that an administrator can select how to timely repair the failed BMC according to the corresponding solution according to actual needs after knowing the reason of the state change, and further, the workload of the administrator is reduced, and a repair result is more accurate.
step 203, notifying the state change information of the BMC.
after the state change information of the BMC is determined, the state change information may be notified to an administrator, for example, the state change information is reported to a BMC management module of the server to notify the administrator that the state of the BMC changes, or notify the administrator that the BMC fails, so that the administrator can timely know the state change event of the BMC by looking up FAQ (Frequently accessed Questions) and the like to perform corresponding processing.
therefore, the method provided by the application does not need a special administrator to enter a complex machine room environment to detect the BMC state of the server, so that the cost and complexity for solving the BMC fault problem are reduced, and the consumption of human resources is reduced. And moreover, after the BMC fault is determined, a solution can be generated according to the detected reason of the BMC fault, so that an administrator can repair the faulty BMC in time according to the solution, the workload of the administrator is further reduced, and the repair result is more accurate.
This application still provides a BMC detection device, and FIG. 3 is this BMC detection device's schematic structural diagram, and this device can be applied to network equipment, network equipment establishes communication connection through the operating system of operation and BMC, and this BMC detection device can include:
a detection unit 301, configured to detect a state of the BMC through a communication connection established between the operating system and the BMC;
a determining unit 302, configured to determine, when a state change of the BMC is detected, state change information of the BMC;
The notification unit 303 is configured to notify the state change information of the BMC.
further, the state change information of the BMC determined by the determining unit 302 may include one or more of the following items:
The firmware version of BMC operation changes;
accessing information that a port of the BMC changes;
The IP address of the BMC changes;
The information of upgrading failure of the firmware version operated by the BMC;
The intelligent platform management interface IPMI process state of the BMC is changed abnormally;
And the network state of the BMC is abnormally changed.
Further, the detecting unit 301 may be further configured to:
and when an operation instruction for detecting the BMC state is monitored, responding to the operation instruction to start detecting the BMC state.
Further, the detecting unit 301 may be further configured to:
And when the BMC is monitored to be restarted or upgraded, triggering the operation instruction for detecting the BMC state.
further, when the determining unit 302 determines that the BMC fails according to the detected state change of the BMC, the state change information of the BMC includes a reason of the BMC state failure and/or a solution generated according to the reason of the BMC state failure.
the BMC detection apparatus applied to the network device in the present application may be consistent with the processing flow of the BMC detection method applied to the network device in a specific processing flow, and is not described herein again.
The device can be implemented by software or hardware, the hardware architecture schematic diagram of the network device where the BMC detection device is located in the present application can be shown in fig. 4, and the basic hardware environment includes a central processing unit CPU401, a forwarding chip 402, a memory 403 and other hardware 404, where the memory 403 includes a machine readable instruction, and the CPU401 reads and executes the machine readable instruction to execute the function of each unit in fig. 3.
the above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. a BMC detection method of a baseboard management controller is applied to a server, the server establishes communication connection with a BMC in the server through a running operating system, and the method comprises the following steps:
Detecting the state of the BMC through communication connection established between the operating system and the BMC;
when the state change of the BMC is detected, determining the state change information of the BMC;
and informing the state change information of the BMC to a BMC management module of the server.
2. The method of claim 1, wherein the BMC state change information comprises one or more of:
the firmware version of BMC operation changes;
Accessing information that a port of the BMC changes;
The IP address of the BMC changes;
The information of upgrading failure of the firmware version operated by the BMC;
the intelligent platform management interface IPMI process state of the BMC is changed abnormally;
And the network state of the BMC is abnormally changed.
3. the method of claim 1, wherein detecting the state of the BMC via the communication connection established by the operating system with the BMC comprises:
and when an operation instruction for detecting the BMC state is monitored, responding to the operation instruction to start detecting the BMC state.
4. The method of claim 3, wherein the operating instruction to detect the BMC state is triggered when a BMC reboot or upgrade is monitored.
5. The method of claim 1, wherein when the BMC failure is determined based on the detected change in state of the BMC, the change in state information of the BMC comprises a cause of the BMC state failure and/or a solution generated based on the cause of the BMC state failure.
6. the BMC detection device is applied to a server, the server establishes communication connection with a BMC in the server through a running operating system, and the device comprises:
the detection unit is used for detecting the state of the BMC through the communication connection established between the operating system and the BMC;
the determining unit is used for determining the state change information of the BMC when the state change of the BMC is detected;
And the notification unit is used for notifying the BMC management module of the server of the state change information of the BMC.
7. The apparatus of claim 6, wherein the status change information of the BMC determined by the determining unit comprises one or more of:
The firmware version of BMC operation changes;
accessing information that a port of the BMC changes;
the IP address of the BMC changes;
the information of upgrading failure of the firmware version operated by the BMC;
the intelligent platform management interface IPMI process state of the BMC is changed abnormally;
and the network state of the BMC is abnormally changed.
8. The apparatus of claim 6, wherein the detection unit is further configured to:
and when an operation instruction for detecting the BMC state is monitored, responding to the operation instruction to start detecting the BMC state.
9. The apparatus of claim 8, wherein the detection unit is further configured to:
And when the BMC is monitored to be restarted or upgraded, triggering the operation instruction for detecting the BMC state.
10. The apparatus of claim 6, wherein when the determining unit determines that the BMC has failed according to the detected change in the state of the BMC, the change in the state information of the BMC comprises a cause of the BMC state failure and/or a solution generated according to the cause of the BMC state failure.
CN201610841255.XA 2016-09-22 2016-09-22 BMC detection method and device Active CN106411643B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610841255.XA CN106411643B (en) 2016-09-22 2016-09-22 BMC detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610841255.XA CN106411643B (en) 2016-09-22 2016-09-22 BMC detection method and device

Publications (2)

Publication Number Publication Date
CN106411643A CN106411643A (en) 2017-02-15
CN106411643B true CN106411643B (en) 2019-12-06

Family

ID=57997467

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610841255.XA Active CN106411643B (en) 2016-09-22 2016-09-22 BMC detection method and device

Country Status (1)

Country Link
CN (1) CN106411643B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109240851A (en) * 2018-08-24 2019-01-18 郑州云海信息技术有限公司 A kind of autonomous type realization self-healing method and system of batch BMC
CN109766110B (en) * 2018-12-27 2022-05-31 联想(北京)有限公司 Control method, substrate management controller and control system
CN111124509B (en) * 2019-11-29 2021-07-06 苏州浪潮智能科技有限公司 Server starting method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104375930A (en) * 2013-08-13 2015-02-25 鸿富锦精密工业(深圳)有限公司 Firmware detection system and method
CN104809044A (en) * 2014-01-24 2015-07-29 鸿富锦精密工业(深圳)有限公司 Method and system for detecting starting state of baseplate management controller

Also Published As

Publication number Publication date
CN106411643A (en) 2017-02-15

Similar Documents

Publication Publication Date Title
US10491671B2 (en) Method and apparatus for switching between servers in server cluster
CN105808394B (en) Server self-healing method and device
CN105933407B (en) method and system for realizing high availability of Redis cluster
WO2015169199A1 (en) Anomaly recovery method for virtual machine in distributed environment
CN105589712B (en) BMC module update method and device
EP2518627B1 (en) Partial fault processing method in computer system
CN107480014A (en) A kind of High Availabitity equipment switching method and device
CN108108255A (en) The detection of virtual-machine fail and restoration methods and device
CN107729213B (en) Background task monitoring method and device
US9210059B2 (en) Cluster system
CN106060859A (en) AP (Access Point) fault detection and restoration method and device
CN106411643B (en) BMC detection method and device
CN114840495B (en) Method, storage medium and equipment for preventing brain fracture of database cluster
CN107491344B (en) Method and device for realizing high availability of virtual machine
US8677323B2 (en) Recording medium storing monitoring program, monitoring method, and monitoring system
KR102262942B1 (en) Gateway self recovery method by the wireless bridge of wireless network system system
CN104268026A (en) Monitoring and management method and device for embedded system
Cisco Operational Traps
Cisco Operational Traps
Cisco Operational Traps
Cisco Operational Traps
Cisco Operational Traps
Cisco Operational Traps
Cisco Operational Traps
Cisco Operational Traps

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 310052 Binjiang District, Zhejiang Province, Changhe Road, No. 11, building 466, building

Applicant after: Huashan Information Technology Co., Ltd.

Address before: 310052 Binjiang District, Zhejiang Province, Changhe Road, No. 11, building 466, building

Applicant before: Hangzhou Kun Hai Information Technology Co., Ltd

CB02 Change of applicant information
CB02 Change of applicant information

Address after: 310052 11th Floor, 466 Changhe Road, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Xinhua Sanxin Information Technology Co., Ltd.

Address before: 310052 11th Floor, 466 Changhe Road, Binjiang District, Hangzhou City, Zhejiang Province

Applicant before: Huashan Information Technology Co., Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant