Detailed Description
for the purpose of making the present application more apparent, its technical solutions and advantages will be further described in detail with reference to the accompanying drawings.
in order to solve the problems in the prior art, the application provides a BMC detection method and a BMC detection device.
Fig. 1 shows a network schematic diagram applied to a BMC detection method provided by the present application, which includes a plurality of servers S101, S102, and S103, where each of the servers S101, S102, and S103 has a BMC, in an example, the servers S101 and S102 may be in a machine room environment, the S103 outside the machine room environment has a BMC management module, and the S103 may establish a communication connection with the BMCs of the servers S101 and S102 through an Operating System (OS) operated by the BMC management module.
Referring to fig. 2, a schematic processing flow diagram of a BMC detection method provided in the present application is shown, where the BMC detection method is applicable to a network device, where the network device may be a server or other devices, and the network device establishes a communication connection with a BMC through a running operating system, and the method includes the following steps:
step 201, detecting the state of the BMC through a communication connection established between the operating system and the BMC;
in this embodiment, a connection may be established between the operating system running in the server and the BMCs of the servers to perform local communication. In one embodiment, the operating system running in the server may have an Agent (Agent), and the BMC communication connection with each server may be established via the Agent of the operating system. The connection established with the BMC of each server includes a connection established with the BMC of the server and a connection established with the BMCs of other servers, and may communicate with the BMCs of each server according to the established connections.
After the server is connected with the BMC through the operating system, when an operating instruction for detecting the BMC state is received, the state detection of the corresponding BMC can be started in response to the operating instruction. In this embodiment of the present application, triggering to send the operation instruction may include various conditions, for example: when the fact that the BMC is restarted or upgraded is monitored, the operation instruction can be triggered to be sent, so that the state detection of the restarted or upgraded BMC is automatically started according to the operation instruction; or when the administrator needs to monitor the state of a BMC, the administrator may manually trigger an operation instruction for detecting the state of the BMC, so as to start the state detection of the BMC to be detected.
In this embodiment, the detecting the BMC state includes:
Detecting the change of the firmware version operated by the BMC;
detecting a change in a port accessing the BMC;
detecting that the IP address of the BMC changes;
Detecting abnormal change of the IPMI (Intelligent Platform Management Interface) process state of the BMC;
And detecting abnormal change of the network state of the BMC and the like.
specifically, when the state of the BMC is detected, if it is detected that the firmware version running by the BMC is upgraded from a lower version to a higher version, for example, from the V2 version to the V3 version, and the upgrade is successful, it may be determined that the firmware version running by the BMC changes (from the V2 version to the V3 version); if the upgraded firmware version is detected to be lower than the version before upgrading, it may be determined that the firmware version running by the BMC fails to be upgraded, for example, if the currently running firmware version is version V2, after the version upgrading is performed, the detection result to the BMC is that the firmware version is version V1, which indicates that the firmware version running by the BMC has version rollback, and determines that the BMC firmware version fails to be upgraded; or, when an upgrade error condition such as an upgrade interruption occurs in the upgrade process, it may also be determined that the BMC firmware version fails to be upgraded, and when the BMC firmware version fails to be upgraded, since the configuration file of the BMC firmware version that fails to be upgraded may change, it may also be determined that the firmware version that the BMC operates changes.
If a change in the Port accessing BMC is detected, for example from Port1 to Port2, then a determination may be made that the Port accessing BMC has changed (access Port changed to Port 2).
if the change of the IP address of the BMC is detected, for example, from 1.1.1.1 to 1.1.1.2, it can be determined that the IP address of the BMC has changed (the IP address changes to 1.1.1.2).
If the problems of cycle restart, rejection of commands and the like of the IPMI process of the BMC are detected, the abnormal change of the BMC can be indicated. For example: and if the loop restart of the IPMI process caused by the loss of the configuration file of the BMC is detected, the abnormal change of the IPMI process state can be determined.
When detecting the network state of the BMC, it may be checked whether the IP address of the BMC becomes a default IP address, for example: if the BMC requests an IP address from a DHCP (Dynamic Host Configuration Protocol) server and the DHCP server allocates the IP address, the BMC may not notify the IP address but restore the IP address to a default IP address, and at this time, it may be determined that the network state of the BMC changes.
Step 202, when the state change of the BMC is detected, determining the state change information of the BMC;
The state change information of the BMC may include the reason of the state change, for example:
if the firmware version changed to be executed by the BMC is upgraded from a lower version to a higher version, for example, from the V2 version to the V3 version, and the upgrade is successful, the state change information of the BMC may include: information that the firmware version of the BMC running changes (upgraded from V2 version to V3 version).
if the firmware version changed to the upgraded version is lower than the version before upgrading, it may be determined that the firmware version running by the BMC fails to be upgraded, for example, if the current running firmware version is version V2, and after the version upgrading is performed, the detection result of the BMC is that the firmware version is version V1, then the state change information of the BMC may include: and the firmware version running by the BMC fails to be upgraded (version rollback: from the V2 version to the V1 version).
If the change is an upgrade error caused by an upgrade interrupt in the upgrade process, the state change information of the BMC may include: and the information that the firmware version upgrading operated by the BMC fails (firmware version upgrading interruption operated by the BMC).
if the Port changed to access BMC changes, for example, from Port1 to Port2, the BMC status change information may include: information that the Port to access the BMC changes (access Port becomes Port 2).
if the change is that the IP address of the BMC changes, for example, from 1.1.1.1 to 1.1.1.2, the state change information of the BMC may include: and the IP address of the BMC changes (the IP address changes to 1.1.1.2).
if the change is that the IPMI process is restarted in a loop due to the absence of the configuration file, the state change information of the BMC may include: and the IPMI process state of the BMC has abnormal change information (configuration file missing).
if the change is that the network state of the BMC is abnormal because the IP address of the BMC is changed to the default IP address, the state change information of the BMC may include: and the information that the network state of the BMC is abnormally changed (the IP address is a default IP address).
Further, if it is detected that the change of the BMC state is an abnormal change of the IPMI process state, a failure in upgrading a firmware version running on the BMC, or an abnormal change of the network state of the BMC, it may be determined that the BMC fails. For better application of the method, when the BMC fails, a solution may also be generated according to a reason of a status change of the failed BMC, and at this time, the status change information may include the solution, for example:
When the reason for BMC failure is: if the running firmware version fails to be upgraded due to version rollback or other reasons, the generated solution may be: re-downloading the higher or stable firmware version and re-upgrading;
When the reason for BMC failure is: the configuration file is missing, then the generated solution may be: downloading a corresponding configuration file;
when the reason for BMC failure is: if the parameter configuration error results in the IPMI process rejecting the command, the generated solution may be: informing the parameters with wrong configuration and the correct parameters;
when the reason for BMC failure is: an IP address error, then the generated solution may be: and informing the BMC of the correct IP address or reapplying for the allocation of the IP address.
Furthermore, the state change information of the BMC may further include a reason of the state change and a corresponding solution, so that an administrator can select how to timely repair the failed BMC according to the corresponding solution according to actual needs after knowing the reason of the state change, and further, the workload of the administrator is reduced, and a repair result is more accurate.
step 203, notifying the state change information of the BMC.
after the state change information of the BMC is determined, the state change information may be notified to an administrator, for example, the state change information is reported to a BMC management module of the server to notify the administrator that the state of the BMC changes, or notify the administrator that the BMC fails, so that the administrator can timely know the state change event of the BMC by looking up FAQ (Frequently accessed Questions) and the like to perform corresponding processing.
therefore, the method provided by the application does not need a special administrator to enter a complex machine room environment to detect the BMC state of the server, so that the cost and complexity for solving the BMC fault problem are reduced, and the consumption of human resources is reduced. And moreover, after the BMC fault is determined, a solution can be generated according to the detected reason of the BMC fault, so that an administrator can repair the faulty BMC in time according to the solution, the workload of the administrator is further reduced, and the repair result is more accurate.
This application still provides a BMC detection device, and FIG. 3 is this BMC detection device's schematic structural diagram, and this device can be applied to network equipment, network equipment establishes communication connection through the operating system of operation and BMC, and this BMC detection device can include:
a detection unit 301, configured to detect a state of the BMC through a communication connection established between the operating system and the BMC;
a determining unit 302, configured to determine, when a state change of the BMC is detected, state change information of the BMC;
The notification unit 303 is configured to notify the state change information of the BMC.
further, the state change information of the BMC determined by the determining unit 302 may include one or more of the following items:
The firmware version of BMC operation changes;
accessing information that a port of the BMC changes;
The IP address of the BMC changes;
The information of upgrading failure of the firmware version operated by the BMC;
The intelligent platform management interface IPMI process state of the BMC is changed abnormally;
And the network state of the BMC is abnormally changed.
Further, the detecting unit 301 may be further configured to:
and when an operation instruction for detecting the BMC state is monitored, responding to the operation instruction to start detecting the BMC state.
Further, the detecting unit 301 may be further configured to:
And when the BMC is monitored to be restarted or upgraded, triggering the operation instruction for detecting the BMC state.
further, when the determining unit 302 determines that the BMC fails according to the detected state change of the BMC, the state change information of the BMC includes a reason of the BMC state failure and/or a solution generated according to the reason of the BMC state failure.
the BMC detection apparatus applied to the network device in the present application may be consistent with the processing flow of the BMC detection method applied to the network device in a specific processing flow, and is not described herein again.
The device can be implemented by software or hardware, the hardware architecture schematic diagram of the network device where the BMC detection device is located in the present application can be shown in fig. 4, and the basic hardware environment includes a central processing unit CPU401, a forwarding chip 402, a memory 403 and other hardware 404, where the memory 403 includes a machine readable instruction, and the CPU401 reads and executes the machine readable instruction to execute the function of each unit in fig. 3.
the above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.