Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a power supply method and a power supply system of a server CPU redundant power supply, which provide higher guarantee for the reliable operation of a server mainboard and effectively reduce the huge maintenance cost caused by the failure of the server CPU power supply.
In order to achieve the purpose, the invention adopts the following technical scheme: a method of powering redundant power supplies of a server CPU, the method comprising:
establishing a communication relation between the CPU and the power supply unit;
the power supply unit serves as a power supply end to supply power to the CPU.
The further technical scheme is as follows: the power supply unit comprises two redundant modules which adopt the same main chip and are respectively a first redundant module and a second redundant module.
The further technical scheme is as follows: the step of establishing the communication relationship between the CPU and the power supply unit comprises the following specific steps:
receiving SVID information sent by a CPU through an MCU;
the MCU processes the received SVID information;
and simultaneously sending the processed information to the first redundant module and the second redundant module.
The further technical scheme is as follows: the MCU processes the received SVID information, and comprises the following specific steps:
the MCU processes ID information of a Vendor ID, a Product review, a Product data Code, a Lot Code and a Protocol;
the MCU processes Icmax and Vboot information;
the MCU processes Vout and Temperature information;
and the MCU processes the Iout and Pout information.
The further technical scheme is as follows: the MCU carries out processing steps on the ID information of the Vendor ID, Product review, Product data Code, Lot Code and Protocol, and the MCU comprises the following specific steps:
the first redundancy module and the second redundancy module set different addresses through hardware;
the MCU translates the address information sent by the CPU;
and the MCU simultaneously sends the translated information to the first redundant module and the second redundant module.
The further technical scheme is as follows: the step of processing Vout and Temperature information by the MCU comprises the following specific steps:
respectively operating the Vout and Temperature data information of the first redundancy module and the second redundancy module through the MCU;
and averaging the results of the respective operations, and feeding back to the CPU.
The further technical scheme is as follows: the MCU processes Iout and Pout information, which comprises the following steps:
the MCU receives Iout and Pout data sent by the first redundancy module and the second redundancy module;
the MCU carries out superposition operation on the received Iout and Pout data;
sending the data after the superposition operation to a CPU;
and the CPU receives the superposed data and adjusts the working state of the data in real time.
The further technical scheme is as follows: the power supply unit is used as a power supply end to supply power to the CPU, and comprises the following specific steps:
the MCU acquires current information fed back to the CPU by the first redundancy module and the second redundancy module in real time through the SVID bus;
the MCU performs average value operation on the acquired current information and sends an operation result to the first redundancy module and the second redundancy module through the current-sharing control bus;
when the first redundant module and the second redundant module receive current information after the average value operation and current information detected in real time, if the current detected in real time is small, the first redundant module and the second redundant module feed back the result to the CPU, and the CPU sends a control instruction for increasing the output voltage to the first redundant module and the second redundant module so as to increase the output current; if the current detected in real time is larger, the first redundant module and the second redundant module feed back the result to the CPU, and the CPU sends a control instruction for reducing the output voltage to the first redundant module and the second redundant module so as to reduce the output current.
A power supply system of a redundant power supply of a CPU of a server comprises a first redundant module, a second redundant module, an MCU and the CPU, wherein,
the first redundancy module and the second redundancy module are used for supplying power to the CPU;
the CPU is used for controlling and managing the first redundant module and the second redundant module;
and the MCU is used for processing communication data information between the CPU and the first redundant module and between the CPU and the second redundant module.
The further technical scheme is as follows: the first redundant module and the second redundant module respectively comprise a power supply conversion module, a switch switching module, a health state indicating module and a current control module, wherein,
the power supply conversion module is used for the first redundancy module and the second redundancy module to communicate with the CPU and respond to the instruction sent by the CPU;
the switch switching module is used for cutting off the connection between the redundancy module and the CPU when one redundancy module of the first redundancy module and the second redundancy module fails;
the health state indicating module is used for indicating the working states of the first redundant module and the second redundant module;
and the current control module is used for controlling the load balance of the first redundancy module and the second redundancy module.
Compared with the prior art, the invention has the beneficial effects that: the invention provides a power supply method of a redundant power supply of a CPU (Central processing Unit), which ensures that a power supply unit supplies power to the CPU stably and reliably by establishing a communication relation between the CPU and a first redundant module and a second redundant module; two identical redundant modules are used as power supply ends of the CPU, but when one of the redundant modules fails, the other redundant module can also normally supply power to the CPU, so that the influence of the failed redundant module on the server is avoided; in addition, the invention also provides a power supply system of the redundant power supply of the CPU of the server, in the system, the first redundant module and the second redundant module ensure the load balance among the redundant modules through the arranged current control module, so that the server can normally and stably run.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented according to the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more apparent, the following detailed description will be given of preferred embodiments.
Detailed Description
In order to more fully understand the technical content of the present invention, the technical solution of the present invention will be further described and illustrated with reference to the following specific embodiments, but not limited thereto.
Referring to fig. 1 to 7, the present invention provides a power supply method for a redundant power supply of a CPU of a server, including:
s10, establishing a communication relation between the CPU and the power supply unit;
and S20, the power supply unit serves as a power supply end to supply power to the CPU.
Specifically, the power supply unit includes two redundant modules, namely a first redundant module and a second redundant module, which use the same main chip.
In certain embodiments, step S10 includes the following specific steps:
s101, receiving SVID information sent by a CPU through an MCU;
s102, the MCU processes the received SVID information;
s103, the processed information is simultaneously sent to the first redundancy module and the second redundancy module.
In certain embodiments, step S102 includes the following specific steps:
s1021, the MCU processes ID information of a Vendor ID, a Product review, a Product data Code, a Lot Code and a Protocol;
s1022, the MCU processes Icmax and Vboot information;
s1023, MCU processes Vout, Temperature information;
and S1024, the MCU processes the Iout and Pout information.
For steps S1021 to S1024, since the first redundant module and the second redundant module use the same master chip, it is also determined that the Vendor ID, Product review, Product date Code, Lot Code, and Protocol ID information of the first redundant module and the second redundant module in the communication of the SVIDs are completely consistent. After receiving the SVID information sent by the CPU, the MCU translates the SVID information into information which can be identified and processed by the MCU.
In some embodiments, step S1021 includes the following specific steps:
s10211, setting different addresses by the first redundant module and the second redundant module through hardware;
s10212, the MCU translates the address information sent by the CPU;
s10213, the MCU sends the translated information to the first redundancy module and the second redundancy module at the same time.
For steps S10211-S10213, since the MCU only responds to the SVID information with address 0000, the SVID address 0000 is the default address of the power conversion chip of the CPU, and when there are 2 identical power conversion chips of the CPU, the power conversion chips of the first redundant module and the second redundant module need to set different addresses by hardware, and then the MCU distributes the translated information to 2 addresses at the same time to set different first redundant module and second redundant module, so that the first redundant module and the second redundant module can receive the SVID information sent by the CPU at the same time.
In step S1022, Iccmax, Vboot information is set to the same configuration by the peripheral hardware, so that the MCU can directly send the Vboot and Iccmax information fed back by the first redundant module and the second redundant module to the CPU.
In certain embodiments, step S1023 includes the following specific steps:
s10231, respectively operating Vout and Temperature data information of the first redundancy module and the second redundancy module through the MCU;
and S10232, averaging the results of the respective operations and feeding back to the CPU.
In certain embodiments, step S1024 includes the following specific steps:
s10241, receiving Iout and Pout data sent by the first redundancy module and the second redundancy module by the MCU;
s10242, the MCU carries out superposition operation on the received Iout and Pout data;
s10243, sending the data after superposition operation to a CPU;
s10244, the CPU receives the data after superposition and adjusts the working state of the data in real time.
In certain embodiments, step S20 includes the following specific steps:
s201, the MCU acquires current information fed back to the CPU by the first redundancy module and the second redundancy module in real time through the SVID bus;
s202, the MCU performs average value operation on the acquired current information and sends the operation result to the first redundancy module and the second redundancy module through the current-sharing control bus;
s203, when the current information after the average value operation is received by the first redundancy module and the second redundancy module is compared with the current information detected in real time, if the current detected in real time is small, the first redundancy module and the second redundancy module feed back the result to the CPU, and the CPU sends a control instruction for increasing the output voltage to the first redundancy module and the second redundancy module so as to increase the output current; if the current detected in real time is larger, the first redundant module and the second redundant module feed back the result to the CPU, and the CPU sends a control instruction for reducing the output voltage to the first redundant module and the second redundant module so as to reduce the output current.
In addition, in order to ensure the convenience of the maintenance work of the server administrator and ensure the stable operation of the server system without interruption, the first redundancy module and the second redundancy module are both designed to support HOT plug, the first redundancy module is taken as an example to be concretely described below, the PIN design of the redundancy board 2x10 golden finger connector of the first redundancy module is divided into a power supply part and a signal part, the power supply part mainly comprises GND, VIN and VOUT, and the signal part mainly comprises PWRGD, EN, ACT, VR _ HOT #, SVID _ CLK, SVID _ ALET, SVID _ DATA, csense, VSSSENSE, vcvin, CTL _ VOUT, GND and DTC _ VCC. All PIN PINs of the redundant board 2x10 golden finger connector are plated with gold, so that the surface oxidation resistance and wear resistance of the redundant board can be enhanced, the reliability of hot plug connection of the connector can be effectively enhanced, meanwhile, the through-current capacity of the surface of the power supply part can be enhanced, the designed current value of the power supply part is VIN 40A and VOUT 200A, the width of a gap between VIN and VOUT of the power supply part is 2mm, and the design of the signal part satisfies that the gap width between the PINs is 1.2mm, the impedance of the signal line satisfies the requirement of 50ohm, the thickness of the whole redundant board 2x10 golden finger connector is 1.6mm, a notch design is made between the PIN A2 and the PIN A3, the width of the notch design is 2mm, on one hand, mutual interference between signals and a power supply is avoided, on the other hand, a foolproof design is made, and the situation that a redundant board 2x10 golden finger connector is inserted reversely with a server mainboard 2x32 slot connector is avoided. The length contrast signal PIN foot of power PIN foot gilding can be longer, can guarantee like this that redundant board when inserting, the preferred and mainboard connector contact of PIN foot of power, and when redundant board removed, the signal PIN foot is preferred to be separated with mainboard connector, has guaranteed the reliability of hot plug signal.
The PIN design of the server motherboard 2x32 slot connector is also divided into a power supply part and a signal part, wherein the power supply part is composed of GND, VIN and VOUT, and the signal part is mainly composed of PWRGD _ A, EN _ A, ACT _ A, VR _ HOT #, SVID _ CLK _ A, SVID _ ALET _ A, SVID _ DATA _ A, VCCSENSE _ A, VSSSENSE _ A, CTL _ VIN _ A, CTL _ VOUT _ A, GND and TS _ A. Each PIN foot of the server mainboard 2x32 slot connector is designed as a shrapnel, and when the redundant board 2x10 golden finger connector is inserted into the server mainboard 2x32 slot connector, the contact point position of the shrapnel is positioned in the middle of the redundant board 2x10 golden finger connector, so that the connection reliability is ensured. The through-current design of PIN feet of a power supply part of a server mainboard 2x32 slot connector meets 8A of each PIN, the width of a gap between PINs of the power supply part is 2mm, the design of a signal part meets the requirement that the width of the gap between PINs is 1.2mm, the impedance of a signal line meets the requirement of 50ohm, the width of the whole server mainboard 2x32 slot connector is 1.8mm, a barrier design is made between PIN A24 and A25, and the width of the barrier is 1.6 mm. The signal part of the motherboard 2x32 socket connector is designed for ESD protection.
The working principle of hot plugging is as follows: the power VIN and VOUT of the first redundancy module are both isolated from the server mainboard through MOS tubes, and the rest signals are only processed and connected with the MCU. When the first redundancy module is plugged into the server mainboard, the PIN B32(TS _ A) of the 2x32 slot connector of the server mainboard is pulled down, the MCU detects that the signal is pulled down, sends a CTL _ VIN _ A signal to the first redundancy template and controls the opening of the MOS of VIN, then the VIN power supply is converted into a VCC power supply for supplying power to VR01 through an LDO chip, when the MCU detects that the VCC voltage is effective through PIN DTC _ VCC _ A, the MCU sends a high signal of EN _ A to the redundancy board, at the moment, the VR01 can normally work, the output voltage reaches a Vboot voltage value, the PWRGD signal of VR01 becomes high, after the MCU receives the PWRGD _ A signal, the MCU activates an SVID signal, a VR _ HOT # signal, a VCCSENSE with remote feedback, a VSSSEN signal is sent to the redundancy board, after the output voltage of the redundancy board is adjusted to the required voltage of the CPU, the MCU sends VOUT _ A to control the opening of the MOS output CTL _ VOUT _ A, and meanwhile, the ACT _ A signal is sent to control a health indicator light of the redundant board, the indicator light can turn into green, and the first redundant module is displayed to work normally. When the first redundant board is removed, the MCU will detect that TS _ a is pulled high, and the MCU will set all the signals on the corresponding slot connectors of 2x32 on the server motherboard to a high-impedance state. The whole hot plug process is controlled by the MCU, thus ensuring the stable operation of the system.
Referring to fig. 8 and 9, the present invention further provides a power supply system for redundant power supplies of a server CPU, including a first redundant module 1, a second redundant module 2, an MCU3, and a CPU4, wherein,
the first redundant module 1 and the second redundant module 2 are used for supplying power to the CPU 4;
a CPU4 for controlling and managing the first redundant module 1 and the second redundant module 2;
and the MCU3 is used for processing communication data information between the CPU4 and the first redundant module 1 and the second redundant module 2.
Further, the first redundant module 1 and the second redundant module 2 each include a power conversion module 11, a switch switching module 12, a health status indication module 13, and a current control module 14, wherein,
a power conversion module 11 for the first redundant module 1 and the second redundant module 2 to communicate with the CPU4 and respond to instructions sent by the CPU 4;
a switching module 12 for disconnecting the redundant module from the CPU4 when one of the first redundant module 1 and the second redundant module 2 fails;
a health status indication module 13, configured to indicate working statuses of the first redundant module 1 and the second redundant module 2;
and the current control module 14 is used for controlling the load balance of the first redundant module 1 and the second redundant module 2.
Specifically, the power conversion module 11 mainly comprises a power conversion chip VR01, a peripheral circuit, an input capacitor, a power MOS, an output inductor, an output capacitor, and the like, and has a main function of normally communicating with the CPU and responding to an instruction sent by the CPU in real time to generate a stable power required by the CPU. The protection functions possessed by the module include UVP (low voltage protection), OVP (overvoltage protection), OCP (overcurrent protection), OTP (over-temperature protection), UVLO (under-voltage protection), SCP (short-circuit protection). In addition, the signals of the power conversion chip VR01 connected to the periphery include: a power state indication signal PWRGD, a power enable signal EN, a temperature alarm signal VR _ HOT #, output feedback signals VCCSENSE, VSSSENSE, and SVID signals (SVID _ DAT, SVI _ CLK, SVID _ ALET). When the input voltage VIN enters the two redundancy modules, VCC voltage is generated through one LDO chip, when waiting for the chip enable signal EN to be high and effective, the VR01 starts working, the output voltage reaches the Vboot voltage value, and when the CPU sends a voltage regulation instruction, the VR01 responds in real time and adjusts to the voltage value required by the CPU. The stable operation of the whole power conversion module 11 is an important guarantee for the quality of the power supply of the CPU.
The switch switching module 12 is located between the power conversion module 11 and the CPU, and mainly plays an isolation role, so as to ensure that the power conversion module 11 can quickly cut off the connection with the server motherboard when a fault occurs. The switch switching module 12 is composed of two parts, i.e., a VIN switch switching module and a VOUT switch switching module, which are both composed of MOS transistors and control circuits. When the two redundancy modules are inserted into the main board, after normal connection, a detection signal TS on the main board is pulled down, then the MCU sends a CTL _ VIN signal to enable the MOS tube for controlling VIN to be conducted, and the VIN power enters the redundancy board; then, after the output voltage of the power conversion module is stable, after the MCU detects that the power state indication signal PWRGD goes high, the CTL _ VOUT signal is sent to enable the MOS transistor controlling VOUT to be turned on. In order to avoid the inRUSH CURRENT impact formed when the MOS tube for controlling VIN/VOUT is started, the design of SOFT START is added into a control circuit, and the stable switching of the VIN/VOUT power supply is ensured. Meanwhile, when the MCU detects that the power conversion module 12 is abnormal, signals CTL _ VIN and CTL _ VOUT are pulled down, so that the MOS for controlling VIN/VOUT is quickly turned off, the two power conversion modules are completely isolated from the server mainboard, and the system cannot be influenced when the redundant power supply fails.
The health status indication module 13 mainly indicates the working status of the two redundant modules, which is convenient for the server manager to maintain and can replace the redundant module with a fault in time. Its status indicator adopts the double-colored lamp, and when redundant module normal work, the pilot lamp shows for green, and when redundant module work was unusual, the pilot lamp shows for yellow. The working state of the redundant module can be judged by workers according to the indicator light at the specific position.
The current control module 14 is mainly used to ensure load balancing among the redundant modules. A current-sharing control bus is arranged between the first redundant module and the MCU, and between the second redundant module and the MCU, and the current-sharing control bus is mainly used as a current reference of the current control module. MCU can read the current information that power conversion chip VR01 and power conversion chip VR02 fed back to CPU in real time, then do the average value operation with 2 current values, then send the result of operation to each current control module 14 through the control bus that flow equalizes, and can compare the current information that detects with in real time after current control module 14 received this current information, if the current that detects in real time is less, current control module can feed back to the chip end, thereby through heightening output voltage, realize the increase of output current. Otherwise, if the current detected in real time is larger, the current is fed back to the chip through the current control module, and the output voltage is adjusted to be lower, so that the purpose of reducing the output current is achieved. Thereby achieving the load balance design of the redundant module.
In addition, in order to ensure the convenience of the maintenance work of the server administrator and simultaneously ensure the stable operation of the server system without interruption, the first redundancy module 1 and the second redundancy module 2 are both designed to support HOT plug, and the first redundancy module 1 is taken as an example to specifically describe below, the PIN design of the redundancy board 2x10 golden finger connector of the first redundancy module is divided into a power supply part and a signal part, the power supply part mainly consists of GND, VIN and VOUT, and the signal part mainly consists of PWRGD, EN, ACT, VR _ HOT #, SVID _ CLK, SVID _ ALET, SVID _ DATA, VCCSENSE, VSSSENSE, CTL _ VIN, CTL _ VOUT, GND and DTC _ VCC. All PIN PINs of the redundant board 2x10 golden finger connector are plated with gold, so that the surface oxidation resistance and wear resistance of the redundant board can be enhanced, the reliability of hot plug connection of the connector can be effectively enhanced, meanwhile, the through-current capacity of the surface of the power supply part can be enhanced, the designed current value of the power supply part is VIN 40A and VOUT 200A, the width of a gap between VIN and VOUT of the power supply part is 2mm, and the design of the signal part satisfies that the gap width between the PINs is 1.2mm, the impedance of the signal line satisfies the requirement of 50ohm, the thickness of the whole redundant board 2x10 golden finger connector is 1.6mm, a notch design is made between the PIN A2 and the PIN A3, the width of the notch design is 2mm, on one hand, mutual interference between signals and a power supply is avoided, on the other hand, a foolproof design is made, and the situation that a redundant board 2x10 golden finger connector is inserted reversely with a server mainboard 2x32 slot connector is avoided. The length contrast signal PIN foot of power PIN foot gilding can be longer, can guarantee like this that redundant board when inserting, the preferred and mainboard connector contact of PIN foot of power, and when redundant board removed, the signal PIN foot is preferred to be separated with mainboard connector, has guaranteed the reliability of hot plug signal.
The PIN design of the server motherboard 2x32 slot connector is also divided into a power supply part and a signal part, wherein the power supply part is composed of GND, VIN and VOUT, and the signal part is mainly composed of PWRGD _ A, EN _ A, ACT _ A, VR _ HOT #, SVID _ CLK _ A, SVID _ ALET _ A, SVID _ DATA _ A, VCCSENSE _ A, VSSSENSE _ A, CTL _ VIN _ A, CTL _ VOUT _ A, GND and TS _ A. Each PIN foot of the server mainboard 2x32 slot connector is designed as a shrapnel, and when the redundant board 2x10 golden finger connector is inserted into the server mainboard 2x32 slot connector, the contact point position of the shrapnel is positioned in the middle of the redundant board 2x10 golden finger connector, so that the connection reliability is ensured. The through-current design of PIN feet of a power supply part of a server mainboard 2x32 slot connector meets 8A of each PIN, the width of a gap between PINs of the power supply part is 2mm, the design of a signal part meets the requirement that the width of the gap between PINs is 1.2mm, the impedance of a signal line meets the requirement of 50ohm, the width of the whole server mainboard 2x32 slot connector is 1.8mm, a barrier design is made between PIN A24 and A25, and the width of the barrier is 1.6 mm. The signal part of the motherboard 2x32 socket connector is designed for ESD protection.
The working principle of hot plugging is as follows: the power VIN and VOUT of the first redundancy module 1 are both isolated from the server mainboard through MOS tubes, and the rest signals are only processed and connected with the MCU, so that the design ensures that when the first redundancy module 1 is inserted, the connector is in a high-impedance state except the VIN power supply. When the first redundancy module 1 is plugged into the server motherboard, the PIN B32(TS _ a) of the 2x32 slot connector of the server motherboard is pulled low, the MCU detects that the signal is pulled low, and then sends a CTL _ VIN _ a signal to the first redundancy template 1, and controls the MOS of VIN to turn on, then the VIN power is converted into a VCC power for supplying VR01 through an LDO chip, when the MCU detects that the VCC voltage is valid through PIN DTC _ VCC _ a, the MCU sends a high signal of EN _ a to the redundancy board, at this time, the redundancy board VR01 can work normally, at this time, the output voltage reaches Vboot voltage, and the PWRGD signal of VR01 becomes high, after the MCU receives the PWRGD _ a signal, the SVID signal is activated, the VR _ HOT # signal, the VCCSENSE, VSSSENSE signal fed back from the remote end is sent to the redundancy board, and after the output voltage of the redundancy board is adjusted to the required voltage of the CPU, the MCU sends a CTL _ VOUT _ a signal to control the MOS to turn on, and meanwhile, the ACT _ A signal is sent to control a health indicator light of the redundant board, the indicator light can turn into green, and the first redundant module is displayed to work normally. When the first redundant board 1 is removed, the MCU will detect that TS _ a is pulled high, and the MCU will set all the signals on the corresponding slot connectors of 2x32 of the server motherboard to a high-impedance state. The whole hot plug process is controlled by the MCU, thus ensuring the stable operation of the system.
The technical contents of the present invention are further illustrated by the examples only for the convenience of the reader, but the embodiments of the present invention are not limited thereto, and any technical extension or re-creation based on the present invention is protected by the present invention. The protection scope of the invention is subject to the claims.