CN110580523B

CN110580523B - Error calibration method and device for analog neural network processor

Info

Publication number: CN110580523B
Application number: CN201810580960.8A
Authority: CN
Inventors: 贾凯歌; 乔飞; 魏琦; 樊子辰; 刘辛军; 杨华中
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2018-06-07
Filing date: 2018-06-07
Publication date: 2022-08-02
Anticipated expiration: 2038-06-07
Also published as: WO2019232965A1; CN110580523A

Abstract

The embodiment of the invention provides an error calibration method and device for a simulated neural network processor, wherein the method comprises the following steps: if algorithm updating and/or error parameter adjustment are detected, analyzing the NN network structure to obtain trainable weight parameters of a full connection layer in the network structure; training the trainable weight parameters by adopting a random gradient descent (SGD) algorithm; wherein, the loss value and the gradient in the learning process adopt logarithmic quantification; the learning process is performed in the digital domain; replacing the multiplication operations used for back propagation and trainable weight parameter updating in the learning process with shift operations; and storing the learned weight parameters for the NN to calibrate the error of the processor according to the weight parameters. The device performs the above method. The error calibration method and device for the simulated neural network processor provided by the embodiment of the invention can reduce the energy and resource consumption of the simulated NN processor, thereby improving the efficiency of the simulated NN processor.

Description

Error calibration method and device for analog neural network processor

Technical Field

The embodiment of the invention relates to the technical field of error calibration, in particular to an error calibration method and device of an analog neural network processor.

Background

Compared with a digital circuit, the analog circuit has the characteristics of low power consumption, small chip size and the like due to the clock-free characteristic, so that an analog Neural Network (NN) processor is widely applied.

However, although the analog NN processor has high energy efficiency, the nonlinearity of the system and the device still leaves room for improvement in the problem of processing accuracy, resulting in errors mainly from variations in the manufacturing process, voltage or temperature offsets, and various noise sources. Although the simulated NN algorithm can tolerate these defects to some extent, the simulated NN processor still has a series of problems such as non-negligible accuracy degradation due to error accumulation.

Neural network algorithms have some error-tolerant learning capabilities, and they can obtain some fixed pattern of input data, even computational errors. Therefore, retraining the network is an effective method to reduce the effect of various errors, such as Process Variations ("PV"), on the system without increasing the extra consumption of the inference channel. However, on-line retraining all the network layers is a very complex task, requires high-precision computation, and requires a large number of additional computation units to achieve convergence, resulting in excessive energy and resource consumption of the simulated NN processor, thereby greatly reducing the efficiency of the simulated NN processor.

Therefore, how to avoid the above-mentioned drawbacks, reduce the energy and resource consumption of the simulated NN processor, and improve the efficiency of the simulated NN processor becomes a problem to be solved.

Disclosure of Invention

Aiming at the problems in the prior art, the embodiment of the invention provides an error calibration method and device for an analog neural network processor.

In a first aspect, an embodiment of the present invention provides an error calibration method for an analog neural network processor, where the method includes:

if algorithm updating and/or error parameter adjustment are detected, analyzing the NN network structure to obtain trainable weight parameters of a full connection layer in the network structure;

training the trainable weight parameters by adopting a random gradient descent (SGD) algorithm; wherein, the loss value and the gradient in the learning process adopt logarithmic quantification; the learning process is performed in the digital domain;

replacing the multiplication operations used for back propagation and trainable weight parameter updating in the learning process with shift operations;

and storing the learned weight parameters for the NN to calibrate the error of the processor according to the weight parameters.

In a second aspect, an embodiment of the present invention provides an error calibration apparatus for an analog neural network processor, the apparatus including:

an obtaining unit, configured to, if algorithm update and/or error parameter adjustment is detected, analyze the NN network structure to obtain trainable weight parameters of a full connection layer in the NN network structure;

the quantization unit is used for training the trainable weight parameters by adopting a random gradient descent (SGD) algorithm; wherein, the loss value and the gradient in the learning process adopt logarithmic quantification; the learning process is performed in the digital domain;

the operation unit is used for replacing multiplication operations used by back propagation and trainable weight parameter updating in the learning process by adopting shift operation;

and the storage unit is used for storing the learned weight parameters so that the NN can calibrate the error of the processor according to the weight parameters.

In a third aspect, an embodiment of the present invention provides an electronic device, including: a processor, a memory, and a bus, wherein,

the processor and the memory are communicated with each other through the bus;

the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform a method comprising:

and storing the trained weight parameters for the NN to calibrate the error of the processor according to the weight parameters.

In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, including:

the non-transitory computer readable storage medium stores computer instructions that cause the computer to perform a method comprising:

According to the error calibration method and device for the simulated neural network processor, provided by the embodiment of the invention, the logarithmic quantization loss value and the gradient are adopted in the process of training the weight parameters of the full connection layer, and the multiplication operation used by back propagation and trainable weight parameter updating is replaced by the shift operation after logarithmic quantization, so that the energy and resource consumption of the simulated NN processor can be reduced, and the efficiency of the simulated NN processor is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart illustrating an error calibration method for a simulated neural network processor according to an embodiment of the present invention;

FIG. 2(a) is a schematic diagram of a prior art backpropagation and trainable weight parameter update using multiplication;

FIG. 2(b) is a schematic diagram of an embodiment of the present invention employing a shift operation for back propagation and trainable weight parameter update;

FIG. 3 is a schematic diagram of an error calibration apparatus for a simulated neural network processor according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is a schematic flow chart of an error calibration method for a simulated neural network processor according to an embodiment of the present invention, and as shown in fig. 1, an error calibration method for a simulated neural network processor according to an embodiment of the present invention includes the following steps:

s101: and if algorithm updating and/or error parameter adjustment are detected, analyzing the NN network structure to obtain trainable weight parameters of a full connection layer in the network structure.

Specifically, if the device detects algorithm update and/or error parameter adjustment, the device analyzes the NN network structure to obtain trainable weight parameters of a full connection layer in the NN network structure. An analog neural network processor refers to a processor that computes in the analog domain for neural network algorithms. The algorithm update may be understood as an update of the neural network NN, and the error parameter may be a PV parameter, but is not particularly limited. The network structure may include a convolutional layer, a pooling layer, a full connection, and other connection relationships, and it should be noted that: the trainable weight parameters in the embodiment of the present invention may be understood as weight parameters of a full connection layer, that is: the embodiment of the invention only trains the weight parameters of the full connection layer, and the weight parameters of each layer such as the convolution layer do not need to be trained.

S102: training the trainable weight parameters by adopting a random gradient descent (SGD) algorithm; wherein, the loss value and the gradient in the learning process adopt logarithmic quantification; the learning process is performed in the digital domain.

Specifically, the device trains the trainable weight parameters by using a random gradient descent (SGD) algorithm; wherein, the loss value and the gradient in the learning process adopt logarithmic quantification; the learning process is performed in the digital domain. Neuron values computed by the analog neural network processor may be converted from the analog domain to the digital domain by low-precision ADCs (i.e., ADCs below a preset precision threshold). Logarithmic quantization may present a wider data range than linear quantization when the same bit width is used. For example: the range of 4 bits is [ -128,128] based on 2 logarithmic quantization, and the range of [ -128,127] for integers of not less than 8 bits. Considering that the loss value and gradient always vary greatly and they can be approximated in a Stochastic Gradient Descent (SGD) algorithm, a logarithmic quantization method is used to quantize the loss value and gradient. Training the trainable weight parameters by using a random gradient descent SGD algorithm is a mature technique in the art and is not described in detail.

S103: a shift operation is used instead of the multiplication operations used for back propagation and trainable weight parameter updates in the learning process.

In particular, the device employs a shift operation instead of a multiplication operation used for back propagation and trainable weight parameter updates in the learning process. FIG. 2(a) is a schematic diagram of a prior art backpropagation and trainable weight parameter update using multiplication; fig. 2(b) is a schematic diagram of an embodiment of the present invention that employs a shift operation for inverse propagation and trainable weight parameter updating, and as shown in fig. 2(a) and 2(b), S102 may employ 2-based logarithmic quantization, and after 2-based logarithmic quantization is completed, multiplication operations in the inverse propagation and trainable weight updating stages may be replaced by a shift operation. Logarithmic quantization is less power consuming because it can be easily obtained by calculating where the highest bit (for positive data) or the lowest bit (for negative data) of the high level is located.

S104: and storing the trained weight parameters for the NN to calibrate the error of the processor according to the weight parameters.

Specifically, the device stores the trained weight parameters, so that the NN can calibrate the error of the processor according to the weight parameters. It should be noted that: after learning is completed, the new parameters already contain the error information of the simulated neural network processor, and no additional error compensation is needed.

During the learning phase, on average, about 66.7% less memory is available compared to the prior art, as shown in table 1 (percentage of memory saved). This can relieve the storage pressure and reduce the data consumption of the embedded terminal.

TABLE 1

	Neuron and its use&Loss of power	Gradient of gradient	Weight of	Synthesis of
					Prior Art	32bit	32bit	32bit	32bit
Examples of the invention	6bit	6bit	20bit	10.67bit
					Saving memory	81.25％	81.25％	37.5％	66.7％

After logarithmic quantization in the embodiment of the present invention, almost all multiplication operations in the learning process can be replaced by displacement operations, and if the learning process is implemented by using the existing FPGA or ASIC integration in the system, the power consumption will be significantly reduced, as shown in table 2 (energy consumption calculation). Wherein PDP (Power Delay product) is the product of power Delay. The combination of log quantization and shift operations can achieve approximately a 50-fold energy efficiency improvement over the prior art.

Table 2 (32 bit fixed point number is selected as the reference in the prior art; data in Table 2 is shown in

Obtained by simulation under the SMIC180nm process

	Operations	Time delay (ns)	Power (mw)	PDP(pJ)
					Prior Art	32bit multiplication	6.53	15.7	102.52
Examples of the invention	32bit shift	1.62	1.22	1.97

According to the error calibration method for the simulated neural network processor, provided by the embodiment of the invention, the logarithmic quantization loss value and the gradient are adopted in the process of training the weight parameters of the full connection layer, and the multiplication operation used by back propagation and trainable weight parameter updating is replaced by the shift operation after logarithmic quantization, so that the energy and resource consumption of the simulated NN processor can be reduced, and the efficiency of the simulated NN processor is improved.

On the basis of the above embodiment, the method further includes:

storing the trainable weight parameters and the weight parameters with a resolution greater than a preset precision.

In particular, the device stores said trainable weight parameters and said weight parameters with a resolution higher than a preset precision. The specific numerical value of the preset precision can be set independently according to the actual condition. Non-trainable weights (e.g., convolutional layers) are saved at low resolution (i.e., resolution lower than a preset precision), which may be bit wide; whereas trainable weights (e.g., fully connected layers) are maintained at high resolution (i.e., resolution higher than a preset precision), which may be bit wide. The high resolution trainable weights are only used to update the weights in the learning phase (learning phase) to ensure convergence.

According to the error calibration method for the simulated neural network processor, provided by the embodiment of the invention, the trainable weight parameters and the weight parameters are stored by adopting the resolution which is higher than the preset precision, so that the convergence of the algorithm can be ensured.

On the basis of the above embodiment, the method further includes:

and acquiring the weight parameters in the convolution layer in the network structure, and storing the weight parameters in the convolution layer with a resolution lower than a preset precision.

Specifically, the device acquires a weight parameter in a convolutional layer in the network structure, and stores the weight parameter in the convolutional layer at a resolution lower than a preset precision. Reference may be made to the above embodiments, which are not described in detail.

According to the error calibration method for the simulated neural network processor, provided by the embodiment of the invention, the weight parameters in the convolutional layer are stored by adopting the resolution which is lower than the preset precision, so that the energy and resource consumption of the simulated NN processor can be reduced, and the efficiency of the simulated NN processor is improved.

On the basis of the above embodiment, the use phase of the NN includes an inference phase and a learning phase; correspondingly, the method further comprises the following steps:

after the step of storing the learned weight parameters, the learning phase is frozen and the reasoning phase is activated.

Specifically, after the step of storing the learned weight parameters, the device freezes the learning phase and activates the reasoning phase. That is, in most cases, only the inference phase is active, so that only a small amount of additional energy consumption is required.

According to the error calibration method for the simulated neural network processor, provided by the embodiment of the invention, after the step of storing the learned weight parameters, the learning stage is frozen, so that the energy and resource consumption of the simulated NN processor can be further reduced, and the efficiency of the simulated NN processor is improved.

On the basis of the above embodiment, the method further includes:

if an algorithm update and/or an error parameter adjustment is detected, the inference phase is frozen and the method described above is performed to activate the learning phase.

Specifically, the device freezes the inference phase if it detects an algorithm update and/or an error parameter adjustment, and executes the method described above to activate the learning phase. That is, in a few cases, only the learning phase is active, so only a small amount of additional energy consumption is required.

According to the error calibration method for the simulated neural network processor, provided by the embodiment of the invention, the energy and resource consumption of the simulated NN processor can be further reduced by freezing the inference stage if the algorithm update and/or the error parameter adjustment are detected, so that the efficiency of the simulated NN processor is improved.

On the basis of the above embodiment, the method further includes:

the reverse phase of the NN is executed in a processor of the terminal except the processor; wherein the terminal is loaded with the processor and the other processors.

Specifically, the reverse phase of the NN in the device is executed in a processor other than the processor of the terminal; wherein the terminal is loaded with the processor and the other processors. The terminal may be a PC or the like, included within the range of the device. The other processors can comprise a CPU or an FPGA and the like, and the steps of weight updating and the like are transferred to other processors for processing, so that the chip area and the energy consumption of the simulation NN processor are further reduced.

According to the error calibration method for the simulated neural network processor provided by the embodiment of the invention, the steps of updating the weight and the like are transferred to other processors for processing, so that the energy and resource consumption of the simulated NN processor can be further reduced, and the efficiency of the simulated NN processor is improved.

On the basis of the above embodiment, the other processors include a CPU or an FPGA.

In particular, the other processors in the device include a CPU or FPGA. Reference may be made to the above embodiments, which are not described in detail.

According to the error calibration method for the analog neural network processor provided by the embodiment of the invention, other processors are selected as CPUs or FPGAs, so that the other processors can be ensured to normally process the steps of weight updating and the like.

Fig. 3 is a schematic structural diagram of an error calibration apparatus for an analog neural network processor according to an embodiment of the present invention, and as shown in fig. 3, an embodiment of the present invention provides an error calibration apparatus for an analog neural network processor, including an obtaining unit 301, a quantizing unit 302, and an arithmetic unit 303, and a storage unit 304, where:

the obtaining unit 301 is configured to, if algorithm update and/or error parameter adjustment is detected, analyze the network structure of the NN to obtain trainable weight parameters of a full connection layer in the network structure; the quantization unit 302 is configured to train the trainable weight parameters by using a random gradient descent SGD algorithm; wherein, the loss value and the gradient in the learning process adopt logarithmic quantification; the learning process is performed in the digital domain; a replacement unit 303 for replacing the multiplication operations used for back propagation and trainable weight parameter updating in the learning process with a shift operation; the storage unit 304 is configured to store the learned weight parameters, so that the NN may calibrate an error of the processor according to the weight parameters.

Specifically, the obtaining unit 301 is configured to, if it is detected that the algorithm is updated and/or the error parameter is adjusted, analyze the network structure of the NN to obtain trainable weight parameters of a full connection layer in the network structure; the quantization unit 302 is configured to train the trainable weight parameters by using a random gradient descent SGD algorithm; wherein, the loss value and the gradient in the learning process adopt logarithmic quantification; the learning process is performed in the digital domain; a replacement unit 303 for replacing the multiplication operations used for back propagation and trainable weight parameter updating in the learning process with a shift operation; the storage unit 304 is configured to store the learned weight parameters, so that the NN may calibrate an error of the processor according to the weight parameters.

According to the error calibration device of the simulated neural network processor, provided by the embodiment of the invention, the logarithmic quantization loss value and gradient are adopted in the process of training the weight parameters of the full connection layer, and the multiplication operation used by inverse propagation and trainable weight parameter updating is replaced by the shift operation after logarithmic quantization, so that the energy and resource consumption of the simulated NN processor can be reduced, and the efficiency of the simulated NN processor is improved.

The error calibration apparatus for an analog neural network processor provided in the embodiments of the present invention may be specifically configured to execute the processing procedures of the above-described method embodiments, and its functions are not described herein again, and refer to the detailed description of the above-described method embodiments.

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device includes: a processor (processor)401, a memory (memory)402, and a bus 403;

the processor 401 and the memory 402 complete communication with each other through a bus 403;

the processor 401 is configured to call the program instructions in the memory 402 to execute the methods provided by the above-mentioned method embodiments, for example, including: if algorithm updating and/or error parameter adjustment are detected, analyzing the NN network structure to obtain trainable weight parameters of a full connection layer in the network structure; training the trainable weight parameters by adopting a random gradient descent (SGD) algorithm; wherein, the loss value and the gradient in the learning process adopt logarithmic quantification; the learning process is performed in the digital domain; replacing the multiplication operations used for back propagation and trainable weight parameter updating in the learning process with shift operations; and storing the trained weight parameters for the NN to calibrate the error of the processor according to the weight parameters.

The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above method embodiments, for example, including: if algorithm updating and/or error parameter adjustment are detected, analyzing the NN network structure to obtain trainable weight parameters of a full connection layer in the network structure; training the trainable weight parameters by adopting a random gradient descent (SGD) algorithm; wherein, the loss value and the gradient in the learning process adopt logarithmic quantification; the learning process is performed in the digital domain; replacing the multiplication operations used for back propagation and trainable weight parameter updating in the learning process with shift operations; and storing the learned weight parameters for the NN to calibrate the error of the processor according to the weight parameters.

The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the methods provided by the above method embodiments, for example, including: if algorithm updating and/or error parameter adjustment are detected, analyzing the NN network structure to obtain trainable weight parameters of a full connection layer in the network structure; training the trainable weight parameters by adopting a random gradient descent (SGD) algorithm; wherein, the loss value and the gradient in the learning process adopt logarithmic quantification; the learning process is performed in the digital domain; replacing the multiplication operations used for back propagation and trainable weight parameter updating in the learning process with shift operations; and storing the learned weight parameters for the NN to calibrate the error of the processor according to the weight parameters.

Those of ordinary skill in the art will understand that: all or part of the steps of implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer-readable storage medium, and when executed, executes the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

The above-described embodiments of the electronic device and the like are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may also be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present invention, and are not limited thereto; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method of error calibration for an analog neural network processor, the method being performed in an analog neural network, NN, processor, comprising:

storing the learned weight parameters for the NN to calibrate the error of the processor according to the weight parameters;

storing the trainable weight parameters and the weight parameters with a resolution higher than a preset precision;

acquiring weight parameters in a convolutional layer in the network structure, and storing the weight parameters in the convolutional layer with a resolution lower than a preset precision;

the NN using phase comprises an inference phase and a learning phase; correspondingly, the method further comprises the following steps:

after the step of storing the learned weight parameters, freezing the learning phase and activating the reasoning phase;

the method further comprises the following steps:

if an algorithm update and/or an error parameter adjustment is detected, freezing the reasoning phase and performing the above step of calibrating the error of the processor to activate the learning phase.

2. The method of claim 1, further comprising:

3. The method of claim 2, wherein the other processor comprises a CPU or FPGA.

4. An error calibration apparatus for an analog neural network processor, the apparatus comprising an analog Neural Network (NN) processor, the apparatus comprising:

the storage unit is used for storing the learned weight parameters so that the NN can calibrate the error of the processor according to the weight parameters; storing the trainable weight parameters and the weight parameters with a resolution higher than a preset precision; acquiring weight parameters in a convolutional layer in the network structure, and storing the weight parameters in the convolutional layer with a resolution lower than a preset precision;

the NN using phase comprises an inference phase and a learning phase; correspondingly, the device is also used for:

after the step of storing the learned weight parameters, freezing the learning phase and activating the reasoning phase; and if the algorithm updating and/or the error parameter adjustment are detected, freezing the reasoning stage, and executing the units to activate the learning stage.

5. An electronic device, comprising: a processor, a memory, and a bus, wherein,

the processor and the memory are communicated with each other through the bus;

the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1 to 3.

6. A non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the method of any one of claims 1 to 3.