CN1422048A - Solution to local failure of memory - Google Patents
Solution to local failure of memory Download PDFInfo
- Publication number
- CN1422048A CN1422048A CN01135088A CN01135088A CN1422048A CN 1422048 A CN1422048 A CN 1422048A CN 01135088 A CN01135088 A CN 01135088A CN 01135088 A CN01135088 A CN 01135088A CN 1422048 A CN1422048 A CN 1422048A
- Authority
- CN
- China
- Prior art keywords
- buffering area
- memory
- veneer
- self check
- logic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Techniques For Improving Reliability Of Storages (AREA)
Abstract
The solution to local failure of memory is to perform self check of the memory in buffering area unit with the logic circuit of IC or ASIC chip itself. For buffering area with all right memory units, the initial address is written to the idle buffering area queue for subsequent use; and for buffering area with failure memory unit, the initial address will not be written for no further access. During self check, the failure buffering areas are counted for post-treatment.
Description
Affiliated field
The present invention relates to a kind ofly solve the memory partial failure and improve the method for whole system functional reliability and fault-tolerance, this method has bigger using value in the occasion that the memory piecemeal uses, and for example transmits or ATM cell such as cuts apart/recombinate at the Logic Circuit Design of aspect application in the message storage.The invention belongs to logic IC or asic chip circuit design technique field.
Background technology
Relating to that message storage is transmitted or ATM cell is cut apart/recombinate etc. in the circuit design of the logic IC of application or asic chip, often need use mass storage and be used for temporary message, and generally all be that memory is divided into several buffering areas, each buffering area can be deposited a message.
Referring to the realization block diagram of logical circuit in the application that the message storage is transmitted or ATM cell is cut apart/recombinated etc. at present shown in Figure 1, its basic functional principle is described as follows:
(1) after system reset, at first carries out the memory self check.The memory self check can be to be realized by the logical circuit of logic IC or asic chip itself, also can be to be undertaken by the memory access passage that this logic chip provides by the CPU that links to each other with this logic IC or asic chip.Because the capacity of memory is bigger, the self check speed of being carried out memory by CPU is too slow, so, all be that the logical circuit by logic IC or asic chip itself carries out the memory self check usually.The method of self check generally is to write data earlier in certain memory cell of memory, and then these data and the data of reading from this memory cell are compared judgement, if both are identical, thinks that then this memory cell is normal.If through after the self check, all memory cell of this memory are all normal, can judge that this memory self check is normal.After the memory self check was finished, this memory self check mistake of needs output was whether Status Flag, and confession CPU judges and handles accordingly.Find that such as self check there is partial failure in memory, then CPU need send alarm signal, notifies the attendant to change corresponding processing such as veneer.Heavy line among the figure is represented the delivering path of message data, and fine line is then represented the delivering path of buffering area first address.
(2) initialization of the not busy buffering area formation of the normal laggard line space of memory self check, the first address that is about to each buffering area writes the freebuf formation.Freebuf formation and the formation of transmission buffering area in fact all are push-up storage (FIFO), what preserve in the freebuf formation is the first address of freebuf, and sending what preserve in the buffering area formation is to have had buffering area first address to be sent such as message.
(3) the accepting state machine is after receiving message, to from the freebuf formation, read the first address of freebuf, and the message that receives is stored in the corresponding buffering area of mass storage according to this address, after a message received, the first address with this buffering area was written in the formation of transmission buffering area again.
(4) after the transmit status machine examination measures and in the formation of transmission buffering area data is arranged, from send the buffering area formation, read out the buffering area first address of this message storage earlier, from mass storage, read this message according to this first address then, and after handling accordingly, send.After a message transmission finished, message transmit status machine was written to the first address of this buffering area in the freebuf formation more again, to discharge this buffering area.By above workflow, just can finish the storage forwarding work of message.
At present, along with developing rapidly of microelectric technique, the capacity of memory chip is increasing, can integrated several hundred million transistors in the present chip, and the scale of memory chip also rapidly increases continuing.Simultaneously, the employed processing technology of production integrated circuit (IC) chip is also more and more advanced, and its live width is more and more littler, and the possibility that certainly will cause like this occurring the LSU local store unit inefficacy in the memory will increase greatly.
If the LSU local store unit in certain buffering area in the memory lost efficacy, the message that then is easy to cause being temporarily stored in this buffering area is made mistakes when sending.In this case, can think that generally this veneer produces fault, need to change whole memory chip, and veneer need be returned manufacturer's maintenance.The expense of whole maintenance is the cost that is higher than this memory chip itself far away, moreover this memory chip is just LSU local store unit generation inefficacy also, more seriously, can produce the illusion that the quality of this product can not get guaranteeing, bring grievous injury to image product to the user.
Summary of the invention
Thereby the purpose of this invention is to provide a kind of method that the memory partial failure improves whole system functional reliability and fault-tolerance that solves, this method can solve memory preferably and partial failure occur and cause message to send wrong and the high problem of single board default rate, make and it seems that from system to just look like that partial failure does not take place this memory chip the same, only this memory span is little little by little, whole system operation reliability and fault-tolerance be can improve greatly like this, single board default rate and repair rate reduced.
The object of the present invention is achieved like this: a kind of method that solves the memory partial failure, it is characterized in that: this method is that the logical circuit by logic IC or asic chip itself is that unit carries out self check with the buffering area to memory, the method of self check is sequentially to write data to each memory cell of this memory, and then these data and the data of reading from this buffering area are compared judgement, if both are identical, think that then the detected memory cell of this buffering area is normal, if the self-detection result of all memory cell of this buffering area is all normal, then after this buffering area self check finishes, its first address is written in the freebuf formation, just can uses this buffering area in the work of logic IC or asic chip afterwards; Have certain or some storage-unit-failure if detect certain buffering area, then the first address of this buffering area will not be written in the freebuf formation, and this buffering area that has the LSU local store unit inefficacy will be accessed in the operate as normal of logic IC or asic chip never; Simultaneously, the memory self-checking circuit is when carrying out the memory self check, one counter is set to be counted the buffering area number that damages that lost efficacy, and after self check finishes, read this statistics by CPU and lost efficacy and damage the count value of buffer count device and handle accordingly, if lost efficacy the buffering area number that damages seldom, under the little situation of the function of veneer and performance impact, can think that this veneer is normal, allow the work as usual of this veneer; When the number that damaged buffering area when losing efficacy is big, under the situation that may affect greatly, should sends alarm signal request maintenance or change veneer the function and the performance of veneer.
Adopt method of the present invention, can under the situation that the LSU local store unit that detects memory takes place to lose efficacy, not re-use this and produced the buffering area that LSU local store unit lost efficacy, but other buffering areas that do not lose efficacy can also normally use, and do not need to change whole memory chip.Like this, it seems that from system to just look like that this memory chip does not produce partial failure the same, only the capacity of this memory is little little by little, and this is complete acceptable in the overwhelming majority's system.So application of the present invention can improve the reliability and the fault-tolerance of whole system greatly, reduce the failure rate and the repair rate of veneer, this has very important significance in the continuous work of application scenario have relatively high expectations, need to(for) functional reliability.
Description of drawings
Fig. 1 is the realization block diagram of the hardware logic electric circuit in the application that the message storage used is at present transmitted or ATM cell is cut apart/recombinated etc.
Embodiment
The present invention a kind ofly solves the memory partial failure and improves the method for whole system functional reliability and fault-tolerance, the specific practice of this method is that the logical circuit by logic IC or asic chip itself is that unit carries out self check with the buffering area to this memory, the method of self check is sequentially to write data to each memory cell of each buffering area of this memory, and then these data and the data of reading from this buffering area are compared judgement, if both are identical, think that then the detected memory cell of this buffering area is normal; If the self-detection result of all memory cell of this buffering area is all normal, then after this buffering area self check finishes, its first address is written in the freebuf formation, just can use this buffering area in the work of logic IC or asic chip afterwards; Have certain or some storage-unit-failure if detect this buffering area, then the first address of this buffering area will not be written in the freebuf formation, and this buffering area that has the LSU local store unit inefficacy will no longer be accessed in the operate as normal of logic IC or asic chip forever; Simultaneously, the memory self-checking circuit is when carrying out the memory self check, one counter is set to be counted the buffering area number that damages that lost efficacy, and after self check finishes, by CPU read this statistics lost efficacy the buffer count device that damages count value and handle accordingly: if lost efficacy the buffering area number that damages seldom, under the little situation of influences such as the function of veneer and performance, can think that this veneer is normal, allow the work as usual of this veneer; The number that damaged buffering area when losing efficacy is bigger, under the situation that may affect greatly the function and the performance of veneer, should send alarm signal, and request maintenance or prompting user in time change this data storage veneer.
When each buffering area to this memory carries out self check, can select to adopt a kind of method for testing memory to the requirement of memory error detection probability according to the complexity and the system that realize, for example scanning patter method, checkerboard pattern method, MATS algorithm, the graphic-arts technique that strides, nine step algorithms, nine step of expansion algorithm, 13 go on foot algorithms, MarchC algorithm or the like, the concrete grammar of above-mentioned these algorithms can be checked related data, and the present invention does not give unnecessary details at this.
Method of the present invention is carried out emulation and simulation by the applicant in computer and some equipment, system, and in actual items, implement test,, realized goal of the invention through the practice test, prove that this method performing step is simple, reliable operation, have good application prospects.
Claims (1)
1, a kind of method that solves the memory partial failure, it is characterized in that: this method is that the logical circuit by logic IC or asic chip itself is that unit carries out self check with the buffering area to memory, if the self-detection result of all memory cell of certain buffering area is all normal, then after this buffering area self check finishes, its first address is written in the freebuf formation, just can uses this buffering area in the work of logic IC or asic chip afterwards; Have certain or some storage-unit-failure if detect certain buffering area, then the first address of this buffering area will not be written in the freebuf formation, and this buffering area that has the LSU local store unit inefficacy will be accessed in the operate as normal of logic IC or asic chip never; Simultaneously, the memory self-checking circuit is when carrying out the memory self check, one counter is set to be counted the buffering area number that damages that lost efficacy, and after self check finishes, read this statistics by CPU and lost efficacy and damage the count value of buffer count device and handle accordingly, if lost efficacy the buffering area number that damages seldom, under the little situation of the function of veneer and performance impact, can think that this veneer is normal, allow the work as usual of this veneer; When the number that damaged buffering area when losing efficacy is big, under the situation that may affect greatly, should sends alarm signal request maintenance or change veneer the function and the performance of veneer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB011350881A CN1288882C (en) | 2001-11-27 | 2001-11-27 | Solution to local failure of memory |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB011350881A CN1288882C (en) | 2001-11-27 | 2001-11-27 | Solution to local failure of memory |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1422048A true CN1422048A (en) | 2003-06-04 |
CN1288882C CN1288882C (en) | 2006-12-06 |
Family
ID=4672943
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB011350881A Expired - Fee Related CN1288882C (en) | 2001-11-27 | 2001-11-27 | Solution to local failure of memory |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1288882C (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101742038B (en) * | 2008-11-14 | 2012-08-22 | 夏普株式会社 | Image processing apparatus |
CN114315541A (en) * | 2022-01-17 | 2022-04-12 | 万华化学(四川)有限公司 | Cyclohexanone composition and application thereof |
CN115292114A (en) * | 2022-10-09 | 2022-11-04 | 中科声龙科技发展(北京)有限公司 | Data storage method, device, equipment and storage medium based on ETHASH algorithm |
-
2001
- 2001-11-27 CN CNB011350881A patent/CN1288882C/en not_active Expired - Fee Related
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101742038B (en) * | 2008-11-14 | 2012-08-22 | 夏普株式会社 | Image processing apparatus |
CN114315541A (en) * | 2022-01-17 | 2022-04-12 | 万华化学(四川)有限公司 | Cyclohexanone composition and application thereof |
CN115292114A (en) * | 2022-10-09 | 2022-11-04 | 中科声龙科技发展(北京)有限公司 | Data storage method, device, equipment and storage medium based on ETHASH algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN1288882C (en) | 2006-12-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102084430B (en) | Method and apparatus for repairing high capacity/high bandwidth memory devices | |
CN101589370B (en) | A parallel computer system and fault recovery method therefor | |
CN102541756A (en) | Cache memory system | |
US20080071499A1 (en) | Run-time performance verification system | |
JP2001350651A (en) | Method for isolating failure state | |
US9141463B2 (en) | Error location specification method, error location specification apparatus and computer-readable recording medium in which error location specification program is recorded | |
CN102932444A (en) | Load balancing module in financial real-time trading system | |
US20040216003A1 (en) | Mechanism for FRU fault isolation in distributed nodal environment | |
US6950978B2 (en) | Method and apparatus for parity error recovery | |
CN101150458A (en) | Method and device for single board detection | |
CN105959235A (en) | Distributed data processing system and method | |
JPH07183898A (en) | Method for recovering predetermined order for cell style of asymmetric order in atm exchange technology | |
CN101299685B (en) | Method and system for testing switching network as well as test initiation module | |
CN107203335A (en) | Storage system and its operating method | |
CN1288882C (en) | Solution to local failure of memory | |
CN108228669A (en) | A kind of method for caching and processing and device | |
CN104780123B (en) | A kind of network pack receiving and transmitting processing unit and its design method | |
CN101458305A (en) | Embedded module test and maintenance bus system | |
CN101634939B (en) | Fast addressing device and method thereof | |
CN102135941B (en) | Method and device for writing data from cache to memory | |
JP3401160B2 (en) | Distributed shared memory network device | |
US6928588B2 (en) | System and method of improving memory yield in frame buffer memory using failing memory location | |
CN112613254B (en) | System and method for verifying fault injection of mirror image control module in processor | |
RU2383067C2 (en) | Method of storing data packets using pointer technique | |
US7788546B2 (en) | Method and system for identifying communication errors resulting from reset skew |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20061206 Termination date: 20161127 |