ECC circuit failure verifier

申请号 EP87113099.3 申请日 1987-09-08 公开(公告)号 EP0265639B1 公开(公告)日 1994-03-23
申请人 International Business Machines Corporation; 发明人 Aichelmann, Frederick John, Jr.;
摘要
权利要求 Circuit for detecting failures in an error detection syndrome generation path in an error checking and correction circuit including an Error Correcting Code (ECC) circuit for receiving data bits Di and check bits Cj of an ECC word stored in storage including a check bit generator (18) for generating check bits Gj in accordance with an error correcting code to be compared with the stored check bits Cj, a check bit comparator (20) for comparing each of the generated check bits Gj to its related memory check bit Cj and for generating a related syndrome bit Sj, a check syndrome block (24) for determining whether there is one correctable error, zero or multiple correctable error in the ECC word, an error bit locator (22) for determining the bit locating of an y single error occuring in the ECC word, a data bit modifier (16) receiving said data bits Di and the output of said error-bit locator (22) for correcting the single error, said check syndrom block (24) deriving from said Syndrome bits Sj a single E bit by means of a logical Exclusive OR operation and characterized in that that it further includes:- a parity-bit generator (40) for generating a parity bit Pk from each of K data fields in the ECC word,- a parity-bit combiner (44) receiving said parity bits Pk for producing a first Q1 bit derived from the first half ( P1-PK/2 ) of a set of K parity bits Pk received from said generator (40) by means of an exclusive OR operation, and a second Q2 bit derived from the second half ( P(((K/2)+1)-Pk ) of said set of K parity bits Pk received,- a check-bit combiner (46) receiving said check bits ( C1-Cj/2 ) for producing a first M1 bit from a logical exclusive OR combination of the first half of the check bits Cj, and a M2 bit derived by means of a logical exclusive OR combination of the second half of said check bits ( (Cj/2)+1-Cj ),- a comparator (48) for generating a first H1 bit derived from a exclusive OR operation of said M2 bit with said Q1 bits, and for generating a H2 bit derived from an exclusive OR operation of said M1 bit with said Q2 bits,- a H-bit combiner (54) for deriving from said H1 and H2 bits a D bit by means of a logical exclusive OR operation,- a comparator (52) for comparing said D bit with said E bit in order to detect the occurence of an error in the syndrome generation path.A circuit according to claim 1 wherein two and only two diagonal quadrants of the code matrix of the error correction code used each are composed entirely of columns which have an even number of ones, and the other two diagonal quadrants are composed entirely of columns which have an odd number of ones.
说明书全文

Background of the Invention

The present invention relates generally to error checking and correction (ECC) circuits for computer memory systems, and more particularly to circuits for detecting failures in ECC circuitry.

Each new generation of computer systems substantially increases the number of high-bit-density chips utilized in the memory. This chip increase provides a corresponding increase in memory capacity. However, such large capacity memory systems utilizing high-density memory chips are much more susceptible to memory chip failure. The most common types of chip failures include single cell, word line, bit line, and chip fail faults. In addition to these hard faults, computer chip memories are susceptible to soft errors caused by alpha particle radiation.

However, it has long been recognized that the integrity of the data bits stored in and retrieved from such memories is critical to the accuracy of calculations performed in the data processing system. In this regard, the alteration of one bit in a data word could dramatically affect arithmetic calculations or could change the meaning of recorded data.

Accordingly, in order to minimize the consequences of hard and soft memory errors, error checking and correction (ECC) circuits are routinely included in computer systems. These ECC circuits typically utilize an error correcting code in the class of codes known as single-error correcting, double error detecting (SEC-DED) codes. Such SEC-DED codes are capable of correcting one error per data word and detecting two errors therein. Of particular advantage are the odd weight column error codes because of the speed, cost, and reliability of the attendant decoding logic. Examples of such codes are disclosed in Fig. 3 of Chen and Hsiao, IBM J. Res. Develop, Vol. 28, No. 2, March 1984.

The above described ECC circuits with their error correcting codes require the storage of a predetermined number of check bits, Cj, along with the data bits, Di, in the ECC word. For example, for 64 data bits, Di, typically 8 check bits, Cj, are generated by of an error correcting code algorithm circuit which implements an algorithm of the type disclosed in the above article. These check bits are then stored along with the word data bits. Upon readout, the data bits, Di, read from an addressable memory location, are again run through an error correcting code algorithm circuit to generate a second set of check bits, Gj. This newly generated set of check bits is compared to the memory stored check bits, Cj, to obtain syndrome bits, Sj. If any of these syndrome bits is a one, indicating a difference in the compared check bits Gj and Cj, then it is known that the stored data word contains an error. If it is a single error, then the syndrome bits, Sj, can be decoded to determine the error location in the word, and the error corrected.

US-A-3 688 265 entitled "Error free decoding for failure-tolerant memories discloses an ECC circuit including a syndrome detection path for providing the correction and for detection of errors. Such a circuit is also shown in US-A-3 891 969 entitled "Syndrome logic checker for an error correcting code decoder" by B.A. Christensen.

However, the above described ECC circuits for generating the syndrome bits, and the additional memory necessary to store the check bits, Cj, are both subject to failures. In this regard, errors can occur in the generation of the error correction code signals through circuit faults, through the erroneous recording or readback of the error correction code signals, or through read/write circuit failures. Such failures would lead to the indication of erroneous data, with the possibility of correct data being altered in the ECC circuit, when, in fact, the error occurred in the error checking and correction circuit.

The invention as claimed is intended to provide a fault detection capability for the ECC circuit, itself.

The advantage offered by the present invention is that it provides the foregoing fault detection of the ECC circuit, but without replicating, and independent of, the ECC circuit. An additional advantage is that the present invention provides this fault detection capability without the need for inserting an additional bit in the ECC word. Finally, the present circuit may be used to quickly determine if all of the data bits in an ECC word are correct a number of cycles in advance of the completion of normal ECC operations.

Summary of the Invention

The present invention concerns a circuit for detecting failures in an error detection syndrome generation path in an error checking and correction (ECC) circuit, including an Error Correcting Code (ECC) circuit for receiving data bits Di and check bits Cj of an ECC word stored in storage including a check bit generator for generating check bits Gj in accordance with an error correcting code to be compared with the stored check bits Cj, a check bit comparator for comparing each of the generated check bits Gj to its related memory check bit Cj and for generating a related syndrome bit Sj, a check syndrome block for determining whether there is one correctable error, zero or multiple correctable error in the ECC word, an error bit locator for determining the bit locating of an y single error occuring in the ECC word, a data bit modifier receiving said data bits Di and the output of said error-bit locator for correcting the single error, said check syndrom block (24) deriving from said Syndrome bits Sj a single E bit by means of a logical Exclusive OR operation, the circuit being characterized by:

  • a parity-bit generator for generating a parity bit Pk from each of K data fields in the ECC word,
  • a parity-bit combiner receiving said parity bits Pk for producing a first Q1 bit derived from the first half ( P1- PK/2 ) of a set of K parity bits Pk received from said generator by means of a exclusive OR operation, and a second Q2 bit derived from the second half ( P((K/2)+1)-Pk ) of said set of K parity bits Pk received,
  • a check-bit combiner receiving said check bits ( C1-Cj/2 ) for producing a first M1 bit from a logical exclusive OR combination of the first half of the check bits Cj, and a M2 bit derived by means of a logical exclusive OR combination of the second half of said check bit ( (Cj/2)+1-Cj ),
  • a comparator for generating a first H1 bit derived from a exclusive OR operation of said M2 bit with said Q1 bits, and for generating a H2 bit derived from an exclusive OR operation of said M1 bit with said Q2 bits,
  • a H-bit combiner for deriving from said H1 and H2 bits a D bit by means of a logical exclusive OR operation,
  • a comparator for comparing said D bit with said E bit in order to detect the occurence of an error in the syndrome generation path.

Brief Description of the Drawings

The figure is a schematic block diagram of one embodiment of the present invention.

Detailed Description Of The Preferred Embodiment

The present invention has wide applicability to error checking and correcting circuits which utilize error correction codes. However, in order to provide a detailed description of one embodiment of the present invention, the invention will be described with respect to the particular error checking and correction circuit 1 0 shown in the Figure.

Referring now to the Figure, ECC circuit 10 receives data bits, Di, from an ECC word in a memory chip via line 12, and check bits, Cj, from the same ECC word via line 14. By way of example, and not by way of limitation, an ECC word with 64 data bits and 8 check bits will be used to set the context for the present invention.

The data bits, Di, are applied directly to a data bit modifier 16 and to a check bit generator 18. The data bit modifier 16 operates to correct data in ECC words containing one and only one bit error therein. The check bit generator 18 in combination with a check bit comparator 20, an error bit locator 22 and a syndrome check 24, operate to determine whether there is a single bit error, to determine its location in the data word, and to provide a signal to correct that data bit at that location in the data modifier block 16. The check bit generator 18, the check bit comparator 20, the error bit locator 22, and the check syndrome block 24 may take a variety of different circuit configurations. However, it is preferred that these blocks be designed to implement an odd weight column SEC-DED code of the type described by M.Y. Hsaio in the article "A Class Of Optimal Minimum Odd Weight Column SEC-DED Codes", IBM J. RES. Develop. 14, pages 395-401 (July 1970). In particular, it is preferred that the odd weight column SEC-DED code utilized have a code matrix with two and only two diagonal quadrants composed entirely of columns which have an even number of ones therein, and with the other two quadrants composed entirely of columns which have an odd number of ones therein. A typical code of this type is shown in Table 1 for a 72/64 bit ECC word.

The code matrix of Table 1 is for forming the check bits, Cj, which can be used to locate an erroneous data bit in an ECC word. As noted previously, these check bits are typically disposed as the last set of bits in the ECC word. Table 1 is an example of a code matrix for generating 8 check bits, C₁ C₈, for error checking a 64-bit data word.

In the code matrix, each row represents the particular data bits from the 64 data bits, Di, which are to be logically combined to form the check bit designated at the far right side adjacent to that row. For example, for generating check bit, C₁, the matrix indicates that the following data bits, Di, are to be logically combined, using, for example, five levels of exclusive OR operations:



C₁ = D₁ + D₂ + D₃ + D₄ + D₅ + D₆ +

D₇ + D₈ + D₂₅ + D₂₆ + D₂₇ +

D₂₈ + D₂₉ + D₃₀ + D₃₁ + D₃₂ +

D₃₃ + D₃₇ + D₃₈ + D₃₉ + D₄₁ +

D₄₅ + D₄₆ + D₄₇ + D₄₉ + D₅₃ +

D₅₄ + D₅₅ + D₅₇ + D₆₁ + D₆₂ + D₆₃,



where + is the exclusive OR function. Such a five level exclusive OR tree would require 31 exclusive OR 2-input gates.

The logical combination of each of the eight rows separately in the code in accordance with the matrix of Table 1 yields the eight check bits C₁ C₈. Accordingly, if the exclusive OR function is utilized, then 8 x 31 exclusive OR gates are required in the check bit generator 18 to generate the 8 check bits.

When the check bits to be stored with the word in memory are generated from the code matrix prior to the word storage, they are referred to as memory check bits, Cj. When the stored word is read out and the check bits are generated again in order to be compared with the stored memory check bits, Cj, then they are referred to as generated check bits, Gj.

Continuing with the example of a 64 data bit/8 check bit ECC word, the eight generated check bits, Gj, are applied as one input to the check bit comparator 20, while the eight stored memory check bits, Cj, are applied as a second input thereto. The check bit comparator 20 compares each of the generated check bits, Gj, to its related memory check bit, Cj, to generate a syndrome bit, Sj. For example, G₃ is compared to C₃ (G₃ + C₃) to yield the syndrome bit S₃. If G₃ and C₃ are the same, then there is no error in this check bit and S₃ = 0. If G₃ and C₃ are not the same, then this check bit is in error, and S₃ = 1.

Typically, this comparison function to form the eight syndrome bits may be accomplished by of one level of exclusive OR gates, i.e., one exclusive OR gate for generating the function Sj = Cj + Gj , for each check bit comparison.

Continuing with this 8 check bit example, the eight syndrome bits, Sj, resulting from the check bit comparison in the check bit comparator 20, are applied to both the error bit locator 22, the check syndrome block 24, and an error/no error tester block 26. The error bit locator 22 operates to determine the bit location of any single error occurring in the 64 data bits in the ECC word. A number of circuit configurations may be utilized to accomplish this error location function. By way of example, sixty four 8-way AND gates, one for each data bit in the ECC word, may be used to determine the bit error location in the ECC word. The eight inputs to each of the AND gates are either the true or the complement of each of the eight syndrome bits, Sj, as is well known in the art. If any AND gate is activated (all eight inputs are high), then it will cause a data bit, Di, associated therewith to be inverted.

The check syndrome block 24 performs the function of determining whether there is one correctable error, or zero or multiple uncorrectable errors in the sixty four data bits in the ECC word. If there is one correctable error in the ECC word, the check syndrome block 24 generates a one output on line 25, which operates to activate the error bit locator 22, to transmit its correction information to the data bit modifier block 16. If this line 25 output is at a zero or even level, indicating that there are either no errors or multiple uncorrectable errors in the ECC word, then the error bit locator 22 is not activated to transmit its output. The foregoing function may be implemented simply by logically combining the syndrome bits into a single bit E. For example, this logical combination could take the form of a three level exclusive OR circuit (7 exclusive OR gates) that performs the function



E = S₁ + S₂ + S₃ + S₄ + S₅

S₆ + S₇ + S₈



   The error/no error tester block 26 performs the function of determining whether there are zero errors or one or more errors. This function can be simply implemented by of an 8-way regular OR gate with the eight syndrome bits, Sj, as inputs. If there are zero errors, then the output on line 27 therefrom is zero or even. If there are one or more errors, then the output on line 27 is one or odd.

The data bit modifier block 16 simply functions to invert any data bit, Di, in the sixty four bit data word from line 12 which the error bit locator 22 indicates is in error. This function may be simply implemented by one level of sixty four exclusive OR gates, one for each data bit, Di. Each of these exclusive OR gates receives a designated data bit, Di, at one input, and the output from the appropriate 8 way AND gate for that data bit from the error bit locator 22. If the output from the 8-way AND gate is one, then the data bit value is inverted by the exclusive OR operation. The output from the data bit modifier is applied on line 17 to a CPU.

An additional block 28, labeled as an error classified, is used to provide output signals indicating whether there are single errors (SEC), or multiple errors (MED) in the data word. The error classifier may be implemented simply by an AND gate which has to E bit on line 25 from the check syndrome block 24 as one input, and the error/no error output on line 27 from the error/no error tester block 26 as the other input. If both inputs are one, then the output on the SEC line 30 is a "one," indicating a single error. If both inputs are even, then the MED line 32 is a "one."

The present invention is designed to detect failures in the above described syndrome generation path. The invention performs this function by generating a bit D to be compared with the bit E from the check syndrome block 24. If these bits are identical, then there are no failures in the ECC circuit. This D bit is generated without generating a redundant set of check bits, and without requiring an extra bit in the ECC word.

The present invention is also designed to quickly determine if the data bits in an ECC word are correct a number of cycles before the completion of the normal ECC function.

Referring again to the figure, the invention comprises means 40 for generating a bit, Rk, from each of K data fields in the ECC word, by logically combining the data bits, Di, in the data field; means 42 for comparing logical combinations of the R bits, to logical combinations of the memory check bits, Cj, to generate H bits; in combination with means 52 for forming a logical combination of the H bits, and means for comparing that logical combination to a logical combination of the syndrome bits, Sj, to thereby independently determine whether there is a failure in the error detection syndrome generation path, without cross coupling with the check bit generator 18.

In a preferred embodiment of the present invention, the R bit generating means 40 comprises means for generating a parity bit, Pk, for each data field in the ECC word.

In a further embodiment of the present invention, the H bit generating means 42 comprises means 44 for combining the parity bits, Pk, from the parity bit generator means 40, into two logical combinations, to form bits Q₁ and Q₂. This H bit generating means 42 further comprises means 46 for combining the memory check bits, Cj, into two logical combinations, to form bits M₁ and M₂.

In one example embodiment, the parity bit generating 40 may comprise circuitry for generating a parity bit for each N-bit data field in accordance with the function:



Pk = D₁ + D₂ ... + DN,



or for data fields of eights bits, as in the present example,



P₁ = D₁ + D₂ + D₃ + D₄ + D₅ + D₆ + D₇ + D₈,



where + is the exclusive OR function. It should be noted that the term data field refers to 8-bit bytes for the example of a 72/64 bit ECC word. For an 8-bit data field, this parity bit generation function can be accomplished with seven exclusive OR gates formed into three levels, as is well known in the art.

The parity bit combining means 44, in one example embodiment, may include means for forming the bit Q₁ from a logical combination of half of the parity bits, Pk, and means for forming the bit Q₂, from the logical combination of the other half of the parity bits. By way of example, for K parity bits, P₁, P₂, ... PK, the parity bit combining means may include means for forming bit Q₁ from the logical combination of P₁, ... PK/2, and means for forming bit Q₂ from the logical combination of the parity bits PK/2+1 ... PK . In a preferred embodiment of the present invention, the parity bit combining means 44 includes for forming bits Q₁ and Q₂ in accordance with the following logical combinations:

   Q₁ = P₁ + P₂ ... + PK/2 , and

   Q₂ = PK/2+1 + PK/2+2 ... + PK , where + is the exclusive OR function

   For the specific example, where their are eight data fields (bytes), so that K = 8, than Q₁ and Q₂ would be formed in accordance with the following equations:

   Q₁ = P₁ + P₂ + P₃ + P₄ , and

   Q₂ = P₅ + P ₆ + P₇ + P₈ . As is well known in the art, the bit Q₁ can be formed by three 2-input exclusive OR gates formed in two levels. The same configuration can be used to form the bit Q₂.

Referring now to the memory check bit combining means 46, this block maybe formed, in one embodiment, by means for forming bit M₁ from a logical combination of half of the check bits, Cj, and for forming bit M₂ from a logical combination of the other half of the check bits. By way of example, for J check bits C₁, C₂,...CJ, the memory check bit combining means 46 may include means for forming bit M₁ from a logical combination of the check bit C₁ ... CJ/2, and means for forming M₂ from a logical combination of CJ/2 + 1, ... CJ .;

   In a preferred embodiment, the memory check bit combining 46 may comprise means for forming bits M₁ and M₂ in accordance with the following logical combinations:



M₁ = C₁ + C₂ ... + CJ/2 , and

M₂ = CJ/2+1 + CJ/2+2 ... + CJ .

Again, for the specific example of an ECC word with 8 check bits, the bits M₁ and M₂ would be formed in accordance with the following equations



M₁ = C₁ + C₂ + C₃ + C₄ ,

M₂ = C₅ + C₆ + C₇ + C₈ .

The functions set forth in these equations can be realized by three exclusive OR gates formed in a two level exclusive OR tree.

In essence, M, is the result of the binary addition of the data bits, Di, in the top left and right quadrants in the code matrix of Table 1. However, in all cases where a given data bit is repeated an even number of times, it can be combined and cancelled. Accordingly, the top left quadrant in the code matrix completely cancels. However, every column in the top right quadrant of the code matrix has an odd number of ones so that every data bit D₃₃ -D₆₄ is present. If the data bits D₃₃- D₆₄ are formed into parity bit combinations, then



M₁ = D₃₃ + D₃₄ + ... + D₆₄ = P₅ + P₆ + P₇ + P₈ = Q₂ .

The same manipulation can be used to show that



M₂ = D₁ + D₂ + D₃ ... + D₃₂ = P₁ + P ₂ + P₃ + P₄ = Q₁ .

Accordingly, the H bit generating 42 may include for generating bit H₁ from a logical combination of the bits Q₁ and M₂, and for generating bit H₂ from a logical combination of the bits Q₂ and M₁. In a preferred embodiment of the present invention, this H bit generating means 42 includes means 48 for generating the H₁ bit and the H₂ in accordance with the following functions:



H₁ = Q₁ + M₂, and

H₂ = Q₂ + M₁ .

Accordingly, the H₁ bit may be formed by a single exclusive OR gate, and the H₂ bit may likewise be formed by a single exclusive OR gate. The H₁ and H₂ bit output from means 48 is provided on line 49.

The comparing means 52 operates to compare a logical combination of these H₁ and H₂ to a logical combination of the syndrome bits, Sj. In order to implement this function, block means 54 is utilized to logically combine the H bits to form a single bit D on line 56. In one embodiment of the present invention, this block means 54 may be implemented simply by a single exclusive OR gate for comparing the H₁ and H₂ bits and generating a single bit D therefrom. This single bit D is applied to a comparator 58 along with the single bit E from the check syndrome block means 54. In one embodiment of the present invention, the comparator 58 may be implemented simply by a single exclusive OR gate for performing the function D + E. If the result of this D + E function is zero or even, then there are no errors or failures in the error detection syndrome generation path. However, if the D + E function is a one or odd, then a failure has been detected in the ECC circuit. This error detecting output is applied on line 60 to other control circuitry (not shown) for either reinitializing and starting the check over again, or for performing some form of alert function.

It should be noted that the comparator 58 may be checked for errors by applying the bit D on line 56 and the bit E on line 25 directly to an external testing circuit.

It should also be noted that the circuit of the present invention can be utilized to verify that the data bits in an ECC word are correct a full two exclusive OR cycles earlier than the standard ECC circuit. This ECC word verification circuit would operate on the bit D, by itself, and would be a viable option for speeding up the transfer of errorless ECC words if ECC circuit failure checking was not a priority.

It can be seen that the present invention provides an ECC circuit facility with improved data integrity by providing a method for detecting failures within the detection syndrome generation path of the ECC circuit. This inventive circuit does not require an additional bit in the basic ECC word to provide this detection capability. Additionally, the present circuit provides this ECC circuit error detection capability without replicating the original check bit generating circuit. In particular, for a conventional 72/64 bit ECC word, the syndrome generation circuitry would require 263 exclusive OR gates, i.e., 31 X 8 exclusive OR gates to generate SJ, and 7 exclusive OR gates to generate the syndromes. In contrast, the circuit of the present invention requires a total of only 71 exclusive OR gates, i.e., 7 X 8 gates for the parity generator, 3 exclusive OR gates for generating each of the bits Q₁, Q₂, M₁ and M₂, and 1 exclusive OR gate for generating each of the bits H₁, H₂, and the D bit.

By way of example, and not by way of limitation, two other error correcting codes that may be advantageously utilized to implement the present invention are shown in Table 2 and Table 3, along with the appropriate equations for M₁, M₂, Q₁ and Q₂. In Table 2, the error correcting code is for a 40/32 bit ECC word. In Table 3, the error correcting code is for an 22/16 bit ECC word.

It should be noted that the circuitry of the present invention does not use part of the results of the circuit which it is checking, and does not share circuitry therewith.

QQ群二维码
意见反馈