(PDF) Properties and constructions of binary channel codes

For the entropy coding of independent and identically distributed (i.i.d.) binary sources, variable-to-variable length (V2V) codes are an interesting alternative to arithmetic coding. Such a V2V code translates variable length words of the source into variable length code words by employing two prefix-free codes. In this article, several properties of V2V codes are studied, and new concepts are developed. In particular, it is shown that the redundancy of a V2V code cannot be zero for a binary i.i.d. source {X} with 0 < p X (1) < 0.5. Furthermore, the concept of prime and composite V2V codes is proposed, and it is shown why composite V2V codes can be disregarded in the search for particular classes of minimum redundancy codes. Moreover, a canonical representation for V2V codes is proposed, which identifies V2V codes that have the same average code length function. It is shown how these concepts can be employed to greatly reduce the complexity of a search for minimum redundancy ...

Realization of modern telecommunication systems is inconceivable without use of different binary sequences. In this paper, an overview of random binary sequences used in different telecommunication systems is given. Basic principles of pseudorandom, chaotic, and true random sequence generation are presented, as well as their application in telecommunications in respect to advantages and drawbacks of the same. Moreover, particular scheme for true random binary sequence generation is given, as well as results of randomness assessment obtained by NIST statistical test suite. Finally, short insight into importance of random binary sequence in secure communications is given.

Properties and Constructions of Binary Channel Codes Proefschrift ter verkrijging van de graad van doctor in de technische wetenschappen aan de Technische Hogeschool Eindhoven, op gezag van de rector magnificus, prof. dL S, T. M. Ackermans, voor een commissie aangewezen door het college van dekanen in het openbaar te verdedigen op vrijdag 3 mei 1985 te 16.00 uur door Kornelis Antonie Schouhamer Immink geboren te Rotterdam This thesis was approved by tlle promotors Prof. Dr. Ir. J. P. M. Schalkwijk and Prof. Dr. K. W, CatterrnQle Contents CONTENTS Summary , . . . vii Acknowledgement. . . 0, Introductory chapter. VII! . . . . . . , _ . . . . I. Performance of simple binary DC-constrained codes . 10 2. Construction of binary DC-constrained codes 31 3. Spectrum shaping with bÎnary DC2-constrained codes 49 4. Some statistica] properties of maxentropic runlength-limited sequences . . . . . , . . . . . . . . . . . . . . " 63 5. A generalized method for encoding and decoding runlength-Bmited binary sequences _ . . . . . . . . . . _ . . . . . . . 75 Biography . . 83 Samenvatting 84 Chapters IA are reprinted from Philips Journa\ of Research. Chapter 5 is from G. F. M. Beenker and K. A. Schouhamer lmmink, IEEE Inform. Theory> IT-29, p. 751 (1983). Summary SUMMARY Channel codes, sometimes ealled transmission Or lines codes, are applied in storage syslems sneh as magnetic tape or disc and optical disco Applications are also found in transmission systems over libre Or met'allie eable. A ehannel code converts the digital source information to a form suitable for a specilic transmission medium. For example DC-free codes are designed in such a way lhat the encoded signal has suppressed frequency componenls in the region around zero frequeney. These codes are for example applied in transmîssion systems having insulficîent response in the low-frequency range. Another requirement împosed on a channel code orÎgînates from the fact lhat the maximum distance between transitions in the encodcd signa!, the maximum 'runlength' , should be limited to enable a simple syslem doek recovery in the receiver. Thîs thesis deals with systematic methods of desîgning nC-free and runlength-limited codes. Procedures are given for a simple enumerative encoding and decoding of the codewords. Also described are several propertîes of channel codes sueh as spectral and runlength distributions. Criteria derÎved from information theory are used lo compare the channel codes. Binal'Y ｾﾷｨＨｊＮｮ･｛ viii codes ACKNOWLEDGEMENT I am great.ly indebted 10 the management of the Philips Itesearch Laboratories, Eindhoven, The Netherlands, for the opportunity to carry out and 10 publish the work described here. Stîmulatîng discussions with Prof. J. P. M. Schalkwijk, Prof. K. W. Cattermok and with colleagues have greatly contributed t.O the eontents of this thesis. In particular I Wan! to thank G. F. M. BeenkcT of the mathematical depaTtment of Philips Research for his mathematical support and for adding many contributions to the papers. Introductory chapter 1 INTRODUCTORY CHAPTER O. General Channel codes are applied in digital transmission or storage systems. Early digital transmission systems have been widely used for telegraphy (morse code) and telex, We are witnessing a revolution in world-wide telecommunications, data and computer networks are now being built with enormous capacities. Not only are the telecommunication networks booming, so too are systems for the storage of digital information (transmission in time). For example the storage capacity per unit surface of magnetic tape Or disc bas been doubled evcry three years since the early 19605. The Compact Disc Digital Audio System, introduced in )983, based on the optical read-out principle, was the first high-density storage medium (one bit per square micron) to reach the homes of the consumer. Channel codes are the cornerstone of almost all digital transmission systems. Their main functions are: (i) matching the transmitted signals to the transmission channel, (ii) allowing reliable transmission fot reception by simple receivers. For transmission or storage of binaty dîgîtal înformation the simplest 'code' format seems to be not coding, Le. the soutce symbols '0' or 'I' are coded as the presence or absence of pulses, respectively. There arc, however. same engineering problems associated wÎth this simple format. In most transmission systems timing information must be extraeted from the transitions 0 -> I or I -> 0 of the received message, so that long sequences of like symbols should be avoided. An uncoded random binary souree signal has an average time between successive signal changes equal to two symbols. The maximum distance between transitions, the so-called maximum runlength, is infinite. Scramblers are often used in cable transmission practice to randomize (without adding redundancy) the source data 1). Unfortunately, however, a code based upon statistical considerations alone remains vulnerable to specific Worst-case or 'pathologicaI' channel sequences for which the probability of being errOlleously received is much larger than average, The Fibonaccî codes described by Kautz 2) and the runlength·1îmîted codes of Tang and Bahl $) add redundancy to the signal so that a finîte maximum runlength is absolutely guaranteed. lntuitively it should be dear that the smaller the maxÎmum runlength the more redundancy should be added to the encoded stream. The arnount of redundancy needed to guarantee a maJI:imum runlength can be calculated using an information theoretical approach $), 2 Binary channel codes Many channels cannot pass the low f(cquclIcics with sufficient signal-tonoise ratio. Shaping thc spectrum of the encoded stream by coding can cope with this problem. Most. of t.hc channc1 codes that are used in practice are 50called block codes. The source digits are grouped in source words of m digits. Using a code book tIJc source words are translated into blocks of n digits called eodewords. Constructions of digital codes having spectrum ?eros at arbitrary frequencies were given by Gorog 4). The designer of a digital transmission system will in general be confronted with the following problems: (I) the characterization of the transmissioll channcl lcading to some specific channel code requirements; (ii) the choÎce of set(s) of codewords, called code book pages (sometimes confusingly called alphabets), satisfying thc channel code requirements; (Hi) the translation of the souree words into thc codewords and vice versa e.g. using look-up tables; (iv) the evaluation of the newly designed code with respect 1.0 added redundancy and resulting shaping; (v) the testing of the new channel code in a practical environment to evaluate its performance in average and worst-case conditions. During the design and experimental phase, feedback in the mentioned itemS is incorporated to improve the total system performance. Thc channel characterization leading to specif1c code reqlJirernents is an important aspect of the syslem design. This aspect is beyond the scope of this thesis. The translation of source words into codewords and vice versa will be one of the main topics discussed here. Boolean flJnctions to lranslate source words into codewords and vice versa can easily be found by hand using a heuristic approach if the codewords arC rclatively smal!. Ir wil! be shown that generally speaking t.hc efficiency of a code improves with increasins codeword lengtlL A systematic; approach of thc mapping problem is needed if greater code effi. ciency is desired. Frankling and Pierce 5) pointed out that a simple sequential algorithm given by Schalkwijk 5) and Cover 7) allows the assignmcnt of unique numbers to codewords of nxed disparity, and Lyon 8) reported on the practical use of this simple algorithm. Application of this so-calied enumerative cncoding and decoding algorithm leads to a systematic approach of thc mappÎng problern wÎth the addîtional advantage that look-up tables of moderate size, even for large codeword lengtIJs. call bc used. The idea of sequential encoding and decoding goes further back in history than noticed in reference 5. For example Cattermole 9) reported on the enumerative encoding and decoding theory. An embodirnent example using anaiogue circuitry was pat.ent,ed in 1952 H'). 3 IntroduclOry cfuIpter We shall study the design and informatÎon theoretical performance of two groups of channel codes: - De-free codes and ｾ runlength-limited codes. Survey papers regarding these types of codes can be found in refs 11, 12 and 13. l. nC-free codes The field of application of digital channel codes with supprcssed low-frequency cOInponents is quite braad. We find applieations in transmission sys· tems over fibre Or metallic cable 14,15,16, I?) in storage media such as magor optical recording 20 •2 1). Though the restrictions on the channcl netie ＩＹＮｾＱ sequence are frequently put in frequeney-domain terms, wc are aften more interested in the tÎme-domain properties. The souree sequenee, assumed to consist of equiprobable and independent binary digits, is mapped onto a binary channel sequence. The received signal can be written down as ., I r(t) '" al g(l - iT) + nw(t), (I) Im_OO where the al E ( - I, 1) are two-valued parameters, that are generated each T seconds. g(t) is the impulse response of the channel (plus possibly a whitening filter) and nw(l) is additive white gaussian noise (see ref. 22, chapter 4_3). Assuming that the signal is matehed-filtered (projceted) and sampled at I = kT, then the equivalent channel vector is 00 h=OkgO + L oigk_l+nk"'o"gO+qk+nk, (2) i.-Q;1 ｉｾｫ where q. is inter symbol interference (ISl) at t "" kT. The statistics of qk are directly related to the channel sequence a and the impulse response g(t) of the channe!. Thc lSI can therefore be affected in two ways: I) pulse shaping at the transmitter and/or receiver using filters, and 2) manipulation of the code structure and hence of the correlation in the channel sequence. The usual approach to combat ISI has foeused on the shaping of g(t) for zero Înterference. Hard-limiting channels (for example optical recording) only accept two pulse shapes, a positive Or a negative full-T pulse, so that ISI ean only be affeeted by the code structure or the receiving filter. The shaping of the code structure with the aim 10 minimize the lSI is the domain of the transmission codes. 4 Bil1ary channei codes l.I. Model oj 'he A C-coupled chllnnel Man)' transmission channels arc for practical reaSonS AC-coupled, i.e. there will be .>ome low-frequenc)' cut·olf due to coupling components, isolating transformers, etc. Other contribu1.il)ns to the ISI mal' arise from the banclwiJth limitations of the channt;!. We assume in the folJowing that ISI is only caused by the AC-coupling. lf the AC-couphng is a first-order high-pass filter and detection is dOlle with an integrate-and-dump filter, th en the ISI can be approximated by: k :E al, qk = -h (3) i=-oo where h (h <: 1) is lhe ratio of pulse duration and time-constant of the ACcouplingWe now define the (running) digital surn of a sequenee a bl': k Zk = I lit- (4) i=o_aQ We concludc frorn eqs (3) and (4) that the 151 of the AC-coupled channel i, proportional to the digital sum of the sequence. We further concludc that long runs of similar syrnbols build up a large iSf. To limit the iSi codes are used with lhe property that the ehannel sequence as,umes a finite number of sum values_ Sum constrained sequences have the frequency-domain propertl' that the power vanishes at zero frequencl', which is the rtason for the name of these wdes: DC-free (or DC-constrained) codes. The probabilitl' of channel ,ymbol error P(e) can be approximated bl' (a5wming P(a = I) ｾ P(a.,,- -I) = I)]: P(e) = t P<I n - hz I ;> go), where the subscripts of n and z; were discarded for brcvitl'. Assuming thc noise n and ISI q oe -hz to be independent, the probabililY density funC!.Îon of the SlIm (n - hz) i, the convolution of the individual density functions of n and zIn ｲｾｨｴｯ words gi ven the density function of the noise, a de,igner can minimize P(e) bl' properll' choosing the density function of the digital sum z_ Chang et aL Z3) reported on the design of DC-free transrnission codes based on pee) as a çriteriQn. Their method is strajghtforward but is impractieal when long codewords are used. It has the additional disadvantage that no systematic rnethod (except a direct look· up) is available for encodîng and decoding thc codewords. From eqs (3) and (4) wc notiee that the maximum IS! (peak eye closure) is relaled tc the total numbcr of surn values the channel sequence assumes. 5 Introductory chapter Mean-square-ISI has achieved a great popularity because of the ease with which they can be handled mathematically. It may be c1ear that the meansquare-ISI is proportional to the variance of the running digital sum (in short sum varianee). The last two criteria, i.c. thc ma/dmum and mean-square ISI, play an important role in the design of DC-constrained transmission codes. Glave 24) (sec a150 ref. 25) derived an upper bound on the probability of error due to ISI and noise of coded sequences. He found that if QE(-A,A) and E[q2] <ut that P(e) can be upper" bounded by (assuming nk gaussian and uncorrelated): 2P(e) ",; ＲａＯｾｴｬ [Ql(go - A)/a.J + Ql(go + A)/lt.ll + 2(1 - uJIA 2) Q(go/o.), (5) where ｾｵ is the noise varianee and Q( .) is the Gaussian error-probability function (see e.g. ref. 24). Eq. (5) can be used to estÎmate PIel once a bound on the IS! has been found. If the channel sequence is designed in such a way that ît assumes N sum values (N is sometimes called the digital SUm variation) then h A", T(N - 1). A frequency-domain reason for using DC-constrained sequences is found when the transmission channel is a small part of a data storage system. For example, in optical disc recording the requirement of reduced low-frequency content is not imposed because this storage medium does not respond to the low frequencies, but because the servo systems needed to position the laser spot on the disc are sensitive to low-frequency interference induced by the code stream. Thc frequency region with suppressed components is charaeterized by the so-called cut-olf frequency. Justesen al) showed a close relationship between the cut-off frequency and the sum varianee, which makes the sum variance an important parameter in the frequency- and time-domain. 1.2. Description of DC-constrained codes Many examples of DC-constrained channel codes are known which are block codes with codewords having zero or low disparity, where the disparity of a codeword is defined as the difference of the number of ones anó ｾ･ｲｯｳ in the codeword 4,9,13,26,27,28,29). In the simplesr type of code the SOUrce words have two alternative translations with opposite sign of the disparity 28). These codes are called bi-mode codes. The choice of a particular codeword polarity is made in such a way that the digital sum of the sequence after transmission of the new codeword is as close to zero as possible. 6 Binary channel codes In chap, I the ,at.c and low-frequency properties of simpie, bi-mode, Decomtrained channe1 codes are analysed, Sec. 1.2 starls with a brief description of the pwperlies of sequences generalcd by a Markov source consnained in sueh a way th at the digital sum of the sequence assumes a limited number of values 3o ,31). The properties of these theoretical sequences are used to establish a new figure of merit of De-eonstrained codes. The p<lwer speelral density function (spectrum) provides useful information aboUl the response near zero frcquency. ]'he calculation of the spectrum is usually a laborious proccss. Several methods and results have been reported in The procedure given by Cariolaro et al. 33,35) the literature ＳｾＬＮＴｾ｛＾ＬＤｏＷ｡ＸＩＮ tlsing matrix analysis is very weU suited to machine computation. For large codeword lengths and many encoder slates the memory requircments of the procedure become prohibitivc, even for large main frame computers. In sec. 1.3 a new method is presented to compute the power density funct.ion of lowdisparity, DC-balanced ehannel codes. The method uses thc special structure of t.he code, lcading lo a simple result, for the simp!c bi-mode channel codes the sum variance is cakulated in sec. 1.4. It is cuswmary to define the efficiency of a channel code as the ratio of the code rate and the noiseless channel capacity given the channel constraints ,s, 4 °l. In sec. 1.5, using t.he theory developed in sec. 1,;2, we d;;mne a new figure of merit for DC-constrained channel codes. We study t.he performance of these simpk channel codes by considering the exchange between the added redundancy and the tesulting suppression of low-frequency components comparcd with the properties of sequences generated by a maxentropic Markov sOUrcc. The simple bi-mode codes as discussed in chap. I use the ruil sets of codeword" of a certain disparity, In chap. 2 thc code design is based on codewords having a constraint on the maximum number of assumed sum valucs, i.e. those codewords arC discarded that make a relatively large contribution 1.0 the SUm varianee. The actllal basis of code design is given in sec. 2.3. It provides a lllethod for the enumeration of codewOrds with a constraint on tiN maximum numbcr of assumed sum values. In general lhe efflcîency of a channel code can be improved if longel' codewords are allowed, Unfortunately thc amoLlnt of encoding and decodjng hardware incteases exponentially with inereasing codewotd length if a direct method using look-up tables of thc souree word" and theiL' channel rcprcscn" t.at.ions is used. Sec. 2.4 deals with algorithms for enumerative encoding and dccoding of constrained codewords with the nice property that the number of enlries of look-up tables to be used grows polynomially with thc eodeword length. Finally. in sec. 2.5 t.he problem is addressed of sekcting selS of code- lntroduetory chapter 7 words to design DC-suppressed channel codes. A worked example is given of a ncw Sb I Ob transmission code, attractive for magnetic recording 41.42.43) and libre transmission ｾｍＶＩＮ showing superior features compared to other codes with rate 8/10. In chap. 1 we show that lhe cut-olf frequency of sequences generated by a maxentropic Markov source is proportional lo lhe redundancy of the code. Examples of a practical embodiment of binary DC-balanced codes are discussed in chaps land 2. It is shown that the performance of these simple codes, in parlicular fOr smal! codeword length, is comparabie with that of maxentropic sum constrained sequences. The power spectra! density funetion of digital sum constrained codes is characterized by a parabolic shape in the low-frequency range from De to the cut-olf frequency. In some applications it is desirabie to achieve for a given redundancy a larger rejection of the low-frequency components than possible with DC·balanced codes. Chap. 3 presents a new class of De-free codes having a zero second-derivative of the code spectrum at zero frequency. This results in a substantial decrease of the power at low frequencies for a lixed code redundancy with respect to the classical designs based on a digital SUm criterion. Sec. 3.:'. introduces a time-damajn constraint which has to be irnposed on the channel sequence so that lhe resulting spectrum of the sequence has lhe DC 2 -balanced property, i.e. has both zero power and ;<;ero second·derivative at zero frequency. Sec. 3.3 gives an enumeration method for finding the number of codewords CO be used in a DC 2 -balanced code. In order to obtain practical rates for the new codes. it is necessary 10 use relatively long codewords, which makes a direct rnethod of encoding and decoding using look-up tables of the source words and their channel representations prohibitively complex. Sec. 3.4 deals with the enurneralÎve encodÎng and decoding of DC 2 "balanced codewords. The algorithm is not more complex than looking-up and adding. The look-up tables needed for the enumera· tive coding grow polynomially in complexity with increasing codeword length. Sec. 3.5 gives examples of codes using codewords that can be concatenated without a merging rule. The power density functions of the newly developed codes are compared with those of classica! DC·balanced codes. 2. Runlengtll-Umited codes Binary codes such as the Miller code 16), EFM 20), 3PM 46) and Zero-Moduiation ｾＷＩ are baseband transmîssion systems with applications in magnetic and optical recording. These codes belong to the class of so-called binary rUn" length-hmited sequences (RLL sequences). A string of bits is ctefined to be Binary ch(Innd codes 8 runJcngth-limited if the number of consecutive like symbols is bounded between a certain minimum and a maximum value. The maximum runlength constraint guaran!.ees a clock pulse within ｾｯｭ･ specified time, which is needed fOl" thc clock regeneration at thc rcceiver. Obviously a ｾ･ｱｵ｣ｮ with a digital sum constraint implics a maximum runlength consIraint, but not vice versa. Thc minimum runlength const.raint is imposed to control intersymbol interfcrcnee and consequemly has a bearîng on the distortion of the transmitted signal when the transmission channel is bandwidth-limited ia.18). Jn chap. 4 wc show thllt RLL sequences with maximum information content. dcfined as maxentropic RLL sequences, have their runlengths exponentially distributed. This leads to a simple derîvation of the power densiLy function of maxentropic RLL ｳ･ｱ｜ｊＨｾｮ｣ｳＮ In sec. 4.2 we give a formal definition of RLL sequences, followed by 1111 ana)ysis of Ihe runlength distribution and spectral properties of maxentropic RLL sequences. The study in chap. 4 was motivated by t.he fact that practîeal ernbodiments of RtL channel codes can with moderate hardware quite easily reach r3teS of 80-950/0 of the maximum information capacity (see chap. 5 and refs 39 and 50). From this observation we maya priorî conproperties of practical codes can be predlcted by jecture that the ｳｴ｡ｩｾＮ｣｡ｬ maxentropie RLL scquences. In this way t.he results obtained from the maxentropie RLL sequence theory could be a simpte tooi for an approxirnate calculatioo of thc power density function of practical code streamS. To test Ihis hyputhesis we compare in sec. 4.4 t.hc runlength distributions aod power densit.y Funetions of two weli"known charmel codes with the resul!.s predicted by t.he maxentropic RLL theory. Many procedures are available for the design of runlength-iimited codes 2 ,3.39.51). In particl,l\ar t/'le method presented by Tang and BahP) is attractive because it is based on codewords of fixed length. Chap, 5 gives a gencrali:.:ation of the concept of dk-limit.ed sequences of length n introdueed by Tang and Bahl by imposing eonstrainIs on the maximum number of conseeutive zcros allhe beginning and end of tne scquences. It is shown in sec. 5.2 thai. Ihe enumerative encoding and decoding procedures are similar to those of Tang and Bahl- The additional constraints aliow a more cfficient merging of the sequences. Two eonstruetions of runlength"limitcd codes with merging rules of increasing complexilY are given in sec. 5.3. The efficiency of the new conSl!'uctions is compareJ with t.hat of Tang and Bahl's method. REFERENCES ') ') ') ') .1.8. S."lge, Elell Sy,1. Techn. J. 45, 449-4ij7 (1966). W. Il. KatllL. IEEE Tram. Inform. 'Yhoory IT-ll. 284·292 (1965). D, T. Tang Md I... R. Bahl, lnform. CO"lr. 17,436-461 (1970). E. GOl'og, IBM J. Rç•. Oovdop. 47, 234-241 (1968), Introductory chopter ') •) ') ") ') lOl ") ") ") I.) ") I.) ") I.) I.) ,.) Ol) ") ") ") >0) '.) ") ,.) "') ") Ol) SO) ") 3<) SS) SS) ") SS) SO) '0) <I) •,) <S) •<) <0) ") ") ") •ｾＩ '0) >I) 9 J. N. Frankling and J. R. Pieree. IEEE Trans. Commun. COM·20, 1182·1184 (1972) . J. P. M. Sehalkwijk. IEEE Trans. lnrorm. Theory IT·lIl, 395·399 (1972). T. M. Ｌｲｾｶｯｃ IEEE TraJ;l>. InfoUIl. Thc;çry H-19, 73-77 (1973). R. P. Lyon, IEEE Tran•. Commun. COM-21, 1438-1441 (1973). K. W. CatlermOle, 'Principle. of digitaJ lino coding" lIiffo Boob Ltd .• London (1969). E. Labin and P, R. A.grain, U,K. Patent 713,614 (1952), H. Kobayashi, IEEE Trans. Commun. Teeh. COM·19. 1087-1100 (1971). N. Q. Due and B. M. Smith, Au>t. Te1ecornrnun. Res. 11, 14-27 (1977). K. W. Cattermole. Int. J. Electron. 55, 3-33 (1983). D. B. Waters, Int. J. Electron. 55, 159·169 (1983), R. M. Brooh and A. Jessop, Int. J. 81eetron. 55. 81·120 (1983). Y. Takasaki, K. Yama.hita and Y. Takaha.,hi, Int. J. 81eetron. SS, 12l-DI (1983). M. RoussealI, ElewOI'I. Lett. 12,478-479 (1976). J. C. Mallinson and J. W. Miller, Radio and Ek<. Eng. 47. 172-176 (l977). M. Davidson, S. f'. Haase. J. L. Maehamer and L. H. Wallman, IEEE Tran•. Magn. MAG-U. 584-586 (1976). j. 1'. J. Heemskerk and IC A. Sehouhamer Immink, Philips Tech. Rev, 40. 157·164 (982). H. Ogawa and K. A. Sehouhamer Immink, Proc. Premier AES Conf. Ryetown, 117-124 (982). J. M. W07.eneraft and I. M. Jaeobs, 'Prineiples of Communication Engineering'. Wiley and Sons ＮＩｾＶＹｉＨ R. W. S. Chang, T. M. Jakubov and A. 1.. Gareia. IEEE Trans. Commun. COM·30. 1668-167g (l9g2). F. E. Glave. IEEE Trans. Inform. Theory lT·18, 356-363 (1972). E. Bigliere. IEEE Trans, lnform. Theory IT·20, 115·118 (1974). R. O. Carter, Electron. Lett. 1,65-1)8 (1965). F. K. BOVJers. US patent No. 2.957.947 (1960). J. M. Od ffiths, electron. Let!. 5. 79-81 (1969). K. W. Cauerrnolo and J. J. O'Reilly, 'Problems of randomncss in communication Engi· neering', Pentech Pre.., London (1984). T. M. Chien, Bel! Syst. Tech. J. 49, 2267·2287 (1970). J. Justesen. IEEE Trans. lnform. Theory IT·28. 457·472 (1982), S. aosÎ k, aeH Syst. Teeh. J. 51. 921-933 (1972), G. L. ｯ｛｡ｬｩｲｾￇ and O. P. Tronea, lEEB Trans. Commull. COM·22. 1555·1563 (1974). O. Brugia, R. Piotroiusti and A. Rover!, Alta Freq. 45, 695-7J2 (1976) (in ltalian). O, L. Cariolaro, O, L. Pierobon and O. 1'. Tronea, Int. J. Electron. 55.35·79 (1983), L. J. Groenstdn, BeU Syst. Teeh_ J. $3, 1103-1126 (l'/74). D. A. Lindholm, IEEE Tran$. Magn. MA(:.-14, 321-323 (1978). G. S. Poe, lEE PrC)c. P, Commun .• Radar and Signal proeess. U8, 323-330 (1981). P. A. f'ranaHoK, BeU Sy,!. Tech. J. 47, 143·157 (1968). S. Yoshida and S. Yajima, Trans. IECE of Japan E 59,1-7 (1976). M. Morizono. H. Yoshida and Y. Hashimolo, SMPTE Journal 89, 658-662 (19g0) . $. Tazaki, F. Takeda, H. Osawa and Y. Yamada, Proc. 5th Internat. Conf. On Vldoo aml Data Rocording, Southampton, 79-84 (1984). M. A. Parku and F. A. Belli., Proc. 4th ｉｮｴｾｲ｡Ｎ ConL on Viow ano Data Recording, Southampton, 207·215 (1982). A. X. Widmor and P. A. Franaszek, Electron. Lelt. 19,202-203 (1983), A. X. Widmer and P. A. FranaHek, IBM J. Ro•. Ｎｰｯ､ｶｾｄ 27, 440-451 (1983). G. V. Jaeoby. IEEE Trans. Magn. MAG·U. 1202-1204 (1977). A. M. Patel, IBM J. Res. Develop. 19,366·378 (1975). M. O. Peleha! and J. M. Geist, IBEE Trans. Commlln. COM·23, 878-883 (1975). Correttion COM·24, p. 479, 1976. P. D. Shaft, IEEE Trans. Comml.ln. tOM·n, 687-695 (1973), K. A. Sehouhamer Immink. Electron. Let!. 19, 323·324 (1983), 1'. A. Franaszek. IBM J. Res, Develop, 14, 376·383 (1970). a. JO Binary cht'mnd codes ＭＭＭＭＭＧＭＭＭＭＭｾＭＭＭＭＭｾＬ PERFORMANCE OF SIMPLE BINARY DC-CONSTRAINED CODES ａ｢ｾｴｲｬ､ In digiLal transmissîon ｾｩ is ｳ･ｭｩｬｯｾ desi.t'able for thc: c.'hannel stream to hilve low power ncar 7-cro frcquoncy. Suppression of (ho low-freqllency "'lmponents is achicvcd by conslraining the unbalancc of thc lransmitlcd Rale and spectra! proport.ie, or lInbalancc p<lsilive a.\d nesativc ｰｾｬＬ＼ｳＮ constrained cooc., wilh biIlary symbols basco on simplo bi-mode cooiog scheme, arc oalculatod. I. Inlroductîon The field of application of digital channel codes with suppressed low frequency components is quite broad. We find applications in transmission sysor metallic cable 1.2) in storage media such as magnetic 3,4) or tems over ｦｩ｢ｲｾ optical recording ｾＩＭｔｲｵｮｳｭｩｾｯｮ systems designed la achîcvc DC-suppres8ion are mostly based on so-called block codes, where the ｾｏｕｲｃ･ digits are groupt:d in source words of m digits; the souree w()rds are trans!ated ming a code baak into blocks of rl digits called codcwords. Cattermole 0,7) and Griffiths 8) designcd binary block codes based on ｣ｯ､･ｷｯｲ､ｾ having zero Ol' low disparîty, where the disparity of a codeword is defincd as the difference of the number of ones and the number ql' zeros in the codeworc.l. In thc ｾｪｭｰｬ･ｳｴ code type the souree words haw twO alternative translations (modes) being of opposite disparity polarity. The choice of a particular codeword polarity is made in such a way thaI. the socalled running digital sum or tht sequence aftel' ｴｲ｡ｮｾｭｬｳｩｯ of the new codewhere the running digîtal sum is defined word is as close 10 zero as ｰｯｾｳｩ｢ｬ･Ｌ for a binary stream as the aceumulated sum of ones and 2eros (a zero countcd as - I). of the power spectral density function of zcro·disAn analytic ･ｸｰｲｾｳｩｯｮ parity codeword systems was derived by f;'ranklîng and Pierec ｾＩＮ Examples Performance of simpte binary DC·constraÎned codes 11 of the computation of the spectra of low-disparity codeword based systems were given by Lindholm IQ) and Poo 11). They applied the procedure given by Cariolaro et al. 12,18) using matrix analysis. The procedure is straightforward and very wel! suited to machine computation. For large codeword length and large number of encoder states the memory requirements of the procedure become prohibitive. In this paper simpte expressions are derived of the rate and power spectral density functions of simpte, bi-mode, OC-constrained channel codes. Section 2 starts with a brief description of the properties of maxentropîc unbalance constrained sequences. The properties of these sequences are used to establish a new figure of merit of DCconstrained codes. In sec. 3 the power density function of low-disparity based channel codes is computed. The variance of the running digital sum (in short sum variance) of the channel codes, adopted here as a criterion of the suppression of the energy near De, is calculated in sec. 4. In sec. 5, using the theory developed in Sec. 2 we intend to answer the question: How good are these simple channel codes for thc given redundancy and resulting suppression of low-frequency components? 2. Properties of z, sequences A designer will aften be confronted with the qucstion of how good his system is with respect to the redundancy of the code and the resu1ting suppression of low-frequency components. There is a need for a yardstick to measurC the performance of DC-suppressed channel codes in an absolute way. To that end asymptotic properties of sequences so constrained that the running digital sum (RDS) of the sequence will take a Iimited number of values are discussed in this sectîon. Thc results will be used to derive a figure of merit that takes into account both the redundancy of the code and the resulting frequency range of the sequence spectrum wîth suppressed componenK Consider binary sequences x = (XI, .. " Xi, •••), Xi € ( - 1, Ij. The sc-called running digital sum (ROS) of a sequence plays a significant role in the analysis and synthesis of codes of which the spectrum vanishes at the low.frequenc)' end. The RDS Zi Îs defined as: I Zi"" L Xj. }=l Chien 14) studied sequences X taking a finite number of RDS values, so-cal!ed z (-constrained) sequences. He calculated, using a Markov source model, the information capacity of z sequences as a function of the number of allowed RDS values. The maximum number of RDS values a sequence takes is aften called digital sum variatioll._ 12 Binary channel codes ａｾｯｲ､ｩｮｧ to Chien the maximum entropy of a Markov informatioD source with N allo wed RDS values is given by: C(N) + 21 0g cos (N: I = (l) I )" More proper ties of z sequenccs were derived by Justesen 10), who calculated lhc ｳｰｾ｣ｴｲ｡ｬ propertie., of a maxentropic z-constrained souree, i.e. of a Markov source wÎth t.hc transition probabilities chosen in such a way that the maximum entropy, souree ｡ｾｨｩ｣ｶｳ JusteSeD devcloped a useful time·domain measure of the low-frequem:y propertier; of DC-constrained sequences. He defined Wo as t.he (Iow-frequency) cut-off frequency according to: (sec abo ref. 9) 1 rH(WoT) I l' = where H(wT) is lhe power density function versus frequency and Tthe time duration of a channel syrnbol. Justesen found a remarkable and very useful relation: 1 w oT""2s 2 (2) ' wherc $2 is the sum variante of the seguence. For the examples of channel codes investigated by Justesen this empirical n::lat.ion was found to be vcry accurate. This motivated us to use tbc sum varianee as a criterion of the channel code's low.frequency propertÎcs (eq. (2». This is of practical importance because the sum varianee of a sequence is often casier to calculate than the complete spectrum. The sum variante of a maxentropie z sequençe Îs given by ref. 15: N aZeN) "" ｾＭｎＫｬｾｻｾＨｎＫ .. " 2 l) - k}Z ｳ￮ｮＲＨｾＩＮ N+l (3) k=l Table I lists thc capacity and sum variance versus the digit.al sum variation N (eqs (l) and (3)). The asyrnpLotic behaviour of the capacity and the sum varianee for large digital sum varial.ions N can be derived: )t2 C(N) - I - (2 In 2) (N + ｬＩｾ , (4) and (5) Performance of simpte binary DC-conslrained codes 13 TABLE I Capacity and sum variance of maxentropic z sequences with digital sum variation N N G(N) a 2 (N) 3 4 0.5 0.694 0.792 0.85 0.886 0.91 0.93 0.5 0.80 1.17 1.59 2.09 5 6 7 8 9 2.643.26 Wîth eqs (4) and (5) the following important bound between the redundancy 1 - G(N) and the sum varîance of maxentropic z sequences is derîved: 7<2 0.25 > (1 - C(N» ＨｦｾｎＩ --I > 6 In 2 4 '" 0.2326. (6) Actually the right-hand bound is within ) % accuracy for N> 9. Combining eqs (2) and (6) yields: woT"" 2.15(1 - G(N». This expressîon dearly shows the linear trade-olf between the redundancy and the cut-olf frequency of maxentropic Zo sequences. In sec. 5 this relation is used to establish a figure of merit of DC-constrained chanoel codes. 3. Simple coding schemes First some properties of codîng schemes based on codewords with an equal number of positivc and negative pulses, so-called zero-disparÎty codewords, are discussed. The number No of zero-disparity codewords with n binary channel symbols (n even) is given by the binomial coeffident The code rate R is defined accordîng 10 14 Binary channel codes The ｺ･ｲｯＭ､ｩｾｰ｡ｲｬｹ codewords are concatenaled without a merging rule. In other words the sequence is encoded without information about the history Practical and a fixed relationship exists bet ween codewords and source ｷｯｲ､ｾＮ demand the number of codewords to be a power of two, 50 coding ｾ｣ｨｭ･Ｌ that. a sub,et of the No availabk codewords should be ｵｾ･､Ｌ which effectively lowers the code rate R. Here only 'full set' cading schemes are considercd. A generalization of lhe coding principle using zero-disparity codewords leads to tht so-called alternate or low-dispadty coding 8). Besidcs the set of codeword, having zero-disparity sets of codewords with nomero-disparity are uscd. The simplest code type has two alternate representat.ions (modes) of the ｾｯｵｲ･ｃ words. The two alternate representations have opposite disparity, the choicc of the positive or negativc representation is determincd by the polarity ()f lhe ROS just before transmission of the new codeword. The cholee is made in such a way thal the absolute value of the RDS aftel' transmission of thc new codeword is rninimized, I.e. as close to zero as possible. Zew-di,parity codewords can in principle be uscd in both modes. For ease of implementation zero-disparity codeword, are sometimes divided into two sets to be uscd in both modes ＢＺｾＱＹＩＮ It is deal' if more subsets of codewords arc used that the number of codewords is larger than in lhe case of zero-disparity encoding (assuming equal codeword length). Consequently tbis allows a larger maxim urn code rate for a given codeword length. Unfortunately the power in the lowfrequency range wW also înerease if more subsets arc used sa that a trade-of[ be/ween code ｲ｡ｴｾ and low-frequency content has to be found. In the following some propertîes of low-disparity coding arc derived. Let a codeword with Icngth n (n even) consist of binary symbols Xi, I '-i.; i,;;; n, Xi f;; I - I, 11. 'I'he disparity d of a eodeword is defined by d= " IXI' ,=1 Assul1le funher that a ｾ｣ｴ of eodewords S+ is ､ｾｳｉ｜ with zero and positive disparity aml a ,sct S- with elements of zero and negative disparity_ Sct S+ consists of K + I .subSCls Su, 5" S2, ... , SK (K «: !n); Ihe elements of the subsets SJ are all codewords wÎt.h disparity 2) (0 <j.,,:; K)_ The codewords in Scan be found by inversÎon oF all n symbols of the codewords in set S., and vice versa. The cardinality NJ of l.he subset Sj is simply given by the binomina! coeflident Performance of simpte binary DC-conslrained codes i5 The total available number of codewords in S.. is K ｍ］ｉｾＬ j=O so that the code rate is I R"= n2logM. As the disparity of the cooeworos is chosen sueh that the RDS after transmission of the codeword is minimized, it is not difficult to see that during transmission the running dighal sum takes on a finite number of values. Without loss of generality it ean be assumed (by properly choosing the initial sum value at the beginning of the transmission) that the sum values are symmetrically centered around zero. The set of values (states) the RDS aSSumes at the end (or start) of a codeword, the so-called terminal or principal states, is a subset of the RDS values the sequence can take. Let the terminal digital sum of the k-th codeword be D(k). The sum after transmission of the (k + J)-th codeword is D(k+l' = D(k) ±d, where dis the disparity of the (k + lHh codeword. The sign of the disparity (iC d f- 0) of the codeword is chosen to minimize the aecumulated sum D(hl). A code with this property is said to be balanced. Wc find by inspection that D(k) can take On One of the 2K values ± 1, ± 3, ... , ± (2K - 1). It can easily be found that the total number of RDS values the sequence can take within codewords, I.e. the digital sum variation, is given by N = 2(2K - 1 + ｾｮＩ 4K + n - I. = +J (7) AS an illustration the RDS as a function of symbol time interval, the so-called unbalance trellis diagram, is shown in fig. J. The code has codeword length n = 6 ano it uses the maximum number K + 1 -- tn + 1 = 4 subsets. Note the 2K = 6 possibIe Sum values at the end of each codeword and also the N", 4K + n - 1 = 17 values that the RDS ean take within a codeword. In the computation of the power density function and the sum variance of the encoded stream we need the stationary probability of being in a certaÎn terminal state. Assume the source blocks to be generated by a random independent proce55 then the signal process D(k) Îs a simple stationary Markov process. Tbe value that D(k) can take is related to one of the 2K states of the Markov proccss. The state transition matrix P, witb entries P(i,}), where PU,}) is the probability that the next codewOrd will take it to terminal state i given that lhe 16 Binary channel codes 5 Q) 3 g 1 .Q -1 E -3 § -5 o 3 2 1 4 5 6 time ----...- fig. I. Uni)alançe trelli, diagram. ｔｨｾ in :-:n.ttt" 3. lhick Cll(v€ shows tlle path of lhe wdeword ' + - - - + -' ｾｬＮ｡ｲｴｩｮｧ ençodçr is currentty in state i, can easily be found. As an illustration we have written down the general matrix P for 2K = 6 terminal states: ··5 p= -3 -I 3 5 0 P3 0 0 Ps -5 0 0 po I 3 Po PI ｐｾ pa 0 0 Po 0 P, po P2 Pa P. PI 0 ps P2 Po PI P2 0 Pij 0 0 P'l. P, PI ｰｾ -3 -I 5 We find that P(i,f) is tht proportion of codewords in the mode llsed in state i having the appropriate disparity d for thc lransition to encoder state} = i + ｾ､Ｎ The transition probabjlity Pi equals lhe retative number of codewords in subset Si, or Ni 0 -:( i.":; K. Pi = M' Dlle to the special struchm: of the matrix P we can find the statÎonary probability vector 11: wilh entries (n(K), , .. , n(I),71(I), ... , n(K», where n(i) is the probabiHty being in the eneoder state i with corresponding sum vallIc (21 1). K () n(K - I) = I Ph j=-K-' 0 «i «K - I, (8) 17 Performance of simpte binary DC-consrrained codes where Q is determîned hom the normalization K L nU) = 0.5. 1=1 With eq. (8) K Q = 22: ipi. (9) jJlli The proof of eq. (8) is by direct verification of the identity nP""n. This amounts to: n(K)PJ + n(K - l)pJ-l'" + n(l)PK-J + n(2)PK_J+l . .. = n(j + I), j = 0, ... , K - J. After evaluatin,g we ardve at eq. (8). 3.1. Computation of the spectrum The codeword symbols are, in general, transmitted on some standard pulse shape g(t) at intervals of duration T. The signal can be expressed by: a> x(t) '" L Xi g(t - i T), lrz-w where XI is the codeword symbol value for the time slot i T < t < (i + I) T. The values that XI assume are determined by the code and the source words which are to be encoded. The auto-correlatîon fUfietion of the encoded sequence R(k T) '" E[XiXiH] is cyclo-stationary with period nT 20 ). On the assumption that the proeess is ergodic, the power spectral density funetion is 20) H(w T) = T1 I G(w T) Iｾ W(w T), where G(w T) is the Fourier transform of g(t). W(w T) is given by W(w T) "" R(O) ., +2L R(k T) cos (k (LJ r). <=1 In the sequel we assume I G(w T) I '" T. The determination of the power spectral density, lhen, requires the calculalion of thc auto-corre1ation function R(kT). Bosik 21) and Cariolaro et al. 12,13) have given a complete and systematic analysîs to fiud the auto·correlation funClion of fixed-Iength-codeword-based channel codes. Thîs analysis cM directly be used to compute the spectrum of alternate bÎ-mode codes (see for Binary channe! codes 18 examples refs 10 and 11). However. the structure of full set alternatc bi-mode codcs allows a more efficient cornputation to be presentcd here. The advantage of our approach is t.hat wc obtain a closed formul<l. for the spectrum of codes with two subsets and sirnple expressions for code spectra with a larger nurnber of subsets. Grl;:cnW:in iA) studied the spectra of a class of DC-suppressed channel codes. This class of channel codes uscd the 'polarity bit.' or Bowcrs principle 16). He observed an important feature of th is blocj(·codcd binary sequence that aha holds for the alt.crnate code principle. He found that the correlation between symbols Xit and Xj")' f h, depends on1y on the number of blocks scparating these twO syrnbols, i.e. on the number of codewords in the interval UI ,12). lf the .5vurcC syrnbols are generated by a random and independent process the consecutive codewords abo would be uncorrelated. Any correlations are therefore due only tv thc inversions of t.he codewords. II follows that. the expcctation E[xj, Xh] depends, at most, On the number of possible codeword alternations between x'il and XhThis nice property holdsif all possible codewords of the subsets are used. If these subsets are truncated, for exampIc for practical reasons to a power of two, thc following analysis can only be used as an approximation. The auto-correlation funetion RUT) can now be exprl;:ssed in terms of a set. of numbers rU), being the correlation between any two codewords having i codewords bet.ween thcm IA) R(U +j n} T) =ｾ {(n - i}rU) + i rU + I)], j? 0, I ,;;, i ｾ n, R(O) = L (In) In other words the auto"correlation fUllction R(i T) can be found by a Iinear interpolation of the set of numbers r(i). The spectrum of sueh a sequence is given by ｾ H(w T) = I - r(O) + n P(w T) \r(o) + 2 tI r(i) cos (w inT) } , wherc F(w T) = sin Ｈｾｮ w T) - (I T)' n sm '2w (11) The spectrum vanishes at w ,... 0, sa that I + (n - 1) 1'(0) + 2n L"" r(i) = ;=1 O. (12) Performance of simple binary nC.conslrained codes 19 As a direct consequence we derive for uncorrelated sequences with rU) = 0, f 0, for example codes based on fixed-disparity codewords; i r(O) = - 1 n-I --. Thc conesponding auto-correlation function R(kT) of the concatenated sequence is found with eq. (l0) R(kT) I = k 0, \k - n) . 0< ke:;:;; n, o k> n. n(n - 1) The power density funetion of ｾ･ｲｯﾷ､ｩｳｰ｡ｲｩｴｹ codeword based channel codes is n 1 T H(w T) = -n=-T II - = F 2(w T)J, which agrees with earlier results of Frankling and Pierce 9 ). The calculation Of the numbers r(i), if K> 0, is a tedious but straightforward evaluation of Cariolaro's results, therefore we merely state the results. The correlation function rei) is given by r(i + 1) = Cf pi c 2 , i;, 0, (13) where P is the state transition matrix and The 2K-vectors C1'and C 2 are given in the interval 1";;: i,;;;; K by: CI(i+K) =- ｾｻￎ n j=i+l (J-i)7r(j)Pi-i- I'u+J)ll(J+ I)P'+i} ｏｾｪ and (14) For symmetry reasons: and C 2(i) = - Cz(2K - i + 1), 1 < i -,{.,K. The correlation coeff\cienl r(O) is not found by eqs (13) and (14). The number r(O) equals the correlation of symbols in the same codeword or r(O) = E[xitxiol. where the symbol positionsj, andjz are in one symbol andj, +- 12. The coeffi- Binary channel ｣ｯ､ｦＺｾＧ 20 eienL r(O) ean with su!fiócnt aecuraey be eomputed with cq. (12). A dosed expression is given in the section on the computation of the sum variançe of , altemate ￧ｯ､｣ｾＬｾ･ eq. (21). Example 1 ln the case K = I the prceeding results for the spectrum and correlation function become manageable. We find the stationary probabilities n(1) = n(2) = i. The cardînalitics of the two subsets are given by so that n NI pj = No + ｎｾＧ + 1) 2(n = . Substitution in eq. (14) yields: C,(1) "'" - Cl(2) '""' - !?l. = _ n 2(n 1 + l) and The 2 x 2 transition matrix P is given by p = ｛ｾＺｾ｝ = [n : 2 n ｾ 2] /2(n + 1). A funher evaluation of cq. (13) gives: rU) = - (n '11IY+1' i> O. Substirutjng in cq. (ll) gives the spectrum of thc alternate code with two subsets { U Z n ZｆｾＨｷｔＩ } 1 (15) T H(w T) = (I - a) I·· I + a" + 2a cos (n ü} T) , where I a=-n+1 Evaluating yields the second"derivalive of the spectrum at De: HJI(O) -----r= Hn + 2)(n + 11). Performance of simple binary DC-constralned codes 21 Lindholm 10) and Pao IJ) have given examples of the computation usÎng the matrix procedure of Cariolaro et al. One of their examples, the spectrum of the Sb 6b code, can be used to evaluate the accuracy of the preceding analysis, when the subsets are truncated. The Sb 6b code is basically an n = 6, K "" 1, bi· mode code with 6 of the possible 50 codewords deletcd. Poo suggested to delete toe codewords ' + + + + - - '. ' - + + + + - " ' - - + + + +' and their in>Ierses. A recalculation yields the power density function of the Sb 6b codc dc· pietcd in fig. 2. The power.density function of the 'full·set' n = 6, K = 1 chan· nel code, using eq. (15), is plotted as a comparison. We note a good agreemenl (a few dB difference) between the spectra. K-'.n-6 k=---- lil "'!;J ＰＳＮｾ -40 -SOIl....l.........l...l..J--........_ 0005 ......................................,J--_..............- ........ Ol 0.01 frod of channel fr@q. Fig. 2. Comparison of the spectra of Sb6b code and the bi-mode n = 6, K = I code. 4. Computation of the sum variance An important frequency·domain property of DC·balanced codes, the cUl· oft frequency can be estimated, using eq. (2), by the sum variance of the code stream. In this section we derive a simple closed-form expression of the sum variance of DC-balanced bi-mode channel codes. The process of encoding using the alternate code principle is cyclo-stationary with period n T21), 50 that the sum varianee of the sequence has to be found by averaging the running sum variance over all n symbol positions wilhin a codeword. iherefore the running sum variance at all symbol positions withill the codeword has to be determined. Define the value of the digital sum at the k-th position in a codeword 10 be Zk. Assume a codeword starts at Z(l, one of the K possible positive terminal 22 Binary channel codf's sum values (the statistical properties of codewords starting at ncgative sum values can be found by symmetry). The digilal sum at the k-th posilion is given by k Zk I Zo + = I";;: k 0:;; n. xm , m=l The running sum varianee at the k-th position given Zo is E[d I zo] = E[(Zo + k = tI ｝ｾＩｭｘ k k. 1(-1 E[Û + ,J;, x;, + 2z o ];1 X + 2);1 J,h;./},Xh] ' m wherc lhe operator E[ ] averages over all codewords. A nice property of ruil codeword subsets is lhat E ｛ｸｨｾｩＬｬ and E[x},]. jl +- h, are not a fundion of the symbol positions Jl and h. Dcfine the short-hand notation: E[xj,] =A and E[x},x},) = B; 1 "':;j.,jz ｾ n; Jt fh· Substitution yields thc running sum variance at the k-th symbol position I zo] E[;;; = z3 + k" + k(k 2kA Zo (16) - 1)B. The sum variance si of the sequence, if starting in zo. sk I Zo, is found by avcraging the running sum varîance over all n symbo! positions sk I Zo '= z$ + ｾＨｮ + 1 '= - n L Il E(z; I ZII) kcl 1) t· A(n + I)zo + ä(n' •. 1)B. TakÎng into account the probability starting in Zo and averaging yields for the sum variance si K si ｾ E[z3l + ｾＨＱ t 1) + Hn' - I) B + 2(n + I) AL (2i - 1) n(i). (17) ;=L We eliminate A by !loting the periodicity, i.e. E(zJ] (16) yields E(z,; I zo] = zg + n + 211 A Zo + n(n = ｅ｛ｺｾ｝Ｎ Evaluating eq. - I) B and averaging yiclds K Elz,:] = E[z$) 50 that with E[d] = + fI + 4n A L (2i ,=) + n(n ｅ｛ｺｾ｝Ｌ K ZA I) n(i) L (2i 1",,1 I) nU) = - Hl + (n - I) BJ. - 1) B. Performance of simpte binary DC-conslrained codes 23 Substitution in eq" (17) yields s1- = ｅＨｺｾｬ The varianee ｅ｛ｺｾ｝ - Hn 2 - I) B" (18) is given by :« E[z5] '"' 2 L (2i - 1)2 n:(i). (19) 1=1 4.1. Computation of the correlation B = E[XJlXh] We now calculate the correlation of the symbols at theh-th andh-th symbol posîtion in a codeword. It is obvious that for iJ == h: E[x),xj,] = 1. If i. 1= 12 same marc work is needed" In that case E [Xj, XjJ = Prob(xJ! = Xj,) - Prob(xh = 1 - 2 Prob(xJl "" Xi,), h -+ * x),) h" (20) Assume a codeword to be an element of subset SI in S•. The probabîlity that a symbol at position h in the codeword equals I is Prob(x}l = I I S = Si) .!.-n ｮｾＨ -- + i). The probability that another symbol at position )2 ",), within the codeword is -I is 1 Prob(xj, = - 1 I Xj, = I, S = Si) = ｾｮｾ . ; . So that and with eq, (20) 4F E[xhx» IS= SI] = - n ｾＭﾭ n-I lf we further take into account the probabîlity Pi that a codeword is an element of subset Si we tind for the correlation 1 B=E(x1J Xh] = _ _. -4 n ｾｬ f = n- I '2 Pi (21) Binary "hannei codes 24 Combining with eq. (18) yields s; = E[z31 + t(n + I) {I - ｾ t PI} , (22) /2 Define K um=:Li"'PIl (23) mE(I,2,3}. i=l After some algebra combining eqs (8), (9), (19), (22) and (23) yields the varianee of Lhe terminal sum values (24) and the sum varianee of thc complete sequence (25) The computation of the sum varÎance was til! now generally treated. Eq. (24) is only based on the aSSllmption of the DC-balanced binmode struclure of lhe transition matrix P. In cq. (25) we further assumed Lhat the expectations are invariant with respect to the position in a codeword. In the next examplcs we substitute values of the cardinalities of the subsets in various code embodÎmenis. EXil.lI1ple 2 The special case of zero-disparity eodeword bascd systems, i.e. K (E[z3] = 0) sJ = ｾＨｮ = 0, yiclds + 1). ThÎs result was earlier obtained by Justesen 16). Examph: 3 lf (wo subsets are used for eneoding a simpte result ean be obtained. We found (see exarnple I) n PI = 2(n + l) . Substitution and working out eqs (23) and (25) yields sf = A(n + 5). Performance of simpte binary DC-conslrained çodes 25 Example 4 Bowers 18) and Carter l7) proposed a construction of DC-balanced codes as being attractive because no look-up tables are needed for encoding and decoding. They proposed a code with (n - 1) source symbols being mapped without modification onto (n - 1) symbols of the codeword. The addiüonal n-th symbOI of the codeword, the so-called 'poJarity bit', is used to identify the polarity of thc transmltted codeword. Assume that the encoder is designed in such a way that the first (n'- 1) symbols equal the source symbols and the n-th symbol is one. If the sum at the start of the transmission of a new codeword and the disparity of the new codeword have the same sign then all symboJs in the codeword (including the po!arity bit) are inverted before transmission. If the disparity of the codeword is ｯｲ･ｾ then the po!arity of the codeword is randomly chosen. Accordingly, the nllmber of available zero-disparity codewords is reduced to half of those used in the bi-mode OC·balanced codes as described in sec. 3. At the receiver a codeword inversion call be Iloticed by the sign of the polarîty bit. The spectra! properties of the 'polarity bit' encoding principle were studied Greenstein used a computer simulation by Greenstein (3) and Brugia et al. ＮＩｾＱ to estimate the power spectra! density function and Brugia et al. applied Cario· !aro's numerical method 12,18). With the precedîng analysis to calculate the sum variance of K + ! subsets based channe! code a very simp!e expression for the sum varianee of the polarity bit encoding construction call be derived. The code rate of the polarity bit code is R = I _1.-. n The number of subsets is K + I '" ｮｾ + 1 (n even), sa that the number of terminal sum values is 2K = n. The effective number of zero·disparity codewords No is halved by the random choke of the 'polarity' of these words with respect to the maximum number used in the low·disparity coding principle, Le. No" +(i:)- The number of codewords having nonzero disparity is not changed The total number of codewords having zero or positive dîsparity is: M"" 2n - 1 • 26 Bin(1ry channel codes Using some propertjes of binomial coefficients a routine COlTIplltation yields: Uj = n ＨｾＺＩ U2 =' ｾｮ 2- C"+1), and Evaluation of eq. (25) yields sft where ｾ［ｸ｡ｭｰｬ･ c= H2n - 1), (26) st is the sum variance of the polarity bit encoded sequence. 5 lf all possiblc codewords are used, i.e. K"" !n, thc following results are derived: and 2 1 sl" = 1i(5 n - 1) - n + 1 12M 2". Other values of the number of subsets dÎd not yield sirnple rcsults. Using a computer s; l;an he found as a funetion of K and n, Thc rcsults of the computations are collected in table 11, where thc redundancy 1 . Rand thc digital sum variation N of the code arC a150 given (see eq. (7». TABLE II Sum variance, digital sum variation N and redundancy 1 - R of alternate codes n K N 2 2 4 4 6 6 6 8 8 8 8 0 1 1 2 1 2 3 5 3 1 2 3 4 7 11 9 13 17 II 1:'J 19 23 s1 .5 1.167 U 2.56 1.83 3.20 3.94 2.17 3.68 4,92 5.32 1-R .5 .208 .170 .135 .145 .107 .101 .U8 .092 .083 ,081 Performance of simpte binary ｄｃｾ｣ｯｮｳｴｲ｡ｩ･､ codes 27 After a study of tables land Il we arrive at the following conclusions: the simple code with n == 2 and K "" 0, the so-ealled 'bi-phase' code, achieves 1000/0 of the rate and the sum varianee of the maxentropic sequence with digital sum variation N = 3. This result was eartier derived by Justesen U). A new result is the simple alternate code with n "" 2 and K = I achieving 100070 of the rate and the sum variance of the maxcntropic sequence with N = 5. Fig. 3 shows for several codes the sum variante as a funetion of the redundancy I - R with K and nas parameters. As a reference the sum varianee is 5 k l> 0 v 1 x 2 a Polorlty bit 05 '-.O-S----,O.1...,1----------"'l'. 0 RedlJndancy (log) Fig. 3. Sum vacianoc and rcdundançy of various codes. plotted versus the redundancy 1 - C(N) of maxentropic z sequences (see eqs (1) and (3». Notice in the figure that the performance of zero-dîsparîty cncoding diverges with growing codeword length from the maxentropic bound. Going to more subsets, K = 1, Îs worth-while in a large (l - R) range. In order to obtain some insight into the accuracy of Justesen's relation, eg. (2), the cut-off freguency was ealculatcd using numerical methods (egs (ll), (13) and (14» and compared with the reciprocal of the sum variance of the code. In the range given in table 11 we found that the relation between sum variance and actilal cut-olf freguency is accurate within a few percent. S. Efficiency of simple alternate codes It is customary 22) to define the rate efficiency of a channel code as the ralio of the code rate and the noiseless channel capacity given the ehannel eonstraints, or 28 Binary channd codes R e"" --' C(N) > where eeN) is the capacity of the Chien channeJ (eq. (I» alld N is the digital sum variation of the channel code. As an example assume n = 4 and K = I. In table n we find in this case N = 7 arld R = 0.84, so that for this challnel code an efficiency e = 0.84;0.886 = 95070 (see table I) is concluded. The sum variance of the code is 1.5, which amouuts to 1.5/2.09 = 72% of the sum varianee of the maxentropic z sequence with N = 7. 11 is clear that the comparison of DC-balanced channel codet; with maxentropic z sequeucet; should take into account both the sum variance and the rate. We come to the following definition of encoder efficiem;y E = ll.=....f.(!V) I Ｈ｝ｾｎＩ (27) [1 - ｒｬｳｾ The efficiency E at; defined in eq. (27) compares the 'redundancy-t;um variance products' of the practical code aud the maxentropic sequence with the same digital sum variation as the practical code. Notc that for N> 9 the 'rcdundancy-sum variance product' of maxentropic z sequences is appWximately constant (see eq. (6» and equals 0.2336. The efficiency l::-' of various codes verSut; codeword length it; plotted in fig. 4. The polarity bit encoding principle has a simple Împlcmentation, but as wc can notice from figs 3 and 4 ît is far from optimum in the depicted range. We conc!ude from the figures that for a giveu rate a sum variance eau be expected 1.0 Q.8 w 0.6 >. u ｾ "0 ..... ..... l.IJ 0.4 k"'--t:l--o-,o--<,,"-->----<l-----i:......-<J I:> 0.2 0 " 1 x 2 " Polorîty bit OO'--_..l-.--=-----L-_-----l._ _l........-----.J o 4 8 12 16 20 Codeword length n Fig. 4. Efficiency of ,imple allernale channeJ ｣ｯＨｬｾＢ Performance of simpte binary DC-COflstrained codes 29 approximately 2.5 times largeT than that of maxentropic z or e.g. codes based on alternate codes wilh K = 1. Eqs (6), (26) and (27) show that for large codewords the efficiency E asymptotically decTeases to 350/0. ln the range 0.8 < R < 0.9 notiee for a desired low-frequency suppression the priee of this simple code: a loss of approximately 1507D in code rate. Greenstein (ref. 18, Pl'. 1125) concluded: 'If the quantity held fixed is the rate, the zero·dispaTity code must use a larger value of the coJeword length than 'polarity bit' for which its spectral superiority all but vanishes'. Quite a different conclusion can be drawn from figs 3 and 4, showing the superiority of zero-disparity codes with respect to 'polarity bit' encoding in thc most practical (l - R) intervaL A ealculation shows that for an unpractically large codeword length n> 160 the polarity bit encoding principle outperforms thc <:ero-disparity encoding. In the following wc study the efficiency of bi-mode codes when the codeword length is allowed to becorne very large. A rearrangement of the preeeding resuls is plotted in fig. 5. The figure shows the efficiency Eversus the norrnalized parameter K' "" K/V;; with the codeword length n as a parameter. 0.8 ＢＭｬｮｾＱＰＧ - .....- oＮＷＱＫＭｉＬＲＨＰｦＢＧｴｾｈＪ ,. L,IJ 1;; I ]I .... ""1\ 0,61t+++++-F,50.--l:::;rl'---boH:B::l-è ｾ J Ｑｯｾ［ＰｪＢＧＭＮ 05 c "/' 0 4 ' ;mo . 0.3 f-- '--I-.-f-.+-t-++-H 00 _.- _.+-+-+--1++-1 ｾＧ｜ＮＭｆＫｴＭｬ _..""C+,.-! ...I--+++-t+l ,/ kf' Tii 0.2 1 0.05 I 0.1 -kNn(log) 1 5 Fig. 5. Asymptotie efliciency of simple allernate channd cod<s. The curves are plotted as solid lines; note however that only a discrete set of points is achievable. The asymptotic curve with n --->- 00 was calculated using the euor-function as an approximation of the binomial coeffi.cients (see for example ref. 23). We notiee that the asymptotic curve achieves a maximum in efficiency whcn 1(' = 0.47. The maximum efficiency E is in that case approximately 0.56. Binary chann.e! codes 30 6. Conduliions We calculatcd the power dengity funetion and gum variance of simp!e bimode DC-8Uppresging block·coded channel codes, We eompared the rate and sum variance of the chamlel seq uence with maxentropic unbalancc eonstrained sequenceS. Wc found that codes with a smal! rate have a good efficiency. Two cOdeS with digital sum varîalion N = 3 and 5 werc shown to be optimal in Chicrl's sense. Large rale codes become asymptotically bad with growing eodeword length. l3y properly choosing the number of subsets of thc code we earl maximize the code efficiency to approximately 56070. Codes based on thc polarity bit principle have a simple implementatÎon, but the efficiency is bad in the practical range of codeword length. Acknowledgentent The author is indebted to R J. M. Jacabs, who calculated and plotted thc results shown iu fig. 5. REFERENCES ') D. Il. Wat.er>, lnt. J. E1eclron. 55, ＱｾＹ (19B3). ') R. M. Brooks and 1\. Jessop, Inl.. .I. Electron. 55,81 (1983), ') .I. C. Mallinson and J. W. Milier, Radio and Elec, Eng. 47,172 (1977). ') M. Davidson, S. F. Haa,e, J. L Machamer and L. H. Wallmall, IEEE Trans. Magn. MAG-U, 584 (1976). ') .J. P. J, Heemsk cr k and K. A. Sc ho u hamer 1mmi n k, Philip, Techn. Rev. 40, J57 (19B2). ') K. w. Cattnmok, Principles of pulse code modulation, lliffe Books Ltd., London, 1969. ') K. W, Canermolc, Int. J. Electron, 55,3 09S3). ") J, M, (,riffith" EleCtron. Let!. S, 79 (1969). ') J. N. I'rankling alld J. R, Pierec, I"'EE Iran•. COl'IllllUn, COM-20, 1182 (1972). 'ol D. A. LindholIll, IEEE Trans. Magn. MAG-14, 321 (1978). 'I) O, S. Po", IEl' Proc. F 128, 323 (1981). ") O. L. Cariolaro and G. P. Tronea, IEEE Trans, commllO· COM-U, 1555 (1974). "') (,- 1,. Cal'iolaro, G, L. ｮＧｾ｢ｯｲ｣ｩｐ and G. P. Ttonca, Inl.. J. Electron, 55. 35 (1983). ") "f. M. ChieIl, BeU SyS(.. Tech. J. 49,2267 (1970), >0) J. JusleSetl, IF.I':F' Tran,. Jnform. Theoty IT·28, ＴｾＷ (19B2). 1&) F, K. Bowps, US patent No. 2,957,947, 1960. 17) R, O, (:arter, Electron. Lett. I, ＶｾＨｬＹＶｾＩＮ ") L. .I. Grecn'lein, BeU Syst. Tech. J. 53. 1103 (1974). "') O. BrlLgia. R. Pielroi\lsti "ncl A. Roveri, AHa Freq. 45, (>95 (1976) (illltalian). tol W. R. Bcnnclt, Heli :';y,t. ieeh. J, 37, 1501 (19.18). ") B. S. HO,ik, Heli Sy,t, Tech. J. 51, 921 (1972). ") r. A. Franaszek. IBM J. Res. Develop. 14, ,176 (1970). ") W, Fellcr, An intmduction 10 prol:>al:>ililY theory and its applications, volume I, Wiley and 8011, lnc., New Vork, 1959. Construction of binary DC-constrained codes 31 CONSTRUCTION OF BINARY OC-CONSTRAINED CODES Abstra.ct The systemalic design of DC-conslrained codo, basod on eodewords of fixed lenglh is cOdsideted. Silllple recur.ion relalions for onumcraling lhc numher of eodewords salisfying a conslrainl On lhe maximum unbatanee of ones and zeros in a codewotd are derived. An onumeralivo schomo for eneodÎng and decoding maximum unhalance conslrained wdewords wilh binary symboll is clevcloped. Examples of COnslruclions of transmission systems basod on unbalaneo eonslrained eodewords are given, A worked example of an 8b lOb chann<;l ,00<; Î> given being of partieular interest hecause of ils practical simplicilY and relalive dfideney. 1. Introduction An important requîrement when designing a digital transmission system is the shaping of the power spectral density function of the encoded stream by adding redundancy 10 match the particular physical properties of the transmission channe!. Many practical cxamples of transmission systems can be mentioned that do not pass the low frequencies with sufficient signal·to-noise ratÎo. Filtering out the low frequencies can only he done if the cncoded signal itself is not serîously dîstorted by this filtering. Shaping the spectrum of the encodcd stream by coding can cope with this phenomenon. Shaping, however, can only be done if some kind of redundancy is added to the source sequence. The field of application of digital channel codes Wilh suppresscd low frequency components is quite broad, We find applications in transmission systems over and În storage media such as optica] recording 4) fibre or metallic cable Ｑ｟ｾＩ and magnetic recording 5). Transmission systems designed to achîeve De suppression are mostly based on so-called block codes, where the source digits are grouped in SOUrce blocks of m digits. The m digit blocks are translaled using a code book into blocks of n digilS called codewords. Cattermole 6,7) and Carter 8) gave examples of chan. 32 8inary channel coi!!!. nel codes based on block codes with codewords having zero or low disparity, where thc disparity of a codeword is dcfined as the difference of the number of alles and zeros in the eodcword. Justesen 9) showed a close relationship betweell the low-band cut-of!" frequency and t.he varianee of lhe running digital sum of a DC-suppressed encoded 'lream. The (running) digital surn is defined for a binary stream as the accurnulated sum of ones and ,.eros (a zero eounted as -I) hom the beginning of the transmission. Furthcrrnore Chang et al. IQ) found thal the sum variance plays all important role in thc bit error rate when the transmission channel is AC-coupled. The simpk codes as previously discussed have the disadvantage that codeword length, the surn variance and rate have a fixed relationship. In this paper the codc design is based on codewords having a constraint on the maximum Dllmber of assumed sum values, i.e. those codewords ,He delet.ed having a relative large contribution to the sum variance. We start in sec. 2 with a brid description of the properties of scquences thaI. assume a limited number of digital sum values. These properties are used to est.ablish a figurc of merit of DC-constrained codes. The actual basis of code dcsign is given in sec. 3. It provides a method for the enumerat-ion of codewords with an unbalance eonstraillt. In .general the effiçiency of a channel code can be improved if Jarger codewords are allowed. Unfortunately the complexity of the cneoding and decoding hardware increases exponentially with inereasing eodeword length if a direct rncthod using look-up ｴ｡｢ｬ･ｾ of the source words and their channel representations is used. Section 4 dcals with algorîthms for encoding and dccoding of constrained codewords growing polynomially with ｩｮ｣ｲ･｡ｾｧ codeword length. Finally, in sec. 5 the problem is addressed of sclçcting sets of codewords to design DC-suppressed channel codes. A worked cxarnple of a new 8blOb transmission code is given showing sup<;rior features with respect to other codes with rate 8/10. 2. Prollerties of z-constrained sequenees This seclion provides a description of some properties of OC-constrained generated by a Markov souree having maximum entrQPY- This study gives a relation bet.ween the redundaney and the sum variancc of codes. USÎng this theory we can derive a new figurc of merit taking into account both the redundancy and t.he resulting frequency range with suppresscd components, Consider binary sequences x = (x I , . . . , Xi, ... ), ,xi e I -I, 11. The so-called (running) digital sum of a sequence plays a significant role in the analysis and syruhesis of codes of which t.he spectrum vanishes at the low frequency end" The digital sum ZI is delined as: ｾ･ｱｵｮ￧･ｳ Construction of binary DC-constrained codes Zi = I 33 i (I) Xj, ja! Chien 11) studied sequences X assuming a finite number of sum values, Sequences having a constraint on the maximum number of Sum values are defined here as z (-constrained) sequences. The total number of sum values a sequence assumeS is often caJled thc digitaJ sum variation. Chien determined the information capacity of z sequences as a function of the number of allowed sum values. His results will be briefty discussed in the following. Taking z/ at any instant i as the state of the signal strcam x, then the bounds on Zj define a set of allowable states of a Markov SOurce. Each transmission of an additional symbol XI can be considered as a transition frortl OnC state tO another. This transition can be represented by a so-called coonection matrix. For the N state Markov source an NxN connection matrix D is defined by D(i,) = 1 if a transition from state i to state) is allowable and D(i,) = otherwise. The connection matrix D is for the z·constrained sequences given by D(i + 1, i) = D(i, i + 1) = I, J";:; i";:;N - I, (2) D(i,}) = 0, otherwise. ° Note that D is a symmetrie Toeplitz matrix. As an example We have written down the matrix D for N == 5. Ol D= l 000 1 0 0 1 0 1 0 0 1 1 1 0 0 ° o 00010 The above representation is related to the input-restricted noiseless channel studied by Shannon u). The Markov information source model enables us to compute the channel capacity, dcfined as the number of bits per channel symbol that can maximally be carried by the constrained channel sequence gen· erated by the Markov source. The channel capacity equals the maximum entropy of the Markov SOurce which is given by the base-two logarithm of the According to Chien the largest real eigenvalue of the connection matrix D ｉｾＩＮ maximum rea I eigenvalue of the Toephtz matrix is given by Je = 2 cos (_ft_) N+ 1 ' (3) so that the capacity is C(N) = 2JogÀ = I + ｾｬｯｧ｣ｳ (N: I)' (4) 34 Binary channel codes Mörc interesting properties of z sequenees were dcrived by Justesen 9 ). He calculated thc power density funetion of z sequeneC$, Thc transition matrix of a maxentropic z·constrained source was determined, i.e. a Markov souree with transition probabilities chosen in such a way th at the souree has maxÎmUm cnlropy. In order to determine the transition probabilities th at maximi7.c lhe entropy of the N state souree, one must determine an eigenvector p" = (pa(l), P.(2)", " Pa(N)) that satisfies (ref. 14, pp. ZW) p.D '" Ap". (5) The joint probability pU,)) of a transition from stale i to jol' the maxentropic Markov SOUfce is given by p(z', j') '-1 D(t', j') pa «(!.,:.)!.. , Pa [ = Il .• = I,.j I , 2" , " N. Using the fact that. D is a Toeplitl matrix we find for the stationary probabiiity p,(i) being in state i 9) p,(i) = Thc cut-aff frequency N (00 ｾ I ｳｩｮｾ (;/'; 1)' 1< i.,,;: N. (6) of a DC-constrained sequence is defincd by 9,15) H(wo T) --1"'-- 1 = 2' (7) wherc H(w T) is the power densilY function of the sequence versus frequency and T the time duration of a channel syrnbol. Justesen found a remarkable and very useful n;latÎon (8) where s! is t.he sum varianee of t.l1e sequenee. This simple relation is not only restricted ((J maxentropic sequcnees; exampies of practical channd codes investigated by Justesen show that the reeiprocai relation (eq. (8)) between cut-olf frequency and sum variance ｩｾ vçry accurate. Thi$ rnotivated us 10 use the sum variance as a criterion of the lowfreqlH;ncy propel'ties of a channe1 code. This is of practical imponance because tbc surn varianee of a sequencc îs often easier to calculate than the autocorl'clation function or specll'um. The sum varianee of a maxenlropic z sequem:e can bç found using eq. (6) h;'(N + Ij - k I 2 " 2 ( 11k ) sm Ntl' (9) Construction af binary DC-constrained codes 35 TABLE I Capacîty and sum varÎance of maxentropic z sequences with digital sum variation N N C(N) 3 4 5 6 7 8 0.5 0.694 ＨｦｾＨｎＩ 0.5 0.80 1.17 1.59 2.09 2.64 3.26 0.792 0.85 0.886 0.91 0.93 9 In table 1 the capacity C(N) and sum variance versus the digital sum variation Nare collected (usîng eqs (4) and (9». For large djgital sum variation N the sum variance and capacity can bc ap· proximated by (2 In 2) (N C(N) - 1 - q2(N) ｾＨＱｾＲｾＩ - + ＱＩｾ (N , + ｉＩｾＮ This leads to following relation bet ween the redundancy 1 - C(N) and the sum variance of a maxentropic Z sequence 0.25 ｾ (1 - C(N» q2(N) > ＨｾＭｾ I) In 2 "" 0.2326. (10) Actually the right·hand bound is within 10/0 aCcunl.Cy for N> 9. This relation shal! be used later 10 establish a figure of merit of DC-constrained codes. 3. Enumeration of (N, Tl) sequences Let x "" (XI,X2, ..• ,x.), Xp", 10, IJ be an (n) sequence. We deftne thc digital sum Zk of the (n) sequence k Zk = L: (lxi - I) + Zo, I s.; k <::<; n. lol The rescaling has been done in sueh a way that 'zeros' are counted as -I. The initial sum is denoted by Zo. Binary channel ("odes 36 Definitioll A binary (n) sequence ｩｾ çallcd an (NI, N 2 , n) sequence if Zk (I ,( k -,;:;, n) is bOllnded between N, and N 2 (N2 ) NI)' Without loss of generality choose NI '" J and N2 = N. We further use the shorthand notation (N, n) sequence la denote a (1 ,N, n) sequence. Let T= T(z",rI;N,zo) be the set. of(N,n) sequences fOr given length n, surn constraint N, inilial sum Zo and terminal sum Zn, Thc cardinality of tIJc Sel T can be füund by manipulatioo of the NxN connectîon matrix D. Let I TI =A(zn,n;N,zo) denotc t.he number of distinel sequenceS of length n starting in state Zo and terminating in state Zn, tIJen A(z",n;N,zo} = D"(zo, z,,), (11) where D"(i,j), ｬｾ［ i,j 0;:; N, denotes the i,j-th element of !.he n-th power of the matrix D. In order la avoid many rnulliplications required in the matrix powers, observe [he specific struet.ure of the conm:çtion matrix D, We I1nd the following recursivc rclalions for determinÎng [he number of sequences. A(zo, 0; N, zo) '" I, Let. AU, 0; N, zo) = 0; and A(O,j; N, zo) = A(N + a1l i f Zo I,j; N, zo) = 0; 0 ｾ［ｽ < rI. Let. furLher A(i,}; N, zo) " A(i - 1,j·- I; N, zo) + + AU" l,J - I; N, zo); 1 ｾ i ｾ N; I ｾｪＺＨ n. (12) We can observe that exccpt at the boundarics the recursive relations are simîlar the relations for the binomial coeffjçjcnt> in the Pasca) triangle. 10 4. Enumerlltive coding of (N, n) sequences In [his section a general enumerativc tcehnique for deeoding (N, n) se· quences is dev<:loped. Ta that end we r:stablish a 1 - 1 lTIapping fram a set Tof (N, 'I) sequenees anta a set of întegers 0, 1, ... ,1 TI - 1, where I T I is the cardinality 0 I' jhe set T. Let /(z",n;N,zo) be thc set of (N,n) sequences 1'01' the given constraints. Thc sct. T ean be ordered Iexicographkally as follows: lf x = (Xl, .. . ,x,,) and y = (Y1" .. ,y,,) are elemcrlls of the set. T [hen y is called Iess lhan x, in shorty <ox, if th ere exists an i, I ,s;:; i ＢＬＭｾ n, sueh that Yi <. Xi and Xi = Y)I 1 <j < i. Fol' cxample '00101' <0 '00110'. The position of x in the lcxicagl'aphieal ordering of T is defined to be the rank of x denoted by r(x), i.e. r(x) is the nurn ber of all y io T with y <ox. COl/Struction of binary DC-constrained codes 37 Theorem 1 The rank r(x) of the binary sequences x in T can be calculated according to r(x) = . L A(st + 1. fI t=1 k; N, ｺｾＩ . Xt. (13) where k-l I St = Z. - (2x1 - 1). i=l Praal According to Cover 16) the lexicographic index r(x) is given by n r(x) = I Xr n.(XI,X2 •... ,Xj_I,O), J=I where fI,(X"X2, ... ,Xk) denotes the numbCr of dements in Set T for which the first k coordinates are given by (XI,X2, ..• ,Xt). The first (j - 1) coordinates of x detenniue the state Sj J-l Sj=z, - L: (2x I - I), ｉｾｬ as n )-1 Zn = Zo + L: ja;:l (2x1 - 1) - 1 + L (2x i - 1). i.j ... t The number of elemenrs in set Twith prefix (X"X2'" .• Xj_hO) equals the number of (N, n - j) sequences starting with digital sum Zo and terminating with Sj + 1. In other words Example 1 o Consider the set T(z.,n;N,zo) = T(4, 10;7,4). It follows from Z.=Zo that all elements of the set have an equal number of zeros and ones. It can be verified that the sequence x = (l 1000101 10) is a member of the set and can be ordered according to theorem 1. The ranking procedure is visualized in fig. 1. The array in fig. 1 has N"" 7 columns and fI = 10 rows. The entries are given by the weighting coefficients A(i,);N,zo). We can easily verify the recursion relations eq. (12). Ranking proceeds as follows. Start at the bottom at the z. '" 4-th column. The Ieft-most symbol XI of the sequeoce x = (1100010110) is a I. Move ooe 38 " 0-. 0:B 3 6 @) @...... 10 14 ﾮｾ ..@ 'V 0 0 0 4 14 34 r(1100010110) = '190' 0 14 48 ... @ 1 ｾ｟Ｖ Fig. 1. y x 0 (2)-'8 1 2 4 . Binary channel codes ＡＨ･ｾｩｬ｡ｲ･ｮｇ＾Ｇ｜｡｣ｾＮｉＨｐ IriangJe. step in the X direction I.e., towards 68, and record the entTY at a single step in Ihe X diTectlon, Le. 116 in an accumulator. The next symbol is again a I, so mOvC agaln in Ihe X oirection towards 34 ano aod the entry 68 10 the current accumulator vatue, I.e. 116 + 68 = 184. We further proceed as depicted in the tlguTe. Eventually we arrÎve at r(1100010110) = 116 + 68 + 4 + 1 + 1 = 190. The procedure given heTc Îs a generalization of a procedure given by Schalkwijk 17) using Pascal's triangle with the binomial coefficients as enlTies for ranking codewords with a given disparity. The invtrse function, conversion hom a given integer I to an (N, n) sequencc with rank I is also analogous to Schalkwijk's algorithm and can be carried out as fol1ows. ｉｮｖｾｔ･ algorithJil Let an integer 1(/= 0,1, ... , I TI - I) ano the set T(zlI,n; N,zo) be given. The following algOTilhm finds x such that r(x) = / (see Cover J6». Step I Ij A(z" + 1, ti - 1; N, zo) ｾ n - I; N, zo) else set X, = O. I then sel Xl = 1 and set I = 1- A(zn + I, Step 2 SR For k=2, ... ,n, ifA(sk+1,tI-k;N,zo)<I, Whe:TC SJ=ZO and = SR-J-- (;2Xt-l - 1) then set XR = 1 and set J = 1- A(Sk + 1," - k; N, zo) else set Xk = O. 39 Construction of binary DC-constraÎned codes Till now the procedure was confined to the ranking of one set (N, n) Sequence8. 8ya straightforward generalization it is also possîble to rank sequenceS starting from a predefined set of states and all terminating in the same state. Assume L codeword subsets starting with sum valucs gîven by ZI, i= I, ,L, I.e. consider the set consisting of the subsets T(z.,n;N,z;), i = 1, , L. Ranking of tnc set can be accomplished with • r(x) where = L L L A(Sk + 1, n ｾ k; N, Zi) • Xk, (14) k=1 i=l kaj Sk = Z" - L (lx j ｾ 1). j=1 This follows from the fact that the subsets are disjoint. The inverse function, converting an integer I to an (N,n) sequence, can also be done in the same maDner. The ranking given in theorem 1 enables the design of channel encoders of moderate comp1exity. We need storage capacity for approximatcly N/2 x n non-zero weighting coefficients, a full adder and an accumu1aror ro store rhe intermediate and final results. This has to be compared with a look"up table of 2'" entries if a non-algebraic method of coding is used. The weighting weffi· cii;mts look.up table ean be used for encoding and decoding, which is attrac· live fOr application in recording. We do not need a multiplier because x Îs binary valued and so the multiplications are simple additions. Lyon 1$) gave an cxample of a practical embodiment for the coding of codewords with given disparity using Pascal's triangle. His circuitry can in principle be used with only a modification of the entrÎes in the look·up table. 5. Tl'ansmission codes based on (N,n) sequences WheD the state-independent encoding strategy is used rransmission codes that satisfy rhe digital sum constraint can readily be found. Enumerative en· coding and decoding can always be used in a straightforward way. A drawback of tnÎs simple codîng scheme is that generally speaking good coding efficiency wil! be reached for 'large' codewords. A much more powerful merhod is based On the state-dependent encoding principle. A problem will arise for the assîgnmenr of codewords ro source words in all encoder stares, when the algebraîc coding algorirhm and stare-independent decoding are to be combined. 5.1. State-independent encoding In the special case of state-independenr encoding all codewords start and terminate in the same sum state. Consequently the codewords have <:ero·dis· 40 Binary channel r.:odes parity. The number of distinct (N,n) sequences can bc found by the enumeration method given in the preceding section. Table Ir lists the number of zerodisparity (N,n) sequences as a function of N and n. The terminal sum state has been chosen to maximize the number of codewords. Transmission codes can readily be designed with this tabie. TABLE n Number üf zero-disparity (N, n) sequences rr 10 12 14 32 4 16 5 13 34 89 6 18 54 162 6 19 61 197 6 20 68 232 6 20 69 241 64 233 486 638 792 846 128 610 1458 2069 2704 2977 4 ｾ N= 3 4 5 6 7 8 6 1; 8 E;\;arnple 2 lf n = 8 ,md N", 7 the table shows that there are 68 distinct (N, n) sequences satisfying the given constraints. This cnables the mapping of m '" 6 souree bits onto the n = 8 channd symbols. Hence the code rate R = mln = 3/4. Wc only need 64 codewords, sa four codewords are deleted. lf the lexicographical largest or smallest codewords are deleted then the enumcratîve coding scheme can be applied without modification. Tlle rate efficiency is defined with respect to capacity of the constrained channel 13) R e = C(N) • where Nis the digital Sum variation of the channel code. The code of examplc 2 has a rate efficiency (see table I): 0.75/0.886 = .85. The Sum variance. adopted in sec. 2 as a low-frequency criterion, could be found by an exhaustive computation of the variance of all codewords. A computation growing polynomially in the codeword length n can be based on the enumeration method of the preceding section. The probabllity thar the digital sum equals i at symbol position j within a codeword is given by the number of paths starting in ZQ, passing state i at position j and eventually tcrrninating at Zn, dÎvided by the total number of paths from ｾｑ to ;<;n, ConSfructiOrl of binary DC-constrained codes 41 In the case of zero-disparity codewords we have Zo "" Zo, 50 that according of zero-disparity (N,n) sequences can be found by to the enumeration method the sum variance Ｄｾ s 2 "" ｾｾ /;:0 ｾｉＨＭ tS1 1 ｾＩＲａＨＭ 1- Z -'N• Zo )A(-I,}. -'N• Zo ) • I, ti - j. (15) where the average digital sum Z is given by ｚ］ｾ - ) f.'ｾ l ｾｬjf:ol-A('l,n-J;. N .zo)A(-' l,j;N.z o I and the cardinality ITI = A(zo. n; N, zo). The code of example 2 has a sum varianee Ｄｾ = 1.38, which amounts to 66070 of the sum variance of a maxentropic sequence with N = 7. It is clear that the comparison of DC-constrained codes with maxentropic sequences should take into account both the sum varianee and the rate. Therefore a new figure of meril, the efficiency E of DC-constrained codes is introduced. Eq. (10) showed that in the caSe of rnaxentropic z sequences the product of redundancy and sum varîance is approximately constant. The efficiency E is now defined as the ratio of the products of redundancies and sum varianees of the maxentropic z sequencc and the actual code, or E == (l - C(N)j ç2(N) (1 - R)S2 (16) Fig. 2 shows the efficiency E of zero-disparity codewords of word length n with N as a parameter. The efficiency grows asymptotîcally to unity with 10creasing codeword length_ 1.0 0.9 w :>. 0 c 0.8 .Q! u .... ;;:: UJ 0.7 6. "J 0.6 0. l( 10 0 20 30 Codeword length n Fig, 2. ｹ｣ｮｾｩｦｅ E of zcrO-ÓiSNrity based codes. 40 Binary channel codes 42 5.Z. State-depel'ldent encoding The case of state-independent encoding is straightforward: all ｾ､ｲｯｷ･￧ end with the same sum value. More efficient coding, i.e. achieving a targer rate with smaller codeword lengths, is possible if the ･ｯ､ｷｯｲ､ｾ are allowed to terminate in a predefined set of sum states. The problem is naw to establish this set of states and codewords for an optimal transmission. ｆｲ｡ｮｳｾ･ｫ 12,19) has developed an algorithm for determining the existenee of a set of sa-caJled prineipal states for the design of collStrained sequences sueb as run-Iength-Iimited sequences. A necessary and sufficient condition for the existence of a code is the existence of a set of principal states with the property that from each of these principal states there cxists a sufficient (with respect to the numher of source wards) number of distînct codewords la the ather principal states. The minimum number of paths from a principal st.ate t.a another detcrmines the information rate. The met.had autlined by Franaszek does not guarantee the important. feature of st.ate-independent decoding. A code is state-independent decadable 19) if every mapping of a souree word int 0 a codeword has a unique in verse independent ofthe encoder code pages, where a code (boak) page is t.he subset. of codewords starting in a princlpal state. This property guarantees that cvery (noiselcss) codeword he carrectly decoded even when the actual eneader state is unknown. Franaszek showed that ,ueh a unique inverse is always possible if only two principal state, are eha,en. We colleeted in table Hl some of the results of the computer search for codes using Franaszek's methad with the additianal constraint of two allowed TABLE III Codes with fixed length codewards, state-dependent erleoding N digit.al N R ｲ｡ｴｾｮｲｵｾ n In 5 4 3 5 5 14 11 24 19 R .95 II .99 19 .999 5 .94 .97 .98 .99 4 iii fi 5 12 e ｾ 7 6 7 7 14 n 7 24 19 21 8 16 14 ij 9 9 18 ZO 16 18 9 j(j ij 6 7 19 22 Ｑｾ fi 7 .96 ｾ .96 .97 variation, n códtwotd letlgth\ m number of souree: bits, = mln, e rale efficiency - RIC(N), 43 Construction of binary DC-constrained codes principal states. The number of codewords were truncated to the nearest power of two. For this reasoo the sum variance of the codes is not known and therdorc thc ratc cfficiency e= RIC(N) is used as a figurc ohncrit. Thc SOUrce word length m is here chosen to optimize the rate efficiency of the code. Note that in practical circumstances we generally do not have such a degree of freedom and consequently Some of the efficiency is lost. We conclude from this table that good rate efficiencies are possible with state-dependent encodîng. Most înterestîng are codes with sum variation N .. 5. In this particular caSe it can be proved by evaluation of the matrix D that for any codeword length the number of principal statcs for optimum transmission is two and that the number Mof sequences from a principal state is (n even) M"" 3"/2. In the n = 2 case there are three distinct codewords: the zero-disparity codewords '0 I' and '10' (i.e. the 'bi-phase' words) and the codewords '00' and '11 ' to be used alternately. The rate of the n = 2 code when using a ternary source 3. According to table I this simple code achieves lOOOJu of the alphabet is ｧＰＱＲｾ noiseless channel capacity. Also the sum varianee of this code equals the sUm variance of the maxentropic N "" 5 sequenee 26). For binary souree alphabets it is not possible to reach this efficiency, though according to table 111 wÎth a codeword length n "" 24 a rate R = ｾ ean be reached, beiog 0.1070 below the asymptotic bound. Fig. 3 shows the efficiency Eversus codeword length TI. wÎth lhe digital sum variatioo N as a parameter. The sum variance was calculated by an extension 1.0 ｾ 0.9 w >. IJ 0.8 --------- l:: .W IJ ........ w N 0.7 t. 7 v 9 0.6 0 x 0 10 20 11 30 Codeword length n fig. 3. Efficiency of 'tate-dependent coding. 40 44 Binary channel codes of eq. (15). The code efficiency was calculated taking into account all possible codewords (no truncation to a power oftwo)_ Tlle efficiency of codes with sum variation N = 5 is for arbitrary (even) codeword length equal to unity. We nate the improvement in efflcîency with respect ra zero-disparity codeword based eoding. Till now we did not diseuss how 10 apply Ihe enumerative coding method when state-dependent code book pages will be used. State-independent deeoding is not guarantecd îf the codewords in the code pages are lexicographically ordered. Iftlleencodcrhas two pages andNis odd then coding can bedonein Ihe following way. Let A o and A, be the code book pages. Codewords in A o have zero or positive disparity. We order one ofthe pages. say A o, aecording to sec. 2. A codeword in page A, is found from a codeword with the same rank in Ao by inverting all bits if the codeword has non zero·disparity. Souree words are represented by the same zero-disparity codeword in bath pages. The decoder observes the disparit)' of a received codeword and if needed the codeword can easily be mapped to the A o ranking. lf N is even this simple procedure cannot be applied. In the next section an example is given to solve the N is even case. 5_3. 8bl0b channel code, a worked example In certain applications of channel codes, sueh as digital recording on magnetic tape or transmission over libre-optie network it has been found t.hat a 'DC-balanced' channel code with rate R = 0.8 has attractive features 20-'8)_ In this section a design example of such a code is described based on the theory of the preceding sedion. The somee signal is assumcd to have eighl parallel bits that arC translaled by the channel encoder using a code book onto the channel symbols. The channel symbols are sequentially reeorded on tlle recording medium such as ror example magnetic tape. In this design examplc the code book has two ｰ｡ｧ･ｾＺａＨｓｾＩ and A(S,), where So and SI are the two encoder stales. Each page has 256 cntries, enabling a ane-ta-one state-depcndent encoding. Dependent on the particular transmitted codeword the eneoder state changes from state 8 0 to SI (or from S, to So or it remains in ,he same state. Bath pages are sa ehosen that: a) a simple encoding and decoding is possible, permitting a simple low-cost design; b) the concatcnated channel string takes on six digital sum values; c) (he maximum run-length is live channel symbols, i.e. at mast five canSeculive like symbols can occur; ct) state-independent decoding is possible. Upon dccoding no lise is made of the digital sum for decoding. In this way error propagation is limitcd to cighl decoded souree bits. Construction of binary DC-constrained codes 45 From table I ît cao be eoncluded that the capacity of a channel that occupies six sum states is C(6) = 0.85. In other words the design of a channel code with this constraint and rate R = 0.8 is feasible. The capacity of a channel occupying live sum states is C(5) = 0.792, so that a code with R = 0.8 satisfying the N = 5 constraint has not to be attempted. Fig. 4 shows the unbalance trellis diagram of a code that oceupies six sum states. The encoder has two princîpal states denoted by So and SI. The trellis 2 1 S1 ｾ 0 c .2 -1 .2c -2 :J So -3 o 2 3 4 5 6 7 8 9 10 time _ Fig, 4, Trellis diagram of Sb lOb code. diagram gives a plot of all possible sum values as a function of thc symbol posltion, Using the generalized Pascal triang!e delined in the preceding section it can be caJculated that the number of codewords satisfying the chanoel eonstraints starting in So and terminating in So and SI is 197 and 155, respective!y. From SI there are 155 and 131 codewords terminating in So and SI, respectively. We condude that the minimum number of codewords hom an initial state is I3I + 155 = 286, being sufficient for an 8 10 10 mapping. In order to enable simple coding the sets of codewords with zero-disparity are divided loto two subsets: 1) codewords that ean be stAte-independently encoded, We found that the total number of slieh codewords is 89. These codewords ean be found by inspection of tbe treIlis diagram. The zero-disparity state-independent codewords occupy four sum states so that the number of codewords and the codewords themsdves ean again be evaluated using the generalized Pascal triangle (see table IV); 2) other codewordS with Il:ero·disparity, We now formulate thc following rules for encoding and decoding. Binary charme' codes 46 TABLE IV Pascal triangles of 8b lOb code 0 0 1 0 0 2 2 3 8 10 0 13 0 21 21 9 0 9 28 33 47 55 3 19 14 8 34 6 4 0 3 3 0 3 5 61 108 Encoding nlles Most of the codewords of page A(So) can be gcnerat.ed by the enumerative coding method. The page A(Sl) can be found with some 'bît-shuffiing' of codewords of page A(So). We ddinc tbc following three subsets of A(So), where I (00:;;; I ＬＮｾＺ 255) is the rank (or decimal notation) of the source words Subset Tu: 0 < I,; 88. The codewords are state-independent encodable, have lero·disparity, and do not change the encoder state. Subset Tt: 89 <i:"" 1« 243. The codewords have non-lero disparity. By symrrtetry (see trelHs diagram) the codewords in page A(Sl) can be found from codewords in A(So) by inverting each symbol and reversing the direction of transmission of the codeword. Exampk assume a codeword wilh rank I in Tl of page A(So) to be: '0100JOIIII'. Inverting and revetsing results in codeword A(Sl,f)= '0000101101'. Jt can be verified in the t.rellis diagram that this codeword satisfies the constraînts. Whenevet a codeword in subset Tl is transmitted t.he encoder cbanges its stat.e. Subser T2: I> 243. The codewords in T 2 are chosen from the st.at.e-dependent, zero-disparity codeword set in such 4 way that. a codeword A(S 1,1:> 243) is found from A (So,l >·243) by reversing the direction of transmission of t.he codeword. We did not found an enumerative method for encoding Or decoding the twelve codewords in set T2 , so that a direct look-up table is needed. 47 Comtruction of binary DC-constrained codes Oecoding rnles At the receiver the cOdewords in A(Sl) can with simple hardware be mapped to codewords in A(80 ) with the same source representation. It thus suffices to have a decoding algorithm for codewords in state 8 0 • Due to the partitioning into two subsets two Pascal triangles are needed listed in table IV. We conclude from this table that the look-up table has 21 + 14 = 35 nonzero entries. We further necd for decoding an 8 bits full adder and additional hardware for distinguishing subsets TI) and TI and coding of codewords in T2. Results Using a computer the following statistical properties of the 8b1Ob code were calculated (see table V). TABLE V StatÎstical properties of 8b1Ob code sum prob. run-Iength prab. 3 2 I .066 1 2 3 4 5 .540 .303 .119 .033 0 -1 -2 .199 .286 .264 .148 .036 average sum "" 0.662 SUm varianee = 1.534 .005 average run-Iength '" 1.66 The run-Iength of a code is defined as the number of consecutive like syrnbols. Thc efficîency E of the 8blOb code is according to eq. (16) and table I: 0.2398/(0.2 x 1.534) '" .78. Table VI lists the main parameters of DC-constrained 8blOb and 16b20b codes th at were reported in the lîterature (see also Tazaki 22». TABLE VI Main parameters of R = 8/10 DC-constrained codes reference Morizono 23) Shirota 24) Widmer 2O•21 ) Parker ＲｾＩ new ＩｾｯＱｉ N 10 T m ",,*) note 10 **) 7 6 7 6 6 5 5 ***) 5 0) maximum rUlI-lellglh. 00) simple zero·disparily code wilh 252 Ｌｳ､ｲｯｷ･ｾ block code wÎth codeword length 11 = 20. 8inary channel codes 48 Note thai Ihe new code has improved features with respect 10 the other codes with codeword length n '" 10. The Parker code has the same main parameters, but due to the doubled codeword length it needs considerably for encoding and decoding. more ｨ｡ｲ､ｷｾｈ･ 6. Coltclusions We have presented a systematic approach for designing binary DC-balanced channel codes. DC-balance was achieved by imposing a conSIraint on the maximUlIl unbalance in the number of positivc and negalive pulses within codewords. We derived recursion relations for determining the number of codewords satisfying a maximum unbalance constraint. We showed that the efficiency of the new codes is asymptotically optimal, i.c., for long codewords the information rale and sum variance approach the capacity and sum variançc of maxentropic z-constrained sequences. Aigorithms for cncoding and decoding of maximum unbalance codewords were derîved. The hardware needed for these algorithms grow polynomiaJly with thc codeword length, 50 that they are very useful when large codewords are applied. An example of a new 8blOb DC-balanced code assuming only six digital sum values was giveo. REFERENCES ') D. B. Watcrs, Int. J. Electron. 55, 159 (1983). ') re M. Brooks arld A. Jessop, Int. J. Electron. 55, 8J (1983). ') Y. Takasaki, K. Yamasnita ano Y. Tak.ha,hi, tnt. J. Electron. 55, 121 (1983). ') J. P. J. Heem sk er k and K. A. Sch Cl uh amcr I m min k, Philips Teclm. Rcv. 40. 157 (1'182). ') J. C. Mallin,on and J. W. Millet, Radio and Elee. Eng. 47,172 (1977). ') K. W. Callermole, Principle; of pulse eooc moou1atÎon, lliffe Books Ltd, London, 1969. ') K. W. Cattermole, Int. .I. Electron. 55, 3 (1983). ') R. O, ç.rter, Electron. Lelt. 1,65 (1965). ') .1. Juste,en, IEEE Trans. Inform. Theory ｬｔﾷｾＸＬ 457 (982). >0) R. W. S. Chang, T. M. Jakuoov and A. 1.. Garcia, IEEE Trans. Commun. COM·30, 1668 (1982). ") T. M. Chien, Bell Ｎｴｳｹｾ Teeh. J. 49, 2267 (1970). ") C. E. Shannon, Bell Syst. Teen. J. 27, 31'1 (1948). ") P, A. Franaszek, IBM J. Res. Devclop. 14, 376 (1970). ") R. I\sh, In format ion Theory, Interse1enee Pllbli.hers, New Yo,'k, ＱＹＶｾＮ JO) J. N. Franklin8 and J. R. Pierçç, IEEE Trans. COmmllII. COM·;W, 118z (972). ") T. M. Cover, IEEE Trans. Inform. 'fheory lT-19, 73 (1973), ") LP. M, Schalkwijk, IEBE Trans. Inform. Theory IT-18, 395 (1972). ") R. F. Lyon, IEEE Trans. Commlln. COM-Zl, 1438 (1973). '0) P, 1\, Franaszek, ｂｾＱｬ SYSI.. Tçch. J. 47, 143 (1968). '0) A. X. Widmer and P. A. Franaszek, Eleçtron. Lclt. 19,202 (1983). ") 1\. X, Widmer and P. A. Fr.n.szek, IBM J. Res, Devolop. 27, 440 (983). ") S. Talaki, F. T.kcd., H. O,awa and Y. Yamada, Proc. of5th lntcrn.t. Conf. On Video and Dala Rcwrding, Soulhampwn, p. 79, 1984. ") M. MOrizono, H. Yoshida anti Y. Ha;himQto, SMPTE Journal89, 658 (19MI). ｩｾＢ N. Shirota, U.S. Patent 4,387,364,1983. ") M. A. Parker anu F. 1\, Bellis, Proc. of 4th Internat. çont". on Video ,md Dala recol'ding, S"llthampton, p. 207, 1982, '") K. A. Schouhamer Immink, PhiIips J Rcs. 4(l, 1 (1985). 49 Spectrnm shaping with binary DC 2 -constrained codes SPECTRUM SHAPING WITH BINARY DC2-CONSTRAINED CHANNEL CODES ａＮ｢ｾｴｲ｡｣ A methad is presented for designing binary channel codes in sueh a way that both the power spectral density function and its second-dcrivalive vanish at zero freQl,lency- ｮｯｩｾｲｵ￧ｒｾｮｯｩｴ｡ｬ･ｲｾｲ｡ çlerived to ･ｮｩｭｲｾｴ･､ the number of ｣ｯ［ｬ｣ｷｯｲ､ｾ that can be used in this coding scheme. A simple algorithm fOr encoding and decoding codewords is developed. The per· formance of lhe new codes is compared with that of classica! channel codes designed with a constraint on the unbalance of the number of transmincd positive and negative pulses. 1. Introduction Many channel codes have been designed with the aim of suppressing the power of the encoded stream near the zero frequency. Most of these channel codes are designed on a digital sum constraint, where the (running) digital sum is defined for a binary sequence as the accumulated sum of ones and zeros (a zerO is coonled as -1) from lhe start of the transmission. The frequency region with suppressed components is characterized by the so-cal!ed cut-of!' frequency. Justesen I) found that the cut-olf frequeney of these sequenees is proportional to the redundancy of the code. An example of a practical embodimenl of a code rejecting the low-frequency componenls, a De-free or DC-balanced code, is the so-cal!ed zero-disparity code using codewords with an equal number of positive and negative pulses 2 ). A generalization of this concept leads to low-disparity codes using two alternative source represenraHons of opposite disparity polarity S). Immink 4) studied the cut-of!' frequency versus lhe redundancy of these simple binary low-disparity codes. He conduded that the performance of the codes, in particular for smal! codeword lengths, is comparable with that of maxenlropîc sum constrained sequences. Thc power spectral density funetion of digital sum constrained codes is characterized by a parabolic shape in the low-frequency range from De to the cut-off 50 Binary channef codes frequency. In some app!icaüons it is desirabie to achiew; for a gÎven redundartcy a larger rejeetiort of the low-frequency components than is possible with DC-balanced codes. In this paper a new class of DC-free codes is presented having as addjtional property that the zero second-derivative of the code spectrum also vanishcs al zero frequency. (Note that the odd derivatives of the speclrum at zcro frequency are zero becausc the spectrum is an even functîon of the frequeney.) This results in a substantial decrease of the power at low frequenties for a fixed code rcdundancy as comparcd wÎlh the classica! designs based on a digilal sum critedon. Seetlon 2 introduees a time domaln constraint to be imposed on the ehannel sequenee 50 that the resulting spectrum of the sequence has the- DÇ'balanced property, i.e. has both zçro power and zçro sccond-derivative at zero frcqucncy. Section 3 ｰｲ･ｳｾｭｴ an enumeration method for finding the number of eodewords to be used in a DC2·balaneed code. In order for the new codes 10 have a practical rate relatively long codewords have to be used which makes a direct method of mcoding and dccoding using look.up tables of the souree words and their chanod rf;pn:sentations prohibitively complex. Seetion 4 deals wÎlh the enumerative encoding and decoding of DC 2-baIaneed codewords, The algorithm is not more complex than looking-up and adding. The look-up tables needed for the enumerative coding grow polynomially in complexÎty with increasing codeword length. Seetion 5 gives examples of codes using codewords that can be concatenated without a merging ruk, .o·calkd staleindependent encoding. The power spectral density fllnctions of the newly dçvdoped codes wiJl be compared with those of classical DC-balaoced ｣ｯ､ｾｳﾭ Examples of codes with state-dependent eneoding that operatc wilt! two modes of the souree represent.atÎon are given in sec. 6. 2. DC 2·balanced codes Lel x denote a stationary channel sequence, having roean zero, wilh variables .. " X-i, Xo, Xi, , .. ; x;e 1-1,11- The power spectra! densily function of the process is given Ｉｾｹ｢ 00 H(w) = R(O) + 2 L RU) COS Uw), (I) f=l where RU) = E(xjxj.;), i '" 0, ±!, ± 2, ... is the auto·correlation function of the sequence. The spectrum H(w) is an even function of w, so that in the neighbourhood of w = 0 il can be approximated wilh the following Taylor cxpansion: H(w) = H(O) + H"(O) -Z-w 2 + H (iv) (0) 24 ｾ w .. - (2) Spectrum shaping with binary DC 2 .cons!rained codes 51 From a straightforward Fourier analysis it can be shown, if the running digital sum (RDS) Zi defined by i 2:, = L xj, j= ... ｾ Îs bQunded, i.e. has finite unbalance, the resulting power spectral density function vanishes at (j) = 0, i.e. H(O) = 0 6). We now define the running digital sum sum(RDSS) Yi by lf both the RDS and RDSS assume a limited number of values it can be proved in the same way that then H(O) '" H"(O) '" O. In the following a method is presented to design codes wÎth bounded RDS and RDSS, here called 'DC 2·balanced' codes. The codes are based on codewords with fixed length. The codewords are chosen in such a way that if concatenated using a certain encoder rule the channel sequence will assurne a finite nurnber of Y and z values. The first problem to solve is to enurnerate the nurnber of codewords with length n satisfying a y and z constraint. 3. Enumeration of (z,y) sequences In the preceding sectlon the word unbalançe refeued to a pröperty of a sequence of arbitrary length. Now we shall use this concept for codewords with a predefined length. First simple reeursion relations are derived to find the number of codewords with length 11 having a cOnstraint On thç zand y dis· (z,y) sequences. Later we show how these codewords can be parity, ｳｯｾ｣｡ｬ･､ concatenatçd in such a way that the eode stream will assume a limited number of RDS and RDSS values. In other words, find the numbcr of sequenccs x = (XI ••• _. x.), Xi e (-I, 11, satisfying the following conditions L:" XI = d. and (3) ;=1 where d z and dy arc the zand y disparîtîes of x. We follow thc approach of7). By changing the order of summation we find '1 d1 = L (n + 1 1=1 J't i) x, = - I 1=1 i Xi + (n + I) d z • (4) Binary channel codes 52 Because Xi E 1·--1, Ij we rewrite eq. (4) and find tV,.d.i dy L '"'- tl.-d,) L Pi+ 1= ｾ (5) ni+(n+l)d" ,I"" 1 whCrC PrE p, "" n} and n, E 11., .. ,n} correspond to the positions of the + l's and -I 's, respectivdy. Obviously 1('I-d.z:) !("+d,) L ,=1 L Pi+ Il ni = L i = ｾｮＨ '=1 + 1), bl su that with eq. (5) !(H<f,l L Pi = Hn(n + 1) + Z(n + l)d, - Zdy ]. (6) 1=1 Rernark In the particular case that both ｾ alld y disparities are zero it follows in a straightforward way from eqs (3) and (6) that the codeword length n has to be a multiple of four, The number of codewords satisfying the ､ｾ and d, constraints can be found using the theory of partitions (sec ref. 8). From the above it follows that this number equals the number of distînct subsets lpl •... , Pil.+d,)) of size Ï(n + d,) of the set p, 2, ... , nl satisfyillg eq. (6). The number of codewords, delloted by A.(d" dy ) , is given by the coefficient CRU,) of u't j , of the polYllomial gn(u, t) defincd by g.(u, t) "" (l + ut) (l + u 2t) _.. (I + Uni), where i = Hn(n + I) + 2(n + I) d, - Ｒ､ｾ｝Ｌ j = ｾＨｮ + d,). Both i artdj are assumed to be integers, otherwise the number of codewords is zero. Wc cart derîve a useful relation wieh the following observalion. The sequence x "" (XI, X2, . __ , x n ) satisfies the d, and dJ constraint If the sequence w obtained by reversing the order of symbols in x, Le. w = (x"' ... , xd. satisfies the constraints . / I.=1 1'111 = d" The number of sequellces KI f ,=t j=l 1'11, = -dJ + (n 'j- I) d, . 1 satisfying the two constraints equals by detinjtion Spectrum shaping with binary ｄￇｾＭ｣ｯｮｳｴｲ｡ｩ･､ codes 55 + l)d, - dy). As the number of sequences w equals the number of sequences x with the given constraints we simply obtain A"(d,, (11 (7) As g,,(u, t) = g,,-I(Il, 1)( I + unt) the following recursion relation of the comcients can be written down. For m = 1,2, ... , n; j = 0, ... , rn aml i;;;; 0, __ "' !m(m + 1) (8) with initial conditions Co(O,O) = I, otherwise em(i, j) '" 0. 4. Enumerative encoding and decoding (z,y) sequences Codes with rates of practical interest are only possible if the codeword length is relativdy large (see later). The hardware requirements for a direct encoding and decoding using look-up tables for storing the codewords and their souree representations grow exponentially with the codeword length. For say n > 24 this direct methad is beyond current technology. In this section a sequential algorithm is presented for tncoding and decoding (z,y) sequences which Îs not more complex than adding and looking-up, where the nurnber of entries of the look-up table grows polynomially with the codeword length. To that end we establish a 1 - 1 mapping from the set Sof (z,y) sequences onlo lhe set of integers 0, I, ... , M - 1, where Mis the cardinality of S. Let S be the set of (:z,y) sequences. The sel Scan be ordered lexicographically as follows: lf w == (Wl, ... , w,,) and v = (VI, .. " v,,) are elements of tha Set S then v is called less than w, in short 1'<0 w, if therc cxists an i, I ,;;:; i';;:; ti, sueh lhal vi<w/and wJ=vjo ls;;,j<i. The position of W in the lexicographical ordering of S is defined to be the rank of w, denoted by r(w), i.e. r(w) is the number of all v in S with v <0 w. The following algorithm for ranking (z,y) sequences can be used for decoding. Decoding algorithm The rank r(x) of the binary (z,y) sequence x in Scan he calculated as follows rex) = I" k=1 A"_k(d, - Zk_l + 1, -dy + (n - k + I) d, + Yk-l) . xl" Binary channel codes 54 wherc kol Zk-I = I x, i=l and k-I L: ZJ. Yk-I = i=1 Proof According to Cover 9) the lexicographic index r(x) of x is given by n r(x) = I x/· n,(xj,x:l, . . . ,X)_I, -1), )=1 where n.,(x, ,X2, ••• ,Xk) denotes the number of elements in S for which the first k coordinates are given by (X"X2, ... ,Xk). The first (j - 1) coordinates of x determine the states Z) and y) )-1 I = Zj_1 XI 1=1 and )-1 Y)-I I = Zi • • =1 From eq. (3) we find that n j-I d, = I Xi + x) + .=1 L Xi = ;=)+1 II-j = Zj_1 + Xj + I Xj+!. i=l In the same way )-1 11 dl = I Zj = 1=1 I ti z/ + Zj 1'2 Zi I + z; + ;=1 Zi ;=1+1 I I Xm m=l }-I 1'1 LZI+(n-j+l)z;+ L I n-) Zi + (n - j {=l + 1) Zj + I k=1 i Xm ｭ］ｪＫｾ (=1+1 i=1 )-1 I = J=j+l 1=1 J-I I I -i" Ic I Xj+"" m=I Prom these observations it fo)lows thllt thc numbCT of 'lectors x given by x = (Uh, •• , Uj-lo -1, X)+lo •• "xn ), Spectrum shaping with binary DC2 ·cOMtrained codes 55 (where the u/s are given) satisfying n " I .1=1 112=1 I I LXI = d. and illl'l equals the number of veetors v with length 1'1 - == d, Xm j, V = (Vh'""' v._j) satisfying .-J L Vi = d. - Z1-1 + 1 i'=l and O-J k j-l I I k.I Vnt = dy - m ... t L z/ - (1'1 - j + 1) Zj_t + (1'1 - j + 1). i=l Whence n,(xt, X2" A._j(d. - Zj-l + 1, dJ' - •• ,Xjd, Yj-I + (1'1 -1) - j = + I) (I - ZJ-l»' With eq. (7) we find. Arl-Ad. - Z}_I = ａｮｾｪＨ､Ｎ - + 1, dy - YJ-l + (1'1 - j + 1)(1 - Zj-I» = Zj-l + 1, -dy + (1'1 - j + 1) d. + Yj-l). Cl The inverse algorithm to encode (z,y) sequences, i.e. to fiod tne scquences given tneir rank, is given in tne following. Encodlng algl;lrithm Let an integer I, lEID, 1, ... , M - Ij and the set S be given. The following algorithm finds x sueh that r(x) -- J. Jnitialize Xo -- Yo = Zo = 0; for where ｚｾ = ｚｾ｟ｉ + Xk-I and Yk = Yk-l + ｚｾＭｬ then set Xk = 1 and set else set Xk -- -1. The procedure. as presented here, is confined to the eneoding and deeoding of one set of (z,y) sequences. The procedure can readily be gencraliled to inc1ude sets with different values of the disparities (see ref. 9). Binary chatlrle( codes 56 The enumerative coding enables the design of channc! cncoders and de· cuders of moderate complexity. For decoding we need storage cal'acity for the non-zero weighting coefficients, a full adder and an accumulator to store tmermediate and final results, The encoder needs an additionallogical comparator. The weighting coefficients look-up table can be used for encoding and decoding, which is attraetive for application in recording, where the t.ransmitter and receiver are usually housed in one machine. Note that wc do not need a multiplier because x is binary valued. Lyon 10) gave an example of a practical embodiment. based on the enumerative coding of codewords with given z disl'arity using Pascal's trianglc. In order to encode and decode (z,y) sequences his circuitry has to be extended with an extra COunter to dctcrmine the y State. 5. Coding with zero.disparily codewords The simp!cSl way to find a code generating long sequences satisfying the z and y eonstraînt is to use codewords of fixed length n that can be concatcnated without a coding rule, i,e. to usc a stateless encodeL In this simplc case the codewords start and terminate with zerO ROS and R,OSS. Consequently the codewords have zero zand y disparities. In the sequel we use the short-hand notation zero-disparity if both the zand y disparities are zero. From eqs (3) and (4) we find in this particular case that the codeword moments Uk satisfy the following conditions: n uk=Likxl=O, keIO,IJ. (9) .j=l Using thc recursion relatiuns eq, (8) we calculated the numbcr M=A.(O,O), 4,8, ... ,36 of zero·disparity codewords (we halle already remarked that An(O, O) = 0, if n is not a multiple of four) , The results are listed in table L ti = TABLE I Number of codewords M with lero-dispaTity n 4 8 12 16 20 24 28 32 36 M R 2 8 0.250 0.375 58 OA88 526 5448 61108 723354 8908546 113093022 0.565 0,621 0.662 0.695 0.722 0.743 57 Spectrum shaping wifh bÎnary DC 2 -constraÎned codes Table I also presents the code rate R, defined by R = 2l og M n (10) Note in table 1 that even for large codeword length the code rate is still poor. Duting transmission the symbols of the (z,y) sequences are serially transmitted. Assume that the codewords are equiprobable and independent, then thc auto-correlation function of the concatenated channel stream is given by 11) 1 R(k) = - ,,_k ')' n f;'l I';;; k"; n - 1, E(XIX'+k); R(O) :: 1, Rek) = 0; k;:. n, whete (11) where X(k) is the k-th element of the set of codewords with zero y and z disI'arity. Example 1 The simplest example is the code with n '" 4 having two codewords so that the rate R ］ｾＮ We can verify that the two codewords are '0110' and '1001' (a '0' is used to denote a symbol with value -I). The auto-correlation function of this channel code can be found by hand: R(O) = 1, R(l) = ＭｾＬ R(l) = ＭｾＬ R(3) = ｾ and for i ｾ 4: R(i) = O. The resulting power spectral density function H(w) of this channel code is found with eq. (1) H(w) = I + ｬＭｾ｣ｯｳｷ - ｾ｣ｯｳＲｷ + ｾ｣ｯｳＳｷｬ = ＴｳｩｮＲｾｷＮ We can easily vedfy that indeed R(O) = R"(O) = O. Due to the exponential growth of the number of codewords the coml'utational elfort becomes prohibitive for large codeword lengths when eq. (11) is used. The preceding cnumeratîon method (sec. 3) can be extended to find the auto-correlation function with a polynomially growîng computatîonal effort. From eq. (ll) we derive for fixed io,i l ; (x/el-I, IJ): where N( ) is the nurnber of codewords satisfying ( ). Blnary channel codes 58 For symmetry ｲ･｡ｾｯｮＺＧｩ find wC E(x"xJ,) = 2N(xr, = XI,) M 4N(XI, XI, = -1 = 1) Assuml;: Xi. = Xi, = 1, then (12) - 1. M from eq. (9) we find I" and I" XI = -2 ｩｾｏＢＱ i=lj lXI = -(i o + i,). (13) ':z1,JI;I!i2 vl l l The numbl;:r of sequences in S satisfying these constraints ean again he found using the muml;:ration method of sec. 3. The number of scqucnccs N(Xi o = Xi, = I) in S satÎsfying l;:q. (13) is givl;:n by + 1) - iQ-i l , j=Hn- 4) of the polynomial the eoefficient of uivJ, ｩ］ｾｮＨ h,,(u, t) h ( nU, t) - (l g,,(u, f) + uiot)(l + (14) Ui,!) Wjth a small moditication of the reeursion relations given by eq. (8) we can obtain N(Xi,=X" = I). Using l;:qs (I), (11), (12) and relation (14) we calculated the power spectral density functions of zero.disparity codcword bascd codes. Thc results of the computatlons are plotted in fig. I. 11 o .,-" -10 ｾ ril -2 0 o <Il 0.. . ＮｾｖＧ -3 0 ｾ k'1/ 1/ Ｎｾ .. -. ｾｾＡＯ I ｾｖ -".. ,,-_. , ,I ......... -"'- --. 'V' 0 it'. 1t. ｾ V - f·· .... - - f-.. . _.._- -,. -- - .. . .. "" -50 0.005 0.01 0.1 " .. 0.5 trod ot Ó1annel freq. (Iogl rig. I. P"wer 'peelral den,ity funclion, of DC'-balanccd codo, ba,ed on zero-dbparity codewords witn the codeword ｬｾｮｧｴｨ n a!i a parameter. Spectrum shaping with binary ｄｃｾＭ｣ｯｮｳｴｲ｡ￎ･､ 59 codes We define the (low-frequency) cut-oft' frequency Wo accarding ta U) H(wo) "" Ｎｾ Using numerical methods we computed the cut·olf frequency of the various codes. Fig. 2 shows the cut-olf frequency versus the redundancy I - R . As a comparison we plotted the cut-ofT frequency, according tO the same definition, of DC-balanced zero-disparity codeword based codes verSuS the redundancy 12). We observe in the figure that for a fixed rate the cut-olf frequency of DC-balanced codes are approximately a factor of 2.5 larger than those of DC 2 -balanced codes. Below the cut-olf frequency the spectra of DC 2 -balanced codes decreases faster witl1 decreasing frequency than those of DC-balanced codes. We nodced when comparing the spectra of DC 2 - and DC-balanced codes that for fixed redundancy a cross-over is found at approximately - 20 dB. We conclude therefore that DC 2 -balanced codes are favourable if rejectÎon of lowfrequency camponents is needed better than - 20 dR 0.1 --_ ...... ---,," Ol .. 'W- .,.,", ,,, ......-- 2 ." ... _- 17 V. 1= '" ----1" 16 Ｘｾ 0.0 1 0.1 fr-- p ｮｾＱＲ Ij· ./16 Ir 20 10 _ ·_· _ 44 . ｾ ... _- 32 18 20 ｮｾＶｾｴｯ 36 52 .. DC - boloneil'd 2 DC - boloneil'd I I I I I I Redundoney (1-Rl (log) Fig. 2. The CUl-olf frequency of De- and DC'-balanccd, zcro-disparity codc$ dancy 1 ... R. The codeword length i$ denoted by 11. ＤｕｾｲＬｖ the r,dun- 6. State-dependent encodlng In the case of state-independent encoding the sequence is encoded without information of the past and we have a unique one-to-one mapping of the source words and their channel representations. A larger rate wÎth given code· word length is possiblc if codewords are allowed to start (and end) in a set of predefined states, called principal states. Franaszek U,U) has developed a procedure for determining the existence of such a set of principal states. We found 60 Binary channel codes during experiments with this procedure that the following choice of the codeword subsets is satisfaetory. The eneoder operates on the basic principle ofthe bi-mode alternate coding ､･ｾ｣ｲｩ｢･､ by Cattermolc 2). All codewol'ds are chosen in sueh a way th at. d. = O. Codewords with positive and negative ､ｾ are used in twO modes, eodcwords with zero clv are used without modification in both modes. The choice of the polarity of lhe codeword is chosen by the encoder in such a way that the RDSS at the end of the new codeword is as close to zero as possible. We can easily verify that the number of eneoder states, Le. the nUillber of ROSS values at the end of a codeword, is twiee the number of code· word subsets with different positive d1 values. By properly ｣ｨｯｾｩｮｧ t.he inÎtial RDSS value the RDSS at the end of the codewords wil! assume the values ± I, ± 3, ... , ± (r - I). In practief; th is is aehieved by initializing the ROSS counter with the disparîty l. Using the reeursion relations given by eq. (8) we caleulated the rate of the bi-mode code versus the number of subsets and the length of the codeword. In table Il we collected the results using ,. encoder states. TABLE 11 Code rate R of DC 2 ·balanced codes wi!h codeword Iength n and r encoder states r = n= 4 8 U 16 20 2 4 6 .396 .488 .568 .627 .670 .5 .557 .616 .663 .700 .594 .648 .688 - .710 8 10 .625 .672 .707 .736 .641 .689 .722 .748 - - Note the improvcmenl, for fixed codeword length, in fate with respect to stateindependent encoding (tabie I). Example 2 An atlractÎve and simpie code found in the table has the following parameten;: n = 4, r = 4 and R "'- Ｎｾ Table lil shows the eodebook of this simple code. 'I'he codewords should be chosen from lhe table in such a way that the ROSS aftel' transmission of the new codeword is as close to zero as possible. The RDSS assumes the values ± land ± 3; table IU shows the corresponding eodeword for the given RDSS value. Fig. 3 shows the power spectral density function of th is DC 2-balanced code. The spectrum of this code has been calAs a reference culated using the matrix method of Cariolaro et al. ＱｾＬＩＮ we plotted the spectrum of the well-known 'bi"phase' code (zero-disparity, Spectrum shaping with binary nC 2 -constrained codes 61 TABLE III Çoàebook of R = ｾｄￇｾＭ｢｡ｬｮ｣･￠ code, the O's correspond to the -l's of the analysis codeword RDSS = 1,3 source codeword RDSS = -3, -1 00 Ol LOOI 0110 0 1001 0 0110 0 0 10 1010 11 1100 2 4 0101 0011 -2 -4 dy dy o -10 !g -20 0.01 0.1 frad. of chonnel freq. Fig. 3. Power spectral density functions of a DC'-balanccd code n phasc' (b). = 4 and R = i (a) and 'bi· DC-balanced code with n = 2) having the same rate. Note the improvement in the low-frequency range of thc new code. The memory requirements of Cariolaro's metbod become prohibitive for a large number of encoder states and large codeword lengths, so that we were not able to compute the spectra of the codes with rate R > 0.6. 7. ConclusÎons A new class of DC-constrained channel codes, called DC 2 -balanced codes. have been presented. Tbe power spectral density function H(w) of DC2- Bim1ry "hunnel çodes 62 balanced codes has besides zero power at zero frequency a zero second-deriva· tive at zero frequency, I.e. N(O) = H"(O) = O. It was shown that if the redundancy is fixed the cut-oft' frequency of DC 2 balanced codes is approximately a factor of 2.5 smaller than of classical DCbalanced codes. Below the cut-oft' frequencies the spectra of DC 2 -balanced codes decreases faster with decreasing frequency than those of DC-balanced codes. We noticed when comparing the spectra of DC 2 - and DC-balanced codes thai for fixed redundancy a cross-over is found at approximately - 20 dB. We conclude that DC 2-balanced codes are superior wilh respect to DC·balanced code, if rejeetion of low.frequcncy cOmponent, is needed better thart - 20 dB. REFERENCES ') ') ') ') 0) 0) ') ') .) '0) ") •,) ") "j lOl ") J. Justesen, IEEB Trans. Infol'm. Theol'Y lT·;llj, 457 (In2). K, w. CattHmole, Int. J. Electron. 55, 3 (\983). J. M. üriffiths, Electron. Lett, 5, 79 (1969). K, A, t'eho\'hamer Immink, Philips J. Res. 40,1 (1985). W, R. Bennett, BeU Syst. Teehn. J. 37,150\ 0%8). H. Yasuda and H. lnose, JournallElÇa Japan 54a, 506 (1971). ü. F. M. Beenker, Philips Techn. Note T.N. 105/84, 1984. J. R.iordan, An introd\lction 10 combinatorial an.lysis, Princc!<)n Univcr<ity Pres,. 1980, T. M, Cover. IEEE Trans. Inform. 1'heory rf-19, 73 (1973). R, F. Lyon, IEEE Trans. Comnl\ln. COM-2J, 1438 (1973). H. S. Hosik, lIell Syst. Teeh. ,I. SI, 921 (1972) . .I. N. frankling and J. R. Pier"e, IEEE Trans, Commun. COM·W, 1182 (972). P. A. FranasLek, BeU Syst, Te.h. J, 47. 143 (1968), P. A. fl'anaszek, IBM J. Res. Develop, 14,376 (1<)70). G. L. Carjolaro and ü. P. Tronca, IEEE Trans. Commun. COM-22, 1555 (1974), (1. l.. Cariolaro, G. L. Picrohon and G. P. Tronca, Int. J. ElectrOn, 55. 35 (1983). Some statistical properties of maxentropic nmlength-limited sequences 63 SQME STATISTICAL PROPERTIES OF MAXENTROPIC RUNLENGTH-LIMITED SEQUENCES Abs.ratl Wc prove that two-Ievel runlength-Iimittd ,equtnccs with maxÎmum informarion content', maxcntrop'c runlcngth-Iimitcd sequenCCS, have an cxponenlial runlength disrtibution. General equations are derhled fOr the pOWer spectra! density function of maxentropic runlength·limited sequences. Runlcnght dislrihutions ancl power dcn,ity functions of practical modulation ｳｹｓｴｾｭＺｩ are tompan:d with those derived from maxentropîc runiength- Iimited sequences. 1. hltwduçtion 8inary codes such as the Miller code 1), 3PM 2) and EFM 3) are baseband rnodulation systems with applications in magnetic and optical recording. These codes belong to the class of so"called binary value runlength-limited sequences (RLL sequences). A string of bits is defined to be runlength-limited if the number of consecutive zeros (or ones) is bounded between a certain minimum and a maximum value. The maximum runlength constraint guarantees a dock pulse within the specified time, which is needed for dock regeneradon at the receiver. The minimum runlength constraint is imposed to control intersymbol interference and consequentiy has a bearing on the spectral propertÎes of the code .,0). In th is paper we prove that RLL sequences with maximum information contents, defined as maxentropic RLL sequences, have their runlengths exponentially distributed. This leads to a simple derivation of the power spectral density function of maxentropic RLL sequences. The power spectral density function of a (random) signal defines the distribution of the average signal power versus frequency. This information is useful in system design because it indicates the most important frequency band and the De content (power at the low frequcncy end) of the applied code. It does not teil uS in genera! how much distortion the signal suffers when the transmission charme! is bandwidth-Jimited. Binary channel codes 64 We start in sec. 2 with a formal definition of RLL sequences, followed in sec. 3 by an analysis of thc runlength distriblltion and spectra! properties of maxentropic RLL sequences. The study described in this paper was motivated by the fact that practical modulation systems can with moderate hardware quite easily reach !'ates of up to 85-950/0 of the maximum information capacity. From this observatioll we maya priori conjecture that the statistical properties of practical modulation systems can be predicted by maxentropic RLL sequences. In this way the results obtained from the maxentropic RLL sequence theory could be a simpie, generally applicable tooi for an approximate calculatlon of the power density functions of bit streams of practical modulator systems. To test this hypothesis we compare ill sec. 4 the runlength diHriblltion and power density funclions of two well-known rnodulation systems with results predicted by the maxentropic RLL seqllences. 2. Definition and asymptotic properties of rnnlength.limited binary sequences The theory of binary scqucnccs with restrictions on minimum and maxi" mum rulllellgth goes back to Kautz ｾＩＬ Tang aod Bahl ｾＩＮ For an exhaustive treatment of this subject the reader is referred to ref. 7. Their most important results are summarized here. A dk-limited sequence simultaneously satisfies the following conditions: a. d constraint - any two logical ones are separated by a run of conseculive logica! zeros of at least d. b. k constraint - thc Iength of any run of consecutive logica! zen)S is at most k. A sequence satisfying the d and the k constraint is called a dk sequence. Sequences only satisfying the d constraint are called d sequences. Wc obt.ain a runlength-limited binary sequence with at least (d + 1) and at most (k + 1) consecutive zeros or ones by integrating modulo 2 a dk-limited sequence. In this way the ones of a dk sequence indicate the position of a transition of a runlength-limited sequence. We note that for long sequences the inforrnation contents of runlength and dk sequences are the same. The only extra bit of inforrnation is the initial polarity of the runlength-limited sequence when inte" gratjng a dk ｾ･ｱｵｮｲＮＺ･ In ref. 7 recursion equations arc derîved for the number of distinct dk-lirnited sequences of ｬｾｮｧｴｨ n as a function of d and k. lf for convenience we restriet ourselves here to d-lirnited seqllenccs, thc numbr.:r of ､ｩｾｬｮ｣ｴ binary sequenr.:es N(n) of length n is given by n+l N(n) ""' { (I) N(n - I) + N(n - d - I) n>d+ 1. Some statistical propertjes of maxentropic runfength·/imited sequences 65 The process of modulation maps the input data stream onto the runlengthlimited output stream. In general m consecutive data bits arc mapped Ooto n consecutive channel bits. For a Ooe-to-one mapping we must have: rn.;:; log2N(n)_ The ratio mln is defined as the rate R of the modulator. We find that for given d and kthere exists a maximum rate, defined as the (information) capacity C. The capacity is determined by the specified d and k constraints and is given by I C= lim -log2N(n). n .. n (2) 00 For large n the number of distinct d sequences of length n, N(n) behaves as where A is the largest real root of /XÀ n, (3) For dk sequences À. is the largest real root of 7) Zk+2 _ Zk+l _ Zk-dTl +1 = O. (4) The capacity Cis then given by Coe log2À.. (5) Lf one wishes to transmit a certain fixed amount of information per time unit, say '/T bit/s. over a dk-limited noiseless channel, then the channel e10ck should run at least '/ C times faster than the data elock to compensate for the d and k constraints. In other words the channel bit time Tc is shorter by a factor of at least C than the time needed for a data bit, i.e. Tc < C· T. 3. Runlength distribution and spectral properties of maxentropic runlengthlimited sequences In this section we investigate the runlength distributioo and Power Spectral Dcnsity functions (PSD) of maxentropic runlength-limited sequeocesShaft C) determined the spectra! propertjes of d-constrained RLL sequences by computer simulations. He generatcd a random binary sequence with a random number generator routine. Next, bloeks of random consecutive bits werC transformed with a time-independent eode book loto d-constrained sequenccs. The lengths of the sequences were rather restricted so that the published PSD functions are not very usefu!. Pelchat et al. Ｉｾ considered d-constrained RLL scquences as a type of generalized pulsewidth modulation. This leads to a straightforward algebraic expression for the PSD function once the runlength dîstribution is given. We follow his approach and first derive the runlength distribution of maxentropic RLL sequences. 66 Binary channel codes 3" I, Runlitngth rjistributiorl 0/ maxentropic RLL sequences For an RLL sequence with a çonstraint on thc minimum and maximum runlength the runlengths Ti are given by i = d + I, d + 2, ... , k + 1. (6) Theorem I Assuming that the lime intervals between the transitions are statistically independent, then the information rate of an RLL sequencc is maximal if the probability of OCCurrence of length TI is given by i = d PUi) "" A-' + 1, d + 2, ... , k + L (7) PROOF Thc entropy of a binary RLL signa) where the time inlervals between transi· lÎons is an independent, discrete stochastÎe process is given by ref. 5 (8) where T is thc average runlength given by kOl T/Tc= I iP(Ti )- (9) I=d ... i Substituting (7) yields With (5) we find that under assumption (7) lhe RLL sequence takes on its asymptotic information capacity C. We now further prove that (7) is a consistent stochastic expression, i.e. I PtT,) "" I. We find from (7) because À satisfies (4) and (5). o 3.2. Power Spectral Density jimcliorl For a two-level signalof unity amplitude wilh independent runlengths and given runlength distribution the PSD funetion can be calculatcd as follows (c.f. ref. 5). Define G(2rr.f) as the discrete Fourier transform of the runlength distribution, i.e, Some statistical properties of maxentropic runlength-Iimited sequences 67 (10) withfis the frequency. The PSD function is then given by 1 1 - IG(21lf)1 2 S(21lf)", n 2f2T II +G(21lf)12' (11) Substjtuting the runlength distribution (7) provides a straighlforward method of determining the PSD function of maxentropic RLL sequences. In fig. I we have plotted the PSD function versus frequency of some d sequences. We normalîzed the pulse lengths of the RLL sequences in such a way that the information rate for all these sequences is fixed at 1/ T bit/s. frequency fT(lin 1 fig. I. Two $ldcd powçr dçn.ity function. versus froquençy of maxentropic d-con.trained runlength-Hmited seqllence•. We can observe the foilowing characteristics. A maximum occurs at zero frequency in the uncoded case d = O. for other d's maxima occur at non·zero frequency. For non-zero d maxima occur at lower frequencies and have a larger peak wilh increasing d. The eoergy at the low frequeocy end decreases with increasing d. In fig. 2 we have plotted as an example the PSD for d = 2 and some different kvalues. For convenience we normalized the frequency axis per 68 Binary channel codes 10 ".". ,:, ,: I; 0 ëii !! ' t- 0 ,,: .. -10 <Il Q. ....ｾ ,," ':\ 1 \\ :\ -20 ＧＢｾ ',\ ｾＢ｜ -30 ".\ '.1 ｉｾ ":\ -..00 0.5 _ frequency fT, Fig. 2. Two-sided pOWer density function versus frcqucncy of maxentn>pk runlenglh-limiled with d .- 2 and some k vahJÇ:s. Thç frçquc:ncy axis has been ､ｾｬＮＺｩ｡ｭｲｯｮｾ･ｱｵｭＮＺ･ｳｾｭｬｴ per chaür'Iel bil T,.. channel bit time T,; the vertical axis is normalized per data bit time T, so that we compare the PSD functions for a fixed information rate of the modulation strcarn. 3.3. Average runlength In self c10eking systems the k eonstraint guarantees doek information within a eertain maximum time interval. Another parameter determining the quality of the doek regeneration is the average time between eaeh doek infor. mation interval. In most doeking systems docking informa.tion is derived în the receiver !"rom the transitions of the regenerated RLL sequence. In othcr words the average runlength is a sound criterion of the doek cOntent of a eertain RLL sequence. The average runlength is given by (9) and (7) TI T, = Ｌｾ L i À -I = d + 1 + ----,----;----,-1 Je d-k-1 1 - (k - d + 1),1"-> + (k À - d»)"d-H 1 ( 12) In fig. 3 we have depicted T as a function of the d and k constraints. Same statistical properties of maxentropic runlength-limired sequences 1--" 5 11- i 4 3 d",4 ｾ 5 69 d=3 d=2 ; : : : : d"' 2 2 4 6 8 10 ｫｾＱ Fig. 3. Average runlenglh of rnaxenlropic runlength-limite<;l sequences versus d and k ｣＼ＩｮＧｬｲ｡ｩｾＮ 3.4. DG-content In system design for magnetic and optical recording the low-frequency properties of a code aften play an important role. In particular the power at low frequencies or the so-calJed "low-frequency content" of the modulated output stream should narmally be as smal! as possible. This requirement is imposed in magnetic recording because this medium cannot reproduce the low frequencies with suflkient signal-to-noise ratio. Optical recordÎng does not suffer from this phenomenon, but unfortunately the servo systems that position the laser spot on the disc are sensitive to low-frequency interference 8). By evaluatÎng (11) we flnd for the power density of maxentropic RLL sequences at zero frequcncy 8(0) = Ｇｴｾ with 1'2 = ｔ＼ｾ { IT Ï:l i 2P(T1l} - 1'7.. (13) l=d+1 In other words the power density at zero frequency is the variance of the runlengths divided by the average runlength. Substituting the runlength distribution (7) gives us the results plotted in fig. 4. We notiee for fixed k that the De content decreases with incrcasîng minimum runlength constraint d. On the Binary channel codes 70 Q d.l ｾ､］ｾ o <;1.3 " do' v 5 r'ig. 4. De ｾｯｮＨｾｴ of ｭ｡ｸｾｮｴｲｯｰｩ｣ 10 d.S '5 runlength·limited sequenCe! ver.!u.! d and k contraint•. other hand we notiee for fixed d that thc De content increases with increasing maximum runlength eonstraînt k, S. Cmnpal'lsOIl of I'esulls Praclical modulatiOIl systems are often based on block codes or variabIe bloek eneoding sJ. Due to the finite hardware dimcnsions of the modulator the aetual rate, defincd as thc ratio of channel and data bit time, is smaller than the information eapacity C for thc gîven d artd k cOllstraints. The ratio of the practical rate and the asymptotic informatiOn capacily C, for the given d and k comtraints, is defined as thc modulator efficiency E. Franaszek 8) has cal· culated the modulator efficiency of many runlength-\imitcd codes. From his Some statistica/ properties of maxentropic runJength-limited sequences 71 work we conclude that, for moderate hardware, modulators can quile easily reach efficiencies of up to 85-950/0. This fact motÎvated us to compare the runlength distribution and power densily functions of practical modulator streams wÎth results obtained from the preceding theory. Power spectral density functîons of many practical modulation streams have been published. For example PSD functions of Zero Modulation, 3PM code and Miller-squared are described by Lindholm D), whose method is based on Cariolaro and Tronca's study of spectra of block-coded digital signals 10). They calculated the PSD function of the codes by Fourier transforming the autocorrelation function of the codes. The autocorrc1ation function can be found by manipulation of the transition matrix of the finîte-state model description of the code '). We calculated with a finite-state description ') thc PSD function of MFM (or Miller modulation) and 3PM. Thc rnain parameters of these modulation systems are listed in table 1. For the details of the modulation mies the interested reader is referred to e.g. D). TABLE I Main parameters of the modulation systems MFM 3PM d k R c E .. R/C 1 2 3 11 ! ! 0.551 0.545 0.91 0.92 In figs 5a/b we have plotted the runlength distribution (derived from the finite-state model) of 3PM and MFM. It can be ,een from fig. 5a that the probability of occurrence of the langer runlengths of 3PM are overestimated (and consequently the smaller runlengths are underestimated) by the theory. The PSD functions of 3PM and MFM were calculated by Lindholm's method and are plotted in tigs 6afb. In these figures we have also plotted the PSD functions derived from the maxentropic RLL sequences. The frequency axes are for practical modulation systems and maxentropic RLL sequences normalized per channel bit time. We note a surprisingly good agreement (a few dB difference) between the spectra of the modulation systems and the maxentropic RLL seguences. This is surprising because the runlength distributions, as plotted in figs 5afb, show a significant discrepancy wilh the theory. We conclude that the PSD functiOI1 is less sel1sîtîve to the typical statistÎCs of a modulation stream than the run, length distributiOn8_ Binary cha/'lnei codes 72 Q - PIT, I t :> PM ,. Theory RLLcode do2.koll x \.x \0 -, x 10 \"x \ \ )( .\. " 10 \x I 10 5 \x 1') runlength I Tc. _ 1- fl 0 \, FIT, I - MFM Theory RLLc-ode d=1,kd \ " 10 1 5 10 15 ru nlerH;lt h I I, Pigs 5alb. Runlcngth ｯｪｭｩ｢ｵｴｯｮｾ of 3PM (fig. Sa), MPM (fig. Sb) ano tbeir ｭＢｸｾｮｴｲｯｰｩ｣ RLL gequenee equivalent, for tbc Ｇｰｾ｣ｩｦｾ､ (d.k) - (2,11) and (d,k) = (J ,3) congtralntg, respeclively, as ,olid lino.; nOle however that only The di,tributions of the lllaxemropic gequcnccs arc ｰｬｯｴｾ､ a dis<:rete sel of point! is achievablc_ Some statistical properties of maxentropic runlength-limited sequences 73 '0 -3PM The or )! do:? kot' Ｍｾ ai ｾ lQ lil IJ. t ·10 -20 fT< _ '0 Miller Th@Oryid,k).i1.3l m ｾ 0 lQ ::c t -20 ......---! ＭＭＺｳＢ｡ＭｾＭＭＢＰＳ _ fT< Fip 6(1/ b. Two-.jided power density funotiOns of3PM. MFM alld their ｭ｡ｾ･ｬｲｏｰｩ｣ RLL seqllellees equivalents versus frequency. The frequeney aXes have been normalized per ehanllel bit time. 6. ConclusÎons We have proved that runlength-Iimited sequences with a minimum and maximum runlength constraint achieve their maximum information capacity if the probability of occurrence of a certain runlength is exponentially distrÎ. Binary channel codes 74 buted. Once the runlength distribution is givcn the power spectral density function of maxentropic ruolength-limited sequences is given by a straightfor· ward calculatioo. We have calculated the runlength distributions and power spectral density functions of some practical modulation systems and have compared them with results obtained from maxentropic RLL sequences. We found a significant discr<lpancy between the runlength distribution of the practical modulator systems and the maxentropic RLL sequences, whcrcas the power spectral densities showed good agreement (a few dB difference). REFERENCES ') J. C. Mallin,on and J. W. Miller, Radio and Elee. Eng. 47,172 (1977). ') ') ') ') 0) ') •) ') JO) G. V. ./acoby, IEEE Trans. Magn, MAG·H. no. 5, ｾｬＨＲＱ (\977). J. P. J. H eemsker k and K. A. Scho u hamer I mmin k, Philip, Techn. Rev. 40, 157 (1982). P. D. Shaf!. IEEE Trans. Commun. ÇOM·ll, 687 ＮＩｾＷＹｬＨ M. G. Pelchal and J. M. Geist, IEEE Trans. Commun. COM-23, no. 9.878 (1975). W, H. Kaulz. LEEE Trans. Inform. Theory lT-ll. 284 (1965). D. T. Tang and I.. R. Bahl, lnf'>lm. COllirol. 17.436 (1970) . P. A. hana"ek, IBM J. Re;. Develop.• 376 (1970). D. A. Lindholm, IEEE Trans. MaJ>l1. ＬＴＱﾷ［ｲｾｍ no. 5, 321 (1978). G. L. Cariolaro and G. P. Tronoa, lEEE Tran•. Commun. COM·:U, 1555 (1974). Method for encoding and decoding runlength-limited binary sequences 75 A GENERALlZED METHOD FOR ENCODING AND DECODING RUNLENGTH-LlMITED BINARY SEQUENCES Abstl1\cl Many moelulation ,yStem5 used in magnetic and optical rccording arc based on binary runlength-limited code5_ We generalb,e thc concept of dk·lirnited sequenees of length n introduce<l by Tang and Bahl by imposing constraints On thc maximum number C>f consecUlive zeros at the beginning and the enel of thc sequences. lt is shown that the encoeling anel elccoeling procedure, ate similar to those of Tang anel Bahl- The aelelitional ＮｴｮｩＢｲｾｯ｣ allow a more efficient rncrgin,g of ｴｨｾＮＡｪｾｑｵ･ｮ｣ｳＬｗｾ d-emonstrate two conStrue;tiöI1S of runlcngth-limlted codes with merging rules of incrcasing complcxity and efficiency and COmpare them to Tarlg and Bahl's method. 1. lntroduction Many baseband modulation systems applied in magnetic and optical re-cording are based on binary runlength-limited codes 1,2,3). A string of bits is said to be runlength-limited if the number of consecutive <:eros between adjacent OneS is bounded between a certain minimum and a certain maximum vaJue. The upper runlength constraint ｩｵ｡ｲｮｴｾｳ a transition within a specified time interval needed for dock regeneration at the receiver. The lower runlength constraint is imposed to reduce the intersymbol interference, The latter constraint appears to have a bearing on the spectral properties of the sequences '). In this correspondence we consider an encoding and decoding procedure for a special dass of runlength·limited codes. The constraints we cOllSider can be defined as followed. Let n, d, k, land r be given, where I";; k and r";; k. 76 Binary channrd codes A binary sequence of 1cogt.h n is called a dklr-sequence if it satisfics thc following conslraints: a) d-constrainl: any two ones are separated by a run af al least d canseculive zerOS; b) k-conslraint: any run of consecutive zeros has a maximum Icngth k; c) I-constraint: the number of conseculive [eading zeros of lhe seqnence is at most I; d) r·constraint: lhe number of cansecutive zeros allhe end of the sequence is at mosl r. A sequence satisfying the d- and k-coostraiots is called a dk-scquencc. Given certain desired runlength conslraints, it is nol trivial how la map uniquely the input data stream onlo the cncoded output. dat.a strcam. A systematic design procedure on lhe basis of fixed length sequences has been given by Tang and Dahl l ). Their method is bascd on mapping dk-sequences of lenglh n anta conseculive integers and vice versa. In lhis correspondence we inlend to generalize their resulls to dklr-sequences. We are also gaing to show that the application of dklr·sequenees eoables the sequences of length n to be merged efficiently without violation of the d- and k-constrainls at lhe bOllodaries. Wc assume r # d- Theorems land 2 of this correspondence remaio vahd for r < d and theorem 3 cao bc gcncralizcd accordingly. Thc proofs, howcver, arC less elegant, the reaSOn being that for r < d there are nO dklr-sequences (XIl_l, ... , xû) having a one at positions ) where r <.i >.( d. 2. Encodinft llod decoding In this section we consider a way of mapping lhe set of all dklr-sequences onto a set of conseculive inlegers and vice versa. Ou!' results are similar to those obtained by Taog ilnd Bahl for dk-sequeoces" Lel A" be the set of all ､ｫｬｲﾷｳ･ｱｵｮ｣ｾ of length n. A" ean be embedded in a larger set A. cansîsling of lhe all-zero sequence of lenglh n and of all binary ｾ･ｱｵｮ｣･ｳ of length n satisfying the d-, k- and ｲﾷ｣ｯｮｳｴｲ｡ｩｮｾ where thc numbe::r of consecutive IeadÎng zeros is allowed to be grcatcr than k. Thc set 04" can be ordered Iexicographically as follows: if x = (x"_,, ... ,xû) and Y = (Y,I-I, ... , Yo) are elements of d" then y is called less than x, y <0 x, if there CXÎsts an i, 0.:( i <: n, such that YI < XI and xJ = YJ for i <) < n. The posÎtÎon of x in the:: lcxicographical ordcring of .4" is denoled by r(x); I.e., r(x) is the number of all y in A" with y <ox. Consequently r(O) = O. For thc sake of convenience we întroduce the rcsidual of a vector. Let y = (Y.-I," .,yo)E.4., y f 0, and let (be such thaty, = 1 andYJ = 0 if l <) < n. Then the resîdual of Y. res(y), is defined as fol1ows: Merhod for encoding and decoding runlength-limired binary sequences res (y) : '" y - ó. i , where <la J. {O. = 77 jf i = 1; elsewhere, res (0) : = O. lt can easily be seen that y r:.4rt implies res (Y) E .4•. The following ob· servation is basic to the proof of theorem 1. Let x, u e 4. and assume that = /1.j = 0 (1 <j < n) and Xi == U, .. 1 (0 «; 1 < nl. Then it is not difficult to show that r(u) ｾ r(x) == r(res (u» - r(res (x». Let N,(I) , i> 0, be the number of dklr-sequences with 1= k of length i and let N,(O) == I. Xj Theorem J Let x = (X'-l' ..• , Xi) e A •. Then .-1 r(x) "" I Xj N,(j). ｊｾｩＩ Praal Let the nonzerO coordinares of X be indexed by i, < 12 < ... < i", i.e., Xi = I if and only if ie (i ... , iql. Let u be the smallest element of ui. with the property that Ui. == 1. "Then ir is not difficult to see tbat the second 1 of u occurs at position iq - k - I, îf i" - k - I )0 0 (otherwise u has only one 1, if r;;;' i", or also IlO = 1 if r < tg). Here we recall that the all-zero sequence of length n is a dklr-sequence if and only if n < min 11, rl. Let 8(i", r) = 1 if r;;" iq, and e(iq, r) = 0 if r < i", Then we obtain r(u) ｾ r(res(u» = the number of dklr-sequences·wÎth I .. k of length I" with their leftmost 1 at position j where roaxlO, i g - k - IJ <J < Iq + t(iq, r) = N,(i"). Furtherroore, on the basis of the above mentioned observation It holds that r(x) '" r(u) = r(u) + r(x) - r(u) + r(res (x» = r(res (x» ｾ r(res (u» + Nr(iq). The theorem then follows by induction. Q.E.D. We have found a sîmple method for mapping the elements of .;drt onto consecutive integerst For practical applications we need a mapping of the elements of A. onto the set 10,1, ... , IA.I - Ij, where IA.I is the cardinality of A•. Obviously the set A. consists of the IA"j largest elements of •.4". In addition it is dear that the number a of elements of 4" that are smaller than all the elements of A" is equal to r(a); I.e., a = r(a), where a is the smallest element of A •. In this way we have proved the following theorem. 78 Binary channe/ codes Theorem 2 Thc:: LransformalÎon I: A,,---- N u (Ol defined by t(x) = r(x) "a for all xeA" is a one-Lo-one mapping from A n onto the set (0,1, ... , IA"I - IJ which preserves the ordering of A", i.e" x <0 y if and only if t(x) < t(y). The number a can also be expressed in another way. In order to do so we define N,°(j), j > 0, to be the number of dk-sequences of length j with their leftmost element equal to land satisfying the r-constraint, N,(J(O) : = I. lt is nol difficulL to show Lhat 11"1"'/ a = L ]1 ... N,°(j) j=1 + = I 1-/ N,o(j). )=0 Thc numbcrs N,(j) and N,°(j) are easily eomputed. They ean be found by a slraighlforward computer "eareh. A more sophisticated approach to tinding these numbers ean be based on the finite state transition matrix eonesponding to the dklr-sequenees 5). The eonversion from integers to dklr-sequences of Iength ft is aho analogous to Tang and Bahl's method and çan be carrîed out as follows. Let To, .. " T.-l he integers defined by Ti = the number of elements y in ,yI" smaller than Lhe smallest element u in ,!d" with Ui 71 and Uj = 0 for j> i, i.e., 1i = I i J j.l Nr(J(j) + 1 = L N,(J(j). j:(J From lhe definîLion and our assumption r;? d it immediately follows that Tu < T, <: ... < Tn .,. These integers are used for mapping consecutive integers onto dklr-sequenees of length n as shown in the following theorem. Theorem 3 Let x '" ＨＮｾＧ｟ｉＬＢ " "' xo) be a dklr-sequence of length n. Then a) Xr = land Xj = 0 1'01' t <) < n <;. Tt < r(x) < 7;'1. Furthermore, jf x, = land Xj = 0 1'01' t <) .".:: n. then b) ｉＭｾｲｔｾ r(x) - N,{l) < 7;-0' Proof a) This statement follows from the definition of T,. b) Let x be a dklr-sequence of length n with XI =0 land Xj = 0 1'01' t <) < n. Then ｔＬＭｾｬ ,,;; r(res (x») < Tt_d. Henee b) follows, sincc r(x) - r(res (x» = N,(t). Q.E.D. The conversion from integers to dk-sequenees ean therefore also be gener· alized to the conversion from integers to dklr-sequenees. The following simpte Methad for encoding and decoding runlength·Umited binary sequences 79 encoding algorithm for dklr-sequenees ean be derived from this theorem. Given an integer Jin the set R = (r(x) IXEA.1 (I = 0 if x = 0), we first locate the largest possible 1, 0".;;; t < n, sueh that T, ｾ 1< 1;+1 alld we make x, = I. Subtracting the contribution of Xl in J, we get a new integer 1- N,(t). Theorem 3 can be used again to find the next nonzero component of x. The second part of theorem 3 assures us that x, win be followed by at least d, but no more than k zeros. 3. The efficiency of dkJr-sequences In modulation systcms the dklr-sequences of length n cannot in general be caseaded without violating the dk-collstraint at the boundaries. lnserting a number p of merging bits between adjacent n-sequenccs makes it possibIe to preserve the d· and k-constraints for the cascaded output sequenee. Thc dksequences need f3 = d + 2 merging bits '), whereas only IJ '" d merging bits are required for dklr·sequcnces, provided that the parameters land rare suitably chosen. Henee this method is more efficîent, especially for small values of n. We shall now demonstrate two eonstfuctions of codes with merging rules of increasing complexity and efficiency. Construction 1 Choose d, k, rand n sueh that r + d,;;;: k. Let {= k - d - rand IJ = d. Then the dklr-scquences of length n can be cascaded without violating the d- and k-constraints if the merging bits are all set to zero. Construction 2 Choose d, k and n such that Zd - 1 < k. Let r = 1= k - d and f3 = d. Then the dklr-scqucnccs of length n ean be cascaded without violating the d- and k-consttaints if the merging bits are determined by the following rules. Let an (s,,;;; r) while the next n-sequenee start n-sequenee end with a run of s ｾ･ｲｯｳ wilh t (t,;;; I) leading zeros. Table I shows the merging rule for the f3 = d merging bits. TABLE I Merging of dklr-sequences where (}i stands for j eonsecutive ｾ･ｲｯｳ s, t s+t+d<k+ s+t+d";?;k+ if s< d ifs>d merging bits Binary channel codes 80 The numbeT m of data bilS that can be represented uniquely by a dklr· sequence of length n is given simply by where lx J is the greatest integer not greater than x. The ratio R of thc numbeT of data bits and the number of needed channel bits is called the information rate of the code. FOT exampte, the information rate of the codes based on the two above-mentioncd constructions equals R = m/(n + dl. The asymptotic information rate is the capacity C of Shannon's discrete noiseless runlengthlimited channeI 6,7), C= IÎm ｬｯｧｾＱＮｌ n The efficiency rI can be defined as the ratio of the information ratc Rand the capacity C of the noiseless runlength-limited channel, IJ = RIC. In order to get some insight into the efficiency of thc codes based on constructions 1 and 2 we have considered same cxamples. For m = 8 and for d = 1,2, 3.4 and k = 2d, . .. ,20 we have determined n in such a way that the infmmalÎon rale R was maximized. In order to compare our two eOnSlrUCtions to Tang and BahI's method we have calculaled the corresponding capacities C and elficiencies '7. Thc capacity of the noiseless runlength-limited channels waS ealculatcd hy a method given in ref. 1. OUr results cal") be summarized as follows. For small values of k, i.e., 2d.:;; k < 3d, construction 2 is only slightly better than Tang and Bahl's mClhod (approximalely 5 percent), while the efficiency of construclion 1 was worse (5 to JO percent). For larger values of k, howeveT, constructions 1 and 2 are clearly better. For those values of k tne gain of construction 2 compared to Tang and BahI' S method is most significant for d = I, 2 (12 to 15 pereerlt), while for d = 3,4 the gain is equal to 9 percent. For large values of k, constructionS 1 and 2 have the same etftciency; for the other values of k, conｾｴｲｵ｣ｩｯｬ 2 has a better efficiency than construction I. Tablcs 1I, III and IV give the results for m '" 8 and d = 1, 2, 3 and 4; in oTder to limil the !ength of [he tables, we have restricted k and n 10 those valucs whieh maximize the information rate R. We note that rates up to 95 percent of the channel capacity can be achieved. On average we observe a slight difference in the rates obtained by constructions 1 and 2, approxÎmately 5 percent in favor of construction 2. Methad jor encoding and decoding runlength·/imited binary sequences 81 TABLE II Block codes based on construction 1 d k 1 2 3 4 17 14 18 7 n R C 12 14 8/13 8/16 8/20 8/23 0.68 0.55 0.46 0.40 17 19 Yf = RfC 0.91 0.91 0.87 0.87 TABLE UI Block codes based On construction 2 d k n R C ,,= R/C 1 2 3 4 5 10 lO 12 12 14 17 8/13 8/16 19 8/23 0.65 0.54 0.45 0.39 0.95 0.92 0.90 0.90 8/20 TABLE IV Block codes based on Tang and Bahl's construction d k n R C 1'/ = RIC I 2 3 4 5 9 8 10 12 14 17 19 8/15 8/18 8/22 8/25 0.65 0.54 0.43 0.38 0.82 0.83 0.86 0.85 4. ConcIusion Methods are described for the construction of runlength-limited codes on the basis of sequences of fixed length. Additional constraints On the maximum numbcr of l;CWS at tbe beginning and end of a sequence, a generallzation of Tang and Bahl's work, allow a more efficient merging of the sequences. For short lengths in particular, our method yields better efficiencies than those of Tang and Bah!. 82 8inary channel codes REFERENCES ') D. T, Tang and L. R. Bahl. Inform, Contr. 17,436-461 (1970). ') H. Kobayashi, IEF:E Trans. Comm. Teeh. COM-l?, 1087-1 J(lO (1971). Ｂｾｩ K. A. Schouhamer Immink, PrOC. IEEE int. Con1. on ａ｣ｯｵｾｴｩＢ Speech, anu Signal Pr<\cessing, 587-589 (1981). ') M. G. Pelchal alld J, M. Geis!, IEEE Trans, Comm. COM·23. 878-883 (1975). ') P, A. Frananck, IBM J. ｒ･ｾＮ Ocv. 14,376-,83 (1970). 0) C. E. Shannon, !Jell $yst. Teeh . .I. 27, 379-423 (1948). ') C. V. l"reiman anJ A. D. Wyner, InforOl. Conlr. 7, 398-415 (1964).

RELATED PAPERS

RELATED TOPICS

Log In

Properties and constructions of binary channel codes

Properties and constructions of binary channel codes

Related Papers

RELATED PAPERS

RELATED TOPICS