Properties and Constructions
of Binary Channel Codes
Proefschrift
ter verkrijging van de graad van doctor in de
technische wetenschappen aan de Technische
Hogeschool Eindhoven, op gezag van de rector
magnificus, prof. dL S, T. M. Ackermans, voor
een commissie aangewezen door het college
van dekanen in het openbaar te verdedigen op
vrijdag 3 mei 1985 te 16.00 uur
door
Kornelis Antonie Schouhamer Immink
geboren te Rotterdam
This thesis was approved by tlle promotors
Prof. Dr. Ir. J. P. M. Schalkwijk and Prof. Dr. K. W, CatterrnQle
Contents
CONTENTS
Summary , . . .
vii
Acknowledgement. . .
0, Introductory chapter.
VII!
. . . . . . , _ . . . .
I. Performance of simple binary DC-constrained codes .
10
2. Construction of binary DC-constrained codes
31
3. Spectrum shaping with bÎnary DC2-constrained codes
49
4. Some statistica] properties of maxentropic runlength-limited sequences . . . . . , . . . . . . . . . . . . . . "
63
5. A generalized method for encoding and decoding runlength-Bmited
binary sequences _ . . . . . . . . . . _ . . . . . . .
75
Biography . .
83
Samenvatting
84
Chapters IA are reprinted from Philips Journa\ of Research.
Chapter 5 is from G. F. M. Beenker and K. A. Schouhamer lmmink, IEEE
Inform. Theory> IT-29, p. 751 (1983).
Summary
SUMMARY
Channel codes, sometimes ealled transmission Or lines codes, are applied in
storage syslems sneh as magnetic tape or disc and optical disco Applications
are also found in transmission systems over libre Or met'allie eable.
A ehannel code converts the digital source information to a form suitable
for a specilic transmission medium. For example DC-free codes are designed
in such a way lhat the encoded signal has suppressed frequency componenls
in the region around zero frequeney. These codes are for example applied in
transmîssion systems having insulficîent response in the low-frequency range.
Another requirement împosed on a channel code orÎgînates from the fact
lhat the maximum distance between transitions in the encodcd signa!, the
maximum 'runlength' , should be limited to enable a simple syslem doek
recovery in the receiver.
Thîs thesis deals with systematic methods of desîgning nC-free and runlength-limited codes. Procedures are given for a simple enumerative encoding
and decoding of the codewords. Also described are several propertîes of channel codes sueh as spectral and runlength distributions. Criteria derÎved from
information theory are used lo compare the channel codes.
Binal'Y セᄋィHjNョ・{
viii
codes
ACKNOWLEDGEMENT
I am great.ly indebted 10 the management of the Philips Itesearch Laboratories, Eindhoven, The Netherlands, for the opportunity to carry out and 10
publish the work described here. Stîmulatîng discussions with Prof. J. P. M.
Schalkwijk, Prof. K. W. Cattermok and with colleagues have greatly contributed t.O the eontents of this thesis. In particular I Wan! to thank G. F. M.
BeenkcT of the mathematical depaTtment of Philips Research for his mathematical support and for adding many contributions to the papers.
Introductory chapter
1
INTRODUCTORY CHAPTER
O. General
Channel codes are applied in digital transmission or storage systems. Early
digital transmission systems have been widely used for telegraphy (morse code)
and telex, We are witnessing a revolution in world-wide telecommunications,
data and computer networks are now being built with enormous capacities.
Not only are the telecommunication networks booming, so too are systems for
the storage of digital information (transmission in time). For example the
storage capacity per unit surface of magnetic tape Or disc bas been doubled
evcry three years since the early 19605.
The Compact Disc Digital Audio System, introduced in )983, based on the
optical read-out principle, was the first high-density storage medium (one bit
per square micron) to reach the homes of the consumer.
Channel codes are the cornerstone of almost all digital transmission systems. Their main functions are:
(i) matching the transmitted signals to the transmission channel,
(ii) allowing reliable transmission fot reception by simple receivers.
For transmission or storage of binaty dîgîtal înformation the simplest 'code'
format seems to be not coding, Le. the soutce symbols '0' or 'I' are coded as
the presence or absence of pulses, respectively. There arc, however. same
engineering problems associated wÎth this simple format.
In most transmission systems timing information must be extraeted from
the transitions 0 -> I or I -> 0 of the received message, so that long
sequences of like symbols should be avoided. An uncoded random binary
souree signal has an average time between successive signal changes equal
to two symbols. The maximum distance between transitions, the so-called
maximum runlength, is infinite. Scramblers are often used in cable transmission practice to randomize (without adding redundancy) the source data 1).
Unfortunately, however, a code based upon statistical considerations alone
remains vulnerable to specific Worst-case or 'pathologicaI' channel sequences
for which the probability of being errOlleously received is much larger than
average,
The Fibonaccî codes described by Kautz 2) and the runlength·1îmîted codes
of Tang and Bahl $) add redundancy to the signal so that a finîte maximum
runlength is absolutely guaranteed.
lntuitively it should be dear that the smaller the maxÎmum runlength the
more redundancy should be added to the encoded stream. The arnount of
redundancy needed to guarantee a maJI:imum runlength can be calculated
using an information theoretical approach $),
2
Binary channel codes
Many channels cannot pass the low f(cquclIcics with sufficient signal-tonoise ratio. Shaping thc spectrum of the encoded stream by coding can cope
with this problem. Most. of t.hc channc1 codes that are used in practice are 50called block codes. The source digits are grouped in source words of m digits.
Using a code book tIJc source words are translated into blocks of n digits
called eodewords. Constructions of digital codes having spectrum ?eros at
arbitrary frequencies were given by Gorog 4).
The designer of a digital transmission system will in general be confronted
with the following problems:
(I) the characterization of the transmissioll channcl lcading to some specific
channel code requirements;
(ii) the choÎce of set(s) of codewords, called code book pages (sometimes confusingly called alphabets), satisfying thc channel code requirements;
(Hi) the translation of the souree words into thc codewords and vice versa e.g.
using look-up tables;
(iv) the evaluation of the newly designed code with respect 1.0 added redundancy and resulting shaping;
(v) the testing of the new channel code in a practical environment to evaluate
its performance in average and worst-case conditions.
During the design and experimental phase, feedback in the mentioned itemS
is incorporated to improve the total system performance. Thc channel characterization leading to specif1c code reqlJirernents is an important aspect of the
syslem design. This aspect is beyond the scope of this thesis.
The translation of source words into codewords and vice versa will be one
of the main topics discussed here. Boolean flJnctions to lranslate source words
into codewords and vice versa can easily be found by hand using a heuristic
approach if the codewords arC rclatively smal!. Ir wil! be shown that generally
speaking t.hc efficiency of a code improves with increasins codeword lengtlL A
systematic; approach of thc mapping problem is needed if greater code effi.
ciency is desired. Frankling and Pierce 5) pointed out that a simple sequential
algorithm given by Schalkwijk 5) and Cover 7) allows the assignmcnt of unique
numbers to codewords of nxed disparity, and Lyon 8) reported on the practical use of this simple algorithm. Application of this so-calied enumerative
cncoding and decoding algorithm leads to a systematic approach of thc
mappÎng problern wÎth the addîtional advantage that look-up tables of
moderate size, even for large codeword lengtIJs. call bc used. The idea of
sequential encoding and decoding goes further back in history than noticed in
reference 5. For example Cattermole 9) reported on the enumerative encoding
and decoding theory. An embodirnent example using anaiogue circuitry was
pat.ent,ed in 1952 H').
3
IntroduclOry cfuIpter
We shall study the design and informatÎon theoretical performance of two
groups of channel codes:
- De-free codes and
セ
runlength-limited codes.
Survey papers regarding these types of codes can be found in refs 11, 12 and 13.
l. nC-free codes
The field of application of digital channel codes with supprcssed low-frequency cOInponents is quite braad. We find applieations in transmission sys·
tems over fibre Or metallic cable 14,15,16, I?) in storage media such as magor optical recording 20 •2 1). Though the restrictions on the channcl
netie IYNセQ
sequence are frequently put in frequeney-domain terms, wc are aften more
interested in the tÎme-domain properties.
The souree sequenee, assumed to consist of equiprobable and independent
binary digits, is mapped onto a binary channel sequence. The received signal
can be written down as
.,
I
r(t) '"
al g(l - iT)
+ nw(t),
(I)
Im_OO
where the al E ( - I, 1) are two-valued parameters, that are generated each T
seconds. g(t) is the impulse response of the channel (plus possibly a whitening
filter) and nw(l) is additive white gaussian noise (see ref. 22, chapter 4_3). Assuming that the signal is matehed-filtered (projceted) and sampled at I = kT,
then the equivalent channel vector is
00
h=OkgO
+
L oigk_l+nk"'o"gO+qk+nk,
(2)
i.-Q;1
iセォ
where q. is inter symbol interference (ISl) at t "" kT.
The statistics of qk are directly related to the channel sequence a and the
impulse response g(t) of the channe!. Thc lSI can therefore be affected in two
ways:
I) pulse shaping at the transmitter and/or receiver using filters, and
2) manipulation of the code structure and hence of the correlation in the
channel sequence.
The usual approach to combat ISI has foeused on the shaping of g(t) for
zero Înterference. Hard-limiting channels (for example optical recording) only
accept two pulse shapes, a positive Or a negative full-T pulse, so that ISI ean
only be affeeted by the code structure or the receiving filter. The shaping of the
code structure with the aim 10 minimize the lSI is the domain of the transmission codes.
4
Bil1ary channei codes
l.I. Model oj 'he A C-coupled chllnnel
Man)' transmission channels arc for practical reaSonS AC-coupled, i.e. there
will be .>ome low-frequenc)' cut·olf due to coupling components, isolating
transformers, etc. Other contribu1.il)ns to the ISI mal' arise from the banclwiJth limitations of the channt;!. We assume in the folJowing that ISI is only
caused by the AC-coupling. lf the AC-couphng is a first-order high-pass filter
and detection is dOlle with an integrate-and-dump filter, th en the ISI can be
approximated by:
k
:E al,
qk = -h
(3)
i=-oo
where h (h <: 1) is lhe ratio of pulse duration and time-constant of the ACcouplingWe now define the (running) digital surn of a sequenee a bl':
k
Zk =
I
lit-
(4)
i=o_aQ
We concludc frorn eqs (3) and (4) that the 151 of the AC-coupled channel i,
proportional to the digital sum of the sequence. We further concludc that long
runs of similar syrnbols build up a large iSf. To limit the iSi codes are used
with lhe property that the ehannel sequence as,umes a finite number of sum
values_ Sum constrained sequences have the frequency-domain propertl' that
the power vanishes at zero frequencl', which is the rtason for the name of
these wdes: DC-free (or DC-constrained) codes.
The probabilitl' of channel ,ymbol error P(e) can be approximated bl' (a5wming P(a = I) セ P(a.,,- -I) = I)]:
P(e) =
t P<I n -
hz I ;> go),
where the subscripts of n and z; were discarded for brcvitl'. Assuming thc noise
n and ISI q oe -hz to be independent, the probabililY density funC!.Îon of the
SlIm (n - hz) i, the convolution of the individual density functions of n and zIn イセィエッ
words gi ven the density function of the noise, a de,igner can minimize
P(e) bl' properll' choosing the density function of the digital sum z_ Chang et
aL Z3) reported on the design of DC-free transrnission codes based on pee) as a
çriteriQn. Their method is strajghtforward but is impractieal when long codewords are used. It has the additional disadvantage that no systematic rnethod
(except a direct look· up) is available for encodîng and decoding thc codewords.
From eqs (3) and (4) wc notiee that the maximum IS! (peak eye closure) is
relaled tc the total numbcr of surn values the channel sequence assumes.
5
Introductory chapter
Mean-square-ISI has achieved a great popularity because of the ease with
which they can be handled mathematically. It may be c1ear that the meansquare-ISI is proportional to the variance of the running digital sum (in short
sum varianee). The last two criteria, i.c. thc ma/dmum and mean-square ISI,
play an important role in the design of DC-constrained transmission codes.
Glave 24) (sec a150 ref. 25) derived an upper bound on the probability of error
due to ISI and noise of coded sequences.
He found that if QE(-A,A) and E[q2] <ut that P(e) can be upper"
bounded by (assuming nk gaussian and uncorrelated):
2P(e) ",; RaOセエャ
[Ql(go - A)/a.J
+ Ql(go + A)/lt.ll + 2(1
- uJIA 2) Q(go/o.),
(5)
where セオ is the noise varianee and Q( .) is the Gaussian error-probability function (see e.g. ref. 24).
Eq. (5) can be used to estÎmate PIel once a bound on the IS! has been
found. If the channel sequence is designed in such a way that ît assumes N sum
values (N is sometimes called the digital SUm variation) then
h
A", T(N - 1).
A frequency-domain reason for using DC-constrained sequences is found
when the transmission channel is a small part of a data storage system. For
example, in optical disc recording the requirement of reduced low-frequency
content is not imposed because this storage medium does not respond to the
low frequencies, but because the servo systems needed to position the laser
spot on the disc are sensitive to low-frequency interference induced by the
code stream. Thc frequency region with suppressed components is charaeterized by the so-called cut-olf frequency. Justesen al) showed a close relationship between the cut-off frequency and the sum varianee, which makes the
sum variance an important parameter in the frequency- and time-domain.
1.2. Description of DC-constrained codes
Many examples of DC-constrained channel codes are known which are
block codes with codewords having zero or low disparity, where the disparity
of a codeword is defined as the difference of the number of ones anó セ・イッウ
in
the codeword 4,9,13,26,27,28,29). In the simplesr type of code the SOUrce words
have two alternative translations with opposite sign of the disparity 28). These
codes are called bi-mode codes. The choice of a particular codeword polarity
is made in such a way that the digital sum of the sequence after transmission
of the new codeword is as close to zero as possible.
6
Binary channel codes
In chap, I the ,at.c and low-frequency properties of simpie, bi-mode, Decomtrained channe1 codes are analysed,
Sec. 1.2 starls with a brief description of the pwperlies of sequences generalcd by a Markov source consnained in sueh a way th at the digital sum of the
sequence assumes a limited number of values 3o ,31). The properties of these
theoretical sequences are used to establish a new figure of merit of De-eonstrained codes.
The p<lwer speelral density function (spectrum) provides useful information
aboUl the response near zero frcquency. ]'he calculation of the spectrum is
usually a laborious proccss. Several methods and results have been reported in
The procedure given by Cariolaro et al. 33,35)
the literature SセLNTセ{^LDoW。XIN
tlsing matrix analysis is very weU suited to machine computation. For large
codeword lengths and many encoder slates the memory requircments of the
procedure become prohibitivc, even for large main frame computers. In sec.
1.3 a new method is presented to compute the power density funct.ion of lowdisparity, DC-balanced ehannel codes. The method uses thc special structure
of t.he code, lcading lo a simple result, for the simp!c bi-mode channel codes
the sum variance is cakulated in sec. 1.4.
It is cuswmary to define the efficiency of a channel code as the ratio of the
code rate and the noiseless channel capacity given the channel constraints ,s, 4 °l.
In sec. 1.5, using t.he theory developed in sec. 1,;2, we d;;mne a new figure of
merit for DC-constrained channel codes. We study t.he performance of these
simpk channel codes by considering the exchange between the added redundancy and the tesulting suppression of low-frequency components comparcd
with the properties of sequences generated by a maxentropic Markov sOUrcc.
The simple bi-mode codes as discussed in chap. I use the ruil sets of codeword" of a certain disparity, In chap. 2 thc code design is based on codewords
having a constraint on the maximum number of assumed sum valucs, i.e.
those codewords arC discarded that make a relatively large contribution 1.0 the
SUm varianee. The actllal basis of code design is given in sec. 2.3. It provides a
lllethod for the enumeration of codewOrds with a constraint on tiN maximum
numbcr of assumed sum values.
In general lhe efflcîency of a channel code can be improved if longel' codewords are allowed, Unfortunately thc amoLlnt of encoding and decodjng hardware incteases exponentially with inereasing codewotd length if a direct
method using look-up tables of thc souree word" and theiL' channel rcprcscn"
t.at.ions is used. Sec. 2.4 deals with algorithms for enumerative encoding and
dccoding of constrained codewords with the nice property that the number of
enlries of look-up tables to be used grows polynomially with thc eodeword
length. Finally. in sec. 2.5 t.he problem is addressed of sekcting selS of code-
lntroduetory chapter
7
words to design DC-suppressed channel codes. A worked example is given of
a ncw Sb I Ob transmission code, attractive for magnetic recording 41.42.43) and
libre transmission セmVIN
showing superior features compared to other codes
with rate 8/10.
In chap. 1 we show that lhe cut-olf frequency of sequences generated by a
maxentropic Markov source is proportional lo lhe redundancy of the code.
Examples of a practical embodiment of binary DC-balanced codes are discussed in chaps land 2. It is shown that the performance of these simple
codes, in parlicular fOr smal! codeword length, is comparabie with that of
maxentropic sum constrained sequences.
The power spectra! density funetion of digital sum constrained codes is
characterized by a parabolic shape in the low-frequency range from De to the
cut-olf frequency. In some applications it is desirabie to achieve for a given
redundancy a larger rejection of the low-frequency components than possible
with DC·balanced codes.
Chap. 3 presents a new class of De-free codes having a zero second-derivative of the code spectrum at zero frequency. This results in a substantial
decrease of the power at low frequencies for a lixed code redundancy with
respect to the classical designs based on a digital SUm criterion.
Sec. 3.:'. introduces a time-damajn constraint which has to be irnposed on
the channel sequence so that lhe resulting spectrum of the sequence has lhe
DC 2 -balanced property, i.e. has both zero power and ;<;ero second·derivative
at zero frequency. Sec. 3.3 gives an enumeration method for finding the number of codewords CO be used in a DC 2 -balanced code.
In order to obtain practical rates for the new codes. it is necessary 10 use
relatively long codewords, which makes a direct rnethod of encoding and decoding using look-up tables of the source words and their channel representations prohibitively complex. Sec. 3.4 deals with the enurneralÎve encodÎng
and decoding of DC 2 "balanced codewords. The algorithm is not more complex than looking-up and adding. The look-up tables needed for the enumera·
tive coding grow polynomially in complexity with increasing codeword length.
Sec. 3.5 gives examples of codes using codewords that can be concatenated
without a merging rule. The power density functions of the newly developed
codes are compared with those of classica! DC·balanced codes.
2. Runlengtll-Umited codes
Binary codes such as the Miller code 16), EFM 20), 3PM 46) and Zero-Moduiation セWI are baseband transmîssion systems with applications in magnetic and
optical recording. These codes belong to the class of so-called binary rUn"
length-hmited sequences (RLL sequences). A string of bits is ctefined to be
Binary ch(Innd codes
8
runJcngth-limited if the number of consecutive like symbols is bounded between a certain minimum and a maximum value. The maximum runlength
constraint guaran!.ees a clock pulse within セッュ・
specified time, which is needed
fOl" thc clock regeneration at thc rcceiver. Obviously a セ・アオ」ョ
with a digital
sum constraint implics a maximum runlength consIraint, but not vice versa.
Thc minimum runlength const.raint is imposed to control intersymbol interfcrcnee and consequemly has a bearîng on the distortion of the transmitted
signal when the transmission channel is bandwidth-limited ia.18). Jn chap. 4 wc
show thllt RLL sequences with maximum information content. dcfined as
maxentropic RLL sequences, have their runlengths exponentially distributed.
This leads to a simple derîvation of the power densiLy function of maxentropic
RLL ウ・ア|jHセョ」 ウN
In sec. 4.2 we give a formal definition of RLL sequences, followed by 1111 ana)ysis of Ihe runlength distribution and spectral properties of
maxentropic RLL sequences. The study in chap. 4 was motivated by t.he fact
that practîeal ernbodiments of RtL channel codes can with moderate hardware quite easily reach r3teS of 80-950/0 of the maximum information capacity
(see chap. 5 and refs 39 and 50). From this observation we maya priorî conproperties of practical codes can be predlcted by
jecture that the ウエ。ゥセN」。ャ
maxentropie RLL scquences. In this way t.he results obtained from the maxentropie RLL sequence theory could be a simpte tooi for an approxirnate calculatioo of thc power density function of practical code streamS. To test Ihis
hyputhesis we compare in sec. 4.4 t.hc runlength distributions aod power densit.y Funetions of two weli"known charmel codes with the resul!.s predicted by
t.he maxentropic RLL theory.
Many procedures are available for the design of runlength-iimited
codes 2 ,3.39.51). In particl,l\ar t/'le method presented by Tang and BahP) is
attractive because it is based on codewords of fixed length. Chap, 5 gives a
gencrali:.:ation of the concept of dk-limit.ed sequences of length n introdueed
by Tang and Bahl by imposing eonstrainIs on the maximum number of conseeutive zcros allhe beginning and end of tne scquences. It is shown in sec. 5.2
thai. Ihe enumerative encoding and decoding procedures are similar to those of
Tang and Bahl- The additional constraints aliow a more cfficient merging of
the sequences. Two eonstruetions of runlength"limitcd codes with merging
rules of increasing complexilY are given in sec. 5.3. The efficiency of the new
conSl!'uctions is compareJ with t.hat of Tang and Bahl's method.
REFERENCES
')
')
')
')
.1.8. S."lge, Elell Sy,1. Techn. J. 45, 449-4ij7 (1966).
W. Il. KatllL. IEEE Tram. Inform. 'Yhoory IT-ll. 284·292 (1965).
D, T. Tang Md I... R. Bahl, lnform. CO"lr. 17,436-461 (1970).
E. GOl'og, IBM J. Rç•. Oovdop. 47, 234-241 (1968),
Introductory chopter
')
•)
')
")
')
lOl
")
")
")
I.)
")
I.)
")
I.)
I.)
,.)
Ol)
")
")
")
>0)
'.)
")
,.)
"')
")
Ol)
SO)
")
3<)
SS)
SS)
")
SS)
SO)
'0)
<I)
•,)
<S)
•<)
<0)
")
")
")
•セI
'0)
>I)
9
J. N. Frankling and J. R. Pieree. IEEE Trans. Commun. COM·20, 1182·1184 (1972) .
J. P. M. Sehalkwijk. IEEE Trans. lnrorm. Theory IT·lIl, 395·399 (1972).
T. M. Lイセカッc
IEEE TraJ;l>. InfoUIl. Thc;çry H-19, 73-77 (1973).
R. P. Lyon, IEEE Tran•. Commun. COM-21, 1438-1441 (1973).
K. W. CatlermOle, 'Principle. of digitaJ lino coding" lIiffo Boob Ltd .• London (1969).
E. Labin and P, R. A.grain, U,K. Patent 713,614 (1952),
H. Kobayashi, IEEE Trans. Commun. Teeh. COM·19. 1087-1100 (1971).
N. Q. Due and B. M. Smith, Au>t. Te1ecornrnun. Res. 11, 14-27 (1977).
K. W. Cattermole. Int. J. Electron. 55, 3-33 (1983).
D. B. Waters, Int. J. Electron. 55, 159·169 (1983),
R. M. Brooh and A. Jessop, Int. J. 81eetron. 55. 81·120 (1983).
Y. Takasaki, K. Yama.hita and Y. Takaha.,hi, Int. J. 81eetron. SS, 12l-DI (1983).
M. RoussealI, ElewOI'I. Lett. 12,478-479 (1976).
J. C. Mallinson and J. W. Miller, Radio and Ek<. Eng. 47. 172-176 (l977).
M. Davidson, S. f'. Haase. J. L. Maehamer and L. H. Wallman, IEEE Tran•. Magn.
MAG-U. 584-586 (1976).
j. 1'. J. Heemskerk and IC A. Sehouhamer Immink, Philips Tech. Rev, 40. 157·164
(982).
H. Ogawa and K. A. Sehouhamer Immink, Proc. Premier AES Conf. Ryetown, 117-124
(982).
J. M. W07.eneraft and I. M. Jaeobs, 'Prineiples of Communication Engineering'. Wiley
and Sons NIセVYiH
R. W. S. Chang, T. M. Jakubov and A. 1.. Gareia. IEEE Trans. Commun. COM·30.
1668-167g (l9g2).
F. E. Glave. IEEE Trans. Inform. Theory lT·18, 356-363 (1972).
E. Bigliere. IEEE Trans, lnform. Theory IT·20, 115·118 (1974).
R. O. Carter, Electron. Lett. 1,65-1)8 (1965).
F. K. BOVJers. US patent No. 2.957.947 (1960).
J. M. Od ffiths, electron. Let!. 5. 79-81 (1969).
K. W. Cauerrnolo and J. J. O'Reilly, 'Problems of randomncss in communication Engi·
neering', Pentech Pre.., London (1984).
T. M. Chien, Bel! Syst. Tech. J. 49, 2267·2287 (1970).
J. Justesen. IEEE Trans. lnform. Theory IT·28. 457·472 (1982),
S. aosÎ k, aeH Syst. Teeh. J. 51. 921-933 (1972),
G. L. ッ{。ャ ゥイセᅦ
and O. P. Tronea, lEEB Trans. Commull. COM·22. 1555·1563 (1974).
O. Brugia, R. Piotroiusti and A. Rover!, Alta Freq. 45, 695-7J2 (1976) (in ltalian).
O, L. Cariolaro, O, L. Pierobon and O. 1'. Tronea, Int. J. Electron. 55.35·79 (1983),
L. J. Groenstdn, BeU Syst. Teeh_ J. $3, 1103-1126 (l'/74).
D. A. Lindholm, IEEE Tran$. Magn. MA(:.-14, 321-323 (1978).
G. S. Poe, lEE PrC)c. P, Commun .• Radar and Signal proeess. U8, 323-330 (1981).
P. A. f'ranaHoK, BeU Sy,!. Tech. J. 47, 143·157 (1968).
S. Yoshida and S. Yajima, Trans. IECE of Japan E 59,1-7 (1976).
M. Morizono. H. Yoshida and Y. Hashimolo, SMPTE Journal 89, 658-662 (19g0) .
$. Tazaki, F. Takeda, H. Osawa and Y. Yamada, Proc. 5th Internat. Conf. On Vldoo aml
Data Rocording, Southampton, 79-84 (1984).
M. A. Parku and F. A. Belli., Proc. 4th iョエセイ。N
ConL on Viow ano Data Recording,
Southampton, 207·215 (1982).
A. X. Widmor and P. A. Franaszek, Electron. Lelt. 19,202-203 (1983),
A. X. Widmer and P. A. FranaHek, IBM J. Ro•. Nーッ、カセd
27, 440-451 (1983).
G. V. Jaeoby. IEEE Trans. Magn. MAG·U. 1202-1204 (1977).
A. M. Patel, IBM J. Res. Develop. 19,366·378 (1975).
M. O. Peleha! and J. M. Geist, IBEE Trans. Commlln. COM·23, 878-883 (1975). Correttion COM·24, p. 479, 1976.
P. D. Shaft, IEEE Trans. Comml.ln. tOM·n, 687-695 (1973),
K. A. Sehouhamer Immink. Electron. Let!. 19, 323·324 (1983),
1'. A. Franaszek. IBM J. Res, Develop, 14, 376·383 (1970).
a.
JO
Binary cht'mnd codes
M M M M MG M M M M M セM M M M M セL
PERFORMANCE OF SIMPLE BINARY
DC-CONSTRAINED CODES
a「セエイャ 、
In digiLal transmissîon セゥ is ウ・ュゥャ ッセ
desi.t'able for thc: c.'hannel stream to
hilve low power ncar 7-cro frcquoncy. Suppression of (ho low-freqllency
"'lmponents is achicvcd by conslraining the unbalancc of thc lransmitlcd
Rale and spectra! proport.ie, or lInbalancc
p<lsilive a.\d nesativc ーセャL\ウN
constrained cooc., wilh biIlary symbols basco on simplo bi-mode cooiog
scheme, arc oalculatod.
I. Inlroductîon
The field of application of digital channel codes with suppressed low frequency components is quite broad. We find applications in transmission sysor metallic cable 1.2) in storage media such as magnetic 3,4) or
tems over ヲゥ「イセ
optical recording セIM
tイオョウュゥセ ッョ
systems designed la achîcvc DC-suppres8ion are mostly based
on so-called block codes, where the セouイc・
digits are groupt:d in source words
of m digits; the souree w()rds are trans!ated ming a code baak into blocks of rl
digits called codcwords. Cattermole 0,7) and Griffiths 8) designcd binary block
codes based on 」ッ、・キッイ、セ
having zero Ol' low disparîty, where the disparity of
a codeword is defincd as the difference of the number of ones and the number
ql' zeros in the codeworc.l. In thc セェューャ・ウエ
code type the souree words haw twO
alternative translations (modes) being of opposite disparity polarity. The
choice of a particular codeword polarity is made in such a way thaI. the socalled running digital sum or tht sequence aftel' エイ。ョセュャウゥッ
of the new codewhere the running digîtal sum is defined
word is as close 10 zero as ーッセウゥ「ャ・L
for a binary stream as the aceumulated sum of ones and 2eros (a zero countcd
as - I).
of the power spectral density function of zcro·disAn analytic ・クーイセウゥッョ
parity codeword systems was derived by f;'ranklîng and Pierec セIN Examples
Performance of simpte binary DC·constraÎned codes
11
of the computation of the spectra of low-disparity codeword based systems
were given by Lindholm IQ) and Poo 11). They applied the procedure given by
Cariolaro et al. 12,18) using matrix analysis. The procedure is straightforward
and very wel! suited to machine computation. For large codeword length and
large number of encoder states the memory requirements of the procedure
become prohibitive. In this paper simpte expressions are derived of the rate
and power spectral density functions of simpte, bi-mode, OC-constrained
channel codes.
Section 2 starts with a brief description of the properties of maxentropîc
unbalance constrained sequences. The properties of these sequences are used
to establish a new figure of merit of DCconstrained codes. In sec. 3 the power
density function of low-disparity based channel codes is computed. The
variance of the running digital sum (in short sum variance) of the channel
codes, adopted here as a criterion of the suppression of the energy near De, is
calculated in sec. 4. In sec. 5, using the theory developed in Sec. 2 we intend to
answer the question: How good are these simple channel codes for thc given
redundancy and resulting suppression of low-frequency components?
2. Properties of z, sequences
A designer will aften be confronted with the qucstion of how good his system is with respect to the redundancy of the code and the resu1ting suppression
of low-frequency components. There is a need for a yardstick to measurC the
performance of DC-suppressed channel codes in an absolute way. To that end
asymptotic properties of sequences so constrained that the running digital sum
(RDS) of the sequence will take a Iimited number of values are discussed in
this sectîon. Thc results will be used to derive a figure of merit that takes into
account both the redundancy of the code and the resulting frequency range of
the sequence spectrum wîth suppressed componenK
Consider binary sequences x = (XI, .. " Xi, •••), Xi € ( - 1, Ij. The sc-called
running digital sum (ROS) of a sequence plays a significant role in the analysis
and synthesis of codes of which the spectrum vanishes at the low.frequenc)'
end. The RDS Zi Îs defined as:
I
Zi""
L
Xj.
}=l
Chien 14) studied sequences X taking a finite number of RDS values, so-cal!ed z
(-constrained) sequences.
He calculated, using a Markov source model, the information capacity of z
sequences as a function of the number of allowed RDS values. The maximum
number of RDS values a sequence takes is aften called digital sum variatioll._
12
Binary channel codes
aセ ッイ、ゥョァ
to Chien the maximum entropy of a Markov informatioD source
with N allo wed RDS values is given by:
C(N)
+ 21 0g cos (N:
I
=
(l)
I )"
More proper ties of z sequenccs were derived by Justesen 10), who calculated
lhc ウーセ」エイ。ャ
propertie., of a maxentropic z-constrained souree, i.e. of a
Markov source wÎth t.hc transition probabilities chosen in such a way that the
maximum entropy,
souree 。セィゥ」カウ
JusteSeD devcloped a useful time·domain measure of the low-frequem:y
propertier; of DC-constrained sequences. He defined Wo as t.he (Iow-frequency)
cut-off frequency according to: (sec abo ref. 9)
1
rH(WoT)
I
l'
=
where H(wT) is lhe power density function versus frequency and Tthe time
duration of a channel syrnbol. Justesen found a remarkable and very useful
relation:
1
w oT""2s 2
(2)
'
wherc $2 is the sum variante of the seguence.
For the examples of channel codes investigated by Justesen this empirical
n::lat.ion was found to be vcry accurate. This motivated us to use tbc sum
varianee as a criterion of the channel code's low.frequency propertÎcs (eq. (2».
This is of practical importance because the sum varianee of a sequence is often
casier to calculate than the complete spectrum. The sum variante of a maxentropie z sequençe Îs given by ref. 15:
N
aZeN) ""
セ M nKャセ
サセHnK
.. "
2
l) -
k}Z
ウ○ョRHセIN
N+l
(3)
k=l
Table I lists thc capacity and sum variance versus the digit.al sum variation N
(eqs (l) and (3)).
The asyrnpLotic behaviour of the capacity and the sum varianee for large
digital sum varial.ions N can be derived:
)t2
C(N) - I -
(2 In 2) (N
+ ャIセ
,
(4)
and
(5)
Performance of simpte binary DC-conslrained codes
13
TABLE I
Capacity and sum variance of maxentropic z sequences
with digital sum variation N
N
G(N)
a 2 (N)
3
4
0.5
0.694
0.792
0.85
0.886
0.91
0.93
0.5
0.80
1.17
1.59
2.09
5
6
7
8
9
2.643.26
Wîth eqs (4) and (5) the following important bound between the redundancy
1 - G(N) and the sum varîance of maxentropic z sequences is derîved:
7<2
0.25
> (1
- C(N»
Hヲセ nI
--I
> 6 In 2
4
'" 0.2326.
(6)
Actually the right-hand bound is within ) % accuracy for N> 9. Combining
eqs (2) and (6) yields:
woT"" 2.15(1 - G(N».
This expressîon dearly shows the linear trade-olf between the redundancy and
the cut-olf frequency of maxentropic Zo sequences. In sec. 5 this relation is used
to establish a figure of merit of DC-constrained chanoel codes.
3. Simple coding schemes
First some properties of codîng schemes based on codewords with an equal
number of positivc and negative pulses, so-called zero-disparÎty codewords,
are discussed.
The number No of zero-disparity codewords with n binary channel symbols
(n even) is given by the binomial coeffident
The code rate R is defined accordîng 10
14
Binary channel codes
The コ・イッM、ゥセー。イャケ
codewords are concatenaled without a merging rule. In
other words the sequence is encoded without information about the history
Practical
and a fixed relationship exists bet ween codewords and source キッイ、セN
demand the number of codewords to be a power of two, 50
coding セ」ィュ・L
that. a sub,et of the No availabk codewords should be オセ・、L
which effectively
lowers the code rate R. Here only 'full set' cading schemes are considercd.
A generalization of lhe coding principle using zero-disparity codewords
leads to tht so-called alternate or low-dispadty coding 8). Besidcs the set of
codeword, having zero-disparity sets of codewords with nomero-disparity are
uscd. The simplest code type has two alternate representat.ions (modes) of the
セッオイ・c
words. The two alternate representations have opposite disparity, the
choicc of the positive or negativc representation is determincd by the polarity
()f lhe ROS just before transmission of the new codeword. The cholee is made
in such a way thal the absolute value of the RDS aftel' transmission of thc new
codeword is rninimized, I.e. as close to zero as possible. Zew-di,parity codewords can in principle be uscd in both modes. For ease of implementation
zero-disparity codeword, are sometimes divided into two sets to be uscd in
both modes BZセQYIN
It is deal' if more subsets of codewords arc used that the
number of codewords is larger than in lhe case of zero-disparity encoding (assuming equal codeword length). Consequently tbis allows a larger maxim urn
code rate for a given codeword length. Unfortunately the power in the lowfrequency range wW also înerease if more subsets arc used sa that a trade-of[
be/ween code イ。エセ
and low-frequency content has to be found. In the following some propertîes of low-disparity coding arc derived.
Let a codeword with Icngth n (n even) consist of binary symbols Xi, I '-i.; i,;;; n,
Xi f;; I - I, 11. 'I'he disparity d of a eodeword is defined by
d=
"
IXI'
,=1
Assul1le funher that a セ」エ
of eodewords S+ is 、セウi|
with zero and positive
disparity aml a ,sct S- with elements of zero and negative disparity_ Sct S+
consists of K + I .subSCls Su, 5" S2, ... , SK (K «: !n); Ihe elements of the subsets SJ are all codewords wÎt.h disparity 2) (0 <j.,,:; K)_ The codewords in Scan be found by inversÎon oF all n symbols of the codewords in set S., and
vice versa.
The cardinality NJ of l.he subset Sj is simply given by the binomina! coeflident
Performance of simpte binary DC-conslrained codes
i5
The total available number of codewords in S.. is
K
m]iセL
j=O
so that the code rate is
I
R"= n2logM.
As the disparity of the cooeworos is chosen sueh that the RDS after transmission of the codeword is minimized, it is not difficult to see that during transmission the running dighal sum takes on a finite number of values. Without
loss of generality it ean be assumed (by properly choosing the initial sum value
at the beginning of the transmission) that the sum values are symmetrically
centered around zero. The set of values (states) the RDS aSSumes at the end
(or start) of a codeword, the so-called terminal or principal states, is a subset
of the RDS values the sequence can take.
Let the terminal digital sum of the k-th codeword be D(k). The sum after
transmission of the (k + J)-th codeword is D(k+l' = D(k) ±d, where dis the
disparity of the (k + lHh codeword. The sign of the disparity (iC d f- 0) of the
codeword is chosen to minimize the aecumulated sum D(hl). A code with this
property is said to be balanced. Wc find by inspection that D(k) can take On
One of the 2K values ± 1, ± 3, ... , ± (2K - 1). It can easily be found that the
total number of RDS values the sequence can take within codewords, I.e. the
digital sum variation, is given by
N
= 2(2K - 1 + セョI
4K + n - I.
=
+J
(7)
AS an illustration the RDS as a function of symbol time interval, the so-called
unbalance trellis diagram, is shown in fig. J. The code has codeword length
n = 6 ano it uses the maximum number K + 1 -- tn + 1 = 4 subsets. Note
the 2K = 6 possibIe Sum values at the end of each codeword and also the
N", 4K + n - 1 = 17 values that the RDS ean take within a codeword.
In the computation of the power density function and the sum variance of
the encoded stream we need the stationary probability of being in a certaÎn
terminal state.
Assume the source blocks to be generated by a random independent proce55
then the signal process D(k) Îs a simple stationary Markov process. Tbe value
that D(k) can take is related to one of the 2K states of the Markov proccss.
The state transition matrix P, witb entries P(i,}), where PU,}) is the probability that the next codewOrd will take it to terminal state i given that lhe
16
Binary channel codes
5
Q)
3
g 1
.Q -1
E -3
§ -5
o
3
2
1
4
5
6
time ----...-
fig. I. Uni)alançe trelli, diagram. tィセ
in :-:n.ttt" 3.
lhick Cll(v€ shows tlle path of lhe wdeword '
+ - - - + -'
セャN。イエゥョァ
ençodçr is currentty in state i, can easily be found. As an illustration we have
written down the general matrix P for 2K = 6 terminal states:
··5
p=
-3
-I
3
5
0
P3
0
0
Ps
-5
0
0
po
I
3
Po
PI
pセ
pa
0
0
Po
0
P,
po
P2
Pa
P.
PI
0
ps
P2
Po
PI
P2
0
Pij
0
0
P'l.
P,
PI
ーセ
-3
-I
5
We find that P(i,f) is tht proportion of codewords in the mode llsed in state i
having the appropriate disparity d for thc lransition to encoder state} = i + セ、N
The transition probabjlity Pi equals lhe retative number of codewords in subset Si, or
Ni
0 -:( i.":; K.
Pi = M'
Dlle to the special struchm: of the matrix P we can find the statÎonary
probability vector 11: wilh entries (n(K), , .. , n(I),71(I), ... , n(K», where n(i)
is the probabiHty being in the eneoder state i with corresponding sum vallIc
(21 1).
K
() n(K - I)
=
I
Ph
j=-K-'
0
«i «K
- I,
(8)
17
Performance of simpte binary DC-consrrained codes
where
Q
is determîned hom the normalization
K
L nU) =
0.5.
1=1
With eq. (8)
K
Q =
22: ipi.
(9)
jJlli
The proof of eq. (8) is by direct verification of the identity
nP""n.
This amounts to:
n(K)PJ
+ n(K -
l)pJ-l'"
+ n(l)PK-J + n(2)PK_J+l . ..
= n(j
+ I),
j
=
0, ... , K - J.
After evaluatin,g we ardve at eq. (8).
3.1. Computation of the spectrum
The codeword symbols are, in general, transmitted on some standard pulse
shape g(t) at intervals of duration T. The signal can be expressed by:
a>
x(t) '"
L
Xi
g(t - i T),
lrz-w
where XI is the codeword symbol value for the time slot i T < t < (i + I) T. The
values that XI assume are determined by the code and the source words which
are to be encoded. The auto-correlatîon fUfietion of the encoded sequence
R(k T) '" E[XiXiH] is cyclo-stationary with period nT 20 ). On the assumption
that the proeess is ergodic, the power spectral density funetion is 20)
H(w T)
=
T1 I G(w T) Iセ
W(w T),
where G(w T) is the Fourier transform of g(t). W(w T) is given by
W(w T) "" R(O)
.,
+2L
R(k T) cos (k (LJ r).
<=1
In the sequel we assume I G(w T) I '" T.
The determination of the power spectral density, lhen, requires the calculalion of thc auto-corre1ation function R(kT). Bosik 21) and Cariolaro et
al. 12,13) have given a complete and systematic analysîs to fiud the auto·correlation funClion of fixed-Iength-codeword-based channel codes. Thîs analysis cM
directly be used to compute the spectrum of alternate bÎ-mode codes (see for
Binary channe! codes
18
examples refs 10 and 11). However. the structure of full set alternatc bi-mode
codcs allows a more efficient cornputation to be presentcd here. The advantage
of our approach is t.hat wc obtain a closed formul<l. for the spectrum of codes
with two subsets and sirnple expressions for code spectra with a larger nurnber
of subsets.
Grl;:cnW:in iA) studied the spectra of a class of DC-suppressed channel codes.
This class of channel codes uscd the 'polarity bit.' or Bowcrs principle 16). He
observed an important feature of th is blocj(·codcd binary sequence that aha
holds for the alt.crnate code principle. He found that the correlation between
symbols Xit and Xj")' f h, depends on1y on the number of blocks scparating
these twO syrnbols, i.e. on the number of codewords in the interval UI ,12). lf
the .5vurcC syrnbols are generated by a random and independent process the
consecutive codewords abo would be uncorrelated. Any correlations are therefore due only tv thc inversions of t.he codewords. II follows that. the expcctation E[xj, Xh] depends, at most, On the number of possible codeword alternations between x'il and XhThis nice property holdsif all possible codewords of the subsets are used. If
these subsets are truncated, for exampIc for practical reasons to a power of
two, thc following analysis can only be used as an approximation.
The auto-correlation funetion RUT) can now be exprl;:ssed in terms of a set.
of numbers rU), being the correlation between any two codewords having i
codewords bet.ween thcm IA)
R(U
+j
n} T)
=セ
{(n - i}rU)
+ i rU
+ I)],
j? 0, I ,;;, i
セ
n,
R(O) = L
(In)
In other words the auto"correlation fUllction R(i T) can be found by a Iinear
interpolation of the set of numbers r(i). The spectrum of sueh a sequence is
given by
セ
H(w T) = I - r(O)
+ n P(w T)
\r(o) + 2
tI
r(i) cos (w inT) } ,
wherc
F(w T)
=
sin Hセョ w T)
- (I
T)'
n sm
'2w
(11)
The spectrum vanishes at w ,... 0, sa that
I + (n - 1) 1'(0) + 2n
L"" r(i) =
;=1
O.
(12)
Performance of simple binary nC.conslrained codes
19
As a direct consequence we derive for uncorrelated sequences with rU) = 0,
f 0, for example codes based on fixed-disparity codewords;
i
r(O) = -
1
n-I
--.
Thc conesponding auto-correlation function R(kT) of the concatenated sequence is found with eq. (l0)
R(kT)
I
=
k
0,
\k - n) .
0< ke:;:;; n,
o
k> n.
n(n - 1)
The power density funetion of セ・イッᄋ、ゥウー。イゥエケ
codeword based channel codes is
n
1
T H(w T)
=
-n=-T II -
=
F 2(w T)J,
which agrees with earlier results of Frankling and Pierce 9 ).
The calculation Of the numbers r(i), if K> 0, is a tedious but straightforward evaluation of Cariolaro's results, therefore we merely state the results.
The correlation function rei) is given by
r(i
+
1)
=
Cf pi c 2 , i;, 0,
(13)
where P is the state transition matrix and
The 2K-vectors C1'and C 2 are given in the interval 1";;: i,;;;; K by:
CI(i+K)
=-
セサᅫ n
j=i+l
(J-i)7r(j)Pi-i- I'u+J)ll(J+ I)P'+i}
oセェ
and
(14)
For symmetry reasons:
and
C 2(i)
=
-
Cz(2K - i
+ 1),
1
< i -,{.,K.
The correlation coeff\cienl r(O) is not found by eqs (13) and (14). The number
r(O) equals the correlation of symbols in the same codeword or r(O) = E[xitxiol.
where the symbol positionsj, andjz are in one symbol andj,
+- 12. The coeffi-
Binary channel 」ッ、ヲZセG
20
eienL r(O) ean with su!fiócnt aecuraey be eomputed with cq. (12). A dosed
expression is given in the section on the computation of the sum variançe
of
,
altemate ッ、」セL
セ・
eq. (21).
Example 1
ln the case K = I the prceeding results for the spectrum and correlation function become manageable. We find the stationary probabilities n(1) = n(2) = i.
The cardînalitics of the two subsets are given by
so that
n
NI
pj =
No
+ nセG
+ 1)
2(n
=
.
Substitution in eq. (14) yields:
C,(1) "'" - Cl(2) '""' - !?l. = _
n
2(n
1
+
l)
and
The 2 x 2 transition matrix P is given by
p
=
{セ
Zセ}
=
[n : 2 n
セ
2] /2(n + 1).
A funher evaluation of cq. (13) gives:
rU)
=
-
(n '11IY+1'
i> O.
Substirutjng in cq. (ll) gives the spectrum of thc alternate code with two subsets
{ U Z n ZfセHキtI
}
1
(15)
T H(w T) = (I - a) I·· I + a" + 2a cos (n ü} T) ,
where
I
a=-n+1
Evaluating yields the second"derivalive of the spectrum at De:
HJI(O)
-----r= Hn
+ 2)(n + 11).
Performance of simple binary DC-constralned codes
21
Lindholm 10) and Pao IJ) have given examples of the computation usÎng the
matrix procedure of Cariolaro et al. One of their examples, the spectrum of
the Sb 6b code, can be used to evaluate the accuracy of the preceding analysis,
when the subsets are truncated. The Sb 6b code is basically an n = 6, K "" 1, bi·
mode code with 6 of the possible 50 codewords deletcd. Poo suggested to delete
toe codewords ' + + + + - - '. ' - + + + + - " ' - - + + + +' and their in>Ierses. A recalculation yields the power density function of the Sb 6b codc dc·
pietcd in fig. 2. The power.density function of the 'full·set' n = 6, K = 1 chan·
nel code, using eq. (15), is plotted as a comparison. We note a good agreemenl
(a few dB difference) between the spectra.
K-'.n-6
k=----
lil
"'!;J
PSNセ
-40
-SOIl....l.........l...l..J--........_
0005
......................................,J--_..............-
........
Ol
0.01
frod of channel fr@q.
Fig. 2. Comparison of the spectra of Sb6b code and the bi-mode n = 6, K = I code.
4. Computation of the sum variance
An important frequency·domain property of DC·balanced codes, the cUl·
oft frequency can be estimated, using eq. (2), by the sum variance of the code
stream. In this section we derive a simple closed-form expression of the sum
variance of DC-balanced bi-mode channel codes.
The process of encoding using the alternate code principle is cyclo-stationary
with period n T21), 50 that the sum varianee of the sequence has to be found
by averaging the running sum variance over all n symbol positions wilhin a
codeword. iherefore the running sum variance at all symbol positions withill
the codeword has to be determined.
Define the value of the digital sum at the k-th position in a codeword 10 be
Zk. Assume a codeword starts at Z(l, one of the K possible positive terminal
22
Binary channel codf's
sum values (the statistical properties of codewords starting at ncgative sum
values can be found by symmetry).
The digilal sum at the k-th posilion is given by
k
Zk
I
Zo +
=
I";;: k 0:;; n.
xm ,
m=l
The running sum varianee at the k-th position given Zo is
E[d
I zo]
=
E[(Zo +
k
=
tI }セIュx
k
k.
1(-1
E[Û + ,J;, x;, + 2z o ];1 X + 2);1 J,h;./},Xh] '
m
wherc lhe operator E[ ] averages over all codewords.
A nice property of ruil codeword subsets is lhat E {クィセゥLャ
and E[x},]. jl +- h,
are not a fundion of the symbol positions Jl and h.
Dcfine the short-hand notation: E[xj,] =A and E[x},x},) = B; 1 "':;j.,jz セ n;
Jt fh·
Substitution yields thc running sum variance at the k-th symbol position
I zo]
E[;;;
=
z3 + k"
+ k(k
2kA Zo
(16)
- 1)B.
The sum variance si of the sequence, if starting in zo. sk I Zo, is found by
avcraging the running sum varîance over all n symbo! positions
sk I Zo
'=
z$
+ セHョ
+
1
'= -
n
L
Il
E(z;
I ZII)
kcl
1) t· A(n
+ I)zo + ä(n'
•. 1)B.
TakÎng into account the probability starting in Zo and averaging yields for the
sum variance
si
K
si セ
E[z3l + セHQ
t 1)
+ Hn' - I) B + 2(n
+ I) AL (2i
- 1) n(i).
(17)
;=L
We eliminate A by !loting the periodicity, i.e. E(zJ]
(16) yields
E(z,; I zo]
=
zg
+ n + 211 A Zo + n(n
=
e{コセ}N
Evaluating eq.
- I) B
and averaging yiclds
K
Elz,:] = E[z$)
50
that with E[d]
=
+ fI + 4n A
L
(2i ,=)
+ n(n
e{コセ}L
K
ZA
I) n(i)
L (2i 1",,1
I) nU)
=
-
Hl
+
(n - I)
BJ.
- 1) B.
Performance of simpte binary DC-conslrained codes
23
Substitution in eq" (17) yields
s1- = eHコセャ
The varianee e{コセ}
-
Hn 2 -
I) B"
(18)
is given by
:«
E[z5] '"' 2
L (2i -
1)2 n:(i).
(19)
1=1
4.1. Computation of the correlation B = E[XJlXh]
We now calculate the correlation of the symbols at theh-th andh-th symbol posîtion in a codeword. It is obvious that for iJ == h: E[x),xj,] = 1. If
i. 1= 12 same marc work is needed" In that case
E [Xj, XjJ
= Prob(xJ! = Xj,)
- Prob(xh
= 1 - 2 Prob(xJl "" Xi,), h -+
*
x),)
h"
(20)
Assume a codeword to be an element of subset SI in S•. The probabîlity that
a symbol at position h in the codeword equals I is
Prob(x}l
= I I S = Si)
.!.-n ョセH
--
+ i).
The probability that another symbol at position )2 ",), within the codeword
is -I is
1
Prob(xj, = - 1 I Xj, = I, S = Si) = セョ セ
.
; .
So that
and with eq, (20)
4F
E[xhx»
IS=
SI]
=
-
n
セMᆳ
n-I
lf we further take into account the probabîlity Pi that a codeword is an element
of subset Si we tind for the correlation
1
B=E(x1J Xh] = _
_. -4
n
セャ
f
=
n- I
'2
Pi
(21)
Binary "hannei codes
24
Combining with eq. (18) yields
s; = E[z31 + t(n +
I)
{I - セ
t PI} ,
(22)
/2
Define
K
um=:Li"'PIl
(23)
mE(I,2,3}.
i=l
After some algebra combining eqs (8), (9), (19), (22) and (23) yields the
varianee of Lhe terminal sum values
(24)
and the sum varianee of thc complete sequence
(25)
The computation of the sum varÎance was til! now generally treated. Eq. (24)
is only based on the aSSllmption of the DC-balanced binmode struclure of lhe
transition matrix P. In cq. (25) we further assumed Lhat the expectations are
invariant with respect to the position in a codeword. In the next examplcs we
substitute values of the cardinalities of the subsets in various code embodÎmenis.
EXil.lI1ple 2
The special case of zero-disparity eodeword bascd systems, i.e. K
(E[z3] = 0)
sJ
=
セHョ
=
0, yiclds
+ 1).
ThÎs result was earlier obtained by Justesen 16).
Examph: 3
lf (wo subsets are used for eneoding a simpte result ean be obtained. We
found (see exarnple I)
n
PI
=
2(n
+
l) .
Substitution and working out eqs (23) and (25) yields
sf
=
A(n + 5).
Performance of simpte binary DC-conslrained çodes
25
Example 4
Bowers 18) and Carter l7) proposed a construction of DC-balanced codes as
being attractive because no look-up tables are needed for encoding and decoding. They proposed a code with (n - 1) source symbols being mapped without modification onto (n - 1) symbols of the codeword. The addiüonal n-th
symbOI of the codeword, the so-called 'poJarity bit', is used to identify the
polarity of thc transmltted codeword. Assume that the encoder is designed in
such a way that the first (n'- 1) symbols equal the source symbols and the n-th
symbol is one. If the sum at the start of the transmission of a new codeword
and the disparity of the new codeword have the same sign then all symboJs
in the codeword (including the po!arity bit) are inverted before transmission.
If the disparity of the codeword is ッイ・セ
then the po!arity of the codeword is
randomly chosen. Accordingly, the nllmber of available zero-disparity codewords is reduced to half of those used in the bi-mode OC·balanced codes as
described in sec. 3. At the receiver a codeword inversion call be Iloticed by the
sign of the polarîty bit.
The spectra! properties of the 'polarity bit' encoding principle were studied
Greenstein used a computer simulation
by Greenstein (3) and Brugia et al. NIセQ
to estimate the power spectra! density function and Brugia et al. applied Cario·
!aro's numerical method 12,18). With the precedîng analysis to calculate the
sum variance of K + ! subsets based channe! code a very simp!e expression for
the sum varianee of the polarity bit encoding construction call be derived.
The code rate of the polarity bit code is
R
=
I
_1.-.
n
The number of subsets is K + I '" ョセ + 1 (n even), sa that the number of terminal sum values is 2K = n. The effective number of zero·disparity codewords
No is halved by the random choke of the 'polarity' of these words with respect
to the maximum number used in the low·disparity coding principle, Le.
No"
+(i:)-
The number of codewords having nonzero disparity is not changed
The total number of codewords having zero or positive dîsparity is:
M"" 2n - 1 •
26
Bin(1ry channel codes
Using some propertjes of binomial coefficients a routine COlTIplltation yields:
Uj =
n HセZI
U2 ='
セョ
2- C"+1),
and
Evaluation of eq. (25) yields
sft
where
セ[ク。ューャ・
c=
H2n -
1),
(26)
st is the sum variance of the polarity bit encoded sequence.
5
lf all possiblc codewords are used, i.e. K"" !n, thc following results are
derived:
and
2
1
sl" = 1i(5 n - 1) -
n + 1
12M 2".
Other values of the number of subsets dÎd not yield sirnple rcsults. Using a
computer s; l;an he found as a funetion of K and n,
Thc rcsults of the computations are collected in table 11, where thc redundancy 1 . Rand thc digital sum variation N of the code arC a150 given (see
eq. (7».
TABLE II
Sum variance, digital sum variation N and
redundancy 1 - R of alternate codes
n
K
N
2
2
4
4
6
6
6
8
8
8
8
0
1
1
2
1
2
3
5
3
1
2
3
4
7
11
9
13
17
II
1:'J
19
23
s1
.5
1.167
U
2.56
1.83
3.20
3.94
2.17
3.68
4,92
5.32
1-R
.5
.208
.170
.135
.145
.107
.101
.U8
.092
.083
,081
Performance of simpte binary dcセ」ッョウエイ。ゥ・、
codes
27
After a study of tables land Il we arrive at the following conclusions: the
simple code with n == 2 and K "" 0, the so-ealled 'bi-phase' code, achieves 1000/0
of the rate and the sum varianee of the maxentropic sequence with digital sum
variation N = 3. This result was eartier derived by Justesen U).
A new result is the simple alternate code with n "" 2 and K = I achieving 100070
of the rate and the sum variance of the maxcntropic sequence with N = 5.
Fig. 3 shows for several codes the sum variante as a funetion of the redundancy I - R with K and nas parameters. As a reference the sum varianee is
5
k
l>
0
v 1
x 2
a Polorlty bit
05 '-.O-S----,O.1...,1----------"'l'.
0
RedlJndancy (log)
Fig. 3. Sum vacianoc and rcdundançy of various codes.
plotted versus the redundancy 1 - C(N) of maxentropic z sequences (see eqs
(1) and (3». Notice in the figure that the performance of zero-dîsparîty cncoding diverges with growing codeword length from the maxentropic bound.
Going to more subsets, K = 1, Îs worth-while in a large (l - R) range.
In order to obtain some insight into the accuracy of Justesen's relation, eg.
(2), the cut-off freguency was ealculatcd using numerical methods (egs (ll),
(13) and (14» and compared with the reciprocal of the sum variance of the
code. In the range given in table 11 we found that the relation between sum
variance and actilal cut-olf freguency is accurate within a few percent.
S. Efficiency of simple alternate codes
It is customary 22) to define the rate efficiency of a channel code as the ralio
of the code rate and the noiseless channel capacity given the ehannel eonstraints, or
28
Binary channd codes
R
e""
--'
C(N)
>
where eeN) is the capacity of the Chien channeJ (eq. (I» alld N is the digital
sum variation of the channel code.
As an example assume n = 4 and K = I. In table n we find in this case N = 7
arld R = 0.84, so that for this challnel code an efficiency e = 0.84;0.886 = 95070
(see table I) is concluded. The sum variance of the code is 1.5, which amouuts
to 1.5/2.09 = 72% of the sum varianee of the maxentropic z sequence with
N = 7. 11 is clear that the comparison of DC-balanced channel codet; with
maxentropic z sequeucet; should take into account both the sum variance and
the rate. We come to the following definition of encoder efficiem;y
E = ll.=....f.(!V) I H}セnI
(27)
[1 - rャウセ
The efficiency E at; defined in eq. (27) compares the 'redundancy-t;um variance
products' of the practical code aud the maxentropic sequence with the same
digital sum variation as the practical code. Notc that for N> 9 the 'rcdundancy-sum variance product' of maxentropic z sequences is appWximately
constant (see eq. (6» and equals 0.2336.
The efficiency l::-' of various codes verSut; codeword length it; plotted in fig. 4.
The polarity bit encoding principle has a simple Împlcmentation, but as wc
can notice from figs 3 and 4 ît is far from optimum in the depicted range. We
conc!ude from the figures that for a giveu rate a sum variance eau be expected
1.0
Q.8
w 0.6
>.
u
セ
"0
.....
.....
l.IJ
0.4
k"'--t:l--o-,o--<,,"-->----<l-----i:......-<J
I:>
0.2
0
" 1
x
2
" Polorîty bit
OO'--_..l-.--=-----L-_-----l._
_l........-----.J
o 4
8
12
16
20
Codeword length n
Fig. 4. Efficiency of ,imple allernale channeJ
」ッHャ セB
Performance of simpte binary DC-COflstrained codes
29
approximately 2.5 times largeT than that of maxentropic z or e.g. codes based
on alternate codes wilh K = 1. Eqs (6), (26) and (27) show that for large
codewords the efficiency E asymptotically decTeases to 350/0. ln the range
0.8 < R < 0.9 notiee for a desired low-frequency suppression the priee of this
simple code: a loss of approximately 1507D in code rate. Greenstein (ref. 18,
Pl'. 1125) concluded: 'If the quantity held fixed is the rate, the zero·dispaTity
code must use a larger value of the coJeword length than 'polarity bit' for
which its spectral superiority all but vanishes'. Quite a different conclusion
can be drawn from figs 3 and 4, showing the superiority of zero-disparity
codes with respect to 'polarity bit' encoding in thc most practical (l - R)
intervaL A ealculation shows that for an unpractically large codeword length
n> 160 the polarity bit encoding principle outperforms thc <:ero-disparity
encoding.
In the following wc study the efficiency of bi-mode codes when the codeword length is allowed to becorne very large. A rearrangement of the preeeding resuls is plotted in fig. 5. The figure shows the efficiency Eversus the
norrnalized parameter K' "" K/V;; with the codeword length n as a parameter.
0.8
BMャョセQPG
-
.....-
oNWQKMiLRHPヲBGエセhJ
,.
L,IJ
1;;
I
]I
.... ""1\
0,61t+++++-F,50.--l:::;rl'---boH:B::l-è セ
J
Qッ セ[PェBGMN
05
c "/'
0 4 ' ;mo
.
0.3
f--
'--I-.-f-.+-t-++-H
00
_.- _.+-+-+--1++-1
セG|N M f K エM ャ
_..""C+,.-!
...I--+++-t+l
,/
kf'
Tii
0.2
1
0.05
I
0.1
-kNn(log)
1
5
Fig. 5. Asymptotie efliciency of simple allernate channd cod<s.
The curves are plotted as solid lines; note however that only a discrete set of
points is achievable. The asymptotic curve with n --->- 00 was calculated using
the euor-function as an approximation of the binomial coeffi.cients (see for
example ref. 23). We notiee that the asymptotic curve achieves a maximum in
efficiency whcn 1(' = 0.47. The maximum efficiency E is in that case approximately 0.56.
Binary chann.e! codes
30
6. Conduliions
We calculatcd the power dengity funetion and gum variance of simp!e bimode DC-8Uppresging block·coded channel codes, We eompared the rate and
sum variance of the chamlel seq uence with maxentropic unbalancc eonstrained
sequenceS. Wc found that codes with a smal! rate have a good efficiency. Two
cOdeS with digital sum varîalion N = 3 and 5 werc shown to be optimal in
Chicrl's sense. Large rale codes become asymptotically bad with growing
eodeword length. l3y properly choosing the number of subsets of thc code we
earl maximize the code efficiency to approximately 56070.
Codes based on thc polarity bit principle have a simple implementatÎon, but
the efficiency is bad in the practical range of codeword length.
Acknowledgentent
The author is indebted to R J. M. Jacabs, who calculated and plotted thc
results shown iu fig. 5.
REFERENCES
') D. Il. Wat.er>, lnt. J. E1eclron. 55, QセY
(19B3).
') R. M. Brooks and 1\. Jessop, Inl.. .I. Electron. 55,81 (1983),
') .I. C. Mallinson and J. W. Milier, Radio and Elec, Eng. 47,172 (1977).
') M. Davidson, S. F. Haa,e, J. L Machamer and L. H. Wallmall, IEEE Trans. Magn.
MAG-U, 584 (1976).
') .J. P. J, Heemsk cr k and K. A. Sc ho u hamer 1mmi n k, Philip, Techn. Rev. 40, J57 (19B2).
') K. w. Cattnmok, Principles of pulse code modulation, lliffe Books Ltd., London, 1969.
') K. W, Canermolc, Int. J. Electron, 55,3 09S3).
") J, M, (,riffith" EleCtron. Let!. S, 79 (1969).
') J. N. I'rankling alld J. R, Pierec, I"'EE Iran•. COl'IllllUn, COM-20, 1182 (1972).
'ol D. A. LindholIll, IEEE Trans. Magn. MAG-14, 321 (1978).
'I) O, S. Po", IEl' Proc. F 128, 323 (1981).
") O. L. Cariolaro and G. P. Tronea, IEEE Trans, commllO· COM-U, 1555 (1974).
"') (,- 1,. Cal'iolaro, G, L. ョGセ「ッイ」ゥp
and G. P. Ttonca, Inl.. J. Electron, 55. 35 (1983).
") "f. M. ChieIl, BeU SyS(.. Tech. J. 49,2267 (1970),
>0) J. JusleSetl, IF.I':F' Tran,. Jnform. Theoty IT·28, TセW
(19B2).
1&) F, K. Bowps, US patent No. 2,957,947, 1960.
17) R, O, (:arter, Electron. Lett. I, Vセ HャYVセIN
") L. .I. Grecn'lein, BeU Syst. Tech. J. 53. 1103 (1974).
"') O. BrlLgia. R. Pielroi\lsti "ncl A. Roveri, AHa Freq. 45, (>95 (1976) (illltalian).
tol W. R. Bcnnclt, Heli :';y,t. ieeh. J, 37, 1501 (19.18).
") B. S. HO,ik, Heli Sy,t, Tech. J. 51, 921 (1972).
") r. A. Franaszek. IBM J. Res. Develop. 14, ,176 (1970).
") W, Fellcr, An intmduction 10 prol:>al:>ililY theory and its applications, volume I, Wiley and
8011, lnc., New Vork, 1959.
Construction of binary DC-constrained codes
31
CONSTRUCTION OF BINARY
OC-CONSTRAINED CODES
Abstra.ct
The systemalic design of DC-conslrained codo, basod on eodewords of
fixed lenglh is cOdsideted. Silllple recur.ion relalions for onumcraling lhc
numher of eodewords salisfying a conslrainl On lhe maximum unbatanee
of ones and zeros in a codewotd are derived. An onumeralivo schomo for
eneodÎng and decoding maximum unhalance conslrained wdewords wilh
binary symboll is clevcloped. Examples of COnslruclions of transmission
systems basod on unbalaneo eonslrained eodewords are given, A worked
example of an 8b lOb chann<;l ,00<; Î> given being of partieular interest
hecause of ils practical simplicilY and relalive dfideney.
1. Introduction
An important requîrement when designing a digital transmission system is
the shaping of the power spectral density function of the encoded stream by
adding redundancy 10 match the particular physical properties of the transmission channe!. Many practical cxamples of transmission systems can be
mentioned that do not pass the low frequencies with sufficient signal·to-noise
ratÎo. Filtering out the low frequencies can only he done if the cncoded signal
itself is not serîously dîstorted by this filtering. Shaping the spectrum of the
encodcd stream by coding can cope with this phenomenon. Shaping, however,
can only be done if some kind of redundancy is added to the source sequence.
The field of application of digital channel codes Wilh suppresscd low frequency
components is quite broad, We find applications in transmission systems over
and În storage media such as optica] recording 4)
fibre or metallic cable Q⦅セI
and magnetic recording 5).
Transmission systems designed to achîeve De suppression are mostly based
on so-called block codes, where the source digits are grouped in SOUrce blocks
of m digits. The m digit blocks are translaled using a code book into blocks of
n digilS called codewords. Cattermole 6,7) and Carter 8) gave examples of chan.
32
8inary channel coi!!!.
nel codes based on block codes with codewords having zero or low disparity,
where thc disparity of a codeword is dcfined as the difference of the number of
alles and zeros in the eodcword.
Justesen 9) showed a close relationship betweell the low-band cut-of!" frequency and t.he varianee of lhe running digital sum of a DC-suppressed encoded 'lream. The (running) digital surn is defined for a binary stream as the
accurnulated sum of ones and ,.eros (a zero eounted as -I) hom the beginning
of the transmission. Furthcrrnore Chang et al. IQ) found thal the sum variance
plays all important role in thc bit error rate when the transmission channel is
AC-coupled. The simpk codes as previously discussed have the disadvantage
that codeword length, the surn variance and rate have a fixed relationship.
In this paper the codc design is based on codewords having a constraint on
the maximum Dllmber of assumed sum values, i.e. those codewords ,He
delet.ed having a relative large contribution to the sum variance.
We start in sec. 2 with a brid description of the properties of scquences thaI.
assume a limited number of digital sum values. These properties are used to
est.ablish a figurc of merit of DC-constrained codes. The actual basis of code
dcsign is given in sec. 3. It provides a method for the enumerat-ion of codewords with an unbalance eonstraillt. In .general the effiçiency of a channel
code can be improved if Jarger codewords are allowed. Unfortunately the
complexity of the cneoding and decoding hardware increases exponentially
with inereasing eodeword length if a direct rncthod using look-up エ。「ャ・セ
of the
source words and their channel representations is used. Section 4 dcals with
algorîthms for encoding and dccoding of constrained codewords growing
polynomially with ゥョ」イ・。セァ
codeword length. Finally, in sec. 5 the problem
is addressed of sclçcting sets of codewords to design DC-suppressed channel
codes. A worked cxarnple of a new 8blOb transmission code is given showing
sup<;rior features with respect to other codes with rate 8/10.
2. Prollerties of z-constrained sequenees
This seclion provides a description of some properties of OC-constrained
generated by a Markov souree having maximum entrQPY- This
study gives a relation bet.ween the redundaney and the sum variancc of codes.
USÎng this theory we can derive a new figurc of merit taking into account both
the redundancy and t.he resulting frequency range with suppresscd components,
Consider binary sequences x = (x I , . . . , Xi, ... ), ,xi e I -I, 11. The so-called
(running) digital sum of a sequence plays a significant role in the analysis and
syruhesis of codes of which t.he spectrum vanishes at the low frequency end"
The digital sum ZI is delined as:
セ・アオ ョ・ウ
Construction of binary DC-constrained codes
Zi
=
I
33
i
(I)
Xj,
ja!
Chien 11) studied sequences X assuming a finite number of sum values, Sequences having a constraint on the maximum number of Sum values are
defined here as z (-constrained) sequences. The total number of sum values a
sequence assumeS is often caJled thc digitaJ sum variation. Chien determined
the information capacity of z sequences as a function of the number of
allowed sum values. His results will be briefty discussed in the following.
Taking z/ at any instant i as the state of the signal strcam x, then the bounds
on Zj define a set of allowable states of a Markov SOurce. Each transmission of
an additional symbol XI can be considered as a transition frortl OnC state tO
another. This transition can be represented by a so-called coonection matrix.
For the N state Markov source an NxN connection matrix D is defined by
D(i,) = 1 if a transition from state i to state) is allowable and D(i,) =
otherwise. The connection matrix D is for the z·constrained sequences given
by
D(i + 1, i) = D(i, i + 1) = I, J";:; i";:;N - I,
(2)
D(i,}) = 0, otherwise.
°
Note that D is a symmetrie Toeplitz matrix. As an example We have written
down the matrix D for N == 5.
Ol
D=
l
000
1
0
0
1 0 1 0
0 1
1
1
0 0
°
o
00010
The above representation is related to the input-restricted noiseless channel
studied by Shannon u). The Markov information source model enables us to
compute the channel capacity, dcfined as the number of bits per channel symbol that can maximally be carried by the constrained channel sequence gen·
erated by the Markov source. The channel capacity equals the maximum entropy of the Markov SOurce which is given by the base-two logarithm of the
According to Chien the
largest real eigenvalue of the connection matrix D iセIN
maximum rea I eigenvalue of the Toephtz matrix is given by
Je
=
2 cos
(_ft_)
N+ 1 '
(3)
so that the capacity is
C(N)
=
2JogÀ
=
I
+ セャッァ」ウ
(N: I)'
(4)
34
Binary channel codes
Mörc interesting properties of z sequenees were dcrived by Justesen 9 ). He
calculated thc power density funetion of z sequeneC$, Thc transition matrix of
a maxentropic z·constrained source was determined, i.e. a Markov souree
with transition probabilities chosen in such a way th at the souree has maxÎmUm cnlropy. In order to determine the transition probabilities th at maximi7.c lhe entropy of the N state souree, one must determine an eigenvector
p" = (pa(l), P.(2)", " Pa(N)) that satisfies (ref. 14, pp. ZW)
p.D '" Ap".
(5)
The joint probability pU,)) of a transition from stale i to jol' the maxentropic Markov SOUfce is given by
p(z', j')
'-1 D(t', j') pa «(!.,:.)!.. ,
Pa [
=
Il
.• =
I,.j
I , 2" ,
"
N.
Using the fact that. D is a Toeplitl matrix we find for the stationary probabiiity
p,(i) being in state i 9)
p,(i) =
Thc cut-aff frequency
N
(00
セ
I
ウゥョセ
(;/';
1)' 1< i.,,;: N.
(6)
of a DC-constrained sequence is defincd by 9,15)
H(wo T)
--1"'--
1
=
2'
(7)
wherc H(w T) is the power densilY function of the sequence versus frequency
and T the time duration of a channel syrnbol. Justesen found a remarkable
and very useful n;latÎon
(8)
where s! is t.he sum varianee of t.l1e sequenee.
This simple relation is not only restricted ((J maxentropic sequcnees; exampies of practical channd codes investigated by Justesen show that the reeiprocai relation (eq. (8)) between cut-olf frequency and sum variance ゥセ vçry
accurate. Thi$ rnotivated us 10 use the sum variance as a criterion of the lowfreqlH;ncy propel'ties of a channe1 code. This is of practical imponance because tbc surn varianee of a sequencc îs often easier to calculate than the autocorl'clation function or specll'um. The sum varianee of a maxenlropic z
sequem:e can bç found using eq. (6)
h;'(N +
Ij - k I 2
" 2 (
11k )
sm
Ntl'
(9)
Construction af binary DC-constrained codes
35
TABLE I
Capacîty and sum varÎance of maxentropic z sequences
with digital sum variation N
N
C(N)
3
4
5
6
7
8
0.5
0.694
HヲセHnI
0.5
0.80
1.17
1.59
2.09
2.64
3.26
0.792
0.85
0.886
0.91
0.93
9
In table 1 the capacity C(N) and sum variance versus the digital sum variation Nare collected (usîng eqs (4) and (9».
For large djgital sum variation N the sum variance and capacity can bc ap·
proximated by
(2 In 2) (N
C(N) - 1 -
q2(N)
セ HQセ
RセI
-
+ QIセ
(N
,
+ iIセN
This leads to following relation bet ween the redundancy 1 - C(N) and the
sum variance of a maxentropic Z sequence
0.25
セ
(1 - C(N» q2(N)
>
HセM
セ
I)
In 2
"" 0.2326.
(10)
Actually the right·hand bound is within 10/0 aCcunl.Cy for N> 9. This relation shal! be used later 10 establish a figure of merit of DC-constrained codes.
3. Enumeration of (N, Tl) sequences
Let x "" (XI,X2, ..• ,x.), Xp", 10, IJ be an (n) sequence. We deftne thc digital
sum Zk of the (n) sequence
k
Zk =
L: (lxi -
I)
+
Zo,
I s.; k <::<; n.
lol
The rescaling has been done in sueh a way that 'zeros' are counted as -I. The
initial sum is denoted by Zo.
Binary channel ("odes
36
Definitioll
A binary (n) sequence ゥセ çallcd an (NI, N 2 , n) sequence if Zk (I ,( k -,;:;, n) is
bOllnded between N, and N 2 (N2 ) NI)'
Without loss of generality choose NI '" J and N2 = N. We further use the
shorthand notation (N, n) sequence la denote a (1 ,N, n) sequence.
Let T= T(z",rI;N,zo) be the set. of(N,n) sequences fOr given length n, surn
constraint N, inilial sum Zo and terminal sum Zn, Thc cardinality of tIJc Sel T
can be füund by manipulatioo of the NxN connectîon matrix D.
Let I TI =A(zn,n;N,zo) denotc t.he number of distinel sequenceS of length
n starting in state Zo and terminating in state Zn, tIJen
A(z",n;N,zo} = D"(zo, z,,),
(11)
where D"(i,j), ャセ[
i,j 0;:; N, denotes the i,j-th element of !.he n-th power of
the matrix D.
In order la avoid many rnulliplications required in the matrix powers, observe [he specific struet.ure of the conm:çtion matrix D, We I1nd the following
recursivc rclalions for determinÎng [he number of sequences.
A(zo, 0; N, zo) '" I,
Let.
AU, 0; N, zo) = 0;
and
A(O,j; N, zo)
= A(N
+
a1l i f Zo
I,j; N, zo)
= 0; 0 セ[ス
< rI.
Let. furLher
A(i,}; N, zo) " A(i - 1,j·- I; N, zo) +
+ AU" l,J - I; N, zo); 1 セ
i
セ
N;
I セェZH
n. (12)
We can observe that exccpt at the boundarics the recursive relations are simîlar
the relations for the binomial coeffjçjcnt> in the Pasca) triangle.
10
4. Enumerlltive coding of (N, n) sequences
In [his section a general enumerativc tcehnique for deeoding (N, n) se·
quences is dev<:loped. Ta that end we r:stablish a 1 - 1 lTIapping fram a set Tof
(N, 'I) sequenees anta a set of întegers 0, 1, ... ,1 TI - 1, where I T I is the cardinality 0 I' jhe set T.
Let /(z",n;N,zo) be thc set of (N,n) sequences 1'01' the given constraints.
Thc sct. T ean be ordered Iexicographkally as follows:
lf x = (Xl, .. . ,x,,) and y = (Y1" .. ,y,,) are elemcrlls of the set. T [hen y is
called Iess lhan x, in shorty <ox, if th ere exists an i, I ,s;:; i BLMセ n, sueh that Yi <. Xi
and Xi = Y)I 1 <j < i. Fol' cxample '00101' <0 '00110'. The position of x in the
lcxicagl'aphieal ordering of T is defined to be the rank of x denoted by r(x),
i.e. r(x) is the nurn ber of all y io T with y <ox.
COl/Struction of binary DC-constrained codes
37
Theorem 1
The rank r(x) of the binary sequences x in T can be calculated according to
r(x) =
.
L A(st + 1. fI t=1
k; N, コセI
. Xt.
(13)
where
k-l
I
St = Z. -
(2x1 -
1).
i=l
Praal
According to Cover 16) the lexicographic index r(x) is given by
n
r(x) =
I
Xr n.(XI,X2 •... ,Xj_I,O),
J=I
where fI,(X"X2, ... ,Xk) denotes the numbCr of dements in Set T for which the
first k coordinates are given by (XI,X2, ..• ,Xt). The first (j - 1) coordinates of
x detenniue the state Sj
J-l
Sj=z, -
L: (2x
I -
I),
iセャ
as
n
)-1
Zn = Zo
+
L:
ja;:l
(2x1 - 1) - 1
+
L (2x
i -
1).
i.j ... t
The number of elemenrs in set Twith prefix (X"X2'" .• Xj_hO) equals the number of (N, n - j) sequences starting with digital sum Zo and terminating with
Sj + 1.
In other words
Example 1
o
Consider the set T(z.,n;N,zo) = T(4, 10;7,4). It follows from Z.=Zo that
all elements of the set have an equal number of zeros and ones. It can be verified that the sequence x = (l 1000101 10) is a member of the set and can be
ordered according to theorem 1. The ranking procedure is visualized in fig. 1.
The array in fig. 1 has N"" 7 columns and fI = 10 rows. The entries are given
by the weighting coefficients A(i,);N,zo). We can easily verify the recursion
relations eq. (12).
Ranking proceeds as follows. Start at the bottom at the z. '" 4-th column.
The Ieft-most symbol XI of the sequeoce x = (1100010110) is a I. Move ooe
38
"
0-.
0:B
3
6
@)
@......
10
14
ᆴセ
..@
'V
0
0
0
4
14
34
r(1100010110) = '190'
0
14
48
... @
1 セ⦅V
Fig. 1.
y
x
0
(2)-'8
1
2
4
.
Binary channel codes
AH・セゥャ。イ・ョ g
^G|。」セN iHp
IriangJe.
step in the X direction I.e., towards 68, and record the entTY at a single step in
Ihe X diTectlon, Le. 116 in an accumulator. The next symbol is again a I, so
mOvC agaln in Ihe X oirection towards 34 ano aod the entry 68 10 the current
accumulator vatue, I.e. 116 + 68 = 184. We further proceed as depicted in the
tlguTe. Eventually we arrÎve at r(1100010110) = 116 + 68 + 4 + 1 + 1 = 190.
The procedure given heTc Îs a generalization of a procedure given by Schalkwijk 17) using Pascal's triangle with the binomial coefficients as enlTies for
ranking codewords with a given disparity.
The invtrse function, conversion hom a given integer I to an (N, n) sequencc with rank I is also analogous to Schalkwijk's algorithm and can be
carried out as fol1ows.
iョvセt ・
algorithJil
Let an integer 1(/= 0,1, ... , I TI - I) ano the set T(zlI,n; N,zo) be given.
The following algOTilhm finds x such that r(x) = / (see Cover J6».
Step I
Ij A(z" + 1, ti - 1; N, zo) セ
n - I; N, zo) else set X, = O.
I then sel Xl
=
1 and set I
=
1- A(zn
+ I,
Step 2
SR
For k=2, ... ,n, ifA(sk+1,tI-k;N,zo)<I, Whe:TC SJ=ZO and
= SR-J-- (;2Xt-l - 1) then set XR = 1 and set J = 1- A(Sk + 1," - k; N, zo)
else set Xk
=
O.
39
Construction of binary DC-constraÎned codes
Till now the procedure was confined to the ranking of one set (N, n) Sequence8. 8ya straightforward generalization it is also possîble to rank sequenceS starting from a predefined set of states and all terminating in the same
state. Assume L codeword subsets starting with sum valucs gîven by ZI,
i= I,
,L, I.e. consider the set consisting of the subsets T(z.,n;N,z;),
i = 1,
, L. Ranking of tnc set can be accomplished with
•
r(x)
where
=
L
L L A(Sk + 1, n セ
k; N,
Zi) • Xk,
(14)
k=1 i=l
kaj
Sk = Z" -
L (lx
j
セ
1).
j=1
This follows from the fact that the subsets are disjoint. The inverse function,
converting an integer I to an (N,n) sequence, can also be done in the same
maDner.
The ranking given in theorem 1 enables the design of channel encoders of
moderate comp1exity. We need storage capacity for approximatcly N/2 x n
non-zero weighting coefficients, a full adder and an accumu1aror ro store rhe
intermediate and final results. This has to be compared with a look"up table of
2'" entries if a non-algebraic method of coding is used. The weighting weffi·
cii;mts look.up table ean be used for encoding and decoding, which is attrac·
live fOr application in recording. We do not need a multiplier because x Îs
binary valued and so the multiplications are simple additions. Lyon 1$) gave an
cxample of a practical embodiment for the coding of codewords with given
disparity using Pascal's triangle. His circuitry can in principle be used with
only a modification of the entrÎes in the look·up table.
5. Tl'ansmission codes based on (N,n) sequences
WheD the state-independent encoding strategy is used rransmission codes
that satisfy rhe digital sum constraint can readily be found. Enumerative en·
coding and decoding can always be used in a straightforward way. A drawback of tnÎs simple codîng scheme is that generally speaking good coding efficiency wil! be reached for 'large' codewords. A much more powerful merhod
is based On the state-dependent encoding principle. A problem will arise for
the assîgnmenr of codewords ro source words in all encoder stares, when the
algebraîc coding algorirhm and stare-independent decoding are to be combined.
5.1. State-independent encoding
In the special case of state-independenr encoding all codewords start and
terminate in the same sum state. Consequently the codewords have <:ero·dis·
40
Binary channel r.:odes
parity. The number of distinct (N,n) sequences can bc found by the enumeration method given in the preceding section. Table Ir lists the number of zerodisparity (N,n) sequences as a function of N and n. The terminal sum state
has been chosen to maximize the number of codewords. Transmission codes
can readily be designed with this tabie.
TABLE n
Number üf zero-disparity (N, n) sequences
rr
10
12
14
32
4
16
5 13 34
89
6 18 54 162
6 19 61 197
6 20 68 232
6 20 69 241
64
233
486
638
792
846
128
610
1458
2069
2704
2977
4
セ
N= 3
4
5
6
7
8
6
1;
8
E;\;arnple 2
lf n = 8 ,md N", 7 the table shows that there are 68 distinct (N, n) sequences
satisfying the given constraints. This cnables the mapping of m '" 6 souree bits
onto the n = 8 channd symbols. Hence the code rate R = mln = 3/4. Wc only
need 64 codewords, sa four codewords are deleted. lf the lexicographical
largest or smallest codewords are deleted then the enumcratîve coding scheme
can be applied without modification.
Tlle rate efficiency is defined with respect to capacity of the constrained
channel 13)
R
e = C(N) •
where Nis the digital Sum variation of the channel code.
The code of examplc 2 has a rate efficiency (see table I): 0.75/0.886 = .85.
The Sum variance. adopted in sec. 2 as a low-frequency criterion, could be
found by an exhaustive computation of the variance of all codewords. A computation growing polynomially in the codeword length n can be based on the
enumeration method of the preceding section. The probabllity thar the digital
sum equals i at symbol position j within a codeword is given by the number of
paths starting in ZQ, passing state i at position j and eventually tcrrninating at
Zn, dÎvided by the total number of paths from セq
to ;<;n,
ConSfructiOrl of binary DC-constrained codes
41
In the case of zero-disparity codewords we have Zo "" Zo, 50 that according
of zero-disparity (N,n) sequences can be found by
to the enumeration method the sum variance Dセ
s 2 ""
セ
セ /;:0
セiHM
tS1
1
セIRaHM
1- Z
-'N• Zo )A(-I,}.
-'N• Zo ) •
I, ti - j.
(15)
where the average digital sum Z is given by
z]セ
-
)
f.'セ l セャjf:ol-A('l,n-J;. N .zo)A(-'
l,j;N.z o
I
and the cardinality ITI = A(zo. n; N, zo).
The code of example 2 has a sum varianee Dセ = 1.38, which amounts to 66070
of the sum variance of a maxentropic sequence with N = 7. It is clear that the
comparison of DC-constrained codes with maxentropic sequences should take
into account both the sum varianee and the rate. Therefore a new figure of
meril, the efficiency E of DC-constrained codes is introduced. Eq. (10) showed
that in the caSe of rnaxentropic z sequences the product of redundancy and
sum varîance is approximately constant. The efficiency E is now defined as the
ratio of the products of redundancies and sum varianees of the maxentropic
z sequencc and the actual code, or
E == (l - C(N)j ç2(N)
(1 - R)S2
(16)
Fig. 2 shows the efficiency E of zero-disparity codewords of word length n
with N as a parameter. The efficiency grows asymptotîcally to unity with 10creasing codeword length_
1.0
0.9
w
:>.
0
c
0.8
.Q!
u
....
;;::
UJ
0.7
6.
"J
0.6
0.
l(
10
0
20
30
Codeword length n
Fig, 2.
ケ」ョセゥ ヲ e
E of zcrO-ÓiSNrity based codes.
40
Binary channel codes
42
5.Z. State-depel'ldent encoding
The case of state-independent encoding is straightforward: all セ、イッキ・
end
with the same sum value. More efficient coding, i.e. achieving a targer rate with
smaller codeword lengths, is possible if the ・ッ、 キッイ、セ
are allowed to terminate
in a predefined set of sum states. The problem is naw to establish this set of states
and codewords for an optimal transmission. fイ。ョ ウセ・ォ
12,19) has developed an
algorithm for determining the existenee of a set of sa-caJled prineipal states for the
design of collStrained sequences sueb as run-Iength-Iimited sequences. A necessary and sufficient condition for the existence of a code is the existence of a set of
principal states with the property that from each of these principal states there
cxists a sufficient (with respect to the numher of source wards) number of distînct
codewords la the ather principal states. The minimum number of paths from a
principal st.ate t.a another detcrmines the information rate. The met.had autlined
by Franaszek does not guarantee the important. feature of st.ate-independent
decoding. A code is state-independent decadable 19) if every mapping of a souree
word int 0 a codeword has a unique in verse independent ofthe encoder code pages,
where a code (boak) page is t.he subset. of codewords starting in a princlpal state.
This property guarantees that cvery (noiselcss) codeword he carrectly decoded
even when the actual eneader state is unknown. Franaszek showed that ,ueh
a unique inverse is always possible if only two principal state, are eha,en.
We colleeted in table Hl some of the results of the computer search for
codes using Franaszek's methad with the additianal constraint of two allowed
TABLE III
Codes with fixed length codewards, state-dependent erleoding
N
digit.al
N
R
イ。エセ
ョイオセ
n
In
5
4
3
5
5
14
11
24
19
R
.95
II
.99
19
.999
5
.94
.97
.98
.99
4
iii
fi
5
12
e
セ
7
6
7
7
14
n
7
24
19
21
8
16
14
ij
9
9
18
ZO
16
18
9
j(j
ij
6
7
19
22
Qセ
fi
7
.96
セ
.96
.97
variation, n códtwotd letlgth\ m number of souree: bits,
= mln, e rale efficiency - RIC(N),
43
Construction of binary DC-constrained codes
principal states. The number of codewords were truncated to the nearest
power of two. For this reasoo the sum variance of the codes is not known and
therdorc thc ratc cfficiency e= RIC(N) is used as a figurc ohncrit. Thc SOUrce
word length m is here chosen to optimize the rate efficiency of the code. Note
that in practical circumstances we generally do not have such a degree of freedom and consequently Some of the efficiency is lost.
We conclude from this table that good rate efficiencies are possible with
state-dependent encodîng. Most înterestîng are codes with sum variation N .. 5.
In this particular caSe it can be proved by evaluation of the matrix D that for
any codeword length the number of principal statcs for optimum transmission
is two and that the number Mof sequences from a principal state is (n even)
M"" 3"/2.
In the n = 2 case there are three distinct codewords: the zero-disparity codewords '0 I' and '10' (i.e. the 'bi-phase' words) and the codewords '00' and '11 '
to be used alternately. The rate of the n = 2 code when using a ternary source
3. According to table I this simple code achieves lOOOJu of the
alphabet is ァPQRセ
noiseless channel capacity. Also the sum varianee of this code equals the sUm
variance of the maxentropic N "" 5 sequenee 26). For binary souree alphabets it
is not possible to reach this efficiency, though according to table 111 wÎth a
codeword length n "" 24 a rate R = セ ean be reached, beiog 0.1070 below the
asymptotic bound.
Fig. 3 shows the efficiency Eversus codeword length TI. wÎth lhe digital sum
variatioo N as a parameter. The sum variance was calculated by an extension
1.0
セ
0.9
w
>.
IJ
0.8
---------
l::
.W
IJ
........
w
N
0.7
t. 7
v 9
0.6
0
x
0
10
20
11
30
Codeword length n
fig. 3. Efficiency of 'tate-dependent coding.
40
44
Binary channel codes
of eq. (15). The code efficiency was calculated taking into account all possible
codewords (no truncation to a power oftwo)_ Tlle efficiency of codes with sum
variation N = 5 is for arbitrary (even) codeword length equal to unity. We
nate the improvement in efflcîency with respect ra zero-disparity codeword
based eoding.
Till now we did not diseuss how 10 apply Ihe enumerative coding method
when state-dependent code book pages will be used. State-independent deeoding is not guarantecd îf the codewords in the code pages are lexicographically
ordered. Iftlleencodcrhas two pages andNis odd then coding can bedonein Ihe
following way. Let A o and A, be the code book pages. Codewords in A o have
zero or positive disparity. We order one ofthe pages. say A o, aecording to sec. 2.
A codeword in page A, is found from a codeword with the same rank in Ao by
inverting all bits if the codeword has non zero·disparity. Souree words are
represented by the same zero-disparity codeword in bath pages. The decoder
observes the disparit)' of a received codeword and if needed the codeword can
easily be mapped to the A o ranking. lf N is even this simple procedure cannot
be applied. In the next section an example is given to solve the N is even case.
5_3. 8bl0b channel code, a worked example
In certain applications of channel codes, sueh as digital recording on magnetic tape or transmission over libre-optie network it has been found t.hat a
'DC-balanced' channel code with rate R = 0.8 has attractive features 20-'8)_ In
this section a design example of such a code is described based on the theory
of the preceding sedion.
The somee signal is assumcd to have eighl parallel bits that arC translaled by
the channel encoder using a code book onto the channel symbols. The channel
symbols are sequentially reeorded on tlle recording medium such as ror example
magnetic tape. In this design examplc the code book has two ー。ァ・セZ
aHsセI
and
A(S,), where So and SI are the two encoder stales. Each page has 256 cntries,
enabling a ane-ta-one state-depcndent encoding. Dependent on the particular
transmitted codeword the eneoder state changes from state 8 0 to SI (or from
S, to So or it remains in ,he same state. Bath pages are sa ehosen that:
a) a simple encoding and decoding is possible, permitting a simple low-cost
design;
b) the concatcnated channel string takes on six digital sum values;
c) (he maximum run-length is live channel symbols, i.e. at mast five canSeculive like symbols can occur;
ct) state-independent decoding is possible. Upon dccoding no lise is made of
the digital sum for decoding. In this way error propagation is limitcd to
cighl decoded souree bits.
Construction of binary DC-constrained codes
45
From table I ît cao be eoncluded that the capacity of a channel that occupies
six sum states is C(6) = 0.85. In other words the design of a channel code with
this constraint and rate R = 0.8 is feasible. The capacity of a channel occupying live sum states is C(5) = 0.792, so that a code with R = 0.8 satisfying the
N = 5 constraint has not to be attempted.
Fig. 4 shows the unbalance trellis diagram of a code that oceupies six sum
states. The encoder has two princîpal states denoted by So and SI. The trellis
2
1 S1
セ 0
c
.2 -1
.2c -2
:J
So
-3
o
2 3 4 5 6 7 8 9 10
time
_
Fig, 4, Trellis diagram of Sb lOb code.
diagram gives a plot of all possible sum values as a function of thc symbol
posltion, Using the generalized Pascal triang!e delined in the preceding section
it can be caJculated that the number of codewords satisfying the chanoel eonstraints starting in So and terminating in So and SI is 197 and 155, respective!y.
From SI there are 155 and 131 codewords terminating in So and SI, respectively. We condude that the minimum number of codewords hom an initial
state is I3I + 155 = 286, being sufficient for an 8 10 10 mapping.
In order to enable simple coding the sets of codewords with zero-disparity
are divided loto two subsets:
1) codewords that ean be stAte-independently encoded, We found that the
total number of slieh codewords is 89. These codewords ean be found by
inspection of tbe treIlis diagram. The zero-disparity state-independent
codewords occupy four sum states so that the number of codewords and
the codewords themsdves ean again be evaluated using the generalized
Pascal triangle (see table IV);
2) other codewordS with Il:ero·disparity,
We now formulate thc following rules for encoding and decoding.
Binary charme' codes
46
TABLE IV
Pascal triangles of 8b lOb code
0
0
1
0
0
2
2
3
8
10
0
13
0
21
21
9
0
9
28
33
47
55
3
19
14
8
34
6
4
0
3
3
0
3
5
61
108
Encoding nlles
Most of the codewords of page A(So) can be gcnerat.ed by the enumerative
coding method. The page A(Sl) can be found with some 'bît-shuffiing' of
codewords of page A(So). We ddinc tbc following three subsets of A(So),
where I (00:;;; I LNセZ 255) is the rank (or decimal notation) of the source words
Subset Tu: 0
< I,;
88.
The codewords are state-independent encodable, have lero·disparity, and
do not change the encoder state.
Subset Tt: 89 <i:"" 1« 243.
The codewords have non-lero disparity. By symrrtetry (see trelHs diagram)
the codewords in page A(Sl) can be found from codewords in A(So) by inverting each symbol and reversing the direction of transmission of the codeword. Exampk assume a codeword wilh rank I in Tl of page A(So) to be:
'0100JOIIII'. Inverting and revetsing results in codeword A(Sl,f)=
'0000101101'. Jt can be verified in the t.rellis diagram that this codeword satisfies the constraînts. Whenevet a codeword in subset Tl is transmitted t.he encoder cbanges its stat.e.
Subser T2: I> 243.
The codewords in T 2 are chosen from the st.at.e-dependent, zero-disparity
codeword set in such 4 way that. a codeword A(S 1,1:> 243) is found from
A (So,l >·243) by reversing the direction of transmission of t.he codeword. We
did not found an enumerative method for encoding Or decoding the twelve
codewords in set T2 , so that a direct look-up table is needed.
47
Comtruction of binary DC-constrained codes
Oecoding rnles
At the receiver the cOdewords in A(Sl) can with simple hardware be mapped
to codewords in A(80 ) with the same source representation. It thus suffices to
have a decoding algorithm for codewords in state 8 0 • Due to the partitioning
into two subsets two Pascal triangles are needed listed in table IV.
We conclude from this table that the look-up table has 21 + 14 = 35 nonzero entries. We further necd for decoding an 8 bits full adder and additional
hardware for distinguishing subsets TI) and TI and coding of codewords in T2.
Results
Using a computer the following statistical properties of the 8b1Ob code were
calculated (see table V).
TABLE V
StatÎstical properties of 8b1Ob code
sum
prob.
run-Iength
prab.
3
2
I
.066
1
2
3
4
5
.540
.303
.119
.033
0
-1
-2
.199
.286
.264
.148
.036
average sum "" 0.662
SUm varianee = 1.534
.005
average run-Iength '" 1.66
The run-Iength of a code is defined as the number of consecutive like syrnbols. Thc efficîency E of the 8blOb code is according to eq. (16) and table I:
0.2398/(0.2 x 1.534) '" .78.
Table VI lists the main parameters of DC-constrained 8blOb and 16b20b
codes th at were reported in the lîterature (see also Tazaki 22».
TABLE VI
Main parameters of R = 8/10 DC-constrained codes
reference
Morizono 23)
Shirota 24)
Widmer 2O•21 )
Parker RセI
new
IセッQ i
N
10
T m ",,*)
note
10
**)
7
6
7
6
6
5
5
***)
5
0) maximum rUlI-lellglh. 00) simple zero·disparily code wilh 252 Lウ、イッキ・ セ
block code wÎth codeword length 11 = 20.
8inary channel codes
48
Note thai Ihe new code has improved features with respect 10 the other
codes with codeword length n '" 10. The Parker code has the same main
parameters, but due to the doubled codeword length it needs considerably
for encoding and decoding.
more ィ。イ、キセh・
6. Coltclusions
We have presented a systematic approach for designing binary DC-balanced
channel codes. DC-balance was achieved by imposing a conSIraint on the
maximUlIl unbalance in the number of positivc and negalive pulses within
codewords. We derived recursion relations for determining the number of
codewords satisfying a maximum unbalance constraint. We showed that the
efficiency of the new codes is asymptotically optimal, i.c., for long codewords
the information rale and sum variance approach the capacity and sum
variançc of maxentropic z-constrained sequences. Aigorithms for cncoding
and decoding of maximum unbalance codewords were derîved. The hardware
needed for these algorithms grow polynomiaJly with thc codeword length, 50
that they are very useful when large codewords are applied. An example of a
new 8blOb DC-balanced code assuming only six digital sum values was giveo.
REFERENCES
') D. B. Watcrs, Int. J. Electron. 55, 159 (1983).
') re M. Brooks arld A. Jessop, Int. J. Electron. 55, 8J (1983).
') Y. Takasaki, K. Yamasnita ano Y. Tak.ha,hi, tnt. J. Electron. 55, 121 (1983).
') J. P. J. Heem sk er k and K. A. Sch Cl uh amcr I m min k, Philips Teclm. Rcv. 40. 157 (1'182).
') J. C. Mallin,on and J. W. Millet, Radio and Elee. Eng. 47,172 (1977).
') K. W. Callermole, Principle; of pulse eooc moou1atÎon, lliffe Books Ltd, London, 1969.
') K. W. Cattermole, Int. .I. Electron. 55, 3 (1983).
') R. O, ç.rter, Electron. Lelt. 1,65 (1965).
') .1. Juste,en, IEEE Trans. Inform. Theory ャtᄋセXL
457 (982).
>0) R. W. S. Chang, T. M. Jakuoov and A. 1.. Garcia, IEEE Trans. Commun. COM·30, 1668
(1982).
") T. M. Chien, Bell Nエウケセ
Teeh. J. 49, 2267 (1970).
") C. E. Shannon, Bell Syst. Teen. J. 27, 31'1 (1948).
") P, A. Franaszek, IBM J. Res. Devclop. 14, 376 (1970).
") R. I\sh, In format ion Theory, Interse1enee Pllbli.hers, New Yo,'k, QYVセN
JO) J. N. Franklin8 and J. R. Pierçç, IEEE Trans. COmmllII. COM·;W, 118z (972).
") T. M. Cover, IEEE Trans. Inform. 'fheory lT-19, 73 (1973),
") LP. M, Schalkwijk, IEBE Trans. Inform. Theory IT-18, 395 (1972).
") R. F. Lyon, IEEE Trans. Commlln. COM-Zl, 1438 (1973).
'0) P, 1\, Franaszek, bセQャ
SYSI.. Tçch. J. 47, 143 (1968).
'0) A. X. Widmer and P. A. Franaszek, Eleçtron. Lclt. 19,202 (1983).
") 1\. X, Widmer and P. A. Fr.n.szek, IBM J. Res, Devolop. 27, 440 (983).
") S. Talaki, F. T.kcd., H. O,awa and Y. Yamada, Proc. of5th lntcrn.t. Conf. On Video
and Dala Rcwrding, Soulhampwn, p. 79, 1984.
") M. MOrizono, H. Yoshida anti Y. Ha;himQto, SMPTE Journal89, 658 (19MI).
ゥセB
N. Shirota, U.S. Patent 4,387,364,1983.
") M. A. Parker anu F. 1\, Bellis, Proc. of 4th Internat. çont". on Video ,md Dala recol'ding,
S"llthampton, p. 207, 1982,
'") K. A. Schouhamer Immink, PhiIips J Rcs. 4(l, 1 (1985).
49
Spectrnm shaping with binary DC 2 -constrained codes
SPECTRUM SHAPING WITH BINARY
DC2-CONSTRAINED CHANNEL CODES
aN「セエイ。」
A methad is presented for designing binary channel codes in sueh a way
that both the power spectral density function and its second-dcrivalive
vanish at zero freQl,lency-
ョッゥセイオ r
セョッゥエ。ャ・イ
セイ。
çlerived to ・ョゥュイセエ・、
the
number of 」ッ [ャ」キッイ、セ
that can be used in this coding scheme. A simple
algorithm fOr encoding and decoding codewords is developed. The per·
formance of lhe new codes is compared with that of classica! channel codes
designed with a constraint on the unbalance of the number of transmincd
positive and negative pulses.
1. Introduction
Many channel codes have been designed with the aim of suppressing the
power of the encoded stream near the zero frequency. Most of these channel
codes are designed on a digital sum constraint, where the (running) digital sum
is defined for a binary sequence as the accumulated sum of ones and zeros
(a zerO is coonled as -1) from lhe start of the transmission. The frequency
region with suppressed components is characterized by the so-cal!ed cut-of!'
frequency. Justesen I) found that the cut-olf frequeney of these sequenees is
proportional to the redundancy of the code. An example of a practical
embodimenl of a code rejecting the low-frequency componenls, a De-free or
DC-balanced code, is the so-cal!ed zero-disparity code using codewords with
an equal number of positive and negative pulses 2 ). A generalization of this
concept leads to low-disparity codes using two alternative source represenraHons of opposite disparity polarity S). Immink 4) studied the cut-of!' frequency
versus lhe redundancy of these simple binary low-disparity codes. He conduded that the performance of the codes, in particular for smal! codeword
lengths, is comparable with that of maxenlropîc sum constrained sequences.
Thc power spectral density funetion of digital sum constrained codes is characterized by a parabolic shape in the low-frequency range from De to the cut-off
50
Binary channef codes
frequency. In some app!icaüons it is desirabie to achiew; for a gÎven redundartcy a larger rejeetiort of the low-frequency components than is possible
with DC-balanced codes.
In this paper a new class of DC-free codes is presented having as addjtional
property that the zero second-derivative of the code spectrum also vanishcs al
zero frequency. (Note that the odd derivatives of the speclrum at zcro frequency are zero becausc the spectrum is an even functîon of the frequeney.)
This results in a substantial decrease of the power at low frequenties for a fixed
code rcdundancy as comparcd wÎlh the classica! designs based on a digilal sum
critedon. Seetlon 2 introduees a time domaln constraint to be imposed on the
ehannel sequenee 50 that the resulting spectrum of the sequence has the- DÇ'balanced property, i.e. has both zçro power and zçro sccond-derivative at zero
frcqucncy. Section 3 ーイ・ウセュエ
an enumeration method for finding the number
of eodewords to be used in a DC2·balaneed code. In order for the new codes
10 have a practical rate relatively long codewords have to be used which makes
a direct method of mcoding and dccoding using look.up tables of the souree
words and their chanod rf;pn:sentations prohibitively complex. Seetion 4 deals
wÎlh the enumerative encoding and decoding of DC 2-baIaneed codewords,
The algorithm is not more complex than looking-up and adding. The look-up
tables needed for the enumerative coding grow polynomially in complexÎty
with increasing codeword length. Seetion 5 gives examples of codes using
codewords that can be concatenated without a merging ruk, .o·calkd staleindependent encoding. The power spectral density fllnctions of the newly
dçvdoped codes wiJl be compared with those of classical DC-balaoced 」ッ、セウᆳ
Examples of codes with state-dependent eneoding that operatc wilt! two
modes of the souree represent.atÎon are given in sec. 6.
2. DC 2·balanced codes
Lel x denote a stationary channel sequence, having roean zero, wilh variables .. " X-i, Xo, Xi, , .. ; x;e 1-1,11- The power spectra! densily function of
the process is given Iセケ「
00
H(w) = R(O)
+ 2 L RU) COS Uw),
(I)
f=l
where RU) = E(xjxj.;), i '" 0, ±!, ± 2, ... is the auto·correlation function
of the sequence.
The spectrum H(w) is an even function of w, so that in the neighbourhood
of w = 0 il can be approximated wilh the following Taylor cxpansion:
H(w)
=
H(O)
+
H"(O)
-Z-w 2
+
H
(iv) (0)
24
セ
w .. -
(2)
Spectrum shaping with binary DC 2 .cons!rained codes
51
From a straightforward Fourier analysis it can be shown, if the running digital
sum (RDS) Zi defined by
i
2:,
=
L xj,
j= ... セ
Îs bQunded, i.e. has finite unbalance, the resulting power spectral density function vanishes at (j) = 0, i.e. H(O) = 0 6).
We now define the running digital sum sum(RDSS) Yi by
lf both the RDS and RDSS assume a limited number of values it can be
proved in the same way that then H(O) '" H"(O) '" O. In the following a
method is presented to design codes wÎth bounded RDS and RDSS, here called
'DC 2·balanced' codes.
The codes are based on codewords with fixed length. The codewords are
chosen in such a way that if concatenated using a certain encoder rule the channel sequence will assurne a finite nurnber of Y and z values. The first problem
to solve is to enurnerate the nurnber of codewords with length n satisfying a y
and z constraint.
3. Enumeration of (z,y) sequences
In the preceding sectlon the word unbalançe refeued to a pröperty of a
sequence of arbitrary length. Now we shall use this concept for codewords
with a predefined length. First simple reeursion relations are derived to find
the number of codewords with length 11 having a cOnstraint On thç zand y dis·
(z,y) sequences. Later we show how these codewords can be
parity, ウッセ」。ャ・、
concatenatçd in such a way that the eode stream will assume a limited number
of RDS and RDSS values.
In other words, find the numbcr of sequenccs x = (XI ••• _. x.), Xi e (-I, 11,
satisfying the following conditions
L:"
XI =
d.
and
(3)
;=1
where d z and dy arc the zand y disparîtîes of x.
We follow thc approach of7). By changing the order of summation we find
'1
d1
=
L (n + 1 1=1
J't
i) x, = -
I
1=1
i Xi
+ (n +
I) d z •
(4)
Binary channel codes
52
Because
Xi E 1·--1,
Ij we rewrite eq. (4) and find
tV,.d.i
dy
L
'"'-
tl.-d,)
L
Pi+
1= セ
(5)
ni+(n+l)d"
,I"" 1
whCrC PrE p, "" n} and n, E 11., .. ,n} correspond to the positions of the + l's
and -I 's, respectivdy.
Obviously
1('I-d.z:)
!("+d,)
L
,=1
L
Pi+
Il
ni
= L i = セョH
'=1
+
1),
bl
su that with eq. (5)
!(H<f,l
L
Pi = Hn(n
+ 1) + Z(n + l)d,
- Zdy ].
(6)
1=1
Rernark
In the particular case that both セ alld y disparities are zero it follows in a
straightforward way from eqs (3) and (6) that the codeword length n has to be
a multiple of four,
The number of codewords satisfying the 、セ and d, constraints can be found
using the theory of partitions (sec ref. 8). From the above it follows that this
number equals the number of distînct subsets lpl •... , Pil.+d,)) of size
Ï(n + d,) of the set p, 2, ... , nl satisfyillg eq. (6). The number of codewords,
delloted by A.(d" dy ) , is given by the coefficient CRU,) of u't j , of the
polYllomial gn(u, t) defincd by
g.(u, t) "" (l
+
ut) (l
+ u 2t)
_.. (I
+ Uni),
where
i = Hn(n
+ I) + 2(n + I) d,
- R、セ}L
j = セHョ
+ d,).
Both i artdj are assumed to be integers, otherwise the number of codewords is
zero.
Wc cart derîve a useful relation wieh the following observalion. The sequence x "" (XI, X2, . __ , x n ) satisfies the d, and dJ constraint If the sequence w
obtained by reversing the order of symbols in x, Le. w = (x"' ... , xd. satisfies
the constraints
.
/
I.=1 1'111 =
d"
The number of sequellces
KI
f ,=t
j=l
1'11, =
-dJ
+
(n
'j-
I) d, .
1
satisfying the two constraints equals by detinjtion
Spectrum shaping with binary dᅦセM」ッョウエイ。ゥ・、
codes
55
+ l)d, - dy). As the number of sequences w equals the number of
sequences x with the given constraints we simply obtain
A"(d,, (11
(7)
As g,,(u, t) = g,,-I(Il, 1)( I + unt) the following recursion relation of the comcients can be written down.
For m = 1,2, ... , n; j = 0, ... , rn aml i;;;; 0, __ "' !m(m + 1)
(8)
with initial conditions
Co(O,O)
=
I,
otherwise
em(i, j) '" 0.
4. Enumerative encoding and decoding (z,y) sequences
Codes with rates of practical interest are only possible if the codeword
length is relativdy large (see later). The hardware requirements for a direct
encoding and decoding using look-up tables for storing the codewords and
their souree representations grow exponentially with the codeword length. For
say n > 24 this direct methad is beyond current technology. In this section a
sequential algorithm is presented for tncoding and decoding (z,y) sequences
which Îs not more complex than adding and looking-up, where the nurnber of
entries of the look-up table grows polynomially with the codeword length.
To that end we establish a 1 - 1 mapping from the set Sof (z,y) sequences
onlo lhe set of integers 0, I, ... , M - 1, where Mis the cardinality of S.
Let S be the set of (:z,y) sequences. The sel Scan be ordered lexicographically as follows:
lf w == (Wl, ... , w,,) and v = (VI, .. " v,,) are elements of tha Set S then v is
called less than w, in short 1'<0 w, if therc cxists an i, I ,;;:; i';;:; ti, sueh lhal
vi<w/and wJ=vjo ls;;,j<i.
The position of W in the lexicographical ordering of S is defined to be the
rank of w, denoted by r(w), i.e. r(w) is the number of all v in S with v <0 w.
The following algorithm for ranking (z,y) sequences can be used for decoding.
Decoding algorithm
The rank r(x) of the binary (z,y) sequence x in Scan he calculated as
follows
rex) =
I"
k=1
A"_k(d, -
Zk_l
+ 1,
-dy
+ (n
- k
+ I) d, + Yk-l) . xl"
Binary channel codes
54
wherc
kol
Zk-I
= I x,
i=l
and
k-I
L: ZJ.
Yk-I =
i=1
Proof
According to Cover 9) the lexicographic index r(x) of x is given by
n
r(x)
=
I
x/· n,(xj,x:l, . . . ,X)_I, -1),
)=1
where n.,(x, ,X2, ••• ,Xk) denotes the number of elements in S for which the first
k coordinates are given by (X"X2, ... ,Xk). The first (j - 1) coordinates of x
determine the states Z) and y)
)-1
I
=
Zj_1
XI
1=1
and
)-1
Y)-I
I
=
Zi •
• =1
From eq. (3) we find that
n
j-I
d, =
I
Xi
+
x)
+
.=1
L
Xi
=
;=)+1
II-j
= Zj_1
+ Xj + I
Xj+!.
i=l
In the same way
)-1
11
dl =
I
Zj =
1=1
I
ti
z/
+ Zj
1'2
Zi
I
+ z; +
;=1
Zi
;=1+1
I
I
Xm
m=l
}-I
1'1
LZI+(n-j+l)z;+
L I
n-)
Zi
+
(n - j
{=l
+ 1) Zj + I
k=1
i
Xm
ュ]ェKセ
(=1+1
i=1
)-1
I
=
J=j+l
1=1
J-I
I
I
-i"
Ic
I
Xj+""
m=I
Prom these observations it fo)lows thllt thc numbCT of 'lectors x given by
x
= (Uh, •• , Uj-lo
-1,
X)+lo ••
"xn ),
Spectrum shaping with binary DC2 ·cOMtrained codes
55
(where the u/s are given) satisfying
n
"
I
.1=1
112=1
I I
LXI = d. and
illl'l
equals the number of veetors v with length
1'1 -
== d,
Xm
j,
V
=
(Vh'""' v._j)
satisfying
.-J
L Vi = d. -
Z1-1
+
1
i'=l
and
O-J
k
j-l
I I
k.I
Vnt
= dy
-
m ... t
L z/ -
(1'1 - j
+ 1) Zj_t + (1'1
- j
+
1).
i=l
Whence
n,(xt, X2"
A._j(d. -
Zj-l
+ 1, dJ' -
•• ,Xjd,
Yj-I
+ (1'1
-1)
-
j
=
+ I) (I
-
ZJ-l»'
With eq. (7) we find.
Arl-Ad. -
Z}_I
= aョセェH、N
-
+ 1, dy - YJ-l + (1'1 - j + 1)(1 - Zj-I» =
Zj-l + 1, -dy + (1'1 - j + 1) d. + Yj-l).
Cl
The inverse algorithm to encode (z,y) sequences, i.e. to fiod tne scquences
given tneir rank, is given in tne following.
Encodlng algl;lrithm
Let an integer I, lEID, 1, ... , M - Ij and the set S be given. The following
algorithm finds x sueh that r(x) -- J.
Jnitialize
Xo --
Yo =
Zo
= 0;
for
where zセ
= zセ⦅i
+ Xk-I
and
Yk = Yk-l
+ zセMャ
then set Xk = 1 and set
else set Xk
-- -1.
The procedure. as presented here, is confined to the eneoding and deeoding
of one set of (z,y) sequences. The procedure can readily be gencraliled to
inc1ude sets with different values of the disparities (see ref. 9).
Binary chatlrle( codes
56
The enumerative coding enables the design of channc! cncoders and de·
cuders of moderate complexity. For decoding we need storage cal'acity for the
non-zero weighting coefficients, a full adder and an accumulator to store tmermediate and final results, The encoder needs an additionallogical comparator.
The weighting coefficients look-up table can be used for encoding and decoding, which is attraetive for application in recording, where the t.ransmitter
and receiver are usually housed in one machine. Note that wc do not need a
multiplier because x is binary valued. Lyon 10) gave an example of a practical
embodiment. based on the enumerative coding of codewords with given z disl'arity using Pascal's trianglc. In order to encode and decode (z,y) sequences
his circuitry has to be extended with an extra COunter to dctcrmine the y State.
5. Coding with zero.disparily codewords
The simp!cSl way to find a code generating long sequences satisfying the z
and y eonstraînt is to use codewords of fixed length n that can be concatcnated
without a coding rule, i,e. to usc a stateless encodeL In this simplc case the
codewords start and terminate with zerO ROS and R,OSS. Consequently the
codewords have zero zand y disparities. In the sequel we use the short-hand
notation zero-disparity if both the zand y disparities are zero. From eqs (3)
and (4) we find in this particular case that the codeword moments Uk satisfy the
following conditions:
n
uk=Likxl=O,
keIO,IJ.
(9)
.j=l
Using thc recursion relatiuns eq, (8) we calculated the numbcr M=A.(O,O),
4,8, ... ,36 of zero·disparity codewords (we halle already remarked that
An(O, O) = 0, if n is not a multiple of four) , The results are listed in table L
ti =
TABLE I
Number of codewords M with lero-dispaTity
n
4
8
12
16
20
24
28
32
36
M
R
2
8
0.250
0.375
58
OA88
526
5448
61108
723354
8908546
113093022
0.565
0,621
0.662
0.695
0.722
0.743
57
Spectrum shaping wifh bÎnary DC 2 -constraÎned codes
Table I also presents the code rate R, defined by
R = 2l og M
n
(10)
Note in table 1 that even for large codeword length the code rate is still poor.
Duting transmission the symbols of the (z,y) sequences are serially transmitted. Assume that the codewords are equiprobable and independent, then
thc auto-correlation function of the concatenated channel stream is given by 11)
1
R(k)
= -
,,_k
')'
n f;'l
I';;; k"; n - 1,
E(XIX'+k);
R(O) :: 1,
Rek)
= 0;
k;:.
n,
whete
(11)
where X(k) is the k-th element of the set of codewords with zero y and z disI'arity.
Example 1
The simplest example is the code with n '" 4 having two codewords so that
the rate R ]セN
We can verify that the two codewords are '0110' and '1001' (a
'0' is used to denote a symbol with value -I). The auto-correlation function
of this channel code can be found by hand: R(O) = 1, R(l) = MセL
R(l) = MセL
R(3) = セ and for i セ 4: R(i) = O. The resulting power spectral density function
H(w) of this channel code is found with eq. (1)
H(w) = I
+ ャMセ」ッウキ
- セ」ッウRキ
+ セ」ッウSキャ
= TウゥョRセキN
We can easily vedfy that indeed R(O) = R"(O) = O.
Due to the exponential growth of the number of codewords the coml'utational elfort becomes prohibitive for large codeword lengths when eq. (11) is
used. The preceding cnumeratîon method (sec. 3) can be extended to find the
auto-correlation function with a polynomially growîng computatîonal effort.
From eq. (ll) we derive for fixed io,i l ; (x/el-I, IJ):
where N( ) is the nurnber of codewords satisfying ( ).
Blnary channel codes
58
For symmetry
イ・。セッョZGゥ
find
wC
E(x"xJ,) =
2N(xr, = XI,)
M
4N(XI,
XI,
=
-1
=
1)
Assuml;: Xi. = Xi,
= 1, then
(12)
- 1.
M
from eq. (9) we find
I"
and
I"
XI = -2
ゥセ oBQ
i=lj
lXI =
-(i o
+ i,).
(13)
':z1,JI;I!i2 vl l l
The numbl;:r of sequences in S satisfying these constraints ean again he found
using the muml;:ration method of sec. 3.
The number of scqucnccs N(Xi o = Xi, = I) in S satÎsfying l;:q. (13) is givl;:n by
+ 1) - iQ-i l , j=Hn- 4) of the polynomial
the eoefficient of uivJ, ゥ]セョH
h,,(u, t)
h (
nU,
t) -
(l
g,,(u, f)
+ uiot)(l +
(14)
Ui,!)
Wjth a small moditication of the reeursion relations given by eq. (8) we can
obtain N(Xi,=X" = I).
Using l;:qs (I), (11), (12) and relation (14) we calculated the power spectral
density functions of zero.disparity codcword bascd codes. Thc results of the
computatlons are plotted in fig. I.
11
o
.,-"
-10
セ
ril
-2 0
o
<Il
0..
.
NセvG
-3 0
セ
k'1/
1/
Nセ
..
-.
セ
セAO
I
セv
-".. ,,-_.
,
,I
......... -"'- --.
'V'
0 it'.
1t.
セ
V
-
f·· ....
- - f-..
.
_.._-
-,.
--
- .. . ..
""
-50
0.005
0.01
0.1
"
..
0.5
trod ot Ó1annel freq. (Iogl
rig. I. P"wer 'peelral den,ity funclion, of DC'-balanccd codo, ba,ed on zero-dbparity codewords witn the codeword ャセョァエィ
n a!i a parameter.
Spectrum shaping with binary dcセM」ッョウエイ。ᅫ・、
59
codes
We define the (low-frequency) cut-oft' frequency
Wo
accarding ta U)
H(wo) "" Nセ
Using numerical methods we computed the cut·olf frequency of the various
codes. Fig. 2 shows the cut-olf frequency versus the redundancy I - R . As
a comparison we plotted the cut-ofT frequency, according tO the same definition, of DC-balanced zero-disparity codeword based codes verSuS the redundancy 12). We observe in the figure that for a fixed rate the cut-olf frequency of
DC-balanced codes are approximately a factor of 2.5 larger than those of
DC 2 -balanced codes. Below the cut-olf frequency the spectra of DC 2 -balanced
codes decreases faster witl1 decreasing frequency than those of DC-balanced
codes. We nodced when comparing the spectra of DC 2 - and DC-balanced codes
that for fixed redundancy a cross-over is found at approximately - 20 dB. We
conclude therefore that DC 2 -balanced codes are favourable if rejectÎon of lowfrequency camponents is needed better than - 20 dR
0.1
--_ ......
---,,"
Ol
..
'W-
.,.,", ,,, ......--
2
."
...
_-
17
V.
1=
'"
----1"
16
Xセ
0.0 1
0.1
fr-- p ョセQR
Ij·
./16
Ir
20
10
_ ·_· _ 44 .
セ
... _-
32
18
20
ョセV
セ エッ
36
52
.. DC - boloneil'd
2
DC - boloneil'd
I
I I I I I
Redundoney (1-Rl (log)
Fig. 2. The CUl-olf frequency of De- and DC'-balanccd, zcro-disparity codc$
dancy 1 ... R. The codeword length i$ denoted by 11.
DuセイLv
the r,dun-
6. State-dependent encodlng
In the case of state-independent encoding the sequence is encoded without
information of the past and we have a unique one-to-one mapping of the
source words and their channel representations. A larger rate wÎth given code·
word length is possiblc if codewords are allowed to start (and end) in a set of
predefined states, called principal states. Franaszek U,U) has developed a procedure for determining the existence of such a set of principal states. We found
60
Binary channel codes
during experiments with this procedure that the following choice of the codeword subsets is satisfaetory. The eneoder operates on the basic principle ofthe
bi-mode alternate coding 、・セ」イゥ「・、
by Cattermolc 2). All codewol'ds are chosen
in sueh a way th at. d. = O. Codewords with positive and negative 、セ are used in
twO modes, eodcwords with zero clv are used without modification in both
modes. The choice of the polarity of lhe codeword is chosen by the encoder in
such a way that the RDSS at the end of the new codeword is as close to zero as
possible. We can easily verify that the number of eneoder states, Le. the nUillber of ROSS values at the end of a codeword, is twiee the number of code·
word subsets with different positive d1 values. By properly 」ィッ セゥョァ
t.he inÎtial
RDSS value the RDSS at the end of the codewords wil! assume the values
± I, ± 3, ... , ± (r - I). In practief; th is is aehieved by initializing the ROSS
counter with the disparîty l. Using the reeursion relations given by eq. (8) we
caleulated the rate of the bi-mode code versus the number of subsets and the
length of the codeword. In table Il we collected the results using ,. encoder
states.
TABLE 11
Code rate R of DC 2 ·balanced codes wi!h
codeword Iength n and r encoder states
r =
n= 4
8
U
16
20
2
4
6
.396
.488
.568
.627
.670
.5
.557
.616
.663
.700
.594
.648
.688
-
.710
8
10
.625
.672
.707
.736
.641
.689
.722
.748
-
-
Note the improvcmenl, for fixed codeword length, in fate with respect to stateindependent encoding (tabie I).
Example 2
An atlractÎve and simpie code found in the table has the following parameten;: n = 4, r = 4 and R "'- Nセ Table lil shows the eodebook of this simple code.
'I'he codewords should be chosen from lhe table in such a way that the
ROSS aftel' transmission of the new codeword is as close to zero as possible.
The RDSS assumes the values ± land ± 3; table IU shows the corresponding
eodeword for the given RDSS value. Fig. 3 shows the power spectral density
function of th is DC 2-balanced code. The spectrum of this code has been calAs a reference
culated using the matrix method of Cariolaro et al. QセLIN
we plotted the spectrum of the well-known 'bi"phase' code (zero-disparity,
Spectrum shaping with binary nC 2 -constrained codes
61
TABLE III
Çoàebook of R = セ dᅦセM「。ャョ」・¢
code,
the O's correspond to the -l's of the analysis
codeword
RDSS = 1,3
source
codeword
RDSS = -3, -1
00
Ol
LOOI
0110
0
1001
0
0110
0
0
10
1010
11
1100
2
4
0101
0011
-2
-4
dy
dy
o
-10
!g -20
0.01
0.1
frad. of chonnel freq.
Fig. 3. Power spectral density functions of a DC'-balanccd code n
phasc' (b).
=
4 and R
=
i (a) and
'bi·
DC-balanced code with n = 2) having the same rate. Note the improvement in
the low-frequency range of thc new code.
The memory requirements of Cariolaro's metbod become prohibitive for a
large number of encoder states and large codeword lengths, so that we were
not able to compute the spectra of the codes with rate R > 0.6.
7. ConclusÎons
A new class of DC-constrained channel codes, called DC 2 -balanced codes.
have been presented. Tbe power spectral density function H(w) of DC2-
Bim1ry "hunnel çodes
62
balanced codes has besides zero power at zero frequency a zero second-deriva·
tive at zero frequency, I.e. N(O) = H"(O) = O.
It was shown that if the redundancy is fixed the cut-oft' frequency of DC 2 balanced codes is approximately a factor of 2.5 smaller than of classical DCbalanced codes.
Below the cut-oft' frequencies the spectra of DC 2 -balanced codes decreases
faster with decreasing frequency than those of DC-balanced codes. We noticed
when comparing the spectra of DC 2 - and DC-balanced codes thai for fixed
redundancy a cross-over is found at approximately - 20 dB. We conclude that
DC 2-balanced codes are superior wilh respect to DC·balanced code, if rejeetion of low.frequcncy cOmponent, is needed better thart - 20 dB.
REFERENCES
')
')
')
')
0)
0)
')
')
.)
'0)
")
•,)
")
"j
lOl
")
J. Justesen, IEEB Trans. Infol'm. Theol'Y lT·;llj, 457 (In2).
K, w. CattHmole, Int. J. Electron. 55, 3 (\983).
J. M. üriffiths, Electron. Lett, 5, 79 (1969).
K, A, t'eho\'hamer Immink, Philips J. Res. 40,1 (1985).
W, R. Bennett, BeU Syst. Teehn. J. 37,150\ 0%8).
H. Yasuda and H. lnose, JournallElÇa Japan 54a, 506 (1971).
ü. F. M. Beenker, Philips Techn. Note T.N. 105/84, 1984.
J. R.iordan, An introd\lction 10 combinatorial an.lysis, Princc!<)n Univcr<ity Pres,. 1980,
T. M, Cover. IEEE Trans. Inform. 1'heory rf-19, 73 (1973).
R, F. Lyon, IEEE Trans. Comnl\ln. COM-2J, 1438 (1973).
H. S. Hosik, lIell Syst. Teeh. ,I. SI, 921 (1972) .
.I. N. frankling and J. R. Pier"e, IEEE Trans, Commun. COM·W, 1182 (972).
P. A. FranasLek, BeU Syst, Te.h. J, 47. 143 (1968),
P. A. fl'anaszek, IBM J. Res. Develop, 14,376 (1<)70).
G. L. Carjolaro and ü. P. Tronca, IEEE Trans. Commun. COM-22, 1555 (1974),
(1. l.. Cariolaro, G. L. Picrohon and G. P. Tronca, Int. J. ElectrOn, 55. 35 (1983).
Some statistical properties of maxentropic nmlength-limited sequences
63
SQME STATISTICAL PROPERTIES OF MAXENTROPIC
RUNLENGTH-LIMITED SEQUENCES
Abs.ratl
Wc prove that two-Ievel runlength-Iimittd ,equtnccs with maxÎmum informarion content', maxcntrop'c runlcngth-Iimitcd sequenCCS, have an cxponenlial runlength disrtibution. General equations are derhled fOr the pOWer
spectra! density function of maxentropic runlength·limited sequences.
Runlcnght dislrihutions ancl power dcn,ity functions of practical modulation
ウケsエセュZ ゥ
are tompan:d with those derived from maxentropîc runiength-
Iimited sequences.
1. hltwduçtion
8inary codes such as the Miller code 1), 3PM 2) and EFM 3) are baseband
rnodulation systems with applications in magnetic and optical recording.
These codes belong to the class of so"called binary value runlength-limited
sequences (RLL sequences). A string of bits is defined to be runlength-limited
if the number of consecutive zeros (or ones) is bounded between a certain
minimum and a maximum value. The maximum runlength constraint guarantees a dock pulse within the specified time, which is needed for dock regeneradon at the receiver. The minimum runlength constraint is imposed to control
intersymbol interference and consequentiy has a bearing on the spectral propertÎes of the code .,0).
In th is paper we prove that RLL sequences with maximum information contents, defined as maxentropic RLL sequences, have their runlengths exponentially distributed. This leads to a simple derivation of the power spectral
density function of maxentropic RLL sequences.
The power spectral density function of a (random) signal defines the distribution of the average signal power versus frequency. This information is useful in system design because it indicates the most important frequency band
and the De content (power at the low frequcncy end) of the applied code. It
does not teil uS in genera! how much distortion the signal suffers when the
transmission charme! is bandwidth-Jimited.
Binary channel codes
64
We start in sec. 2 with a formal definition of RLL sequences, followed in
sec. 3 by an analysis of thc runlength distriblltion and spectra! properties of
maxentropic RLL sequences.
The study described in this paper was motivated by the fact that practical
modulation systems can with moderate hardware quite easily reach !'ates of up
to 85-950/0 of the maximum information capacity. From this observatioll we
maya priori conjecture that the statistical properties of practical modulation
systems can be predicted by maxentropic RLL sequences. In this way the
results obtained from the maxentropic RLL sequence theory could be a
simpie, generally applicable tooi for an approximate calculatlon of the power
density functions of bit streams of practical modulator systems. To test this
hypothesis we compare ill sec. 4 the runlength diHriblltion and power density
funclions of two well-known rnodulation systems with results predicted by the
maxentropic RLL seqllences.
2. Definition and asymptotic properties of rnnlength.limited binary sequences
The theory of binary scqucnccs with restrictions on minimum and maxi"
mum rulllellgth goes back to Kautz セIL
Tang aod Bahl セIN
For an exhaustive
treatment of this subject the reader is referred to ref. 7. Their most important
results are summarized here.
A dk-limited sequence simultaneously satisfies the following conditions:
a. d constraint - any two logical ones are separated by a run of conseculive logica! zeros of at least d.
b. k constraint - thc Iength of any run of consecutive logica! zen)S is at
most k.
A sequence satisfying the d and the k constraint is called a dk sequence. Sequences only satisfying the d constraint are called d sequences. Wc obt.ain a
runlength-limited binary sequence with at least (d + 1) and at most (k + 1)
consecutive zeros or ones by integrating modulo 2 a dk-limited sequence. In
this way the ones of a dk sequence indicate the position of a transition of a
runlength-limited sequence. We note that for long sequences the inforrnation
contents of runlength and dk sequences are the same. The only extra bit of
inforrnation is the initial polarity of the runlength-limited sequence when inte"
gratjng a dk セ・アオ ョイNZ・
In ref. 7 recursion equations arc derîved for the number of distinct dk-lirnited sequences of ャセョァエィ
n as a function of d and k.
lf for convenience we restriet ourselves here to d-lirnited seqllenccs, thc
numbr.:r of 、ゥセャ ョ」エ
binary sequenr.:es N(n) of length n is given by
n+l
N(n) ""'
{
(I)
N(n - I)
+ N(n
- d - I)
n>d+ 1.
Some statistical propertjes of maxentropic runfength·/imited sequences
65
The process of modulation maps the input data stream onto the runlengthlimited output stream. In general m consecutive data bits arc mapped Ooto n
consecutive channel bits.
For a Ooe-to-one mapping we must have: rn.;:; log2N(n)_ The ratio mln is
defined as the rate R of the modulator. We find that for given d and kthere
exists a maximum rate, defined as the (information) capacity C. The capacity
is determined by the specified d and k constraints and is given by
I
C= lim -log2N(n).
n ..
n
(2)
00
For large n the number of distinct d sequences of length n, N(n) behaves as
where A is the largest real root of
/XÀ n,
(3)
For dk sequences
À.
is the largest real root of 7)
Zk+2 _
Zk+l _
Zk-dTl
+1
=
O.
(4)
The capacity Cis then given by
Coe log2À..
(5)
Lf one wishes to transmit a certain fixed amount of information per time unit,
say '/T bit/s. over a dk-limited noiseless channel, then the channel e10ck
should run at least '/ C times faster than the data elock to compensate for the
d and k constraints. In other words the channel bit time Tc is shorter by a
factor of at least C than the time needed for a data bit, i.e. Tc < C· T.
3. Runlength distribution and spectral properties of maxentropic runlengthlimited sequences
In this section we investigate the runlength distributioo and Power Spectral
Dcnsity functions (PSD) of maxentropic runlength-limited sequeocesShaft C) determined the spectra! propertjes of d-constrained RLL sequences
by computer simulations. He generatcd a random binary sequence with a
random number generator routine. Next, bloeks of random consecutive bits
werC transformed with a time-independent eode book loto d-constrained
sequenccs. The lengths of the sequences were rather restricted so that the
published PSD functions are not very usefu!.
Pelchat et al. Iセ considered d-constrained RLL scquences as a type of generalized pulsewidth modulation. This leads to a straightforward algebraic expression for the PSD function once the runlength dîstribution is given. We
follow his approach and first derive the runlength distribution of maxentropic
RLL sequences.
66
Binary channel codes
3" I, Runlitngth rjistributiorl 0/ maxentropic RLL sequences
For an RLL sequence with a çonstraint on thc minimum and maximum runlength the runlengths Ti are given by
i = d
+ I, d + 2, ... , k + 1.
(6)
Theorem I
Assuming that the lime intervals between the transitions are statistically
independent, then the information rate of an RLL sequencc is maximal if the
probability of OCCurrence of length TI is given by
i = d
PUi) "" A-'
+ 1, d + 2, ... , k + L
(7)
PROOF
Thc entropy of a binary RLL signa) where the time inlervals between transi·
lÎons is an independent, discrete stochastÎe process is given by ref. 5
(8)
where
T is
thc average runlength given by
kOl
T/Tc=
I
iP(Ti )-
(9)
I=d ... i
Substituting (7) yields
With (5) we find that under assumption (7) lhe RLL sequence takes on its
asymptotic information capacity C. We now further prove that (7) is a consistent stochastic expression, i.e.
I
PtT,) "" I.
We find from (7)
because À satisfies (4) and (5).
o
3.2. Power Spectral Density jimcliorl
For a two-level signalof unity amplitude wilh independent runlengths and
given runlength distribution the PSD funetion can be calculatcd as follows
(c.f. ref. 5). Define G(2rr.f) as the discrete Fourier transform of the runlength
distribution, i.e,
Some statistical properties of maxentropic runlength-Iimited sequences
67
(10)
withfis the frequency. The PSD function is then given by
1
1 - IG(21lf)1 2
S(21lf)", n 2f2T II +G(21lf)12'
(11)
Substjtuting the runlength distribution (7) provides a straighlforward method
of determining the PSD function of maxentropic RLL sequences.
In fig. I we have plotted the PSD function versus frequency of some d
sequences. We normalîzed the pulse lengths of the RLL sequences in such a
way that the information rate for all these sequences is fixed at 1/ T bit/s.
frequency fT(lin 1 fig. I. Two $ldcd powçr dçn.ity function. versus froquençy of maxentropic d-con.trained runlength-Hmited seqllence•.
We can observe the foilowing characteristics.
A maximum occurs at zero frequency in the uncoded case d = O. for other
d's maxima occur at non·zero frequency. For non-zero d maxima occur at
lower frequencies and have a larger peak wilh increasing d.
The eoergy at the low frequeocy end decreases with increasing d.
In fig. 2 we have plotted as an example the PSD for d = 2 and some
different kvalues. For convenience we normalized the frequency axis per
68
Binary channel codes
10
".".
,:,
,:
I;
0
ëii
!!
'
t-
0
,,:
..
-10
<Il
Q.
....セ
,,"
':\
1
\\
:\
-20
GBセ ',\
セB|
-30
".\
'.1
iセ
":\
-..00
0.5
_
frequency
fT,
Fig. 2. Two-sided pOWer density function versus frcqucncy of maxentn>pk runlenglh-limiled
with d .- 2 and some k vahJÇ:s. Thç frçquc:ncy axis has been 、セャNZゥ。ュイッョ
セ・アオ ュNZ ・ウ
セュャエ
per chaür'Iel bil
T,..
channel bit time T,; the vertical axis is normalized per data bit time T, so that
we compare the PSD functions for a fixed information rate of the modulation
strcarn.
3.3. Average runlength
In self c10eking systems the k eonstraint guarantees doek information within a eertain maximum time interval. Another parameter determining the
quality of the doek regeneration is the average time between eaeh doek infor.
mation interval. In most doeking systems docking informa.tion is derived în
the receiver !"rom the transitions of the regenerated RLL sequence. In othcr
words the average runlength is a sound criterion of the doek cOntent of a eertain RLL sequence. The average runlength is given by (9) and (7)
TI T,
= Lセ
L
i À -I = d
+ 1 + ----,----;----,-1 Je d-k-1
1 - (k - d
+ 1),1"-> + (k À -
d»)"d-H
1
( 12)
In fig. 3 we have depicted
T as a function of the d and k constraints.
Same statistical properties of maxentropic runlength-limired sequences
1--"
5
11-
i
4
3
d",4
セ
5
69
d=3
d=2
; : : : : d"'
2
2
4
6
8
10
ォセQ
Fig. 3. Average runlenglh of rnaxenlropic runlength-limite<;l sequences versus d and k 」\IョGャイ。ゥセN
3.4. DG-content
In system design for magnetic and optical recording the low-frequency properties of a code aften play an important role. In particular the power at low
frequencies or the so-calJed "low-frequency content" of the modulated output
stream should narmally be as smal! as possible. This requirement is imposed
in magnetic recording because this medium cannot reproduce the low frequencies with suflkient signal-to-noise ratio. Optical recordÎng does not suffer
from this phenomenon, but unfortunately the servo systems that position the
laser spot on the disc are sensitive to low-frequency interference 8).
By evaluatÎng (11) we flnd for the power density of maxentropic RLL
sequences at zero frequcncy
8(0) = Gエセ
with
1'2 =
t\セ
{
IT
Ï:l i 2P(T1l}
-
1'7..
(13)
l=d+1
In other words the power density at zero frequency is the variance of the runlengths divided by the average runlength. Substituting the runlength distribution (7) gives us the results plotted in fig. 4. We notiee for fixed k that the De
content decreases with incrcasîng minimum runlength constraint d. On the
Binary channel codes
70
Q
d.l
セ
、]セ
o <;1.3
" do'
v
5
r'ig. 4. De
セッョHセ エ
of
ュ。クセョエイッーゥ」
10
d.S
'5
runlength·limited sequenCe! ver.!u.! d and k contraint•.
other hand we notiee for fixed d that thc De content increases with increasing
maximum runlength eonstraînt k,
S. Cmnpal'lsOIl of I'esulls
Praclical modulatiOIl systems are often based on block codes or variabIe
bloek eneoding sJ. Due to the finite hardware dimcnsions of the modulator the
aetual rate, defincd as thc ratio of channel and data bit time, is smaller than
the information eapacity C for thc gîven d artd k cOllstraints. The ratio of the
practical rate and the asymptotic informatiOn capacily C, for the given d and
k comtraints, is defined as thc modulator efficiency E. Franaszek 8) has cal·
culated the modulator efficiency of many runlength-\imitcd codes. From his
Some statistica/ properties of maxentropic runJength-limited sequences
71
work we conclude that, for moderate hardware, modulators can quile easily
reach efficiencies of up to 85-950/0. This fact motÎvated us to compare the runlength distribution and power densily functions of practical modulator
streams wÎth results obtained from the preceding theory.
Power spectral density functîons of many practical modulation streams
have been published. For example PSD functions of Zero Modulation, 3PM
code and Miller-squared are described by Lindholm D), whose method is based
on Cariolaro and Tronca's study of spectra of block-coded digital signals 10).
They calculated the PSD function of the codes by Fourier transforming the
autocorrelation function of the codes. The autocorrc1ation function can be
found by manipulation of the transition matrix of the finîte-state model
description of the code ').
We calculated with a finite-state description ') thc PSD function of MFM
(or Miller modulation) and 3PM.
Thc rnain parameters of these modulation systems are listed in table 1. For
the details of the modulation mies the interested reader is referred to e.g. D).
TABLE I
Main parameters of the modulation systems
MFM
3PM
d
k
R
c
E .. R/C
1
2
3
11
!
!
0.551
0.545
0.91
0.92
In figs 5a/b we have plotted the runlength distribution (derived from the
finite-state model) of 3PM and MFM. It can be ,een from fig. 5a that the probability of occurrence of the langer runlengths of 3PM are overestimated (and
consequently the smaller runlengths are underestimated) by the theory. The
PSD functions of 3PM and MFM were calculated by Lindholm's method and
are plotted in tigs 6afb. In these figures we have also plotted the PSD functions derived from the maxentropic RLL sequences. The frequency axes are
for practical modulation systems and maxentropic RLL sequences normalized
per channel bit time.
We note a surprisingly good agreement (a few dB difference) between the
spectra of the modulation systems and the maxentropic RLL seguences. This
is surprising because the runlength distributions, as plotted in figs 5afb, show
a significant discrepancy wilh the theory. We conclude that the PSD functiOI1
is less sel1sîtîve to the typical statistÎCs of a modulation stream than the run,
length distributiOn8_
Binary cha/'lnei codes
72
Q
-
PIT, I
t
:> PM
,.
Theory RLLcode do2.koll
x
\.x
\0
-,
x
10
\"x
\
\
)(
.\.
"
10
\x
I
10
5
\x
1')
runlength I Tc.
_
1-
fl
0
\,
FIT, I
-
MFM
Theory RLLc-ode d=1,kd
\
"
10
1
5
10
15
ru nlerH;lt h I I,
Pigs 5alb. Runlcngth ッェュゥ「オエッョセ
of 3PM (fig. Sa), MPM (fig. Sb) ano tbeir ュBクセョエイッーゥ」
RLL
gequenee equivalent, for tbc Gーセ」ゥヲ セ、
(d.k) - (2,11) and (d,k) = (J ,3) congtralntg, respeclively,
as ,olid lino.; nOle however that only
The di,tributions of the lllaxemropic gequcnccs arc ーャッエ セ、
a dis<:rete sel of point! is achievablc_
Some statistical properties of maxentropic runlength-limited sequences
73
'0
-3PM
The or )! do:? kot'
Mセ
ai
セ
lQ
lil
IJ.
t
·10
-20
fT<
_
'0
Miller
Th@Oryid,k).i1.3l
m
セ
0
lQ
::c
t
-20
......---!
M MZウ B 。 M セ M MBPS
_
fT<
Fip 6(1/ b. Two-.jided power density funotiOns of3PM. MFM alld their ュ。セ・ャイoーゥ」
RLL seqllellees
equivalents versus frequency. The frequeney aXes have been normalized per ehanllel bit time.
6. ConclusÎons
We have proved that runlength-Iimited sequences with a minimum and
maximum runlength constraint achieve their maximum information capacity
if the probability of occurrence of a certain runlength is exponentially distrÎ.
Binary channel codes
74
buted. Once the runlength distribution is givcn the power spectral density
function of maxentropic ruolength-limited sequences is given by a straightfor·
ward calculatioo.
We have calculated the runlength distributions and power spectral density
functions of some practical modulation systems and have compared them
with results obtained from maxentropic RLL sequences. We found a significant discr<lpancy between the runlength distribution of the practical modulator systems and the maxentropic RLL sequences, whcrcas the power spectral
densities showed good agreement (a few dB difference).
REFERENCES
') J. C. Mallin,on and J. W. Miller, Radio and Elee. Eng. 47,172 (1977).
')
')
')
')
0)
')
•)
')
JO)
G. V. ./acoby, IEEE Trans. Magn, MAG·H. no. 5, セャHRQ
(\977).
J. P. J. H eemsker k and K. A. Scho u hamer I mmin k, Philip, Techn. Rev. 40, 157 (1982).
P. D. Shaf!. IEEE Trans. Commun. ÇOM·ll, 687 NIセWYャH
M. G. Pelchal and J. M. Geist, IEEE Trans. Commun. COM-23, no. 9.878 (1975).
W, H. Kaulz. LEEE Trans. Inform. Theory lT-ll. 284 (1965).
D. T. Tang and I.. R. Bahl, lnf'>lm. COllirol. 17.436 (1970) .
P. A. hana"ek, IBM J. Re;. Develop.• 376 (1970).
D. A. Lindholm, IEEE Trans. MaJ>l1. LTQᄋ[イセm
no. 5, 321 (1978).
G. L. Cariolaro and G. P. Tronoa, lEEE Tran•. Commun. COM·:U, 1555 (1974).
Method for encoding and decoding runlength-limited binary sequences
75
A GENERALlZED METHOD FOR ENCODING AND
DECODING RUNLENGTH-LlMITED
BINARY SEQUENCES
Abstl1\cl
Many moelulation ,yStem5 used in magnetic and optical rccording arc based
on binary runlength-limited code5_ We generalb,e thc concept of dk·lirnited
sequenees of length n introduce<l by Tang and Bahl by imposing constraints
On thc maximum number C>f consecUlive zeros at the beginning and the enel
of thc sequences. lt is shown that the encoeling anel elccoeling procedure,
ate similar to those of Tang anel Bahl- The aelelitional NエョゥBイセッ」
allow a
more efficient rncrgin,g of エィセ
NAェセqオ・ョ」 ウL
wセ
d-emonstrate two
conStrue;tiöI1S
of runlcngth-limlted codes with merging rules of incrcasing complcxity and
efficiency and COmpare them to Tarlg and Bahl's method.
1. lntroduction
Many baseband modulation systems applied in magnetic and optical re-cording are based on binary runlength-limited codes 1,2,3). A string of bits
is said to be runlength-limited if the number of consecutive <:eros between
adjacent OneS is bounded between a certain minimum and a certain maximum
vaJue. The upper runlength constraint ゥオ。イョエセウ
a transition within a
specified time interval needed for dock regeneration at the receiver. The lower
runlength constraint is imposed to reduce the intersymbol interference, The
latter constraint appears to have a bearing on the spectral properties of the
sequences ').
In this correspondence we consider an encoding and decoding procedure for
a special dass of runlength·limited codes. The constraints we cOllSider can be
defined as followed. Let n, d, k, land r be given, where I";; k and r";; k.
76
Binary channrd codes
A binary sequence of 1cogt.h n is called a dklr-sequence if it satisfics thc following conslraints:
a) d-constrainl: any two ones are separated by a run af al least d canseculive
zerOS;
b) k-conslraint: any run of consecutive zeros has a maximum Icngth k;
c) I-constraint: the number of conseculive [eading zeros of lhe seqnence is at
most I;
d) r·constraint: lhe number of cansecutive zeros allhe end of the sequence is
at mosl r.
A sequence satisfying the d- and k-coostraiots is called a dk-scquencc.
Given certain desired runlength conslraints, it is nol trivial how la map
uniquely the input data stream onlo the cncoded output. dat.a strcam. A systematic design procedure on lhe basis of fixed length sequences has been given
by Tang and Dahl l ). Their method is bascd on mapping dk-sequences of
lenglh n anta conseculive integers and vice versa. In lhis correspondence we
inlend to generalize their resulls to dklr-sequences. We are also gaing to show
that the application of dklr·sequenees eoables the sequences of length n to
be merged efficiently without violation of the d- and k-constrainls at lhe
bOllodaries.
Wc assume r # d- Theorems land 2 of this correspondence remaio vahd for
r < d and theorem 3 cao bc gcncralizcd accordingly. Thc proofs, howcver, arC
less elegant, the reaSOn being that for r < d there are nO dklr-sequences (XIl_l,
... , xû) having a one at positions ) where r <.i >.( d.
2. Encodinft llod decoding
In this section we consider a way of mapping lhe set of all dklr-sequences
onto a set of conseculive inlegers and vice versa. Ou!' results are similar to
those obtained by Taog ilnd Bahl for dk-sequeoces"
Lel A" be the set of all 、ォャイᄋウ・アオョ」セ
of length n. A" ean be embedded in a
larger set A. cansîsling of lhe all-zero sequence of lenglh n and of all binary
セ・アオ ョ」・ウ
of length n satisfying the d-, k- and イᄋ」ッョウエイ。ゥョセ
where thc numbe::r
of consecutive IeadÎng zeros is allowed to be grcatcr than k.
Thc set 04" can be ordered Iexicographically as follows: if x = (x"_,, ... ,xû)
and Y = (Y,I-I, ... , Yo) are elements of d" then y is called less than x, y <0 x,
if there CXÎsts an i, 0.:( i <: n, such that YI < XI and xJ = YJ for i <) < n. The
posÎtÎon of x in the:: lcxicographical ordcring of .4" is denoled by r(x); I.e., r(x)
is the number of all y in A" with y <ox. Consequently r(O) = O.
For thc sake of convenience we întroduce the rcsidual of a vector. Let y =
(Y.-I," .,yo)E.4., y f 0, and let (be such thaty, = 1 andYJ = 0 if l <) < n.
Then the resîdual of Y. res(y), is defined as fol1ows:
Merhod for encoding and decoding runlength-limired binary sequences
res (y) : '" y - ó. i ,
where <la
J.
{O.
=
77
jf i = 1;
elsewhere,
res (0) : = O.
lt can easily be seen that y r:.4rt implies res (Y) E .4•. The following ob·
servation is basic to the proof of theorem 1. Let x, u e 4. and assume that
= /1.j = 0 (1 <j < n) and Xi == U, .. 1 (0 «; 1 < nl. Then it is not difficult to
show that r(u) セ r(x) == r(res (u» - r(res (x».
Let N,(I) , i> 0, be the number of dklr-sequences with 1= k of length i and
let N,(O) == I.
Xj
Theorem J
Let x = (X'-l'
..• , Xi)
e A •. Then
.-1
r(x) ""
I
Xj
N,(j).
jセゥI
Praal
Let the nonzerO coordinares of X be indexed by i, < 12 < ... < i", i.e.,
Xi = I if and only if ie (i
... , iql. Let u be the smallest element of ui. with
the property that Ui. == 1. "Then ir is not difficult to see tbat the second 1 of u
occurs at position iq - k - I, îf i" - k - I )0 0 (otherwise u has only one 1, if
r;;;' i", or also IlO = 1 if r < tg). Here we recall that the all-zero sequence of
length n is a dklr-sequence if and only if n < min 11, rl. Let 8(i", r) = 1 if
r;;" iq, and e(iq, r) = 0 if r < i", Then we obtain
r(u)
セ
r(res(u» = the number of dklr-sequences·wÎth I .. k of length I"
with their leftmost 1 at position j where roaxlO, i g - k - IJ <J < Iq +
t(iq, r) = N,(i").
Furtherroore, on the basis of the above mentioned observation It holds that
r(x) '" r(u)
=
r(u)
+ r(x)
- r(u)
+ r(res (x»
= r(res (x»
セ
r(res (u»
+ Nr(iq).
The theorem then follows by induction.
Q.E.D.
We have found a sîmple method for mapping the elements of .;drt onto consecutive integerst For practical applications we need a mapping of the elements
of A. onto the set 10,1, ... , IA.I - Ij, where IA.I is the cardinality of A•.
Obviously the set A. consists of the IA"j largest elements of •.4". In addition
it is dear that the number a of elements of 4" that are smaller than all the
elements of A" is equal to r(a); I.e., a = r(a), where a is the smallest element
of A •. In this way we have proved the following theorem.
78
Binary channe/ codes
Theorem 2
Thc:: LransformalÎon I: A,,---- N u (Ol defined by t(x) = r(x) "a for all
xeA" is a one-Lo-one mapping from A n onto the set (0,1, ... , IA"I - IJ
which preserves the ordering of A", i.e" x <0 y if and only if t(x) < t(y).
The number a can also be expressed in another way. In order to do so we
define N,°(j), j > 0, to be the number of dk-sequences of length j with their
leftmost element equal to land satisfying the r-constraint, N,(J(O) : = I. lt is
nol difficulL to show Lhat
11"1"'/
a
=
L
]1 ...
N,°(j)
j=1
+
=
I
1-/
N,o(j).
)=0
Thc numbcrs N,(j) and N,°(j) are easily eomputed. They ean be found by a
slraighlforward computer "eareh. A more sophisticated approach to tinding
these numbers ean be based on the finite state transition matrix eonesponding
to the dklr-sequenees 5).
The eonversion from integers to dklr-sequences of Iength ft is aho analogous
to Tang and Bahl's method and çan be carrîed out as follows. Let To, .. " T.-l
he integers defined by
Ti = the number of elements y in ,yI" smaller than Lhe smallest element u in
,!d" with Ui 71 and Uj = 0 for j> i,
i.e.,
1i =
I
i
J
j.l
Nr(J(j)
+ 1 = L N,(J(j).
j:(J
From lhe definîLion and our assumption r;? d it immediately follows that
Tu < T, <: ... < Tn .,. These integers are used for mapping consecutive integers onto dklr-sequenees of length n as shown in the following theorem.
Theorem 3
Let x '" HNセG⦅iLB
" "' xo) be a dklr-sequence of length n. Then a) Xr = land
Xj = 0 1'01' t <) < n <;. Tt < r(x) < 7;'1. Furthermore, jf x, = land Xj = 0 1'01'
t <) .".:: n. then b) iMセイt
セ r(x) - N,{l) < 7;-0'
Proof
a) This statement follows from the definition of T,.
b) Let x be a dklr-sequence of length n with XI =0 land Xj = 0 1'01' t <) < n.
Then tLMセャ
,,;; r(res (x») < Tt_d. Henee b) follows, sincc r(x) - r(res (x» =
N,(t).
Q.E.D.
The conversion from integers to dk-sequenees ean therefore also be gener·
alized to the conversion from integers to dklr-sequenees. The following simpte
Methad for encoding and decoding runlength·Umited binary sequences
79
encoding algorithm for dklr-sequenees ean be derived from this theorem.
Given an integer Jin the set R = (r(x) IXEA.1 (I = 0 if x = 0), we first locate
the largest possible 1, 0".;;; t < n, sueh that T, セ 1< 1;+1 alld we make x, = I.
Subtracting the contribution of Xl in J, we get a new integer 1- N,(t).
Theorem 3 can be used again to find the next nonzero component of x. The
second part of theorem 3 assures us that x, win be followed by at least d, but
no more than k zeros.
3. The efficiency of dkJr-sequences
In modulation systcms the dklr-sequences of length n cannot in general be
caseaded without violating the dk-collstraint at the boundaries. lnserting a
number p of merging bits between adjacent n-sequenccs makes it possibIe to
preserve the d· and k-constraints for the cascaded output sequenee. Thc dksequences need f3 = d + 2 merging bits '), whereas only IJ '" d merging bits are
required for dklr·sequcnces, provided that the parameters land rare suitably
chosen. Henee this method is more efficîent, especially for small values of n.
We shall now demonstrate two eonstfuctions of codes with merging rules of
increasing complexity and efficiency.
Construction 1
Choose d, k, rand n sueh that r + d,;;;: k. Let {= k - d - rand IJ = d. Then
the dklr-scquences of length n can be cascaded without violating the d- and
k-constraints if the merging bits are all set to zero.
Construction 2
Choose d, k and n such that Zd - 1 < k. Let r = 1= k - d and f3 = d. Then
the dklr-scqucnccs of length n ean be cascaded without violating the d- and
k-consttaints if the merging bits are determined by the following rules. Let an
(s,,;;; r) while the next n-sequenee start
n-sequenee end with a run of s セ・イッウ
wilh t (t,;;; I) leading zeros. Table I shows the merging rule for the f3 = d
merging bits.
TABLE I
Merging of dklr-sequences where (}i stands
for j eonsecutive セ・イッウ
s, t
s+t+d<k+
s+t+d";?;k+
if s< d
ifs>d
merging bits
Binary channel codes
80
The numbeT m of data bilS that can be represented uniquely by a dklr·
sequence of length n is given simply by
where lx J is the greatest integer not greater than x. The ratio R of thc numbeT
of data bits and the number of needed channel bits is called the information
rate of the code. FOT exampte, the information rate of the codes based on the
two above-mentioncd constructions equals R = m/(n + dl. The asymptotic
information rate is the capacity C of Shannon's discrete noiseless runlengthlimited channeI 6,7),
C=
IÎm
ャッァセQ Nl
n
The efficiency rI can be defined as the ratio of the information ratc Rand the
capacity C of the noiseless runlength-limited channel,
IJ = RIC.
In order to get some insight into the efficiency of thc codes based on constructions 1 and 2 we have considered same cxamples. For m = 8 and for
d = 1,2, 3.4 and k = 2d, . .. ,20 we have determined n in such a way that the
infmmalÎon rale R was maximized. In order to compare our two eOnSlrUCtions to Tang and BahI's method we have calculaled the corresponding capacities C and elficiencies '7. Thc capacity of the noiseless runlength-limited channels waS ealculatcd hy a method given in ref. 1.
OUr results cal") be summarized as follows. For small values of k, i.e.,
2d.:;; k < 3d, construction 2 is only slightly better than Tang and Bahl's
mClhod (approximalely 5 percent), while the efficiency of construclion 1 was
worse (5 to JO percent). For larger values of k, howeveT, constructions 1 and 2
are clearly better. For those values of k tne gain of construction 2 compared
to Tang and BahI' S method is most significant for d = I, 2 (12 to 15 pereerlt), while for d = 3,4 the gain is equal to 9 percent. For large values of k,
constructionS 1 and 2 have the same etftciency; for the other values of k, conセエイオ」 ゥッ ャ
2 has a better efficiency than construction I. Tablcs 1I, III and IV
give the results for m '" 8 and d = 1, 2, 3 and 4; in oTder to limil the !ength of
[he tables, we have restricted k and n 10 those valucs whieh maximize the
information rate R. We note that rates up to 95 percent of the channel
capacity can be achieved. On average we observe a slight difference in the rates
obtained by constructions 1 and 2, approxÎmately 5 percent in favor of construction 2.
Methad jor encoding and decoding runlength·/imited binary sequences
81
TABLE II
Block codes based on construction 1
d
k
1
2
3
4
17
14
18
7
n
R
C
12
14
8/13
8/16
8/20
8/23
0.68
0.55
0.46
0.40
17
19
Yf
=
RfC
0.91
0.91
0.87
0.87
TABLE UI
Block codes based
On
construction 2
d
k
n
R
C
,,= R/C
1
2
3
4
5
10
lO
12
12
14
17
8/13
8/16
19
8/23
0.65
0.54
0.45
0.39
0.95
0.92
0.90
0.90
8/20
TABLE IV
Block codes based on Tang and Bahl's construction
d
k
n
R
C
1'/ = RIC
I
2
3
4
5
9
8
10
12
14
17
19
8/15
8/18
8/22
8/25
0.65
0.54
0.43
0.38
0.82
0.83
0.86
0.85
4. ConcIusion
Methods are described for the construction of runlength-limited codes on
the basis of sequences of fixed length. Additional constraints On the maximum
numbcr of l;CWS at tbe beginning and end of a sequence, a generallzation of
Tang and Bahl's work, allow a more efficient merging of the sequences. For
short lengths in particular, our method yields better efficiencies than those of
Tang and Bah!.
82
8inary channel codes
REFERENCES
') D. T, Tang and L. R. Bahl. Inform, Contr. 17,436-461 (1970).
') H. Kobayashi, IEF:E Trans. Comm. Teeh. COM-l?, 1087-1 J(lO (1971).
Bセゥ
K. A. Schouhamer Immink, PrOC. IEEE int. Con1. on a」ッオセエゥB
Speech, anu Signal Pr<\cessing, 587-589 (1981).
') M. G. Pelchal alld J, M. Geis!, IEEE Trans, Comm. COM·23. 878-883 (1975).
') P, A. Frananck, IBM J. r・セN
Ocv. 14,376-,83 (1970).
0) C. E. Shannon, !Jell $yst. Teeh . .I. 27, 379-423 (1948).
') C. V. l"reiman anJ A. D. Wyner, InforOl. Conlr. 7, 398-415 (1964).