I Introduction
The work was supported in part by the European Research Council under the European Union’s Seventh Framework Programme (FP7/20072013)/ERC grant agreement no.337752, and the ISF research grant 818/17. Part of this work was presented at the 2017 IEEE Int. Symposium on Information Theory (ISIT 2017), Aachen, Germany [Peled_ISIT].O. Peled, O. Sabag and H. H. Permuter are with the department of Electrical and Computer Engineering, BenGurion University of the Negev, BeerSheva, Israel (oripe@post.bgu.ac.il, oronsa@post.bgu.ac.il, haimp@bgu.ac.il). The physical limitations of the hardware used in recording and communication systems cause some digital sequences to be more prone to errors than others. This elicits the need to ensure that such sequences will not be recorded or transmitted. Constrained coding is a method that enables such systems to encode arbitrary data sequences into sequences that abide by the imposed restrictions [Marcus98]. In the classical constrained coding setting, it is assumed that the transmission is noiseless if the transmitted sequence satisfies the imposed constraint. In this paper, however, we consider a transmission of constrained sequences where the transmission is over a noisy channel, the binary erasure channel (BEC) (Fig. 1).
Runlength limited (RLL) constraints are common in magnetic and optical recording standards, where the run length of consecutive ‘’s should be limited between and . A RLL constrained binary sequence must satisfy two restrictions:

at least ‘’s must follow each ‘’.

no more than consecutive ‘’s are allowed.
The first restriction ensures that the frequency of transitions, i.e., or , will not be too high. This is necessary in systems where the sequence is conveyed over bandlimited channels. Timing is commonly recovered with a phaselocked loop (PLL) that adjusts the phase of the detection instant according to the observed transition of the received waveform. The second restriction guarantees that the PLL does not fall out of synchronization with the waveform [Immink90runlengthlimitedsequences, Marcus98].
Two important families of the RLL constraint are the RLL and RLL. These constraints might seem symmetric in some sense, but indeed, may greatly differ in their behavior, see e.g., [dk_1, dk_2]. Therefore, when dealing with RLL constraints, it is common to tackle each of these families separately before approaching the general RLL. In this paper, we adopt this approach and show that the RLL problem is solvable, while the same problem with a RLL constraint is a great deal more challenging.
The model studied in this paper is a BEC (Fig. 1), in which the input sequences must satisfy the RLL constraint. Two cases of this model are investigated, based on the information that is available to the encoder. In the first case, described in Fig. 2, the encoder has access to all past outputs via a noiseless feedback link. In the second case, described in Fig. 3, the encoder has noncausal access to the erasure that is about to occur, that is, the encoder knows in advance whether the BEC behaves like a clean channel or not. From an operational point of view, the capacity of the noncausal case must be greater than the feedback case due to the additional information that the encoder has.
We show that the feedback capacity of the RLL inputconstrained BEC is:
(1) 
for all and , where are simple functions of , given in Eq. (6), below. Surprisingly, we are also able show that the noncausal knowledge of the channel erasure does not increase the feedback capacity, so that (1) is the noncausal capacity as well.
This work generalizes the results in [Sabag_BEC], where the feedback capacity of the RLL inputconstrained BEC was calculated^{1}^{1}1The RLL constraint is equivalent to the RLL constraint by swapping ‘’s and ‘’s. In [Sabag_BEC] and other works, [TatikondaMitter_IT09, Chen05, Yang05, PermuterCuffVanRoyWeissman08, trapdoor_generalized, Ising_channel, generalized_Ising, sabag_BIBO], the capacity was derived by formulating it as a dynamic programming (DP) problem and then solving the corresponding Bellman equation. In all past works, the DP state was an element of the simplex, an essential property in the solution of the Bellman equation. However, the DP state in our case is an element of the simplex. This makes the approach of solving the Bellman equation intractable and different methods are required.
To circumvent this difficulty, we use alternative techniques to solve the capacity of our problems. The upper bound follows from standard converse techniques for the noncausal model, where the encoder knows the erasure ahead of time. This upper bound is trivially an upper bound also for the feedback model, since non causal knowledge might increase the capacity only. Then, we construct a simple coding scheme for the feedback setting, inspired by the posterior matching principle [horstein_original, shayevitz_posterior_mathcing, Li_elgamal_matching, shayevitz_simple]. The coding scheme enables both the encoder and the decoder to systematically reduce the size of the set of possible messages to a single message, which is then declared by the decoder as the correct message. An analysis of the achieved rate reveals an expression that is similar to the upper bound. The equivalence of these bounds is finally derived, and this concludes both the feedback capacity and the noncausal capacity for our setting.
The remainder of the paper is organized as follows: Section II includes the notations we use and the problem definition. Section III contains the main results of this paper. In Section IV we present the coding scheme and its rate analysis. Section V includes an upper bound of the capacity. In Section VI we discuss the RLL input constraint, as an example where the noncausal capacity is strictly greater than the feedback capacity. Section VII presents the feedback capacity of the RLL BEC, as an example for possible future avenues of research. Finally, the appendices contain proofs of several lemmas used throughout the paper.
Ii Notations and Problem Definition
Iia Notations
Random variables are denoted using a capital letter . Lowercase letters are used to denote realizations of random variables. Calligraphic letters denote sets. The notation means the tuple and
is a realization of such a vector of random variables. For a real number
, we define . The binary entropy function is denoted by for .IiB Problem Definition
The BEC (Fig. 1) is memoryless, that is , and can be represented by:
where is an i.i.d. process with . A message is drawn uniformly from and is available to the encoder. We define two models, based on the additional information that is available to the encoder: in the first model, at time , the encoder has knowledge of past channel outputs via a noiseless feedback link (Fig. 2); in the second model, at time , the encoder has noncausal access to (Fig. 3). In both cases, the transmission is over a BEC.
The encoder must produce sequences that comply with the RLL input constraint. This constraint can be described graphically (Fig. 4), where all walks along the directed edges of the graph do not contain the forbidden string. Note that the node has only one outgoing edge, labeled ‘’, which implies that after consecutive ‘’s, the next bit must be a ‘’. The constrained encoder and the decoder operations are made precise by the following code definitions.
Definition 1.
A code for an inputconstrained BEC is composed of encoding and decoding functions. The encoding functions for the first model (with feedback) are:
(2) 
satisfying if for all . For the noncausal model the encoding functions are defined by:
(3) 
satisfying if for all . The decoding function for both models is defined by:
Without loss of generality, we assume that , so that the initial state is .
The average probability of error for a code is defined as . A rate is said to be achievable if there exists a sequence of codes such that . The capacity is defined to be the supremum over all achievable rates and is a function of and the erasure probability . Denote by the capacity of the feedback model and that of the noncausal model. Since is computable from and , we have the relation , for all , .
Iii Main Results
In this section we present the main results, including the feedback capacity and the noncausal capacity for the BEC with RLL input constraints. We then explain the methodology used to prove the results. The following theorem constitutes our main results regarding the feedback capacity and the capacity achieving coding scheme. Define the function:
(4) 
where takes values in for .
Theorem 1.
The feedback capacity of the RLL inputconstrained BEC with feedback is:
(5) 
where are functions of and can be calculated recursively using:
(6) 
with . In addition, there exists a simple coding scheme that achieves the capacity given in (5).
Fig 5 presents graphs of the feedback capacity as a function of for several values of . The capacity is a decreasing function of , and an increasing function of . For , the channel is noiseless and so the capacity is that of the corresponding constraint. For example, , where is the golden ratio, which is known to be the capacity of sequences that do not contain two consecutive ‘’s. For , the output is constant so we have for all . As increases the constraint becomes more lenient and the capacity approaches , which is the capacity of the unconstrained BEC.
Theorem 1 guarantees that even though the function we aim to maximize is a function of variables, to calculate the capacity, one needs only to perform a maximization over . For any , the values of all other variables can be calculated by utilizing the set of equations given in (6).
Our proposed coding scheme has degrees of freedom, represented by . For this reason, it is rather surprising that the feedback capacity is a simple optimization problem of one variable for all . Indeed, the relaxation of the optimization domain shows that optimizing over the ktuple and the optimization in (5) and (6) are equivalent. In addition we also prove the following:
Theorem 2.
Noncausal knowledge of the erasures does not increase the feedback capacity, that is :
It is tempting to conjecture that this property holds for the general RLL constrained BEC, but we will provide a counterexample in Section VI. Theorems 1 and 2 both generalize parallel results shown in [Sabag_BEC], where the special case of was calculated using different techniques. The following inequalities are the main steps required to prove Theorems 1 and 2:
(7) 
where

Inequality (a) follows from the coding scheme that is presented in Algorithm 1. Specifically, it is shown that is achievable for any choice of .

Inequality (b) follows from the operational definitions of the code in Section II.

Inequality (c) follows from standard converse techniques for the noncausal setting.
The next lemma shows that the maximal value of remains the same whether the maximization domain is or . Thus, the chain of inequalities is actually a chain of equalities.
Lemma 1.
Iv Optimal Coding Scheme for the InputConstrained BEC with Feedback
In this section, we introduce the idea behind the optimal coding scheme, as well as the coding itself, which is presented in Algorithm 1. We then prove that the scheme is feasible, meaning that the generated input sequence does not violate the input constraint, and, finally, calculate its rate.
Iva Coding Scheme
Before presenting the coding scheme itself, we discuss the basic ideas in accordance with which the scheme operates. The coding scheme is a mechanism that allows the encoder to transmit a message to the decoder without violating the channel’s input constraint. The main feature of the scheme is a dynamic set of possible messages that is known both to the encoder and to the decoder at all times. Both parties will systematically reduce the size of the set of possible messages from in the beginning of the transmission process to a single message that will then be announced as the correct message.
Initially, the messages are mapped uniformly to message points in the unit interval by applying . As long as transmission proceeds, the set of possible messages is represented by uniform points on the unit interval with proper scaling.
Channel inputs are determined by labeling functions, which map the unit interval into . Given a labelling, , with a corresponding parameter the encoder assigns the label ‘’ to a subsection of of length and the label ‘’ to the rest of . Fig. 6 depicts the various labelings. Define the following function:
(8) 
The channel input is , where is the correct message point and is the labeling being used at time .
The labelling at each time is a function of all channel outpouts and can be computed recursively from the previous channel input and the previous labelling. Therefore, the instantaneous labelling is available both to the encoder and the decoder. Transition between the various labelings is controlled by a finitestate machine (FSM), which is illustrated in Fig. 7. Define the following function:
(9) 
and thus, .
A transmission at time is said to be successful if . Due to the nature of the BEC, whenever the decoder can know with certainty the value of . Denote by and the subsets of messages which are labeled at time with ‘’ and ‘’, respectively. Define and for :
(10) 
Thus, a successful transmission reduces the size of the set of possible messages. Following a successful transmission, the remaining messages in the set of possible messages are uniformly mapped again to . Fig. 8 depicts a successful transmission and the subsequent reduction of the number of possible messages. The process continues until such a time that the set of possible messages contains only one message, at which point the decoder declares it to be .
IvB Feasibility of the Proposed Scheme
First, we show that the coding scheme satisfies the RLL constraint, that is, no message is mapped by the scheme into a sequence with more than consecutive ‘’s. The following lemma shows that the constraint is satisfied when restricting the scheme parameters .
Lemma 2.
If for , then any channel input sequence generated by the proposed coding scheme satisfies the RLL constraint.
Proof.
We show that if for , then no message is labeled ‘’ more than times in a row. From Eqs. (8) and (9), the channel input, is a function of and . Therefore, we divide the proof into three disjoint cases based on the last outputs . For each case, we show that the subsequent sequence of channel inputs cannot contain more than consecutive ‘’s. Assume that transmission begins at the node associated with labeling .

Any output sequence (of length ) that contains a ‘’ cannot cause a violation.

Lastly, consider a sequence of outputs that contains both ‘’s and erasures. Assume the first erasure occurred at time instance , meaning that the erasure took place while the encoder was using labeling . This means that all messages between and in were labeled with ‘’ times in a row. The next labeling that will be used is . In this labeling, all messages between and are labeled with ‘’. Since , we have that , so all messages that were labeled ‘’ times in a row will be labeled ‘’ in . This analysis holds for any .
In summary, setting , ensures that the scheme does not violate the RLL constraint. ∎
IvC Rate Analysis
The achieved rate is measured by . Define as the number of information bits gained in a single channel use, i.e., the quotient of the size of the set of possible messages before and after the transmission. Additionally, Let be the random variable which corresponds to the labeling and takes values in .
In the following lemma we calculate the expectation of .
Lemma 3.
For all , we have that , where is the relevant to the labeling .
Proof.
Consider:
(11) 
where (a) holds since if , then the transmitted symbol is erased by the channel and the set of possible messages is unchanged.
In the proposed coding scheme, labeling assigns a portion of of size to the label ‘’ and the rest to label ‘’ for all . The labeling also assigns of the unit interval to ‘’. If the labeling is employed, then the channel input is distributed according to ^{2}^{2}2For labelings , the encoder transmits if the correct message falls within a subinterval of that has length respectively. Note that the messages are discrete points in and it is possible for the partition to occur between two messages. This implies that the transmitted bit is distributed , where is a correction factor. From the continuity of the entropy function, the contribution of this correction factor can be bounded with arbitrary constant by taking the block length be large enough. The precise details are omitted and follow parallel argument to [Sabag_BEC, Appendix C].
Assume that . If the successfully received bit was ‘’, then the new set of possible messages has size , and if it was ‘’, then the new set of possible messages has size . The expected number of bits required to describe the new set of possible messages is . Thus, given that , following a successful transmission the decoder gains bits of information. Substituting into (IVC) we get:
∎
The next lemma calculates the rate achieved by the proposed coding scheme.
Lemma 4.
For any , and , the proposed coding scheme achieves the following rate :
(12) 
Proof.
Consider the averaged gain of information divided by the amount of time:
where

Follows from the law of total expectation.

Follows from Lemma 3 and exchanging the finite sums’ order.

Follows from the definition of stationary probability. is the stationary probability of labeling . There exists a stationary probability because the random process
is a positive recurrent, irreducible and aperiodic Markov chain, as can be seen from Fig.
7. 
Follows from calculation of the stationary probability of the Markov chain described in Fig. 7 and is parameterized with ,
. Calculating the conditional probability of each edge is simple, using the law of total probability. For example, the conditional distribution of the edge beginning node
and culminating in node is .
From Lemma 2, we conclude that is an achievable rate. ∎
V NonCausal Upper Bound
In this section, we present an upper bound of the noncausal capacity for the RLL inputconstrained BEC (given in Eq. (7)). We begin with an observation: it is sufficient to look at a smaller family of codes, called restricted codes. We then proceed to calculate an upper bound on the achievable rate of such codes, using standard converse arguments, as well as the method of types and Markov theory results.
A code is said to be restricted if
(13) 
Condition (13) states that if an erasure is about to occur, the encoder transmits . The following lemma formalizes the fact that restricted codes can achieve the capacity.
Lemma 5.
For the RLL constrained noncausal BEC, if a rate is achievable, then can be achieved using a sequence of restricted codes.
Proof.
Assume the rate is achieved using a sequence of codes: . Define for each a new code that is exactly the same as except that in , whenever the encoder transmits .
The code does not violate the input constraint since the original did not violate the constraint and transmitting is always permitted by the RLL input constraint. In addition, the channel outputs remain the same whether the code is or . This means that , and so the rate is also achieved by the sequence of . ∎
Va Upper Bound Calculation
The following technical lemma is needed for the converse proof:
Lemma 6.
For any ntuple constrained distribution , where , there exists a time invariant constrained Markov distribution such that:
(14) 
where .
The proof is available in Appendix B. This result can readily be generalized to any RLL constraint imposed on the ntuple distribution.
In the constraint graph, Fig. 4, a node can be calculated from an anylength tuple by walking along the edges labelled with , since we assume that the initial state is . The notation will be used in various forms for distributions on to signify that the probability for is and that the probability for a constrained word is .
Proof of the upper bound.
Let be an achievable rate using a restricted code, and consider the following chain of inequalities:
(15) 
where

Follows from Fano’s inequality.

Follows from the chain rule.

Follows from the fact that conditioning reduces entropy, so: .

Follows from the fact that conditioning reduces entropy.

The maximization domain is the set of all ntuple distributions which induce and , for all and , respectively.

We want to show that it is possible to maximize over a smaller domain and maintain an equality. It Suffices to prove by induction that if we have two distributions and , which induce the same marginal distributions , then and coincide. For the proof is trivial. Assume by induction that and we need to prove that . Indeed we have:
since is the same for both distributions by assumption, and the
Comments
There are no comments yet.