Computability of the Channel Reliability Function and Related Bounds
aa r X i v : . [ c s . I T ] J a n Computability of the Channel Reliability Function andRelated Bounds
Holger Boche
Chair of Theoretical Information TechnologyTechnical University of MunichD-80333 Munich, GermanyandMunich Center for Quantum Scienceand Technology (MCQST)D-80799 Munich, GermanyEmail: [email protected]
Christian Deppe
Institute for Communications EngineeringTechnical University of MunichD-80333 Munich, GermanyEmail: [email protected]
Abstract
The channel reliability function is an important tool that characterizes the reli-able transmission of messages over communications channels. For many channelsonly the upper and lower bounds of the function are known. In this paper weanalyze the computability of the reliability function and its related functions.We show that the reliability function is not a Turing computable performancefunction. The same also applies to the functions of the sphere packing boundand the expurgation bound. Furthermore, we consider the R ∞ function and thezero-error feedback capacity, both of them play an important role for the relia-bility function. Both the R ∞ function and the zero-error feedback capacity arenot Banach Mazur computable. We show that the R ∞ function is additive. Thezero-error feedback capacity is super-additive and we characterize it’s behavior. Keywords: computability, channel reliability function, zero-error feedbackcapacity
Preprint submitted to Journal of L A TEX Templates January 26, 2021 . Introduction
The foundations of information theory were laid by Shannon in his 1948 pa-per [1]. In order to characterize important properties of communication chan-nels, he introduced the realiability function E ( R ) as the optimal exponent ofthe exponential decrease exp( − nE ( R )) of the decoding error probability whencode length n increases, for given transmission rate R less than capacity C ofthe channel.It is of course a high priority goal for information theory to find a closed formexpression for the channel reliability function. This formula should be a simplecomputation of the channel reliability function determined by the parametersof the communication task. Of course, we must specify what a closed formexpression is. In [2] by Chow and in [3] by Borwein and Crandall, examplesfor specifications are given. In all the approaches in [2] and [3], the closed formrepresentation is always coupled with the fact that the corresponding functionscan be computed algorithmically using a digital computer in a very precisemanner, depending on the inputs from their domain of definition.Shannon’s characterization of the capacity for the transmission of messagesvia the discrete memoryless channel (DMC) in [1], Ahlswede’s characterizationof the capacity for the transmission of messages via the multiple access channelin [4], and Ahlswedes and Dueck’s characterization of the identification capac-ity for DMCs in [5] are important examples of closed-form solutions throughelementary functions according to Chow and Borwein/Crandall, and are exam-ples of computability of the corresponding performance functions in the abovesense. The precise meaning of computability defined by Turing will be includedlater in section 2. Also, Lovasz’s characterization of the zero error capacity forthe pentagram is an example of a closed form number corresponding to Chowand can be computed algorithmically, and is desirable as well. However, for acyclical heptagon, the characterization is still pending for the zero error capac-ity. It is also unclear whether the zero error capacities of DMCs even assume2omputable values for computable channels. Furthermore, it is unclear whetherthe broadcast capacity region can be algorithmically computed.In this paper, we give a negative answer to the question of whether thechannel reliability function and several related bounds are algorithmically com-putable.Since Shannon introduced the reliability function in [6], there has been alot of work on this topic. Many problems regarding behavior of the channelrealiability function are still open (see surveys [7] and [8]). Nowadays a charac-terization of the channel reliability function is not known even for binary-input-binary-output channels. That is why there are a large number of papers thathave tried to find computable lower and upper bounds for the channel reliabilityfunction (see [9, 10, 11]).It is difficult to determine the behavior of the channel reliability function overthe entire interval (0, C). There are approaches that have tried to algorithmi-cally compute the reliability function, meaning considering sequences of upperand lower bounds. The first paper in this direction was [12] by Shannon, Gal-lager and Berlekamp. We have now asked whether it is possible to compute thereliability function in this way. To formulate this, we use the theory of Turingcomputability [13]. In general, a function can be computed if an algorithm canbe formulated for it. The strongest model for computability is the computabil-ity with a Turing machine. Turing machines make the concepts of algorithmsand computability mathematically comprehensible, that is, they formalize theseconcepts. In contrast to a physical computer, a Turing machine is a mathemat-ical object and can be examined using mathematical methods. It is importantto note that the Turing machine describes the ultimate performance limit thatcan be achieved by today’s digital computers and even super computers. A Tur-ing machine represents an algorithm or a program. A computation consists ofthe step-by-step manipulation of symbols or characters, which are written to amemory tape according to certain rules, and are also read from there. Strings ofthese symbols can be interpreted in different ways, including numbers. A Turingmachine describes a function that maps character strings that are initially on3he tape to character strings that are on the tape after being "processed" by themachine. Often, however, one cannot find a function that is Turing computable.Then one can ask the weaker question of whether it is possible to approximatethe function in a computable way. For this we need computable sequences ofcomputable upper and lower bounds. This analysis is also necessary for therealiability function and we have carried this out. Unfortunately, we have togive a negative answer. The reliability function is not a Turing computableperformance function as a function of the channel as an input. Furthermore,we consider the R ∞ function, the function for the sphere packing bound, thefunction for the expurgation bound and the zero-error feedback capacity, all ofwhich are closely related to the realiability function. We consider all these func-tions as functions of the channel. The structure of the paper is as follows. Westart in Section 2 with the basic definitions and known statements that we need.In Section 3 we first consider the R ∞ function. We consider the decidability ofconnected sets with the R ∞ function and show that only an approximation ofbelow is possible. This has consequences for the sphere packing bound and weshow that this is not a Turing computable performance function. In Section 4we then consider the realiability function and show that it is also not a Turingcomputable performance function. We can show the same for the expurgationbound. In Section 5 we consider the zero-error feedback capacity. It is closelyrelated to the R ∞ function. First we answer a question for the zero-error ca-pacity with feedback which Alon asked in [14] for the case without feedback(we examined this in [15]). Then we show that the zero-error feedback capacityis not Banach-Mazur computable. Furthermore, the zero-error feedback capac-ity cannot be approximated by computable increasing sequences of computablefunctions. We characterize the superadditivity of the zero-error feedback ca-pacity and show that the R ∞ function is additive. In section 6 we analyze thebehavior of the expurgation-bound rates. In the conclusion we summarize whatour results mean for the channel reliability function. In particular, our resultsshow that in general there can be no simple recursive closed form formula forthe channel reliability function in a very precise interval.4 . Definitions and Basic Results from Computability Theory and In-formation Theory In this section we give the basic definitions and results that we need forthis work. We start with the most important definitions of computability. Todefine computability, we use the concept of a Turing machine [13]. This is amathematical model of an abstract machine that processes symbols on a tapeaccording to certain rules. Any given algorithm can be simulated with a Turingmachine, which yields a simple but very powerful model of computability. Turingmachines are free of restrictions regarding complexity, computing capacity andmemory. The computation is assumed to be executed in an error-free manner.The computational capabilities of Turing machines can be characterizedthrough the use of recursive functions [16]. For simplicity, we directly use theconcept of Turing machines as a definition for the latter. We first define when areal number is computable. For this we need the following two definitions. Weuse the concepts of recursive functions (see [16]) and computable numbers (see[17]).
Definition 1.
A sequence of rational numbers { r n } n ∈ N is called a computablesequence if there exist recursive functions a, b, s : N → N with b ( n ) = 0 for all n ∈ N and r n = ( − s ( n ) a ( n ) b ( n ) , n ∈ N . Definition 2.
We say that a computable sequence { r n } n ∈ N of rational numbersconverges effectively, i.e., computably, to a number x , if a recursive function a : N → N exists such that | x − r n | < N for all N ∈ N and all n ∈ N with n ≥ a ( N ) applies. We can now introduce computable numbers.
Definition 3.
A real number x is said to be computable if there exists a com-putable sequence of rational numbers { r n } n ∈ N such that | x − r n | < − n for all n ∈ N . We denote the set of computable real numbers by R c . Definition 4.
A set A ⊂ N is called recursive if there exists a computablefunction f such that f ( x ) = 1 if x ∈ A and f ( x ) = 0 if x ∈ A c . Definition 5.
A set A ⊂ N is recursively enumerable if there exists a recursivefunction whose domain is exactly A . Definition 6.
Let A ⊂ N be recursively enumerable. A function f : A → N iscalled partial recursive if there exists a Turing machine T M f which computes f .2.2. Basic Concepts of Information Theory To define the reliability function and its related functions, we first need thedefinition of a discrete memoryless channel. In the theory of transmission, thereceiver must be in a position to successfully decode all the messages transmittedby the sender.Let X be a finite alphabet. We denote the set of probability distributionsby P ( X ) . We define the set of computable probability distributions P c ( X ) asthe set of all probability distributions P ∈ P ( X ) such that P ( x ) ∈ R c for all x ∈ X . Furthermore, for finite alphabets X and Y , let C H ( X , Y ) be the setof all conditional probability distributions (or channels) P Y | X : X → P ( Y ) . C H c ( X , Y ) denotes the set of all computable conditional probability distribu-tions, i.e., P Y | X ( ·| x ) ∈ P c ( Y ) for every x ∈ X .Let M ⊂ C H c ( X , Y ) . We call M semi-decidable if and only if there isa Turing machine T M M that either stops or computes forever, depending onwhether W ∈ M is true. That means T M M accepts exactly the elements of M and computes forever for an input W ∈ M c = C H c ( X , Y ) \ M . Definition 7.
A discrete memoryless channel (DMC) is a triple ( X , Y , W ) ,where X is the finite input alphabet, Y is the finite output alphabet, and W ( y | x ) ∈ C H ( X , Y ) with x ∈ X , y ∈ Y . The probability for a sequence y n ∈ Y n o be received if x n ∈ X n was sent is defined by W n ( y n | x n ) = n Y j =1 W ( y j | x j ) . Definition 8.
A (deterministic) block code C ( n ) with rate R and block length n consists of • A message set M = { , , ..., M } with M = 2 nR ∈ N . • An encoding function e : M → X n . • A decoding function d : Y n → M .We call such a code an ( R, n ) -code. Definition 9.
Let ( X , Y , W ) be a DMC. C ( n ) denotes a code with block length n and message set M , and m ∈ M . The individual message probability of error is defined by the conditionalprobability of error given that message m is transmitted: P e ( C ( n ) , W, m ) = P r { d ( Y n ) = m | X n = e ( m ) } . We define the average probability of error by P e, av ( C ( n ) , W ) = 1 |M| X m ∈M P e ( C ( n ) , W, m ) .P e, av ( W, R, n ) denotes the minimum error probability P e, av ( C ( n ) , W ) overall codes C ( n ) of block length n and with message set M = 2 nR . We define the maximal probability of error by P e, max ( C ( n ) , W ) = max m ∈M P e ( C ( n ) , W, m ) .P e, max ( W, R, n ) denotes the minimum error probability P e, max ( C ( n ) , W ) over all codes C ( n ) of block length n and with message set M = 2 nR . The Shannon capacity for a channel W ∈ C H ( X , Y ) is defined by C ( W ) := sup { R : lim n →∞ P e, max ( W, R, n ) = 0 } . The zero-error capacity for a channel W ∈ C H ( X , Y ) is defined by sup { R : P e, max ( W, R, n ) = 0 for some n } . For R with C ( W ) < R < C ( W ) there exists A ( W, R ) , B ( W, R ) ∈ R + suchthat − nA ( W,R )+ o (1) ≤ R e, max ( W, R, n ) ≤ − nB ( W,R )+ o (1) . We also define the discrete memoryless channel with noiseless feedback (DMCF).By this we mean that in addition to the DMC there exists a return channelwhich sends back from the receiving point to the transmitting point the elementof Y actually received. It is assumed that this information is received at thetransmitting point before the next letter is sent, and can therefore be used forchoosing the next letter to be sent. We assume that this feedback is noiseless.We denote the feedback capacity of a channel W by C F B ( W ) and the zero-errorfeedback capacity by C F B ( W ) . Shannon proved in [18] that C ( W ) = C F B ( W ) .This is not true in general for the zero-error capacity. We will see that the zero-error (feedback) capacity is related to the reliability function, which we analyzein this paper. It is defined as follows. Definition 10.
The channel reliability function (error exponent) is defined by E ( W, R ) = lim sup n →∞ − n log P e, max ( W, R, n ) (1) Remark 11.
We make use of the common convention that log −∞ . Remark 12.
We need the lim sup in (1) because it is not known whether thelimit value, i.e. the limts on the right-hand side of (1) , exist. The first simple observation is that for
R > C ( W ) we have E ( W, R ) = 0 andif C ( W ) > for ≤ R < C ( W ) we have E ( W, R ) = + ∞ . One well-knownupper bound is the sphere packing bound, which can be defined as follows (see[11]) Definition 13.
Let X , Y be finite alphabets and ( X , Y , W ) be a DMC. Then for ll R ∈ (0 , C ( W )) we define the sphere packing bound function E SP ( W, R ) = sup ρ> max P ∈P ( X ) − log X y X x P ( x ) W ( y | x ) ρ ! ρ − ρR (2) Theorem 14 (Fano 1961, Shannon, Gallager, Berlekamp 1967) . For any DMC W and for all R ∈ (0 , C ( W )) it holds E ( W, R ) ≤ E sp ( W, R ) . The sphere packing upper bound is a central upper bound. The followingtwo lower bounds of the reliability function are also very important. In [19] therandom coding bound was defined as follows.
Definition 15.
Let X , Y be finite alphabets and ( X , Y , W ) be a DMC. Then forall R ∈ (0 , C ( W )) we define the random coding bound function E r ( W, R ) = max ≤ ρ ≤ E ( ρ ) − ρR, where (3) E ( ρ ) = max P ∈P ( X ) − log X y X x P ( x ) W ( y | x ) / (1+ ρ ) ! ρ . (4) Theorem 16.
Let X , Y be finite alphabets and ( X , Y , W ) be a DMC, then E ( W, R ) ≥ E r ( W, R ) Gallager also defined in [19] the k -letter expurgation bound as follows Definition 17.
Let X , Y be finite alphabets and ( X , Y , W ) be a DMC, then forall R ∈ (0 , C ( W )) we define the k -letter expurgation bound function E ex ( W, R, k ) = sup ρ ≥ E x ( ρ, k ) − ρR (5) E x ( ρ, k ) = − ρk log min P Xk ∈P ( X k ) Q k ( ρ, P X k ) (6) Q k ( ρ, P X k ) = X x k ,x ′ k P X k ( x k ) P X k ( x ′ k ) g k ( x k , x ′ k ) ρ (7) g k ( x k , x ′ k ) = X y k q W k ( y k | x k ) W k ( y k | x ′ k ) . (8)9 heorem 18. Let X , Y be finite alphabets and ( X , Y , W ) be a DMC. Then forall R ∈ (0 , C ( W )) we have E ( W, R ) ≥ lim k →∞ E ex ( W, R, k ) . (9)The inequality in (9) follows from Fekete’s lemma. The following Theoremis also well-known. Theorem 19.
If the capacity C ( W ) of a channel is positive, then for sufficientlysmall values of R we have E ex ( W, R ) > E r ( W, R ) .E ex ( W, R, k ) is the expurgation bound for k -letter channel use. R exk ( W ) isthe infimum of all rates R such that the function E ex ( W, · , k ) is finite on theopen interval ( R, C ( W )) . Therefore, E ex ( W, R, k ) is defined on the open interval ( R exk ( W ) , C ( W )) . It holds [8] R exk ( W ) ≤ R exk +1 ( W ) , W ∈ C H ( X , Y ) and lim k →∞ R exk ( W ) = C ( W ) . The smallest value of R, at which the convex curve E sp ( W, R ) meets its sup-porting line of slope -1, is called the critical rate and denoted by R crit [8]. Forthe certain interval [ R crit , C ] , the random coding lower bound corresponds tothe sphere packing upper bound. The channel reliability function is thereforeknown for this interval. The channel reliability function is generally not knownfor the interval [0 , R crit ] . For the interval [0 , R crit ] there are also better lowerbounds than the random coding lower bound. R ∞ ( W ) is the infimum of allrates R such that the function E sp ( W, · ) is finite on the open interval ( R, C ( W )) . C ( W ) ≤ R ∞ ( W ) applies if C ( W ) > . The following representation of R ∞ exists: R ∞ ( W ) = min Q ∈P ( X ) max x log 1 P y : W ( y | x ) > Q ( y ) . (10)10here are alphabets X , Y and channels W ∈ C H ( X , Y ) with C ( W ) = 0 and R ∞ > . Furthermore, for the zero-error feedback capacity C F B , C F B ( W ) = R ∞ ( W ) if C ( W ) > . If C ( W ) = 0 , then there is a channel W with C F B ( W ) = 0 and R ∞ > .For the zero-error feedback capacity the following is shown. Theorem 20 (Shannon 1956, [18]) . Let W ∈ C H ( X , Y ) then C F B = if C ( W ) = 0max P ∈P ( X ) min y log P x : W ( y | x ) > P ( x ) otherwise . (11) As mentioned before, Shannon, Gallager and Berlekamp assumed in [12] thatthe expurgation is bound tight. Katsman, Tsfasman and Vladut have given in[20] a counterexample for the symmetric q -ary channel when q ≥ . Dalai andPolyanskiy have found in [21] a simpler counterexample. They have shown thatthe conjecture is already wrong for the q -ary typewritter channel for q ≥ . Wewould like to briefly present their results here. Definition 21.
Let X = Y = Z q and ≤ ǫ ≤ . The typewriter channel W ǫ isdefined by W ǫ ( y | x ) = − ǫ y = xǫ y = x + 1 mod q. (12) The extension of the channel W nǫ is defined by W nǫ ( y n | x n ) = n Y k =1 W ǫ ( y i | x i ) . (13)For the reliability function of this channel, the interval ( C ( W ǫ ) , C ( W ǫ )) isof interest. The capacity of a typewriter channel W ǫ has the formula C ( W ǫ ) = log( q ) − h ( ǫ ) , where h is the binary entropy function. Shannon showed in [18] that C ( W ǫ ) is positiv, if q ≥ . He showed that for even q , it holds C ( W ǫ ) = log (cid:0) q (cid:1) . It is a11ard problem to get a formula for odd q . Lovasz proved in [22] that Shannon’slower bound for q = 5 : C ( W ǫ ) = log √ is tight. For general odd q , Lovaszproved C ( W ǫ ) ≤ log cos( πq )1 + cos( πq ) q. It is only known for q = 5 that this bound is tight. In general this is not true.For special q there are special results in [22, 23, 24].Dalai and Polyanskiy give upper and lower bounds on the reliability functionin [21]. They observed that the zero-error capacity of the pentagon can bedetermined by a careful study of the expurgated bound.They present an improved lower bound for the case of even and odd q , show-ing that it also is a precisely shifted version of the expurgated bound for theBSC. Their result also provides a new elementary disproof of the conjecture sug-gested in [12] that the expurgated bound is asymptotically tight when computedon arbitrarily large blocks. Furthermore, Dalai and Polyanskiy present in [21] anew upper bound for the case of odd q based on the minimum distance of codesby using Delsarte’s linear programming method [25] (see also [26]), combiningthe construction used by Lovász [22] for bounding the graph capacity with theconstruction used by McEliece-Rodemich-Rumsey-Welch [27] for bounding theminimum distance of codes in Hamming spaces. In the special case ǫ = 1 / , theygive another improved upper bound for the case of odd q , following the ideas ofLitsyn [28] and Barg-McGregor [29], which in turn are based on estimates forthe spectra of codes originated in Kalai-Linial [30]. We need further basic concepts for computability. We want to investigatethe function E ( W, R ) and the upper bounds like E sp ( W, R ) and E ex ( W, R ) for k ∈ N as functions of W and R . These functions are generally only well definedfor fixed channels W on sub-intervals of [0 , C ( W )] as functions depending on R . For example, for W ∈ C H ( X , Y ) with C ( W ) > , E ( W, R ) is infinite for R < C ( W ) . Hence, E ( W, R ) must be examined and computed as a function of R on the interval ( C ( W ) , C ( W )] . Similar statements also apply to the other12unctions that have already been introduced. We now fix non-trivial alphabets X , Y and the corresponding set C H c ( X , Y ) of the computable channels and R ∈ R c . Definition 22 (Turing computable channel function) . We call a function f : C H c ( X , Y ) → R c a Turing computable channel function if there is a Turingmachine that converts any program for the representation of W ∈ C H c ( X , Y ) , W arbitrarily, into a program for the computation of f ( W ) , that is, f ( W ) = T M f ( W ) , W ∈ C H c ( X , Y ) . We want to determine whether there is a closed form for the channel relia-bility function. For this we need the following definition, which we will go intoin more detail in the remark below.
Definition 23 (Turing computable performance function) . a Let ⊥ be a sym-bol. We call a function F : C H ( X , Y ) c × R + c → R c ∪ {⊥} Turing computableperformance function if there are two Turing computable channel functions f and f with f ( W ) ≤ f ( W ) for W ∈ C H c ( X , Y ) and a Turing machine T M F ,which is defined for input R ∈ R + c and W ∈ C H c ( X , Y ) . The Turing machinestops if and only if R ∈ ( f ( W ) , f ( W )) and the Turing machine T M F delivers F ( W, R ) =
T M F ( W, R ) . If R ( f ( W ) , f ( W )) then T M F does not stop. Remark 24.
The requirement for function F : C H ( X , Y ) c × R + c → R c ∪ {⊥} tobe a Turing-computable performance function is relatively weak. For example,let’s take W and R as inputs. Then the interval ( f ( W ) , f ( W )) is computed first.If R is now in the interval (( f ( W ) , f ( W )) , then it is required that the Turingmachine T M F stops for the input ( W, R ) and delivers the result for F ( W, R ) .Nothing is required about the behavior of the Turing machine for input W and R ( f ( W ) , f ( W )) . In particular, the Turing machine T M F does not have tostop for the input ( W, R ) in this case.Take, for example, any Turing-computable function G : C H ( X , Y ) c × R + c → R c ∪ {⊥} with the corresponding Turing machine T M G . Furthermore, let T M : C H c ( X , Y ) → R c and T M : C H c ( X , Y ) → R c be any two TM, so that M ( W ) ≤ T M ( W ) always holds for all W ∈ C H c ( X , Y ) . Then the followingTuring machine T M : C H c ( X , Y ) × R c → R c ∪ {⊥} defines a Turing-computableperformance function. For any input W ∈ C H c ( X , Y ) and R ∈ R c , first compute f ( W ) = T M ( W ) and f ( W ) = T M ( W ) . Compute the following two tests in parallel: (a)
Use the Turing machine
T M >f ( W ) and test R > f ( W ) using T M >f ( W ) for input R ∈ R c . (b) Use the Turing machine
T M T M stops for the input ( W, R ) if and only if R ∈ ( f ( W ) , f ( W )) applies. Then it gives the value G ( W, R ) as output. This follows from thefact that the Turing machine T M >f ( W ) stops for input R ∈ R c if and only if R > f ( W ) . The second Turing machine T M T M >f ( W ) and T M With the above approach we can try, for example, to find upperand lower bounds for the channel reliability function by allowing general Turing-computable functions G : C H ( X , Y ) c × R + c → R c ∪ {⊥} and determine algorith-mically the interval from R + c for which the function G ( W, · ) delivers lower orupper bounds for the channel reliability function. Definition 26 (Banach Mazur computable channel function) . We call f : C H c ( X , Y ) → R c a Banach Mazur computable channel function if every com-putable sequence { W r } r ∈ N from C H c ( X , Y ) is mapped by f into a computablesequence from R c . For practical applications, it is necessary to have performance functionswhich satisfy Turing computability. Depending on W , the channel reliability14unction or the bounds for this function should be computed. This computationis carried out by an algorithm that also receives W as input. This means thatthe algorithm should also be recursively dependent on W , because otherwisea special algorithm would have to be developed for each W (depending on W but not recursively dependent), since the channel reliability function for thischannel, or a bound for this function, is computed.It is now clear that when defining the Turing computable performance func-tion, the Turing computable channel functions f , f cannot be dispensed with,because the channel reliability function depends on the specific channel and thepermissible rate region for which the function can be computed. For f , oneoften has the representation f ( W ) = C ( W ) with W ∈ C H c ( X , Y ) . For f , thechoice f ( W ) = C ( W ) with W ∈ C H c ( X , Y ) for the channel reliability functionis a natural choice, because the channel reliability function is only useful for thisinterval. (We note that we showed in [15] that C ( W ) is not Turing computablein general.)For the Turing computability of the channel reliability function, or corre-sponding upper and lower bounds, it is therefore a necessary condition that thedependency of the relevant rate intervals on W is Turing computable, that is,recursive. 3. Results for the rate function R ∞ and applications on the spherepacking bound In this section we consider the R ∞ function and the consequences for thesphere packing bound. We show that this is not a Turing computable perfor-mance function. We already see that for R ∞ we have the representation R ∞ ( W ) = min Q ∈P ( Y ) max x ∈X log P y : W ( y | x ) > Q ( y ) . (14)15herefore we have R ∞ ( W ) = min Q ∈P ( Y ) max x ∈X log P y : W ( y | x ) > Q ( y )= min Q ∈P ( Y ) log x ∈X P y : W ( y | x ) > Q ( y )= log min Q ∈P ( Y ) x ∈X P y : W ( y | x ) > Q ( y )= log Q ∈P ( Y ) min x ∈X P y : W ( y | x ) > Q ( y )= log ∞ ( W ) , where Ψ ∞ ( W ) = max Q ∈P ( Y ) min x ∈X P y : W ( y | x ) > Q ( y ) .In summary, the following holds true: Let X , Y be arbitrary non-trivial finitealphabets, then for W ∈ C H c ( X , Y ) R ∞ ( W ) = log ∞ ( W ) . (15) Lemma 27. It holds R ∞ : C H c ( X , Y ) → R c . Proof. Let W be fixed. We consider the vector (cid:16) Q (1) · · · Q ( |Y| ) (cid:17) T of theconvex set M P rob = { u ∈ R |Y| : u = u ... u |Y| , u l ≥ , l = 1 , . . . , |Y| , X l u l = 1 } .G ( u ) := min x P y : W ( y | x ) > u y is a computable continuous function on M P rob .Thus for Ψ ∞ ( W ) = max u ∈M Prob G ( u ) we always have Ψ ∞ ( W ) ∈ R c with Ψ ∞ ( W ) > , and thus R ∞ ( W ) ∈ R c . (cid:4) Remark 28. We do not know whether C : C H c ( X , Y ) → R c holds for anyfinite X , Y . This statement holds for max {|X | , |Y|} ≤ , but the general case isopen. For finite alphabets X , Y and λ ∈ R c with λ > we want to analyze the set { W ∈ C H c ( X , Y ) : R ∞ ( W ) > λ } . 16o do so, we refer to the proof of Theorem 23 in [15]. Along the same lines, onecan show that the following holds true: Theorem 29. Let X , Y be non-trivial finite alphabets. For all λ ∈ R c with < λ < log (min {|X | , |Y|} ) , the set { W ∈ C H c ( X , Y ) : R ∞ ( W ) > λ } is not semi-decidable. The following Theorem can be derived from a combination of the proof ofTheorem 29 and Theorem 24 in [15]. The proof is carried out in the same wayas the proof of Theorem 24 in [15]. Theorem 30. Let X , Y be non-trivial finite alphabets. The function R ∞ : C H c ( X , Y ) → R is not Banach Mazur computable. We now show a stronger result than what we could show for C in [15] sofar. We show that the analogous question like Noga Alon’s question for C forthe R ∞ function can be answered positively.We need a concept of distance for W , W ∈ C H ( X , Y ) . Therefore, for fixedand finite alphabets X , Y we define the distance between W and W based onthe total variation distance d C ( W , W ) = max x ∈X X y ∈Y | W ( y | x ) − W ( y | x ) | . (16) Definition 31. A function f : C H ( X , Y ) → R is called computable continuousif: f is sequentially computable, i.e., f maps every computable sequence { W n } n ∈ N with W n ∈ C H c ( X , Y ) into a computable sequence { f ( W n ) } n ∈ N of computable numbers, f is effectively uniformly continuous, i.e., there is a recursive function d : N → N such that for all W , W ∈ C H c ( X , Y ) and all N ∈ N with d C ( W , W ) ≤ d ( N ) it holds that | f ( W ) − f ( W ) | ≤ N . heorem 32. Let X , Y be finite alphabets with |X | ≥ and |Y| ≥ . Thereexists a computable sequence of computable continuous functions { F N } N ∈ N on C H c ( X , Y ) with F N ( W ) ≥ F N +1 ( W ) with W ∈ C H ( X , Y ) and N ∈ N , lim N →∞ F N ( W ) = R ∞ ( W ) for all W ∈ C H ( X , Y ) .Proof. We consider the function Φ N ( W ) = max Q ∈P ( Y ) min x ∈X X y ∈Y N W ( y | x )1 + N W ( y | x ) Q ( y ) for N ∈ N . For all x ∈ X we have for all Q ∈ P ( Y ) X y ∈Y N W ( y | x )1 + N W ( y | x ) Q ( y ) ≤ X y ∈Y : W ( y | x ) > Q ( y ) (17)and for all N ∈ N we have for all x ∈ X and Q ∈ P ( Y ) X y ∈Y N W ( y | x )1 + N W ( y < x ) Q ( y ) ≤ X y ∈Y : W ( y | x ) > ( N + 1) W ( y | x )1 + ( N + 1) W ( y | x ) Q ( y ) . (18) Φ N is a computable continuous function and { Φ N } N ∈ N is a computable sequenceof computable continuous functions. So for F N ( W ) = log a Φ N ( W ) , for N ∈ N and W ∈ C H ( X , Y ) . F N satisfies all properties of the theorem andpoint 1 is shown.It holds (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X y ∈Y : W ( y | x ) > Q ( y ) − X y ∈Y N W ( y | x )1 + N W ( y | x ) Q ( y ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X y ∈Y : W ( y | x ) > 11 + N W ( y | x ) Q ( y ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ 11 + N min y ∈Y : W ( y | x ) > W ( y | x ) Therefore, we have X y ∈Y : W ( y | x ) > Q ( y ) ≤ 11 + N min y ∈Y : W ( y | x ) > W ( y | x ) (19) + X y ∈Y N W ( y | x )1 + N W ( y | x ) Q ( y ) . Φ N ( W ) ≤ Ψ ∞ ( W ) for all W ∈ C H c ( X , Y ) . (19) yields X y ∈Y : W ( y | x ) > Q ( y ) ≤ 11 + N min x ∈X (cid:16) min y ∈Y : W ( y | x ) > W ( y | x ) (cid:17) + X y ∈Y N W ( y | x )1 + N W ( y | x ) Q ( y ) . So min x ∈X X y ∈Y : W ( y | x ) > Q ( y ) ≤ 11 + N min x ∈X (cid:16) min y ∈Y : W ( y | x ) > W ( y | x ) (cid:17) + min x ∈X X y ∈Y N W ( y | x )1 + N W ( y | x ) Q ( y ) and Ψ ∞ ( W ) ≤ 11 + N min x ∈X (cid:16) min y ∈Y : W ( y | x ) > W ( y | x ) (cid:17) + Φ N ( W ) holds. So we have ≤ Ψ ∞ ( W ) − Φ N ( W ) ≤ 11 + N min x ∈X (cid:16) min y ∈Y : W ( y | x ) > W ( y | x ) (cid:17) . (cid:4) We now want to prove that Alon’s corresponding question can be answeredpositively for R ∞ . Theorem 33. Let X , Y be finite alphabets with |X | ≥ and |Y| ≥ . For all λ ∈ R c with < λ < log (min {|X | , |Y|} ) , the set { W ∈ C H c ( X , Y ) : R ∞ ( W ) < λ } is semi-decidable.Proof. We use the computable sequences of computable continuous functions F N from Theorem 32. It holds W ∈ { W ∈ C H c ( X , Y ) : R ∞ ( W ) < λ } 19f and only if there is an N such that F N < λ holds. As in the proof of Theo-rem 28 from [15] we now use the construction of a Turing machine T M R ∞ ,<λ ,which accepts exactly the set { W ∈ C H c ( X , Y ) : R ∞ ( W ) < λ } . (cid:4) We consider now the approximability “from below” (this could be seen as akind of reachability). We have shown that R ∞ ( · ) can always be represented asa limit value of monotonically decreasing computable sequences of computablecontinuous functions. From this it can be concluded that the sequence is thenalso a computable sequence of Banach Mazur computable functions. We nowhave: Theorem 34. Let X , Y be finite alphabets with |X | ≥ and |Y| ≥ . Theredoes not exist a sequence of Banach Mazur computable functions { F N } N ∈ N with F N ( W ) ≤ F N +1 ( W ) with W ∈ C H c ( X , Y ) and N ∈ N , lim N →∞ F N ( W ) = R ∞ ( W ) for all W ∈ C H ( X , Y ) .Proof. We assume that such a sequence { F N } N ∈ N does exist. Then from Theo-rem 32 and the assumptions from this theorem it can be concluded that R ∞ isa Banach-Mazur-computable function. This has created a contradiction. (cid:4) With this we immediately get the following: Corollary 35. Let X , Y be finite alphabets with |X | ≥ and |Y| ≥ . If { F N } N ∈ N with F N ( W ) ≤ F N +1 ( W ) with W ∈ C H c ( X , Y ) and N ∈ N , lim N →∞ F N ( W ) = R ∞ ( W ) for all W ∈ C H ( X , Y ) is a sequence of Banach Mazur computable functions, then there exists ˆ W ∈C H c ( X , Y ) with lim N →∞ F N ( ˆ W ) < R ∞ ( ˆ W ) . We want now apply the results for R ∞ to the sphere packing bound as anapplication. With the results via the rate function we immediately get:20 heorem 36. Let X , Y be finite alphabets with |X | ≥ and |Y| ≥ . Thesphere packing bound E sp ( · , · ) is not a Turing computable performance functionfor C H c ( X , Y ) × R + c .Proof. Assuming the statement of the theorem is incorrect, then R ∞ is a Tur-ing computable performance function on C H c ( X , Y ) × R + c . But then the chan-nel functions f ( W ) = R ∞ ( W ) for W ∈ C H c ( X , Y ) and f ( W ) = C ( W ) for W ∈ C H c ( X , Y ) must be Turing-computable channel functions. As alreadyshown, however, R ∞ is not Banach-Mazur-computable. We have thus createda contradiction. (cid:4) 4. Computability of the channel reliability function and the sequenceof expurgation bound functions In this section we consider the realiability function and the expurgationbound and show that these functions are not Turing computable performancefunctions.With the help of the results from [15] for C for noisy channels, we immedi-ately get the following theorem. Theorem 37. Let X , Y be finite alphabets with |X | ≥ and |Y| ≥ . The chan-nel reliability function E ( · , · ) is not a Turing computable performance functionfor C H c ( X , Y ) × R c .Proof. Here f ( W ) = C ( W ) for W ∈ C H c ( X , Y ) is a Turing-computable func-tion according to the Definition 22. We already know that C is not Banach-Mazur-computable on C H c ( X , Y ) . This gives the proof in the same way as forthe sphere packing bound, i.e. the proof of Theorem 36. (cid:4) Now we consider the rate function for the expurgation bound. The k -letterexpurgation bound E ex ( W, R, k ) as a function of W and R is a lower boundfor the channel reliability function. We have the problem that only for certainintervals ( R exk ( W ) , C ( W )) can the reliability function be finite. Thus we wantto computed the function there. In their famous paper [12] on the channel21eliability function, Shannon, Gallager and Berlekamp examined the sequenceof functions { E ex ( · , · , k ) } k ∈ N and analyzed the relationship to the channel re-liability function. They assumed that for all W ∈ C H ( X , Y ) for all R with E ( W, R ) < + ∞ (one would have convergence and also E ex ( W, R, k ) < + ∞ ),the relation lim k →∞ E ex ( W, R, k ) = E ( W, R ) holds. This assumption was refuted by Dalai and Polianski in [21].It was already clear with the introduction of the channel reliability functionthat it had a complicated behavior. A closed form formula for the channelreliability function is not yet known and the results of this paper show that sucha recursive formula for the channel reliability function cannot exist. Shannon,Gallager and Berlekamp tried in [12] in 1967 to find sequences of seeminglysimple formulas for the approximation of the channel reliability function. Itseems that they considered the sequence of the k-letter expurgation boundsto be very good channel data for the approximation of the channel reliabilityfunction. It was hoped that these sequences could be computed more easilywith the use of new powerful digital computers.Let us now examine the sequence { E ex ( · , · , k ) } k ∈ N . We have already intro-duced the concept of the computable sequence of computable continuous channelfunctions. We now introduce the concept of the computable sequences of Turingcomputable performance functions. Definition 38. A sequence { F k } k ∈ N of Turing computable performance func-tions is called a computable sequence if there is a Turing machine that generatesthe description of F k for input k according to the definition of the function F k for the values for which the function is defined. We now want to prove in the following theorem that the sequence of the k -letter expurgation bounds is not a computable sequence of computable per-formance functions. So the hope mentioned above cannot be fulfilled. Theorem 39. Let X , Y be finite alphabets with |X | ≥ and |Y| ≥ . The equence of the expurgation lower bounds { E ex ( · , · , k ) } k ∈ N is not a computablesequence of Turing computable performance functions.Proof. We prove the theorem indirectly and assume that there is a Turing ma-chine T M ∗ that generates the description for the input k according to the def-inition of the function E ex ( · , · , k ) . Then { R exk } k ∈ N is a computable sequenceof Turing computable functions because we have an algorithm to generate thissequence { R exk } k ∈ N . Note that f k ( · ) = R exk ( · ) . For input k , T M ∗ generatesthe description of the function E ex ( · , · , k ) and from this we can immediatelygenerate R exk by projection (in the sense of primitive recursive functions). Now,according to Shannon, Gallager, Berlekamp, [12] for all W ∈ C H ( X , Y )lim k →∞ R exk ( W ) = C ( W ) and we have for all k ∈ N R exk ( W ) ≤ R exk +1 ( W ) for all W ∈ C H ( X , Y ) . Let usconsider the set { W ∈ C H c ( X , Y ) : C ( W ) > λ } for λ ∈ R c with < λ < log (min {|X | , |Y|} ) . We are now constructing a Turingmachine T M ∗ with only one holding state "stop", which means that it eitherstops or computes forever. T M ∗ should stop for input W ∈ C H c ( X , Y ) if andonly if C ( W ) applies, that is, T M ∗ stops if W is in the above set. Accordingto the assumption, { R exk ( · ) } k ∈ N is a computable sequence of Turing computablechannel functions. For the input W we can generate the computable sequence { R exk ( W ) } k ∈ N of computable numbers. We now use the Turing machine T M λ ,which receives an arbitrary computable number x as input and stops if andonly if x > λ , i.e. T M λ has only one hold state again and accepts exactlythe computable numbers x as input for which x > λ holds. We now use thisprogram for the following algorithm.1. We start with l = 1 and let T M λ compute one step for input R ex ( W ) . If T M λ ( R ex ( W )) stops, then we stop the algorithm.2. If T M λ ( R ex ( W )) does not stop, we set l = l + 1 and compute l + 1 steps T M λ ( R exr ( W )) for ≤ r ≤ l + 1 . If one of these Turing machines stops,23hen the algorithm stops, if not we set l = l + 1 and repeat the secondcomputation.The above algorithm stops if and only if there is a ˆ k ∈ N such that R ˆ kex ( W ) > λ .But this is the case (because of the monotony of the sequence { R exk ( W ) } k ∈ N ) ifand only if C ( W ) > λ . But with this, the set { W ∈ C H c ( X , Y ) : C ( W ) > λ } is semi-decidable. So we have shown that this is not the case. We have thuscreated a contradiction. (cid:4) 5. Computability of the zero-error capacity of noisy channels withfeedback In this section we consider the zero-error capacity for noisy channels withfeedback. In our paper [15] we examined the properties of the zero-error capacitywithout feedback. Let W ∈ C H ( X , Y ) . We already noted that Shannon showedin [18], C F B = if C ( W ) = 0max P min y log P x : W ( y | x ) > P ( x ) otherwise. . (20)If we set Ψ F B ( W ) = min p ∈P ( X ) min y ∈Y X x : W ( y | x ) > P ( x ) , (21)then we have for W with C ( W ) = 0 , C F B = log F B ( W ) . We know that C F B ( W ) = R ∞ ( W ) if C ( W ) > . If C ( W ) = 0 , then there isa channel W with C F B ( W ) = 0 and R ∞ > . Like in Lemma 27, we can showthe following: Lemma 40. Let X , Y be finite non-trivial alphabets. It holds C F B : C H c ( X , Y ) → R c . C and C F B , we get thefollowing results for C F B , which we have already proved for C in [15]. Theorem 41. Let X , Y be finite alphabets with |X | ≥ and |Y| ≥ . For all λ ∈ R c with ≤ λ < log min {|X | , |Y|} , the sets { W ∈ C H c ( X , Y ) : C F B ( W ) >λ } are not semi-decidable. Theorem 42. Let X , Y be finite alphabets with |X | ≥ and |Y| ≥ . Then C F B : C H c ( X , Y ) → R is not Banach-Mazur computable. Now we will prove the following: Theorem 43. Let X , Y be finite alphabets with |X | ≥ and |Y| ≥ . There isa computable sequence of computable continuous functions G with G N ( W ) ≥ G N +1 ( W ) for W ∈ C H ( X , Y ) and N ∈ N ; lim n →∞ G N ( W ) = C F B ( W ) for W ∈ C H ( X , Y ) .Proof. We use for N ∈ N , y ∈ Y and P ∈ P ( X ) the function X x ∈X N W ( y | x )1 + N W ( y | x ) P ( x ) . Then for Φ N ( W ) = min P ∈P ( X ) max y ∈Y X x ∈X : W ( y | x ) > P ( x ) we have the same properties as in Theorem 32 and U N ( W ) = log n ( W ) is an upper bound for C F B , which is monotonically decreasing. Now the relation C F B ( W ) > holds for W ∈ C H ( X , Y ) if and only if there are two x , x ∈ X ,so that X y ∈Y W ( y | x ) W ( y | x ) = 0 holds. We now set g (ˆ x, x ) = P y ∈Y W ( y | ˆ x ) W ( y | x ) = g ( W, ˆ x, x ) and have ≤ g (ˆ x, x ) ≤ for x, ˆ x ∈ X . g is a computable continuous function with respect to25 ∈ C H ( X , Y ) . Now we set V N ( W ) = − Y x, ˆ x g ( W, ˆ x, x ) N U N ( W ) for N ∈ N . { V N } N ∈ N is thus a computable sequence of computable continuousfunctions. Obviously, V N ( W ) ≥ V N +1 ( W ) for W ∈ C H ( X , Y ) and N ∈ N issatisfied. (1 − Y x, ˆ x g ( W, x, hx )) N = 1 if and only if C F B > . So for C F B ( W ) = 0 we always have lim N →∞ V N ( W ) = 0 . For W with C F B ( W ) , lim N →∞ V N ( W ) = lim N →∞ U N ( W ) = C F B ( W ) . This is shown as in the proof of Theorem 32. (cid:4) This immediately gives us the following theorem. Theorem 44. Let X , Y be finite alphabets with |X | ≥ and |Y| ≥ . For all λ ∈ R c with ≤ λ < log min {|X | , |Y|} , the sets { W ∈ C H c ( X , Y ) : C F B ( W ) <λ } are semi-decidable. Now we want to look at the consequences of the results above for C F B .The same statements apply here as in section 3 for R ∞ with regard to theapproximation from below. C F B cannot be approximated by monotonicallyincreasing sequences.There is an elementary relationship between R ∞ and C F B , which we areusing now in the following. We again assume that X , Y are finite non-trivialalphabets. We remind you of the following functions: R ∞ ( W ) = log ∞ ( W ) , (22)26here Ψ ∞ ( W ) = max Q ∈P ( Y ) min x ∈X P y : W ( y | x ) > Q ( y ) .C F B = C ( W ) = 0 G ( W ) C ( W ) > , (23)where G ( W ) = log FB ( W ) and Ψ F B ( W ) = min p ∈P ( X ) min y ∈Y X x : W ( y | x ) > P ( x ) . (24)Let A ( W ) be the |Y| × |X | matrix with ( A ( W )) kl ∈ { , } for ≤ k ≤ |Y| and ≤ l ≤ |X | such that ( A ( W )) kl = 1 if and only if W ( k ( l )) > . Furthermore,let M X = u ∈ R |X | : u = u . . .u |X | , u l ≥ , |X | X l =1 u l = 1 (25)and M Y = v ∈ R |Y| : v = v . . .v |Y| , v l ≥ , |Y| X l =1 v l = 1 . (26)For v ∈ R |Y| and u ∈ R |X | we consider the function F ( v, u ) = v T A ( W ) u . Thefunction F is concave in v ∈ M Y and convex in u ∈ M X . M Y and M X areclosed convex and compact sets and F ( v, u ) is continuous in both variables. So max v ∈M Y min u ∈M X F ( v, u ) = min u ∈M X max v ∈M Y F ( v, u ) . (27)Let v ∈ M Y be fixed. Then F ( v, u ) = |X | X l =1 |Y| X k =1 v k A kl ( W ) u l (28) F ( v, u ) = |X | X l =1 d l ( v ) u l , (29)with d l ( v ) = P |Y| k =1 v k A kl ( W ) . It is now d l ( v ) ≥ for ≤ l ≤ |X | . Hence min u ∈M X F ( v, u ) = min ≤ l ≤|X | d l ( v ) = min ≤ l ≤|X | X k : A k l ( W ) > v k = min x ∈X X y : W ( y | x ) > Q v ( y ) Q v ( y ) = v y for y ∈ { , . . . , |Y|} . So max v ∈M Y min u ∈M X F ( v, u ) = max Q ∈P ( Y ) min x ∈X X y : W ( y | x ) > Q v ( y ) = Ψ ∞ ( W ) . Furthermore, for u ∈ M X fixed, F ( v, u ) = |Y| X k =1 |X | X l =1 u l A kl ( W ) v k = |Y| X k =1 β k ( u ) v k , with β k ( u ) = P |X | l =1 u l A kl ( W ) ≥ and ≤ k ≤ |Y| . Therefore, max v ∈M Y F ( v, u ) = max ≤ k ≤|Y| β k ( u ) = max ≤ k ≤|Y| X l : A kl ( W ) > u l = max y ∈Y X x : W ( Y | x ) > p u ( x ) with p u ( x ) = u x for ≤ x ≤ |X | . It follows min u ∈M X max v ∈M Y F ( v, u ) = min p ∈P ( X ) max y ∈Y X x : W ( y | x ) > P ( x ) = Ψ F B ( W ) . Because of (27) we have for W ∈ C H ( X , Y )Ψ ∞ = Ψ F B . We get the following Lemma. Lemma 45. Let W ∈ C H ( X , Y ) , then R ∞ ( W ) = G ( W ) . We want to investigate the behavior of E ( · , R ) for the input W ⊗ W , where W ⊗ W denotes the Kronecker-product of the matrices W and W , comparedto E ( W , R ) and E ( W , R ) . For this purpose, let X , Y , X , Y be arbitraryfinite non-trivial alphabets and we consider W l ∈ C H ( X l , Y l ) for l = 1 , . Theorem 46. Let X , Y , X , Y be arbitrary finite non-trivial alphabets and W l ∈ C H ( X l , Y l ) for l = 1 , . Then we have R ∞ ( W ⊗ W ) = R ∞ ( W ) + R ∞ ( W ) . roof. We use the Ψ ∞ function. It applies to Q = Q · Q with Q ∈ P ( Y ) and Q ∈ P ( Y ) that min x ∈X ,x ∈X X y : W ( y | x ) > X y : W ( y | x ) Q ( y ) Q ( y )= min x ∈X X y : W ( y | x ) > Q ( y ) min x ∈X X y : W ( y | x ) > Q ( y ) . This applies to all Q ∈ P ( Y ) and Q ∈ P ( Y ) arbitrarily. So Ψ ∞ ( W ⊗ W ) ≥ Ψ ∞ ( W ) · Ψ ∞ ( W ) . Also we have Ψ ∞ ( W ⊗ W )= min P ∈P ( X ×X ) max ( y ,y ) ∈Y ×Y X x : W ( y | x ) > X x : W ( y | x ) > P ( x , y ) ≤ Ψ ∞ ( W ) · Ψ ∞ ( W ) as well. So we have Ψ ∞ ( W ⊗ W ) = Ψ ∞ ( W ) · Ψ ∞ ( W ) and the theorem is proven. (cid:4) We want to investigate the behavior of C F B for the input W ⊗ W comparedto C F B ( W ) and C F B ( W ) . For this purpose, let X , Y , X , Y is arbitraryfinite non-trivial alphabets and we consider W l ∈ C H ( X l , Y l ) for l = 1 , . Theorem 47. Let X , Y , X , Y be arbitrary finite non-trivial alphabets and W l ∈ C H ( X l , Y l ) for l = 1 , . Then we have C F B ( W ⊗ W ) ≥ C F B ( W ) + C F B ( W ) (30)2. C F B ( W ⊗ W ) > C F B ( W ) + C F B ( W ) (31)29 f and only if min ≤ l ≤ C F B ( W l ) = 0 and max ≤ l ≤ C F B ( W l ) > and min ≤ l ≤ R ∞ ( W l ) > . (32) Remark 48. The condition (32) is equivalent to min ≤ l ≤ C ( W l ) = 0 and max ≤ l ≤ C ( W l ) > and min ≤ l ≤ R ∞ ( W l ) > . (33) Proof. (30) follows directly from the operational definition of C. Let (32) now befulfilled. Then C F B ( W ⊗ W ) > must be fulfilled. Without loss of generalitywe assume C F B ( W ) = 0 , C F B ( W ) > and R ∞ ( W ) > , R ∞ ( W ) > . Since C F B ( W ⊗ W ) > , C F B ( W ⊗ W ) = R ∞ ( W ⊗ W )= R ∞ ( W ) + R ∞ ( W )= R ∞ ( W ) + C F B ( W ) > C F B ( W )= C F B ( W ) + C F B ( W ) . If (31) is fulfilled, then C F B ( W ⊗ W ) > . Then max ≤ l ≤ C F B ( W l ) > mustbe, because if max ≤ l ≤ C F B ( W l ) = 0 then max ≤ l ≤ C ( W l ) = 0 and thus also C ( W ⊗ W ) = 0 (since the C capacity has no super-activation). This meansthat C F B ( W ⊗ W ) = 0 , which would be a contradiction.If min ≤ C F B ( W l ) > , then C F B ( W ⊗ W ) = R ∞ ( W ⊗ W )= R ∞ ( W ) + R ∞ ( W )= C F B ( W ) + C F B ( W ) . This is a contradiction and thus min ≤ C F B ( W l ) = 0 .Furthermore, min ≤ l ≤ R ∞ ( W l ) > must apply, because if30 in ≤ l ≤ R ∞ ( W l ) = 0 , then R ∞ ( W ) = 0 without loss of generality. Then C F B ( W ⊗ W ) = R ∞ ( W ⊗ W )= R ∞ ( W ) + R ∞ ( W )= 0 + R ∞ ( W )= 0 + C F B ( W )= C F B ( W ) + C F B ( W ) because C F B ( W ) = 0 when R ∞ ( W ) = 0 . This is again a contradiction. Withthat we have proven the theorem. (cid:4) We still want to show for which alphabet sizes the behavior according toTheorem 47 can occur. Theorem 49. If |X | = |X | = | Y | = | Y | = 2 , then for all W l ∈C H ( X l , Y l ) with l = 1 , , we have C F B ( W ⊗ W ) = C F B ( W ) + C F B ( W ) . (34)2. If X , X , Y , Y are non-trivial alphabets with max { min {|X | , |Y |} , min {| X | , |Y |}} ≥ , then there exists ˆ W l ∈ C H ( X l , Y l ) with l = 1 , , such that C F B ( ˆ W ⊗ ˆ W ) > C F B ( ˆ W ) + C F B ( ˆ W ) . (35) Proof. 1. If C ( W ) = C ( W ) , then (34) holds because C ( W ⊗ W ) = 0 .If max { C ( W ) , C ( W ) } > , then without loss of generality C ( W ) =0 and W = or W = always applies and therefore C ( W ) = 1 . This means that C F B ( W ) = 1 . Furthermore, if R ∞ > ,then W = or W = , so (34) is fulfilled. If R ∞ ( W ) = 0 then (34) is also fulfilled due to Theorem 46.31. We now prove (35) under the assumption |X | = |Y | = 2 and |X | = |Y | = 3 . If we have found channels ˆ W , ˆ W for this case such that (35)holds, then it is also clear how the general case 2. can be proved. We set ˆ W = , that means C ( ˆ W ) = C F B ( ˆ W ) = R ∞ ( ˆ W ) = 1 . For ˆ W we take the 3-ary typewriter channel ˆ W ( ǫ ) with X = Y = { , , } (see[21]): ˆ W ( ǫ )( y | x ) = − ǫ y = xǫ y = x + 1 mod 3 . Let ǫ ∈ (0 , ) be arbitrary, then C ( ˆ W ( ǫ )) = log (3) − H ( ǫ ) . It is R ∞ ( ˆ W ( ǫ )) = log and C ( ˆ W ( ǫ )) = 0 . This means that C F B ( ˆ W ( ǫ )) =0 . Thus, because C ( ˆ W × ˆ W ( ǫ )) ≥ C ( ˆ W ) = 1 , C F B ( ˆ W ⊗ ˆ W ( ǫ )) = R ∞ ( ˆ W ) = R ∞ ( ˆ W ( ǫ ))= 1 + log ( 32 ) > C F B ( ˆ W ) + C F B ( ˆ W ( ǫ )) and we have proven case 2. (cid:4) 6. Behavior of the expurgation-bound rates In this section we consider the behaviour of the expurgation-bound rate. R kex occurs in the expurgation bound as a lower bound for the channel reliabilityfunction, where k is the parameter for the k -letter description. Let X , Y , X , Y be arbitrary finite non-trivial alphabets and W l ∈ C H ( X l , Y l ) for l = 1 , . Wewant to examine R kex . Theorem 50. There are non-trivial alphabets X , Y , X , Y and channels W l ∈C H ( X l , Y l ) for l = 1 , , such that for all ˆ k , there exists k ≥ ˆ k with R kex ( W ⊗ W ) = R kex ( W ) + R kex ( W ) . Proof. Assume that for all X , Y , X , Y and W l ∈ C H ( X l , Y l ) with l = 1 , forall k ∈ N R kex ( W ⊗ W ) = R kex ( W ) + R kex ( W ) 32e now take X ′ , Y ′ , X ′ , Y ′ such that C is superadditive. Then we have forcertain W ′ , W ′ with W ′ l ∈ C H ( X ′ l , Y ′ l ) , C ( W ′ ⊗ W ′ ) > C ( W ′ ) + C ( W ′ ) . (36)Then C ( W ′ ⊗ W ′ ) = lim k →∞ R kex ( W ′ ⊗ W ′ )= lim k →∞ R kex ( W ′ ) + R kex ( W ′ )= C ( W ′ ) + C ( W ′ ) . This is a contradiction and thus the theorem is proven. (cid:4) We improve the statement of Theorem 50 with the following theorem. Theorem 51. There are non-trivial alphabets X , Y , X , Y and channels W l ∈C H ( X l , Y l ) for l = 1 , , so that there is a ˆ k , so that for all k ≥ ˆ kR kex ( W ⊗ W ) > R kex ( W ) + R kex ( W ) . Proof. Assuming the statement of the theorem is false, that means for all chan-nels W l ∈ C H ( X l , Y l ) with l = 1 , the following applies: There exists a sequence { k j } j ∈ N ⊂ N with lim j →∞ k j = + ∞ such that R k l ex ( W ⊗ W ) ≤ R k l ex ( W ) + R k l ex ( W ) for l ∈ N . We now take ˆ X , ˆ Y , ˆ X , ˆ Y so that C is super-additive for thesealphabets. Then we have for certain ˆ W , ˆ W with ˆ W l ∈ C H ( X l , Y l ) for l = 1 , C ( ˆ W ⊗ ˆ W ) > C ( ˆ W ) + C ( ˆ W ) . (37)Then C ( ˆ W ⊗ ˆ W ) = lim j →∞ R k j ex ( ˆ W ⊗ ˆ W ) ≤ lim j →∞ (cid:16) R k j ex ( ˆ W ) + R k j ex ( ˆ W ) (cid:17) = C ( ˆ W ) + C ( ˆ W ) . This is a contradiction to (37) and thus the theorem is proven. (cid:4) 33e have already seen that for certain rate ranges [ R, ˆ R ] , the function E ( W, · ) has completely different behavior. We have already examined the influenceof W ⊗ W for the intervals ( R ∞ ( W ⊗ W ) , C ( W ⊗ W )) and ( E kex ( W ⊗ W ) , C ( W ⊗ W )) for k ∈ N . For the first interval we have ( R ∞ ( W ⊗ W ) , C ( W ⊗ W )) = ( R ∞ ( W ) + R ∞ ( W ) , C ( W ) + C ( W )) . For the sequence of the second interval we have seen that such behavior cannotapply. For certain alphabets and channels W ′ , W ′ from the proof of Theorem 47, R kex ( W ′ ⊗ W ′ ) > R kex ( W ′ ) + R kex ( W ′ ) must hold for k ≥ ˆ k . It is also interesting to understand when the interval [0 , ˆ R ) E ( W, r ) must be infinite. This is true if and only if C ( W ) > and then thisinterval is given by [0 , C ( W )) . We now have that for certain alphabets andfor certain channels W ′ , W ′ for E ( W ′ ⊗ W ′ , · ) this interval can be greater than [0 , C ( W ′ ) + C ( W ′ )) . Then C is in general super-additive. 7. Conclusions It was already clear with the introduction of the channel reliability functionthat it had a complicated behavior. A closed form formula for the channelreliability function in the sence of [2] and [3] is not yet known. We analyzed thecomputability of the reliability function and its related functions. We showedthat the reliability function is not a Turing computable performance function.The same also applies to the functions of the sphere packing bound and theexpurgation bound.It is interesting to note that we have made a much weaker requirementon the Turing computable performance function than the requirement that isusually made on the Turing computability of a function. We do not requirethat the Turing machine stop for the computation of the performance functionfor all inputs ( W, R ) ∈ C H c ( X , Y ) × R + c . This means that we also allow thatthe corresponding Turing machine computes for certain inputs forever, i.e. it34ill never stop for certain inputs. This means that we can allow functions asperformance functions that are not defined for all ( W, R ) ∈ C H c ( X , Y ) × R + c .However, we require that the Turing machine stops to compute a performancefunction F for input ( W, R ) ∈ C H c ( X , Y ) × R c if F is defined, and that itthen computes the computable number F ( W, R ) as output. This means thatan algorithm is generated at the output that represents the number F ( W, R ) according to Definition 23.Furthermore, we considered the R ∞ function and the zero-error feedbackcapacity; both of them play an important role for the reliability function. Boththe R ∞ function and the zero-error feedback capacity are not Banach Mazurcomputable. We showed that the R ∞ function is additive. The zero-errorfeedback capacity is super-additive and we characterized its behavior.We showed that for all finite alphabets X , Y with |X | ≥ and |Y| ≥ the channel reliability function itself is not a Turing computable performancefunction. We also showed that the usual bounds, which have been extensivelyexamined in the literature so far, are not Turing computable performance func-tions. It is unclear whether one can find non-trivial upper bounds for the channelreliability function at all, which are Turing’s computable performance functions.Shannon, Gallager and Berlekamp considered in [12] the sequence of thek-letter expurgation bounds to be very good channel data for approximatingthe channel reliability function. It was hoped that these sequences could becomputed more easily with the use of new powerful digital computers. Weshowed that unfortunately, this is not possible. Acknowledgments We thank the German Research Foundation (DFG) within the GottfriedWilhelm Leibniz Prize under Grant BO 1734/20-1 for their support of H. Boche.Further, we thank the German Research Foundation (DFG) within Ger-many’s Excellence Strategy EXC-2111—390814868 for their support of H.Boche. 35hanks also go to the German Federal Ministry of Education and Research(BMBF) within the national initiative for “Post Shannon Communication (New-Com)” with the project “Basics, simulation and demonstration for new commu-nication models” under Grant 16KIS1003K for their support of H. Boche andwith the project “Coding theory and coding methods for new communicationmodels” under Grant 16KIS1005 for their support of C. Deppe.Finally, we thank Yannik Böck for his helpful and insightful comments.