A computability challenge: asymptotic bounds and isolated error-correcting codes
aa r X i v : . [ c s . I T ] J u l A COMPUTABILITY CHALLENGE: ASYMPTOTIC BOUNDSAND ISOLATED ERROR–CORRECTING CODESYuri I. Manin
Max–Planck–Institut f¨ur Mathematik, Bonn, Germany
Dedicated to Professor C. S. Calude, on his 60th birthday
ABSTRACT. Consider the set of all error–correcting block codes over a fixedalphabet with q letters. It determines a recursively enumerable set of points in theunit square with coordinates ( R, δ ):= (relative transmission rate, relative minimaldistance).
Limit points of this set form a closed subset, defined by R ≤ α q ( δ ), where α q ( δ ) is a continuous decreasing function called asymptotic bound. Its existence wasproved by the author in 1981, but all attempts to find an explicit formula for it sofar failed.In this note I consider the question whether this function is computable in thesense of constructive mathematics, and discuss some arguments suggesting that theanswer might be negative.
1. Introduction.1.1. Notation.
This paper is a short survey focusing on an unsolved problemof the theory of error–correcting codes (cf. the monograph [VlaNoTsfa]).Briefly, we choose and fix an integer q ≥ alphabet A , ofcardinality q . An (unstructured) code C is defined as a non–empty subset C ⊂ A n of words of length n ≥
1. Such C determines its code point P C = ( R ( C ) , δ ( C )) inthe ( R, δ )–plane, where R ( C ) is called the transmission rate and δ ( C ) is the relativeminimal distance of the code. They are defined by the formulas δ ( C ) := d ( C ) n ( C ) , d ( C ) := min { d ( a, b ) | a, b ∈ C, a = b } , n ( C ) := n,R ( C ) = k ( C ) n ( C ) , k ( C ) := log q card( C ) , (1 . d ( a, b ) is the Hamming distance d (( a i ) , ( b i )) := card { i ∈ (1 , . . . , n ) | a i = b i } . In the degenerate case card C = 1 we put d ( C ) = 0 . We will call the numbers k = k ( C ), n = n ( C ), d = d ( C ), code parameters and refer to C as an [ n, k, d ] q –code.A considerable bulk of research in this domain is dedicated either to the con-struction of (families of) “good” codes (e. g. algebraic–geometric ones), or to theproof that “too good” codes do not exist. A code is good if in a sense it maximizessimultaneously the transmission rate and the minimal distance. To be useful inapplications, a good code must also come with feasible algorithms of encoding anddecoding. The latter task includes the problem of finding a closest (in Hamming’smetric) word in C , given an arbitrary word in A n that can be an output of a noisytransmission channel (error correction). Feasible algorithms exist for certain classesof structured codes. The simplest and most popular example is that of linear codes : A is endowed with a structure of a finite field F q , A n becomes a linear space over F q , and C is required to be a linear subspace. Since the demands of good codes are mutuallyconflicting, it is natural to look for the bounds of possible.A precise formulation of the notion of good codes can be given in terms of twonotions: asymptotic bounds and isolated codes.
Fix q and denote by V q the set of all points P C , corresponding to all [ n, k, d ] q –codes. Define the code domain U q as the set of limit points of V q . It was proved in [Man1] that U q consists of all points in [0 , lying below thegraph of a certain continuous decreasing function α q : U q = { ( R, δ ) | R ≤ α q ( δ ) } . (1 . α q (0) = 1 , α q ( δ ) = 0 for 1 − q − ≤ δ ≤
1, and the graph of α q is tangentto the R –axis at (1 ,
0) and to the δ –axis at (0 , − q − ).This curve is called the asymptotic bound . (In fact, [Man1] considered only linearcodes, and the respective objects are now called V linq , U linq , α linq ; unstructured casecan be treated in the same way with minimal changes: cf. [ManVla] and [ManMar]).Now, a code can be considered a good one, if its point either lies in U q and isclose to the asymptotic bound, or is isolated , that is, lies above the asymptoticbound. There is an abundant literature establishingupper and lower estimates for asymptotic bounds, and providing many isolated codes. However, not only “exact formulas” for asymptotic bounds are unknown,but even the question, whether α q ( δ ) is differentiable, remains open (of course, sincethis function is monotone and continuous, it is differentiable almost everywhere .)Similarly, the structure of the set of isolated code points is a mystery: for example, are there points on R = α q ( δ ) , < R < − q − , that are limit points of isolatedcodes? The principle goal of this report is to discuss weaker versions of these problems,replacing “exact formulas” by “computability”. In particular, we try to elucidatethe following
QUESTION. Is the function α q ( δ ) computable? As our basic model of computability we adopt the one described in [BratWe] andfurther developed in [BratPre], [Brat], [BratMiNi]. In its simplest concrete version,it involves approximations of closed subsets of R , such as U q or graph of α q , byunions of computable sets of rational coordinate squares, “pixels” of varying size.The following mental experiment suggests that the answer to this computabilityproblem may not be obvious, and that α q might even be uncomputable and byimplication not expressible by any reasonable “explicit formula”.Imagine that a computer is drawing finite approximations V ( N ) q to the set of codepoints V q by plotting all points with n ≤ N for a large N (appropriately matchinga chosen pixel size). What will we see on the screen?Conjecturally, we will not see a dark domain approximating U q with a cloud ofisolated points above it, but rather an eroded version of the Varshamov–Gilbertcurve lying (at least partially) strictly below R = α q ( δ ): R = 12 (1 − δ log q ( q − − δ log q δ − (1 − δ )log q (1 − δ )) (1 . q = 2.)By contrast, a statistical meaning of the asymptotic bound does not seem to beknown, and this appears as the intrinsic difficulty for a complete realization of theproject started in [ManMar]: interpreting asymptotic bound as a “phase transition”curve. Hopefully, a solution might be found if we imagine plotting code points inthe order of their growing Kolmogorov complexity , as was suggested and used in [Man3] for renormalization of halting problem. For the context of constructivemathematics, cf. [CaHeWa] and references therein.In any case, it is clear that code domains represent an interesting testing groundfor various versions of computability of subsets of R n , complementing the morepopular Julia and Mandelbrot fractal sets (cf. [BravC] and [BravYa]).
2. Code parameters and code points: a summary2.1. Constructive worlds of code parameters.
Denote the set of all triples[ n, q k , d ] ∈ N corresponding to all (resp. linear) [ n, k, d ] q –codes by P q (resp. P linq ).Clearly, P q and P linq are infinite decidable subsets of N . Therefore they admitnatural recursive and recursively invertible bijections with N (“admissible number-ings”), defined up to composition with any recursive permutation N → N . Hence P q and P linq are infinite constructive worlds in the sense of [Man3], Definition 1.2.1.If X , Y are two constructive worlds, we can unambiguously define the notionsof (partial) recursive maps X → Y , enumerable and decidable subsets of X , Y , X × Y etc., simply pulling them back to the numberings. For a more developedcategorical formalism, cf. [Man3]. = [0 , ∩ Q . The set of all rational points ofthe unit square in the (
R, δ )–plane also has a canonical structure of a constructiveworld.
Code points (1.1) of linear codes alllie in S . To achieve this for unstructured codes, we will slightly amend (1.1) anddefine the map cp : P q → S ( cp stands for “code point”) by cp ([ n, q k , d ]) := (cid:18) [ k ] n , dn (cid:19) (2 . k ] denotes the integer part of the (generally real) number k . On P linq ⊂ P q it coincides with (1.1).The motivation for choosing (2.1) is this: in the eventual study of computabilityproperties of the graph R = α q ( δ ), it is more transparent to approximate it bypoints with rational coordinates, rather than logarithms.Let V q (resp. V linq ) be the image cp ( P q ) (resp. cp ( P linq )) i.e. the respective setof code points in S . Since cp is a total recursive function both on P q and P linq , V q and V linq are recursively enumerable subsets of S . Let U q (resp. U linq ) be the closed sets of limit points of V q (resp. V linq ). We will call limit code points elements of V q ∩ U q (resp. V linq ∩ U linq ).The remaining subset of isolated code points is defined as V q \ V q ∩ U q , and similarlyfor linear codes.Notice that we get one and the same set U q , using transmission rates (1.1)or (2.1). In fact, for any infinite sequence of pairwise distinct code parameters[ n i , q k i , d i ], i = 1 , , ... we have n i → ∞ , hence the convergence of the sequence ofcode points (1.1) is equivalent to that of (2.1), and they have a common limit. Theresulting sets of isolated code points differ depending on the adopted definition (1.1)or (2.1), however, the set of isolated codes , those whose code points are isolated,remains the same.Our main result in this section is the following characterization of limit andisolated code points in terms of the recursive map cp rather than topology of theunit square.We will say that a code point x ∈ V q has infinite (resp. finite ) multiplicity , if cp − ( x ) ⊂ P q is infinite (resp. finite). The same definition applies to V linq and P linq . (a) Code points of infinite multiplicity are limit points. There-fore isolated code points have finite multiplicity.(b) Conversely, any point ( R , δ ) with rational coordinates satisfying the in-equality < R < α q ( δ ) (resp. < R < α linq ( δ ) ) is a code point (resp. linearcode point) of infinite multiplicity. This (actually, a slightly weaker) statement, seemingly, was first stated andproved in [ManMar]. It makes me suspect that distinguishing between limit andisolated code points might be algorithmically undecidable , since in general it is al-gorithmically impossible to decide, whether a given recursive function takes one ofits values at a finite or infinitely many points.Similarly, one cannot expect a priori that limit and isolated code points formtwo recursively enumerable sets, but this must be true, if α q is computable: seeTheorem 3.3.1 below.For completeness, I will reproduce the proof of Theorem 2.5 here. It is basedon the same “Spoiling Lemma” that underlies the only known proof of existence ofthe asymptotic bounds α q and α linq . (Numerical spoiling). If there exists a linear [ n, k, d ] q –code,then there exist also linear codes with the following parameters: (i) [ n + 1 , k, d ] q (always).(ii) [ n − , k, d − q (if n > , k > .)(iii) [ n − , k − , d ] q (if n > , k > In the domain of unstructured codes statements (i) and (ii) remain true, whereasin (iii) one should replace [ n − , k − , d ] q by [ n − , k ′ , d ] q for some k − ≤ k ′ < k . For a proof of Proposition 2.6, see e. g. [VlaNoTsfa] (linear codes) and [ManMar](unstructured codes). (a) We first check that if a code point ( R , δ ) ∈ Q is of infinite multiplicity, then it is a limit point. In fact, let [ n i , q k i , d i ] be aninfinite sequence of pairwise distinct code parameters, i ≥
1, such that [ k i ] /n i = R , d i /n i = δ for all i . Then codes with parameters [ n i + 1 , q k i , d i ] (cf. 2.6 (i))produce infinitely many pairwise distinct code points converging to ( R , δ ).(b) Now consider a rational point ( R , δ ) ∈ Q ∩ (0 , (unstructured or linear),lying strictly below the respective asymptotic bound. Then there exists a codepoint ( R , δ ) also lying strictly below the asymptotic bound, with R > R and δ > δ , because functions α q and α linq decrease. Hence in the part of U q (resp. U linq ) where R ≥ R , δ ≥ δ ) there exists an infinite family of pairwise distinctcode points ( R i , δ i ), i ≥
1, coming from a family of unstructured (resp. linear)[ N i , K i , D i ] q –codes.Let ( R , δ ) = ( k/n, d/n ). Divide N i by n with a remainder term, i.e. put N i = ( a i − n + r i , a i ≥ , ≤ r i < n. Using repeatedly 2.6 (i), spoil the respective[ N i , K i , D i ] q –code, replacing it by some [ a i n, K i , D i ] q –code. Its code point will haveslightly smaller coordinates than the initial ( R i , δ i ), however for N i large enough,it will remain in the domain R > R , δ > δ . Hence we may and will assume fromthe start that in our sequence of [ N i , K i , D i ] q –codes all N i ’s are divisible by n : N i = a i n . (2 . R , δ ) = ( k/n, d/n ),we will first consider the case of linear codes where the procedure is neater, because[ K i ] = K i . Since we have K i /N i > k/n, D i /N i > d/n , we get K i > a i k, D i > a i d. To complete the proof, it remains to reduce the parameters K i , D i to a i k, a i d respec-tively, without reducing N i = a i n. In the linear case, this is achieved by applicationof several steps 2.6 (ii), 2.6 (iii), followed by steps 2.6 (i).In the unstructured case reducing D i can be done in the same way. It remainsto reduce [ K i ] to a i k . One application of the step 2.6 (iii) produces K ′ i such thateither [ K ′ i ] = [ K i ] −
1, or [ K ′ i ] = [ K i ]. In the latter case, after restoring N i to itsformer value, one must apply 2.6 (iii) again. After a finite number of such substeps,we will finally get [ K i ] − Can one find a recursive function b ( n, k, d, q ) such that if an [ n, k, d ] q –code is isolated, and a > b ( n, k, d, q ) , there is no code with parameters [ an, ak, ad ] q ?
3. Codes and computability
In this section, I will discuss computability of two types of closed sets in [0 , : U q and Γ q := the graph of α q , as well as their versions for linear codes. I will startwith the brief summary of basic definitions of [BratWe] in our context. First, we will consider [0 , , U q and Γ q as closedsubsets in a larger square, say X := [ − , , with its structure of compact metricspace given by d (( a i ) , ( b i )) := max | a i − b i | . The set of open balls B with rationalcenters and radii in this space has a natural structure of a constructive world (cf.2.1). Hence we may speak about (recursively) enumerable and decidable subsets of B . Following [BratWe] and [La], we will consider three types of effectivity of closedsubsets Y ⊂ X :(i) Y is called recursively enumerable , if the subset { I ∈ B | I ∩ Y = ∅} ⊂ B (3 . B .(ii) Y is called co–recursively enumerable , if the subset { I ∈ B | I ∩ Y = ∅} ⊂ B (3 . B (here I is the closure of I ). (iii) Y is called recursive , if it is simultaneously recursively enumerable and co–recursively enumerable.As a direct application of [BratWe] we find: The closures V q and V linq are recursively enumerable. Proof.
In fact, range of the function cp (see 2.3) is dense in V q , resp. V linq , andwe can apply [BratWe], Corollary 3.13(1)(d). Referring tothe Corollary 7.3 of [Brat], we will call α q (resp. α linq ) computable , if its graph Γ q (resp. Γ linq ) is co–recursively enumerable. Assume that α q is computable. Then each of the followingsets is recursively enumerable:(a) Code points lying strictly below the asymptotic bound.(b) Isolated code points.The same is true for linear codes, if α linq is computable. Proof.
We start with the following remark. Choose any integer N ≥ ( N ) q which is the union of closed balls of the form I = (cid:20) pN , p + 1 N (cid:21) × (cid:20) pN , p + 1 N (cid:21) ⊂ X (3 . p ∈ N , I ∩ Γ q = ∅ . Then we have:(i) The boundary of Γ ( N ) q consists of two vertical (parallel to the R –axis) segmentsat the ends and two piecewise linear connected closed curves: Γ ( N ) q + lying above Γ ( N ) q − .(ii) The distance of any point x ∈ Γ ( N ) q − to Γ ( N ) q + does not exceed /N , and simi-larly with + and − reversed. Let us call an N – strip any connected closed set satisfying these conditions.Now, assuming α q (resp. α linq ) computable, that is, Γ q co–recursively enumer-able, choose N and run the algorithm generating in some order all rational closedballs I such that I ∩ Γ q = ∅ . Wait until their subset consisting of balls of the form(3.3) covers the whole square [0 , with exception of a set whose closure is an N –strip. This strip will then be an approximation to Γ q (resp. Γ linq ) containing therespective graph in the subset of its inner points.Run parallelly an algorithm generating all code points and divide each partiallist of code points into three parts depending on N : points lying below Γ ( N ) q , aboveΓ ( N ) q , and inside Γ ( N ) q .When N grows, the growing first and second parts respectively will recursivelyenumerate code points below and above the asymptotic bound. Remark.
This reasoning also shows, in accordance with [Brat], that if weassume Γ q only co–recursively enumerable, it will be automatically recursively enu-merable and therefore recursive. Assume that U q is recursive in the sense of 3.1(iii). Then α q is computable. The similar statement holds for linear codes. Proof.
Consider first a closed ball I as in (3.3) that intersects U q whereas itsinner part I does not intersect U q . A contemplation will convince the reader thatthe left lower boundary point of this “ball” (a square in the Euclidean metric) isprecisely the intersection point I ∩ Γ q . Call such a ball an exceptional N –ball. Since α q is decreasing, we have(a) Each horizontal strip p/N ≤ R ≤ ( p + 1) /N and each vertical strip q/N ≤ δ ≤ ( q + 1) /N can contain no more than one exceptional N –ball.(b) If one exceptional N –ball lies to the right of another one, then it also lieslower than that one. Generally, call a set of N –balls N –admissible , if it satisfies (a) and (b).Now, assuming U q recursive and having chosen N , we can run parallelly twoalgorithms: one generating closed balls (3.3) non–intersecting U q and another, gen-erating open balls (3.3) intersecting U q . Run them until all N –balls are generated,with a possible exception of an N –admissible subset X ( N ) q , then stop generation.Let U ( N ) q + be the union of all balls generated by the first algorithm, and U ( N ) q − theunion of all balls generated by the second algorithm.Look through all the balls in X ( N ) q in turn. If there are elements in it whoseclosure does not intersect the closure of U ( N ) q − , delete them from X ( N ) q and put itinto U ( N ) q + . Similarly, if there are elements in it whose closure does not intersect(initial) U ( N ) q + , delete them from X ( N ) q and put them into U ( N ) q − . Keep the old notations U ( N ) q − , U ( N ) q + , X ( N ) q for these amended sets.Now, the union of the lower boundary of U ( N ) q + and the upper boundary of U ( N ) q − will approximate Γ q from two sides, with error not exceeding N − . (Here a ”bound-ary” means the respective set of boundary squares).Clearly, this reasoning shows also also computability of α q in the sense of 3.3. References [BaFo] A. Barg, G. D. Forney.
Random codes: minimum distances and errorexponents.
IEEE Transactions on Information Theory, vol 48, No 9 (2002), 2568–2573.[Brat] V. Brattka.
Plottable real functions and the computable graph theorem.
SIAM J. Comput., vol. 38, Bo. 1 (2008), 303–328.[BratMiNi] V. Brattka, J. S. Miller, A. Nies.
Randomness and differentiability. arXiv:1104.4456[BratPre] V. Brattka, G. Preser.
Computability on subsets of metric spaces.
Theoretical Computer Science, 305 (2003), 43–76.[BratWe] V. Brattka, K. Weihraub.
Computability on subsets of Euclidean spaceI: closed and compact subsets.
Theoretical Computer Science, 219 (1999), 65–93.[BravC] M. Braverman, St. Cook.
Computing over the reals: foundations forscientific computing.
Notices AMS, 53:3 (2006), 318–329[BravYa] M. Braverman, M. Yampolsky.
Computability of Julia sets.
MoscowMath. Journ., 8:2 (2008), 185–231.[CaHeKhWa] C. S. Calude, P. Hertling, B. Khoussainov, Yongge Wang.
Recur-sively enumerable reals and Chaitin Ω numbers. Theor. Comp. Sci, 255 (2001),125–149.[La] D. Lacombe.
Extension de la notion de fonction r´ecursive aux fonctionsd’une ou plusieurs variables r´eelles., I–III.
C. R. Ac. Sci. Paris, 240 (1955), 2478–2480; 241 (1955), 13–14, 151–153.[Man1] Yu. I. Manin,
What is the maximum number of points on a curve over F ? J. Fac. Sci. Tokyo, IA, Vol. 28 (1981), 715–720.[Man2] Yu. I. Manin,
Renormalization and computation I: motivation and back-ground.
Preprint math.QA/0904.4921 [Man3] Yu. I. Manin, Renormalization and Computation II: Time Cut-off andthe Halting Problem.
Preprint math.QA/0908.3430[ManMar] Yu. I. Manin, M. Marcolli.
Error–correcting codes and phase transi-tions. arXiv:0910.5135[ManVla] Yu. I. Manin. S.G. Vladut,
Linear codes and modular curves . J. SovietMath., Vol. 30 (1985), 2611–2643.[TsfaVla] M. A. Tsfasman, S. G. Vladut.
Algebraic–geometric codes , Kluwer,1991.[VlaNoTsfa] S. G. Vladut, D. Yu.M. A. Tsfasman.
Algebraic geometric codes: ba-sic notions.
Mathematical Surveys and Monographs, 139. American MathematicalSociety, Providence, RI, 2007.YURI I. MANIN,