[PDF] Levels of Abstraction and the Apparent Contradictory Philosophical Legacy of Turing and Shannon

Abstract

In a recent article, Luciano Floridi explains his view of Turing's legacy in connection to the philosophy of information. I will very briefly survey one of Turing's other contributions to the philosophy of information and computation, including similarities to Shannon's own methodological approach to information through communication, showing how crucial they are and have been as methodological strategies to understanding key aspects of these concepts. While Floridi's concept of Levels of Abstraction is related to the novel methodology of Turing's imitation game for tackling the question of machine intelligence, Turing's other main contribution to the philosophy of information runs contrary to it. Indeed, the seminal concept of computation universality strongly suggests the deletion of fundamental differences among seemingly different levels of description. How might we reconcile these apparently contradictory contributions? I will argue that Turing's contribution should prompt us to plot some directions for a philosophy of information and computation, one that closely parallels the most important developments in computer science, one that understands the profound implications of the works of Turing, Shannon and others.

Full PDF

aa r X i v : . [ c s . G L ] F e b Levels of Abstraction and the ApparentContradictory Philosophical Legacy of Turingand Shannon

Hector ZenilInstitute d’Histoire et de Philosophie des Sciences(Paris 1 Panth´eon-Sorbonne, CNRS, ENS Ulm), Paris, France; andUnit of Computational Medicine, Karolinska InstituteCentre for Molecular Medicine, Stockholm, [email protected]

Abstract

In a recent article, Luciano Floridi explains his view of Turing’s legacyin connection to the philosophy of information. I will very brieﬂy surveyone of Turing’s other contributions to the philosophy of information andcomputation, including similarities to Shannon’s own methodological ap-proach to information through communication, showing how crucial theyare and have been as methodological strategies to understanding key as-pects of these concepts. While Floridi’s concept of Levels of Abstraction isrelated to the novel methodology of Turing’s imitation game for tacklingthe question of machine intelligence, Turing’s other main contribution tothe philosophy of information runs contrary to it. Indeed, the seminalconcept of computation universality strongly suggests the deletion of fun-damental diﬀerences among seemingly diﬀerent levels of description. Howmight we reconcile these apparently contradictory contributions? I willargue that Turing’s contribution should prompt us to plot some direc-tions for a philosophy of information and computation, one that closelyparallels the most important developments in computer science, one thatunderstands the profound implications of the works of Turing, Shannonand others.

Floridi’s recent article [5] seems to leave little doubt that Turing’s mostimportant contribution to the philosophy of information was the imitation gamethat he put forward [12] as a strategy for inquiring into and evaluating theintelligence capabilities of computing machines (today we call it the Turingtest): When one looks at Turing’s philosophical legacy, there seems tobe two risks. One is to reduce it to his famous test (Turing 1950).1his has the advantage of being clear cut. Anybody can recognizethe contribution in question and place it within the relevant debateon the philosophy of artiﬁcial intelligence. The other risk is to diluteit down into an all-embracing narrative, making Turing’s ideas theseeds of anything we do and know today.Reads the Introduction [5].One main contribution of Turing’s imitation game is methodological in na-ture, constituting a powerful epistemological approach to under-deﬁned con-cepts. As Floridi asserts, Turing ﬁnds it more appropriate to ask a speciﬁcquestion at the right level of description that can be quantiﬁed rather thandiscussed ad inﬁnitum . As it is well known, at an international mathematics conference in 1928, DavidHilbert and Wilhelm Ackermann suggested the possibility that a mechanicalprocess could be devised that was capable of proving all mathematical assertions,this notion referred to as the

Entscheidungsproblem , or ‘the decision problem’,made not diﬃcult to imagine that arithmetic could be amenable to a sort ofmechanisation. The origin of the Entscheidungsproblem dates back to GottfriedLeibniz, who having succeeded (circa 1672) in building a machine, based on theideas of Blaise Pascal, that was capable of performing arithmetical operations(the

Staﬀelwalze or the Step Reckoner), imagined a machine of the same kindthat would be capable of manipulating symbols to determine the truth valueof mathematical principles. Leibniz devoted himself to conceiving a formaluniversal language, which he designated ‘characteristica universalis’, a languagewhich would encompass, among other things, binary language and the deﬁnitionof binary arithmetic.In 1931, Kurt G¨odel arrived at the conclusion that Hilbert’s intention (alsoreferred to as ‘Hilbert’s programme’) of proving all theorems by mechanizingmathematics was not possible under certain reasonable assumptions. G¨odeladvanced a formula that codiﬁed an arithmetical truth in arithmetical termsand that could not be proved without arriving at a contradiction. Even worse,it implied that there was no set of axioms that contained arithmetic free of trueformulae that could not be proved.Theorems in a mathematical theory are formal semantic objects. They havetruth value, conveying information attesting to the truth of the facts encom-passed, all the way from axioms, which are facts taken to be true by deﬁnition, Here it is also worth clarifying the point that his prediction as to when machines wouldpass the test was rectiﬁed by Turing himself in a 1951 BBC radio interview (broadcast ayear later in 1952) with H. A. Newman, Sir Geoﬀrey Jeﬀerson, and R. B. Braithwaite: “CanAutomatic Calculating Machines Be Said to Think?” [14], when he oﬀered a prediction of “atleast 100 years” (at a 70% percent chance). So it shouldn’t be claimed that he was wrong, asis claimed in [5] and many other sources.

2o the statement of the theorem itself. But Godel did something remarkableand encoded the meaning of theorems in the syntax of the theory itself. Hedid this by associating symbols with numbers in order to encode meaning inthe form of arithmetical propositions. Using a clever construction that led to acontradiction, he proved that some of these constructions are undecidable, thatis, they cannot be assigned a meaning within the theory unless a larger morepowerful theory is used, which in turn would have new undecidables itself, henceleading to questions of absolute undecidability.This fundamental relativisation put an end to the discussion of the feasibilityof Hilbert’s programme, given that no matter how strong a theory could be,there would always be meaningful statements from outside it that the theorywould be unable to encompass. The relationship between truth and the provablewas broken.Just a few years after G¨odel, Turing arrived at very similar conclusions byvery diﬀerent means. His means were mechanical, so the theorems and truthsfrom G¨odel’s work were now nothing but the manipulation of symbols, sequencesof tasks as mundane as those which people, then as now, were used to dealingwith on an everyday basis, these were computer programs. Turing also showedthat no matter how powerful you think a digital computer would be, it wouldturn out to have serious limitations, notwithstanding its remarkable properties.

Alan Turing tackled the problem of decision in a diﬀerent way to G¨odel. Hisapproach included the ﬁrst abstract description of the digital general-purposecomputer as we know it today. Turing deﬁned what in his article he termedan ‘a’ computer (for ‘automatic’), now known as a Turing machine. Turingalso showed that certain computer programs can be decided with more pow-erful computing machines. Unfortunately in the scheme of Floridi’s Levels ofAbstraction (LoA) [5], Turing’s derivation of a rich hierarchy pertains to incom-putable objects. And intermediate degrees of computation are all but natural;examples are non-constructive [11], hence of little signiﬁcance to LoA.As is widely known, a Turing machine is an abstract device which reads orwrites symbols on a tape one at a time, and can change its operation accordingto what it reads, moving forwards or backwards through the tape. The machinestops when it reaches a certain conﬁguration (a combination of what it readsand its internal state). It is said that a Turing machine produces an outputif the machine halts, while the locations on the tape the machine has visitedrepresent the output produced.The most remarkable idea advanced by Turing is his concept of universality,his proof that there is an ‘a’ machine that is able to read other ‘a’ machinesand behave as they would for any input. In other words, Turing proved thatit was not necessary to build a new machine for each diﬀerent task; a singlemachine that could be reprogrammed suﬃced for all. Not only does this erasethe distinction between programs carried out by diﬀerent machines (since one3achine suﬃces), but also between programs and data, as one can always codifydata as a program to be executed by another Turing machine and vice versa,just as one can always build a universal machine to execute any program.Turing also proved that there are Turing machines that never halt, and if aTuring machine is to be universal and hence able to simulate any other Turingmachine or computer program, it is actually expected that it will never halt fora(n inﬁnite) number of inputs of a certain type (while halting for an inﬁnitenumber of inputs too). This is something we are faced with in everyday life, foreven the simplest and most mundane tasks, approached using devices as simpleas Turing machines, already impose limits on our knowledge of these devicesand what they are or are not able to compute.

Once approached the problem of deﬁning an algorithm with the concept of Tur-ing computation, a question to be considered concerns the nature of information.Shannon did something similar than Turing for the concept of algorithm, butfor the concept of information. Not so long ago the problem of communicatinga message was believed to be related only to the type of message and how fastone could send letters through a communication medium. When Morse codewas invented, it was clear that the number of symbols was irrelevant, it requiredonly two diﬀerent symbols to convey any letter and therefore any possible wordand any possible message. Shannon separated information from meaning whenit came to measuring certain aspects of messages, because meaning seemed to beirrelevant to the question of communication. The same medium could be usedfor what were thought to be completely diﬀerent kinds of information, suchas images, sounds and text, which today’s computers show are not essentiallydiﬀerent, being exactly the same at the machine level.The computer casts the information in a form that we recognize as an imageor a sound, a password or an emoticon, but there is no essential diﬀerenceamong these at the level of Shannon information theory. Shannon formallyproved [9] that any language, no matter how sophisticated, can be reduced toa 2-symbol system of yes-no answers, and it can be so reduced quite eﬃciently.This is why we can now store any sort of information in the same device. AsTuring showed with respect to computation, information storage too does notrequire diﬀerent media; a single medium suﬃces, indeed any medium (of thesame capacity) would suﬃce albeit noise and other relevant considerations thatShannon himself also studied in incredible formal detail [9].That information can be of very diﬀerent kinds is signiﬁcant, but what ismore remarkable is that all information can be fundamentally treated of thesame type, and only the way in which its elements are arranged results in somany disparate meanings, to which Shannon’s measures are immune. And ifone wished to use Shannon’s entropy to distinguish between messages using thesame alphabet, one would soon ﬁnd that it is unsuitable for capturing this levelof meaning, as it has been widely recognized starting by Shannon himself. But4his is not to say that no formal low-level quantiﬁcation theory can deal withinformation and meaning at given levels of abstraction.

Algorithmic information theory [7, 1, 10, 8] (AIT), for example, is better atdealing with subtle diﬀerences in messages and thereby capturing certain aspectsof meaning [16]. Its central measure, Kolmogorov complexity ( K ), not only takesinto consideration the message itself, but also its recipient and generator. It tellsus that a message can be quantiﬁed by the length in bits of the shortest computerprogram that generates it. The computer program reproduces the messageand is included with the message itself, so it is in some sense autoexecutable,regardless of the carrier. Barry Cooper points out [2] that..., if one [limited] oneself to the usual computability models, thenotion of randomness of ﬁnite strings seems to provide a ﬁrst steptoward a much needed theory of the incomputability of ﬁnite objects.The theory of algorithmic information promises to allow some hypothesistesting on the algorithmicity of the world [18, 17], and it even introduces theneed for an observer [16], given that one cannot calculate K directly but only byindirect methods, making approximations subject to diﬀerences in the methodsused. But once again, it is not this relativity that makes the theory incrediblypowerful, but the objective properties of this quantiﬁcation of messages andmeaning, the fact that the theory provides an invariance theorem asserting thatquantifying information content asymptotically converges to the same valuesregardless of the production method, even though diﬀerent observers may seediﬀerent things in the same bit sequence, interpreting it diﬀerently.In [4], for example, the only reference to AIT as a formal context for thediscussion of information content and meaning is a negative one—appearing invan Benthem’s contribution (p. 171 [4]). It reads:To me, the idea that one can measure information one-dimensionally in terms of a number of bits, or some other measure,seems patently absurd...I think this position is misguided. When Descartes transformed the notionof space into an inﬁnite set of ordered numbers (coordinates), he did not stripthe discussion and study of space of any of its interest. On the contrary, headvanced and expanded the philosophical discussion to encompass concepts suchas dimension and curvature—which would not have been possible without theCartesian intervention. Perhaps this answers the question that van Benthemposes immediately after the above remark (p. 171 [4]):But in reality, this quantitative approach is spectacularly moresuccessful, often much more so than anything produced in my worldof logic and semantics. Why? 5ccepting a formal framework such as algorithmic complexity for informa-tion content does not mean that the philosophical discussion of informationwill be reduced to a discussion of the numbers involved—as it did not in thecase of the philosophy of geometry or space-time after Descartes. Thanks toDescartes, however, Euclidian geometry eventually exhausted itself, and muchof the philosophical discussion was considered complete and settled. But we stillpursue the philosophy of Euclidian geometry because we now have a modernperspective that keeps giving us new material with which to approach what wasdone, how and why, from an hermeneutical perspective. And we have extendedthe reach of the philosophy of geometry to the philosophy of modern physics. Inthe future the same will happen for information if we embrace the most recentdevelopment in theoretical computer science, with the help of theories such asalgorithmic information theory. Some things are more remarkable not because they are diﬀerent but becausethey are the same, even if they can be studied in diﬀerent ways and at diﬀerentscales and levels of abstraction. Levels of abstraction are necessary for practicalreasons. For example, we are used to seeing things at the level at which ourphysics and biology predispose us to see them; we are ﬁnite beings that canstore information in certain limited—though extraordinary—ways. One cannotexpect to reconstruct information from the bottom up with limited storagecapacity and limited understanding. We will never be able to read machine codeand see that it is obviously the source code of a sophisticated word processor,even when the machine code is only a plain translation of the computer code inwhich the software was originally programmed. It is Turing and Shannon whotaught us that software in machine code is the same thing as the same softwarewe interact with on our screens.It is not that it cannot be provided a machine with the step-by-step proofof a mathematical theorem. It’s that no one is able to follow such a detaileddescription before becoming completely lost. In order for information to beuseful, it needs to be packaged for human understanding at the right level, thatis. Indeed that’s why mathematicians have been so good at creating a languagefor themselves.Indeed, access and the study of diﬀerent levels of abstraction are key tounderstanding our world. Concerned by the return to asking basic questionsof the kind considered by Alan Turing within the framework of computabilitytheory, Barry Cooper argues that uncomputability arises at certain levels ofcausal explanation, at the point of interaction of local and global phenomena [2],while at another level a phenomenon may be computable [3]:Even in non-linear systems, such high order behaviour [emergentphenomena] is causal — one phenomenon triggers another. Levels ofexplanation, from the quantum to the macroscopic, can be applied.6ut modelling the evolution of the higher-order eﬀects is diﬃcult inanything other than a broad-brush way. Such problems inﬁltrate allour models of the natural world.Unlike Cooper, I do not think this is an impediment by principle but apractical limitation. But everything else in Cooper’s reasoning applies.It then turns out that Turing’s main contribution to computation comple-ments the LoA approach at the end but for diﬀerent and less fundamentalreasons. If in dealing with emergent phenomena, a common task is to identifyuseful descriptions and to extract enough computational content to enable pre-dictions to be made, then it is clear that one cannot look at natural phenomenaat some arbitrary level; one will be able to compute very little if one is trying toextract a biological discovery from a quantum eﬀect. At some level of abstrac-tion, where epistemological limits are of less fundamental nature, the need ofLoAs is a pragmatic necessity. Turing’s contribution is twofold, on the one handthe novel strategy epitomized by the Turing test suggesting diﬀerent levels ofdescription and, on the other hand, the seminal concept of Turing universalitycollapsing levels in a fundamental way. They will appear contradictory if thebeauty of their elegant complementarity, fundamental and pragmatical sides, isoverlooked.

References [1] G. J. Chaitin. On the length of programs for computing ﬁnite binary se-quences,

Journal of the ACM , 13(4):547–569, 1966.[2] S.B. Cooper, P. Odifreddi, Incomputability in Nature, Computability andModels, The University Series in Mathematics 2003, pp 137-160[3] S.B. Cooper, The incomputable reality, Nature, vol 482, pp 465, Nature.[4] L. Floridi (ed.) Philosophy of Computing and Information: 5 Questions,Automatic Press / VIP, 2008.[5] L. Floridi, Turing’s three philosophical lessons and the philosophy of infor-mation, Phil. Trans. R. Soc. A, 370, 2012. (doi: 10.1098/rsta.2011.0325)[6] K. G¨odel, On formally undecidable propositions of Principia Mathematicaand related systems I (1931). In S. Feferman (ed.), Kurt G¨odel Collectedworks, vol. I. Oxford University Press, pp. 144–195, 1986.[7] A.N. Kolmogorov. Three approaches to the quantitative deﬁnition of infor-mation,

Problems of Information and Transmission , 1(1):1–7, 1965.[8] L.A. Levin. Laws of information conservation (non-growth) and aspects ofthe foundation of probability theory,

Problems of Information Transmis-sion , 10(3):206–210, 1974. 79] C.E. Shannon: A Mathematical Theory of Communication,

Bell SystemTechnical Journal , vol. 27, pp. 379–423, 623–656, 1948.[10] R. J. Solomonoﬀ. A formal theory of inductive inference: Parts 1 and 2,

Information and Control , 7:1–22 and 224–254, 1964.[11] K. Sutner, Computational Equivalence and Classical Recursion Theory. InH. Zenil (ed.), Irreducibility and Computational Equivalence, Emergence,Complexity and Computation, vol. 2, Springer-Verlag.[12] A.M. Turing, Computing Machinery and Intelligence,

Mind , LIX(236), pp.443–460, 1950.[13] A.M. Turing, On Computable Numbers, with an Application to theEntscheidungsproblem,

Proc. London Math. Soc. , vol. 2, no. 42, 1936, pp.230–267.[14] B.J. Copeland, The Essential Turing, Can automatic calculating machinesbe said to think? BBC Radio broadcast, 1952: discussion with A.M. Tur-ing, H. A. Newman, G. Jeﬀerson, R. B. Braithwaite. Included (ed. B.J. Copeland) in K. Furukawa, D. Michie, S. Muggleton (eds.), MachineIntelligence 15, Oxford University Press, 1999. Original documents avail-able online: (accessed18 August, 2013.)[15] H. Zenil, An Algorithmic Approach to Information and Meaning,

APANewsletter on Philosophy and Computers, vol. 11, No. 1, 2011.[16] H. Zenil, What is Nature-like Computation? A Behavioural Approach anda Notion of Programmability,

Philosophy & Technology (special issue onHistory and Philosophy of Computing), 2013. (doi: 10.1007/s13347-012-0095-2)[17] H. Zenil,

The World is Either Algorithmic or Mostly Random . 3rd. Placein the FQXi essay contest on the topic “Is Reality Digital or Analog?”[18] H. Zenil and J-P. Delahaye, On the Algorithmic Nature of the World. InG. Dodig-Crnkovic and M. Burgin (eds),