Towards Taming Java Wildcards and Extending Java with Interval Types
TTowards Taming Java Wildcards andExtending Java with Interval Types
Moez A. AbdelGawad
Informatics Research Institute, SRTA-City, Alexandria, Egypt [email protected]
Abstract
Of the complex features of generic nominally-typed OO type sys-tems, wildcard types and variance annotations are probably thehardest to fully grasp. As demonstrated when adding closures( a.k.a. , lambdas) and when extending type inference in Java, wild-card types and variance annotations make the development andprogress of OO programming languages, and of their type systemsin particular, a challenging and delicate task.In this work we build on our concurrent work, in which wemodel Java subtyping using a partial graph product, to suggesthow wildcard types in Java can be generalized, and simplified, to interval types . In particular, interval types correspond to endpointsof paths in the Java subtyping graph.In addition to being a simple and more familiar notion, that iseasier to grasp than wildcard types, interval types are strictly moreexpressive than wildcard types. As such, we believe interval types,when developed and analyzed in full, will be a welcome addition toJava and other similar generic nominally-typed OO programminglanguages.
Keywords
Object-Oriented Programming (OOP), Interval Types,Nominal Typing, Subtyping, Type Inheritance, Generics, TypePolymorphism, Variance Annotations, Java, Java Wildcards, Wild-card Types, Partial Graph Product
1. Introduction
Java wildcard types [14, 15, 27, 28] (also known as ‘Java wild-cards’) are wild. Wildcard types in Java are an instance of so-called usage-site variance annotations. Another form of variance annota-tions, called declaration-site variance annotations, is supported inother generic nominally-typed OO programming languages such asC
To illustrate how wild Java wildcard types, and variance annota-tions more generally, can be, consider the following Java code. Thereader is invited to tell (before he or she checks the answer below)which of the variable assignments in the following code—all in-volving wildcard types—will typecheck and which ones will theJava typechecker complain about. class E
Answer: All class and variable declarations in the code abovetypecheck, but the second assignment in each of the five groups ofvariable assignments ( i.e. , elements of the middle column) does nottypecheck, while the other six assignments will typecheck.Now, did the reader figure this answer correctly? ... and quickly?... and, most importantly, can the reader tell the why for his orher findings? ... (Unfortunately, the error messages that the Javacompiler emits, which involve wildcard “capturing”, are largelyunhelpful in this regard).The difficulty in figuring out which of the assignments above(let alone assignments involving even more complex generic types)do typecheck and which ones do not, as well as the difficulty infiguring out the reason in each case, and the almost total unhelp-fulness of error messages, is an example of how hard ( a.k.a. , wild)reasoning about wildcard types can be.A main motivation for our research, and for the work presentedin this paper in particular, is to face this situation head-on andtry to significantly improve it. We believe the research presentedhere and in other closely-related publications is a significant steptowards that goal. (Based on earlier research we did [3, 4] thatlargely complements the work we present in this paper, we believeJava generics error diagnostics, for example, can be significantlyimproved once wildcard types are fully tamed and generalized tointerval types.)This paper is structured as follows. In Section 2 we briefly dis-cuss the background needed for reading this paper. In Section 3 wethen formally define OO interval types and the construction methodof the Java subtyping relation with interval types. In Section 4 wepresent examples that illustrate the definitions in Section 3 (In Ap- a r X i v : . [ c s . P L ] J u l endix A we present the SageMath code we developed to help usin generating these examples). In Section 5 we briefly discuss re-search related to the research we present in this paper. In Section 6we make some concluding remarks and discuss some possible fu-ture work that can build on work presented in this paper.
2. Background
The background necessary for reading this paper is basically thesame as that of [5] (A summary is presented in Section 2 of [6]).In the following sections of this paper we largely adopt the sameassumptions, vocabulary and notation of [5, 6].In particular, the construction method we use to construct theJava subtyping relation with interval types (which is formalizedin Section 3 of this paper), except for its use of interval types, isthe same as the construction method used to construct the Javasubtyping relation with wildcard types (as presented in [6]).
3. Interval Types and Constructing The JavaSubtyping Relation
In this section we formally introduce interval types, then we for-mally define the construction method of the Java subtyping relationbetween generic reference types with interval type arguments.
Informally, similar to closed intervals over real numbers or overintegers, which are sets of numbers between and including twobounding numbers, intervals over a directed graph are sets of ver-tices between and including two bounding vertices in the graph. Formally, we define intervals of a directed graph as the quotientset of the set of paths of the graph (including trivial, zero-lengthpaths from vertices of the graph to themselves) over the equivalencerelation of paths of the graph with the same endpoints, as follows.Let G = ( V, E ) be a directed graph, and let P ( G ) be the set ofpaths in G . For a path p in P ( G ) let s ( p ) ∈ V and e ( p ) ∈ V denotethe (possibly equal) start and end vertices of p ( i.e. , the endpointsof path p ). For paths p and p in G ( i.e. , p , p ∈ P ( G ) ), definethe relation ↔ over P ( G ) as the equivalence relation p ↔ p iff s ( p ) = s ( p ) ∧ e ( p ) = e ( p ) . In agreement with intuition, using the properties of = and conjunc-tion ( ∧ ) it is easy to show that relation ↔ is a reflexive ( ∀ p ∈ P ( G ) , p ↔ p ), symmetric ( ∀ p , p ∈ P ( G ) , p ↔ p = ⇒ p ↔ p ) and transitive relation ( ∀ p , p , p ∈ P ( G ) , p ↔ p ∧ p ↔ p = ⇒ p ↔ p ), i.e. , that ↔ is an equivalencerelation. Relation ↔ thus partitions P ( G ) .The set I ( G ) of intervals of graph G is then defined as thequotient set I ( G ) = P ( G ) / ↔ ( i.e. , as the set of equivalence classes defined by ↔ ).Equivalently, intervals of G can be defined using the reflex-ive transitive closure of G . That is because the intervals of G are in one-to-one correspondence with the edges (including self-edges/loops) of RT C ( G ) , the reflexive transitive closure of G . Inother words, if RT C ( G ) = ( V, E rtc ) then we have I ( G ) (cid:39) E rtc . Hence, based on the definition of graph intervals, one graph in-terval can correspond to multiple (one or more) paths of the graph, The two bounds of a graph interval are usually called its lowerbound andits upperbound , particularly when the graph is that of a partially-orderedset, as is the case for the OO subtyping relation and for the smaller-than-or-equals relation on numbers. all having the same endpoints, but one path of the graph corre-sponds to exactly one graph interval, namely the interval definedby the endpoints of the path ( i.e. , its start and end vertices [18]). As such, using two standard notations for intervals, we denote agraph interval I with endpoints v and v either by I = [ v − v ] ,or sometimes equivalently by I = [ v , v ] (given that we defineintervals over directed graphs, the order of the vertices in a graphinterval expression matters).Informally, an interval I = [ v − v ] over a graph G canbe viewed as a pair of vertices ( v , v ) in G where a path from v to v is guaranteed to exist in G ( v is said to be reachable from v ). Not all pairs of vertices of G satisfy this condition.A pair of vertices that do satisfy this connectedness conditioncorresponds to an interval over G , while the vertices of a pairthat does not satisfy the condition are called disconnected vertices (they are sometimes also called parallel vertices , particularly when G is the graph of a partial order and when also the inverse pair ( v , v ) does not form an interval, and are usually denoted by v || v in order theory literature). In other words, a graph intervalover a directed graph G corresponds to a pair of connected or reachable vertices in G , where (in the context of directed graphs)‘connectivity’ is understood to be to the second mentioned vertex( v ) and ‘reachability’ is understood to be from the first mentionedvertex ( v ), i.e. , we say v is reachable from v or, equivalently,that v is connected to v .We then simply define OO interval types in an OO type systemas graph intervals over the graph of the OO subtyping relation ( i.e. ,the graph whose vertices are OO types and whose edges correspondto the subtyping relation between OO types).
Similar to intervals of real numbers or intervals of integers, graphintervals and interval types can be (partially) ordered by a contain-ment relation, where a graph interval I is said to be contained-in another interval I if some path P corresponding to I is, in itsentirety, a subpath [18] of some path P corresponding to I ( i.e. ,for I to be contained in I the path P must share its vertices with P , in the same order as the vertices occur in P ). If an interval I is contained in an interval I , sometimes we call I a subinterval of I and, equivalently, call I a superinterval of I .The definition of the containment relation between graph in-tervals may seem a bit convoluted, but it should be intuitivelyclear. (Visually speaking, i.e. , when presented with the diagram ofa graph, it is usually immediately obvious whether a graph intervalis contained in another.)In accordance with our definition of interval types, we definethe containment relation between interval types as containmentbetween their corresponding graph intervals over the graph of theOO subtyping relation. It should be intuitively clear also that thecontainment relation is a partial ordering between interval types,which itself— i.e. , the containment ordering—can be modeled by We call paths corresponding to a graph interval its ‘witnesses’. Moreprecisely, similar to formal proofs of valid logical statements in formallogic, we call a path in a graph a witness to the graph interval definedby the endpoints of the path. In fact we found some resemblance betweengraph intervals in graph theory (as we define them in this paper) and validlogical statements in formal mathematical logic. In formal logic a logicalstatement must have a formal proof for the statement to be a valid logicstatement. Similarly, a pair of graph vertices must have a path connectingits two vertices for the pair to define a graph interval. Further, in formal logica valid logical statement can have multiple (one or more) witnessing proofs.Likewise, a graph interval can have multiple (one or more) witnessing paths.Also, the ↔ relation between graph paths is analogous to the equivalenceof proofs with the same premises and conclusions (‘proof irrelevance’), but,other than noting the analogy in this footnote, we do not explore or take theanalogy any further in this paper. directed graph (which can be presented, for example, using theHasse diagram of the partial ordering).Another relation that can be defined on graph intervals (andinterval types accordingly) is the precedence relation, where aninterval I = [ v , v ] is said to precede an interval I = [ u , u ] if and only if there exists an interval I = [ v , u ] ( i.e. , if and onlyif the pair ( v , u ) actually defines an interval). In other words,interval I precedes interval I if and only if the end vertex of I and the start vertex of I are connected. If I precedes I it maybe equivalently said that I succeeds or follows I . Currently ( i.e. ,in this paper), we do not have a particular use or application ofthe precedes relation in constructing and modeling the generic Javasubtyping relation, but we do not preclude the possibility—indeed,we believe—that the precedes relation may be useful in some futurework that builds on the work we present in this paper. As we did for wildcard types (see [6]), to construct the Java subtyp-ing relation with interval types we use the partial Cartesian graphproduct operator (cid:110) from [7].First, however, we define a graph constructor S (cid:109) (similar toour definition of the graph constructor S (cid:52) in [6]) that constructsthe graph whose vertices are all the interval type arguments thatcan be defined for the graph of a subtyping relation S and whoseedges model the containment relation between these arguments. Inother words, operator S (cid:109) constructs (the graph of) the containmentrelation between interval types over S as described in Section 3.1.1.With S (cid:109) in hand, the Java subtyping relation with interval typescan now be defined as the solution S of the recursive graph equation S = C (cid:110) C g S (cid:109) (1)where S is the graph of the subtyping relation, C is the graph of thesubclassing relation, and C g is the set of generic classes (a subsetof classes in C ).As is standard for recursive equations, Equation (1) can besolved iteratively (thereby formalizing our construction method)using the equation S i +1 = C (cid:110) C g S (cid:109) i (2)where the S i are finite successive better approximations to theinfinite relation S , and S (cid:109) is an appropriate initial graph of thecontainment relation between interval types (which we take as thegraph with one vertex, having the default interval type argument‘ ? ’ as its only vertex, and having no containment relation edges).Equation (2) thus formally and concisely describes the constructionmethod of the generic Java subtyping relation between ground Javareference types with interval types.As a comparison of their defining equations reveals, the maindifference between the construction of the graph of the Java sub-typing relation with interval types (presented here) and the con-struction of the graph of the Java subtyping relation with wildcardtypes (presented in [6]) lies in using the operator S (cid:109) in place ofoperator S (cid:52) to construct the type arguments of generic classes.In [5, 6] we noted that the generic Java subtyping relation ex-hibits self-similarity. The equation for Java subtyping with inter-val types demonstrates that extending Java with interval types, asa generalization of wildcard types, preserves the self-similarity ofthe subtyping relation. As is the case for wildcard types, the self-similarity of the Java subtyping relation with interval types comesfrom the fact that the second factor ( i.e. , S (cid:109) , the graph of the con-tainment relation between interval type arguments) of the partialproduct (cid:110) defining S is derived iteratively (in all but the first itera-tion) from the first factor of the product ( i.e. , from C , the subclass-ing relation). The implications (see [6, Section 3]) of the depen- dency of the definition of the subtyping relation on the definitionof the subclassing/class inheritance relation in nominally-typed OOprogramming languages, regarding the value of nominal typing andits influence and effects on the type system of Java and other similarnominally-typed OO languages, are thus also preserved.
4. Examples of Constructing The Java SubtypingRelation with Interval Types
In this section we present examples for how the generic Java sub-typing relation between ground reference types with interval typearguments can be iteratively constructed. As we do in the exam-ples section of [6], we use colored edges in the diagrams below toindicate the self-similarity of the Java subtyping relation.Also, so as to have the lower bound always on the left ofa type argument expression and the upper bound always on theright of a type argument expression (as is the customary mathe-matical notation for intervals) in the diagrams we use the notation‘
T <: ? ’ (to mean ‘
T extends ? ’) instead of ‘ ? :> T ’ (whichmeans ‘ ? super T ’) to express type arguments upper-bounded by O ( i.e. , Object ).Also, in the diagrams we use wildcards notation (‘ ? ’, ‘ ? <: T ’and ‘ T <: ? ’) as much as possible, even though unnecessary, soas to indicate the equivalence of some types that have interval typearguments to types that have wildcard type arguments. As such,only interval type arguments that cannot be expressed as wildcardtype arguments are expressed using the proper intervals notation(‘
T1 - T2 ’).
Example 1.
Consider the Java class declaration class C
Example 2.
Consider the two Java class declarations class C
Consider the two Java class declarations class C
Consider the four Java class declarations class C {}class E extends C {} Given that we do not present graphs of S x or higher in the remainingexamples below (due to the very large size of these graphs), this differencein expressiveness between interval types and wildcard types is not as evidentin later examples as it is in this example.a) C (b) S = C (cid:110) { C } S (cid:109) (c) S = C (cid:110) { C } S (cid:109) (d) S = C (cid:110) { C } S (cid:109) Figure 1: Constructing generic OO subtyping with interval types class D {}class F
5. Related Work
Although viewing wildcard types in Java as “some sort of intervals”seems to not be quite a new idea, but, other than presenting theidea as a vague intuition that may help in understanding wildcardtypes, it seems the idea has not been researched and presented morethoroughly before. (a) C (b) S = C (cid:110) { F } S (cid:109) (c) S = C (cid:110) { F } S (cid:109) Figure 5: Constructing generic OO subtyping with interval types(automatic layout by yEd)As to other work that is related to the research we present here,we already mentioned our concurrent work presented in [6] and ourearlier work presented in [5], both of which paved the way for thework presented here.The addition of generics to Java has motivated much earlierresearch on generic OOP and also on the type safety of Java andsimilar languages. Much of this research was done before genericswere added to Java. For example, the work in [8, 9, 11] was mostlyfocused on researching OO generics, while the work in [12, 13]was focused on type safety.Some research on generics was also done after generics wereadded to Java, e.g. , [3, 4, 17, 29]. However, Featherweight Java/Feath-erweight Generic Java (FJ/FGJ) [19] is probably the most promi-nent work done on the type safety of Java, including generics. Vari-ance annotations and wildcard types were not put in considerationin the construction of the operational model of generic nominally-typed OOP presented in [19] however.Separately, probably as the most complex feature of Java gener-ics, the addition of “wildcards” ( i.e. , wildcard type arguments) toJava (in [28], which is based on the earlier research in [20]) alsogenerated some research that is particularly focused on modelingwildcards and variance annotations [10, 16, 21, 24–27]. This sub-stantial work, particularly the latest work (of [26], [25] and [16]),learly points to the need for more research on wildcard types andgeneric OOP.
6. Discussion and Future Work
In this paper we presented how type arguments for ground genericOO types can be smoothly generalized from wildcard types to in-terval types, thereby uniformly supporting types other than
Object and
Null as upper and lower bounds for generic type arguments,thereby subsuming and generalizing wildcard types. Also, our workcan be made more comprehensive—covering more features of theJava generic subtyping relation—if it is extended to model sub-typing between generic types that have type variables in them. Wehinted earlier in this paper, in particular, to how bounded type vari-ables can be modeled, but including type variables and boundedtype variables in the construction of the Java subtyping relation re-mains to be done.We also believe it may be useful if a notion of nominal in-tervals [3] ( i.e. , nominal type intervals) is supported in Java andother similar generic nominally-typed OO programming languages,where type variables (including bounded ones) can be viewed asnames for interval types (as presented in this paper) and wherenominal intervals with the same bounds but with different namesare considered distinct ( i.e. , unequal) nominal intervals. We con-jecture that the nominality of nominal intervals can be a notion thatis simpler to reason about than the existentiality and “capturing”notions currently used in modeling wildcard types. The fine detailsof this work, however, also have yet to be decided and sorted out.
References [1] C arXiv:1610.05114[cs.PL] , 2016.[4] Moez A. AbdelGawad. Novel uses of category theory in modelingOOP (extended abstract).
Accepted at The Nordic Workshop on Pro-gramming Theory (NWPT’17), Turku, Finland, November 1-3, 2017.(Full version available at arXiv.org: 1709.08056 [cs.PL]) , 2017.[5] Moez A. AbdelGawad. Towards a Java subtyping operad.
Proceedingsof FTfJP’17, Barcelona, Spain, June 18-23, 2017 , 2017.[6] Moez A. AbdelGawad. Java subtyping as an infinite self-similar partial graph product.
Available as arXiv preprint athttp://arxiv.org/abs/1805.06893 , 2018.[7] Moez A. AbdelGawad. Partial cartesian graph product (and itsuse in modeling Java subtyping).
Available as arXiv preprint athttp://arxiv.org/abs/1805.07155 , 2018.[8] Joseph A. Bank, Barbara Liskov, and Andrew C. Myers. Parameter-ized types and Java. Technical report, 1996.[9] Gilad Bracha, Martin Odersky, David Stoutamire, and Philip Wadler.Making the future safe for the past: Adding genericity to the Javaprogramming language. In Craig Chambers, editor,
ACM Symposiumon Object-Oriented Programming: Systems, Languages and Applica-tions (OOPSLA) , volume 33, pages 183–200, Vancouver, BC, October1998. ACM, ACM SIGPLAN.[10] Nicholas Cameron, Sophia Drossopoulou, and Erik Ernst. A modelfor Java with wildcards. In
ECOOP’08 , 2008.[11] Robert Cartwright and Jr. Steele, Guy L. Compatible genericity withrun-time types for the Java programming language. In Craig Cham-bers, editor,
ACM Symposium on Object-Oriented Programming: Sys-tems, Languages and Applications (OOPSLA) , volume 33, pages 201–215, Vancouver, BC, October 1998. ACM, ACM SIGPLAN.[12] Sophia Drossopoulou, Susan Eisenbach, and Sarfraz Khurshid. Is theJava type system sound?
TAPOS , 5(1):3–24, 1999. [13] Matthew Flatt, Shriram Krishnamurthi, and Matthias Felleisen. Aprogrammer’s reduction semantics for classes and mixins. In
Formalsyntax and semantics of Java , pages 241–269. Springer, 1999.[14] James Gosling, Bill Joy, Guy Steele, and Gilad Bracha.
The JavaLanguage Specification . Addison-Wesley, 2005.[15] James Gosling, Bill Joy, Guy Steele, Gilad Bracha, and Alex Buckley.
The Java Language Specification . Addison-Wesley, 2014.[16] Ben Greenman, Fabian Muehlboeck, and Ross Tate. Getting f-bounded polymorphism into shape. In
PLDI ’14: Proceedings of the2014 ACM SIGPLAN conference on Programming Language Designand Implementation , 2014.[17] Radu Grigore. Java generics are Turing complete. In
Proceedings ofthe 44th ACM SIGPLAN Symposium on Principles of ProgrammingLanguages , POPL 2017, pages 73–85, New York, NY, USA, 2017.ACM.[18] Richard Hammack, Wilfried Imrich, and Sandi Klavzar.
Handbook ofProduct Graphs . CRC Press, second edition edition, 2011.[19] Atsushi Igarashi, Benjamin C. Pierce, and Philip Wadler. Feather-weight Java: A minimal core calculus for Java and GJ.
ACM Transac-tions on Programming Languages and Systems , 23(3):396–450, May2001.[20] Atsushi Igarashi and Mirko Viroli. On variance-based subtyping forparametric types. In
In ECOOP , pages 441–469. Springer-Verlag,2002.[21] Andrew J. Kennedy and Benjamin C. Pierce. On decidability of nom-inal subtyping with variance. In
International Workshop on Foun-dations and Developments of Object-Oriented Languages (FOOL/-WOOD) , 2017.[24] Alexander J. Summers, Nicholas Cameron, Mariangiola Dezani-Ciancaglini, and Sophia Drossopoulou. Towards a semantic modelfor Java wildcards. th Workshop on Formal Techniques for Java-likePrograms , 2010.[25] Ross Tate. Mixed-site variance. In
FOOL ’13: Informal Proceedingsof the 20th International Workshop on Foundations of Object-OrientedLanguages , 2013.[26] Ross Tate, Alan Leung, and Sorin Lerner. Taming wildcards in Java’stype system.
PLDI’11, June 4–8, 2011, San Jose, California, USA. ,2011.[27] Mads Torgersen, Erik Ernst, and Christian Plesner Hansen. Wild FJ.In
Foundations of Object-Oriented Languages , 2005.[28] Mads Torgersen, Christian Plesner Hansen, Erik Ernst, Peter von derAhé, Gilad Bracha, and Neal Gafter. Adding wildcards to the Javaprogramming language. In
SAC , 2004.[29] Yizhou Zhang, Matthew C. Loring, Guido Salvaneschi, BarbaraLiskov, and Andrew C. Myers. Lightweight, flexible object-orientedgenerics. In
Proceedings of the 36th ACM SIGPLAN Conferenceon Programming Language Design and Implementation , PLDI 2015,pages 436–445, New York, NY, USA, 2015. ACM.
A. SageMath Code
In this appendix we present SageMath [23] code that we used tohelp produce some of the graph examples presented in this paper.The code presented here is not optimized for speed of execution butrather for clarity and simplicity of implementation, and it builds onand makes use of code presented in Appendix A of [6].
IntvlFromTo = ’␣-␣’ExtW = ExtStr +W def is_sublist (sl , l):if sl == []:eturn Trueif l. count (sl [0]) == 0:return Falsestart = l. index (sl [0])return (sl == l[ start : start + len (sl )]) def ty_intvl ( path ):f = path [0]l = path [ len ( path ) -1]return W if f == BotCls and l == TopCls else(f if f == l else( WExt +l if f == BotCls else(f+ ExtW if l == TopCls else(f+ IntvlFromTo +l )))) def ITAs (S):ITA = DiGraph ()ntps = S. all_simple_paths () ntps . reverse () i=0for ntp in ntps :i=i+1lp = len ( ntp )if lp > 2:shrtr_paths = S. all_simple_paths (max_length =lp -2 , trivial = true ) else :shrtr_paths = map ( lambda x: [x],S. vertices ())for sp in shrtr_paths :if is_sublist (sp , ntp ):ITA . add_edge (( ty_intvl (sp),ty_intvl ( ntp )))ITA = ITA . transitive_reduction ()return ITAILP = ’[’
IRP = ’]’ def ity (c, ita ):return c+ ILP + ita + IRPdef IntervalsSubtyping ( subclassing , lngc ,FN_Prfx , num_iter ):