This page is meant to summarize Kim Bruce's paper "Typing in object-oriented languages: Achieving expressiveness and safety" http://www.cs.williams.edu/~kim/README.html#Static , without all the maths. Actually the paper itself is pretty readable, if this page interests you please read it. Anyone out there with a better understanding than me, feel free to correct this page, but remember to keep it concise. ''Is there another source for this paper? Doesnt seem to exist anymore.'' See http://citeseer.nj.nec.com/bruce96typing.html , and you can download the cached PDF copy of the original. The paper argues that static types are a good thing, catching real programmer errors. However, many modern o-o languages have type systems that are often type-unsafe in some way ''by design'' in order to allow some language uses which are perfectly valid but that the simple type checker would have disallowed (think casts, for example). At various places, the paper gives its desirable goals for any type system: * checks can be made at compile time, so problems are caught early. (ie static, not dynamic, typing) * checks don't rely on data flow analysis (look at casts for example. These don't just rely on local type declarations, but on what got passed in, and whose type was declared somewhere else in the program. A system which can rely on 'local' checks would be faster, easier to understand, and less brittle to change elsewhere in the program) * the type system should be expressive enough to handle typical programs * the type system should not be overly intrusive on the programmer (examples are given in the paper where the programmer would have to understand some fairly complex reasoning to see why a given program fails to pass the checker. The examples would most likely happen in real life) The notions of subclassing and subtyping are discussed. This has been done to death on the wiki; basically subclassing means 'I want to reuse the code from class X', subtyping means the LiskovSubstitutionPrinciple (B is a subtype of A iff B can be used anywhere A could, in the program). The paper goes on to show how the LSP is too restrictive to express some intents (like collections), in some type systems, which is where languages like Java introduce casting. At this point the notion of 'MyType' is introduced. This is the type of the current object, and can be used in place of types wherever they would appear; for example: Node { void addchild(MyType s); } This is a real construct available in some languages (e.g. Eiffel uses 'like Current'), which allows for better type safety in many situations where casting was previously the only option. However, there are still problems, to do with 'covariance' and 'contravariance' in method signatures. An example using some type definitions (RefactorMe if you know of a better explanation of co/contra-variance on wiki): Fruit { Fruit addToSalad(CaesarSalad cs); } Orange isa Fruit { Fruit addToSalad(CaesarSalad cs); } is the only subtype relation java allows. however: Banana isa Fruit { Banana addToSalad(CaesarSalad cs); } is a valid subtype, and so is Apple isa Fruit { Banana addToSalad(Salad cs); } ''Note that as of Java 5 the covariant return of Banana addToSalad(CaesarSalad cs); is allowed. Also, Java collections no longer require casting since the introduction of JavaGenerics.'' This goes back to the LiskovSubstitutionPrinciple: banana can be used wherever fruit would be because the return type of banana's addToSalad method is a fruit. Apple can be used wherever fruit would be because its addToSalad method would accept a CaesarSalad. The method return value becomes a subtype when we subtyped Fruit, so we call this 'covariant'. However the method ''parameters'' become supertypes as we subtype fruit, which we call 'contravariant'. How does this cause us problems with 'MyType'? Well, go back to the Node example above. The intent of this class was that node children can only be subtypes of their parent. But suppose we do: RedNode isa Node { void addchild(MyType s); } RedNode's addchild method only accepts RedNodes. However, that means they can't be used wherever we would have used a Node previously, since the method won't accept 'Node's. So we've managed to break the subtyping relation. The paper then describes various attempts to solve this problem, including introducing the concept of 'matching' into the type checker. I'm going to skip over most of that and cut to the chase - what Kim Bruce ends up recommending as a type system. The paper introduces the concept of 'match types'. They're a pretty simple idea. Wherever you need a specific type T, you say T; wherever you just need something that has all of the method signatures of T, you say #T. By all of the method signatures, I mean exact matches, just like in java; there's no co- or contra- variance worries with the method signatures, as we don't change them at all. For example, given: Fruit { Fruit addToSalad(CaesarSalad cs); } Orange { Fruit addToSalad(CaesarSalad cs); void peel(#Fruit x); } matches, however: Banana { Banana addToSalad(CaesarSalad cs); } doesn't. That means: myorange.peel(myfruit) works, as does myorange.peel(myorange), but myorange.peel(mybanana) fails at compile time. Notice how the 'isa' relation has been dropped, there is now no notion of a static type ''hierarchy''. It turns out that this notion of match types can be used to build type safe systems which can be checked at compile time without data flow analysis. There are some flaws (see the paper), but for the most part this does everything we want from a type checker. The main criticism would be that it requires programmers to think about whether they ''could'' use another type in the code; Bruce comments that writing reusable code requires a bit more thought anyway, so this is no bad thing. ----- The most important idea in Bruce's work (and there have been many others, most of them cited by Bruce) is that you have two completely separate goals. * subtyping as a faithful representation of the IS-A relation between conceptual models (ex: a circle IS-A ellipse, a File''''''InputStream IS-A Input''''''Stream) * the need for '''code reuse''' (and desirably type-safe code reuse). For example the implementation of a generic vector sorting algorithm should be reuse across the board for all the vectors who's component types X support a ''int compare(X, X)'' function. For a long time (and this shows in the design of all wide-spread OO languages that we have today), the two essentially different goals have been confounded, mixed and matched at will, with negative consequences. Inheritance mechanism as we know it, tries to accomplish both of these goals and usually fails on both. Bruce doesn't say that subtyping is broken or unuseful, he says that subtyping is not good for code reuse. He proposes a ''matching'' relation that is essentially what we need for type safe code reuse. --CostinCozianu ---- I couldn't help being struck when reading this that it offered an alternative to something that is a problem with java for many folk - String is '''final'''. For non java folk, this means it can't be subclassed, and since java doesn't make a distinction, you can't use a subtype of String either, anywhere. The reason for this is down to security - passing non-String Strings to some of the core java classes would allow you to subvert the security model. Notice how this requirement of ''some classes'' has been forced on to ''all classes'' by the use of the 'final' keyword. If match types had been used instead, we could do this: class ParanoidType { void doSomethingSecureWith(genuine String s); } class RelaxedType { void doSomethingInsecureWith(String s); } (in terms of the paper, I've used 'genuine S', 'S', instead of 'T', '#T' respectively). Notice how the requirement to use a genuine String is now on the class that needs it. The relaxed type can now take anything with String's signature, ''even if it does not subclass string''. ----- I think I'm missing something. Wouldn't one usually factor out a common interface in Java? Isn't #T the interface of T? If we are only interested in method level identity, rather than type identity, then an interface seems like the way to go. Much more explicit too. The whole notion of subtyping is broken, irretrievably IMO. It suffers from an essential ambiguity of when a sub-thing is still a thing. The answer is often "it depends [on usage context]". ''You're right, but this only works if you ''can'' create an interface. Right now you can write 'Foo extends Bar implements Baz', but if 'Bingo' is a second, existing class 'Bingo' whose interface Foo also implements, there is no way of expressing this, without altering the code of Bingo. In fact, in the presence of match typing there is no reason to say 'implements xxx' since this can ''always'' be inferred from the uses you put the class to. This is the point of having a 'sound' type system. As for subtypes being broken, I think this was Bruce's point - subtyping just doesnt capture the idea of what we want to do with a program, match typing is very close to being exactly what we want.'' ---- Interfaces in Java are not good enough for type safe code reuse (neither they are for subtyping, but that's another story). For example : interface Comparable { int compareTo(Object other); } which results in generic code for like sort(...) { ... Comparable o1= ... Object o2= ... if (o1.compareTo(o2)>0) // here we are type unsafe ... ... } What was really needed was something like type Comparable { int compareTo( MyType other); } with a generic code like sort(...) { MyType o1= ... MyType o2= ... if (o1.compareTo(o2)>0) // here we are type safe ... // fort every MyType that matches Comparable } And considering the definition of type Comparable and the generic function above, the only thing needed to reuse the sort routine is to have a concrete class X including the following definition class X matches Comparable { ... int compareTo(X other); // here the parameter is of the same type as the method's class ... } The relation between X and Comparable is not a subtype relation, but a matching relation. The difference is that while a subtype relation allows a parameter of type X to be used wherever a parameter of type Comparable is expected, the matching relation is less strict and only allows generic code written to implement methods working with Comparable objects, to be reused as an implementation for methods of the X class while the implementation remains statically type safe. Type matching doesn't limit of course to the use of My''''''Type, but this is a good illustration. --CostinCozianu ''I see. A templating mechanism could do this?'' Yes, and starting in Java 5 the Comparable interface is defined in a way where this problem no longer occurs: public interface Comparable { int compareTo(T other); } Also, String now implements an interface, CharSequence, which allows other implementations to exist so long as code is refactored to accept CharSequences rather than Strings, although String still cannot be extended and there are problems with mixing different implementations of CharSequences (they are not guaranteed to be mutually comparable with equals, and typically are not, for one thing). (Incidentally, I think the problem above with peeling oranges vs. bananas could be easily handled with a Peelable interface that Orange implemented but Banana did not.) I realize this page is about type systems rather than Java, but since Java is used in several of the examples I thought it would be useful to note how Java has addressed some of these concerns in more recent versions. --DavidConrad