Type safe Eiffel (4, chapters "Backward compatibility" and "Promiscuous generic conformance" included)
From the beginning of Eiffel the language has been characterized as strongly typed. But since its beginning there has been a hole in the type system which has not yet been fixed up to now (the so called catcall problem).
Contents
|
What is the hole?
The existence of the hole can be demonstrated with very simple examples. There is a class COMPARABLE defining the basic comparison operators "<", "<=", ">", ">=". The class COMPARABLE is an ancestor of types like INTEGER, REAL, STRING, ..., i.e. all types which would be characterized by a mathematician as having an order-relation.
The code snippet
local i: COMPARABLE s: COMPARABLE do i := 1 -- an INTEGER is COMPARABLE s := "Hello world." -- a STRING is COMPARABLE if i < s -- catcall !! then ... end end
is legal Eiffel, but produces a catcall with the expression "i < s", because it tries to compare an object of type INTEGER with an object of type STRING. Unfortunately the question "Is 1 less than "Hello world."" has no answer and the runtime has to react with an exception.
What happened here? The class COMPARABLE is defined like
deferred class COMPARABLE feature is_less alias "<" (other: like Current): BOOLEAN deferred end ... end
Another example:
local a: ARRAY[STRING] aa: ARRAY[ANY] do create a.make (1,2) a[1] := "Hello world." aa := a -- an ARRAY[STRING] conforms to ARRAY[ANY] aa[2] := 1 -- catcall !! end
The array attached to "aa" is still an ARRAY[STRING]. But since the static type of "aa" is ARRAY[ANY], the assignment "aa[2]:=1" is legal Eiffel. Putting an INTEGER into a place, where a STRING is expected can only result in a runtime exception.
ECMA Eiffel (like all its predecessors) says an ARRAY[STRING] "is a" ARRAY[ANY]. But it is **not**. Into an ARRAY[STRING] you can put only strings or something conformant to strings, into an ARRAY[ANY] you can put anything.
Why has it not yet been solved?
To the inventor of Eiffel the problem has been known for a long time. In the book "Object Oriented SW Construction" (OOSC2, 1992) Bertrand Meyer analyzed and explained the problem. Since then, some solutions have already been proposed, but none of them has been satisfying.
Surely some bright people have tried to solve the problem. If bright people have not resolved the problem for more than 15 years, does that mean that the problem has no solution?
But the problem has solutions. In OOSC2 Bertrand Meyer explained, that catcalls only happen when polymorphic attach (attach a descendant to an ancestor, i.e. the "is_a" relation) and covariant redefinition (e.g. arguments of type "like Current) come together.
So if you forbid polymorphic attachs or covariant redefinitions you have a solution which makes catcalls impossible.
Unfortunately this is not a satisfactory solution. The former would rob Eiffel of one of the most powerful mechanisms in object orientation (dynamic bind), the latter one would make classes like COMPARABLE (i.e. partial implementations) impossible.
Therefore a satisfactory solution has to retain polymorphy and covariance.
But neither polymorhy nor covariance are all-or-nothing constraints. The boil down to a lot of explicit or implicit constraints. These constraints define the space, where a solution has been searched for. Since no satisfactory solution (a solution which does not violate the constraints) has been found up to now, it seems that the solution space which the constraints define might be empty.
It might be like a student of mathematics who tries to solve a problem with 4 equations and 3 unknowns where each equation stands for a constraint. Using 3 of the constraints he comes up with a solution. Unfortunately the solution does not satisfy the 4th constraint.
Hypothesis: There are too many or too strong constraints (explicit or
implicit). Finding a solution implies modifying (i.e. weakening) or getting
rid of some constraints.
What can we do? I propose to make all the constraints explicit and discuss
them. Let us formulate the constraints as fine grained as possible in order
to find an acceptable modification to the constraints, i.e. let us
scrutinze the potential solution space with the goal to find a non empty
solution space.
The definition of the solution space will require some decisions. Once having a non empty solution space at least one or possibly more solutions will show up (hopefully).
From my analysis up to now the following constraints are used to find the
problem. They will be discussed in the following sections in order to find out
in which manner we can modify them.
Constraints:
- Universal conformance (UC)
- Feature rich ANY (FRA)
- Covariance (COV)
- Backward compatibility (BC)
- Promiscuous generic conformance (PGC)
- Generic constraints based on conformance (GCC)
- Validation with local analysis (VLA)
- ...
Maybe the list will become longer and/or more fine grained during the discussion.
Analysis of the constraints
Universal conformance (UC)
Universal conformance means that all types conform to ANY. The type graph has a unique top. Any object in your system can be attached to an entity of type ANY.
This can lead to some holes in the type system together with features of ANY which have implicit covariant signatures like is_equal and copy with an argument of type "like Current". We can demonstrate the problem with a slight modification of one of the above examples
One of the arising questions is: Is it really necessary that all things conform to ANY?
What is the advantage and disadvantage of having universal conformance?
Java has universal conformance (class Object) and C++ has no universal conformance. Smarteiffel has chosen liberal way. You can have conformance to ANY or not. It is up to the designer.
Discussion of UC:
UC gives us the possibility of having just an object. You can design classes which store any type of object. You can store it and move it around (i.e. its reference).
As soon as you retrieve it, you cannot do a lot with it unless you find out what type of object it is. With an object test you have to "downcast" the object to do something meaningful with it.
Having UC is some kind of convenience. If you don't have UC you can always give a group of types a common conformant ancestor type to store them and move them around. This has the advantage of being able to call some common features of this group of types and assuring that only conformant descendants of the base type can be stored in a container class.
The opponents of UC sometimes argue that using ANY is just lack of design. If want to treat a group of types in a common way, design some conformant base type for them.
Then there is the argument of beauty. Since its beginning Eiffel had a unique top and bottom of the type system: ANY and NONE. This is very pleasing for anybody which looks at Eiffel with a mathematical eye.
Another cause for UC is the possibility to implement system routines like print(a:ANY) which prints a representation of any object. Without UC a routine like print is not possible which would be some loss of convenience.
To summarize, neither the case for nor the case against universal conformance has very strong arguments. UC does not seem to be a principle at the heart of the Eiffel language. If it helps (which might be the more interesting question), UC can be thrown over board.
By the way: The ECMA committee has already given up universal
conformance. The type detachable T does not conform to ANY, just to detachable ANY. Only attached types conform to ANY.
Feature rich ANY (FRA)
All classes in Eiffel inherit the features of ANY. These are a buch of features: type, is_equal, copy, twin, once_manager, io, is_deep_equal, deep_win, do_nothing, default_rescue, ... just to mention a some of them (in EiffelStudio 6.3 the class ANY has 29 public features).
These features are imposed on all classes and therefore on all types, independently whether they need them or not. And if you study some real life classes they usually don't need all features of ANY or just one or two of them. So for somebody looking for some minimal kernel of the language, this is a little bit disturbing.
All who know C are familiar with the famous "Hello world"-example. The designers of C made the language core minimal, so that even for printing a simple string to standard output you have to request some library. Without "#include you cannot print anything. Fullstop. The basic philosophy behind: You only pay for what you ask. If you don't ask for stdio, you won't pay for it.
The philosophy behind Eiffels ANY seem to be just the opposite: You don't have to ask, you just have to pay. It is already included in the basic setup. Some might argue that unused features will be squeezed out finally by the optimizer and in the final binary you won't pay for unused features. However in the whole development process you have them in.
Let us smell the taste of the different views a little bit. A hello-world program with FRA reads like
class HELLO create make feature {NONE} make do print ("Hello world.%N") end end
Without FRA you would encounter something like
class HELLO inherit STANDARD_INPUT_OUTPUT create make feature {NONE} make do print ("Hello world.%N") end end
Too noisy? I don't know. But up to now it is a matter of taste.
You might ask, what has this to do with type safety. Admittedly the example with printing "Hello world." has nothing to do with type safety. It served to smell the taste about FRA and the opposite (feature poor ANY).
The thing gets more interesting if we look at features like is_equal and copy. Since they have an argument anchored to Current "is_equal ( other: like Current): BOOLEAN", they imply covariant redefinition in descendants. I.e. any class, which redefines is_equal will redefine the argument covariantly. INTEGER, REAL, STRING, ... all have a different definition of is_equal and are expecting an argument of the same type (or conformant).
And this together with universal conformance (UC) hurts. UC implies the possibility of polymorphic attachment of an object of any type to an entity of type ANY and FRA makes it possible, that is_equal can be called with an argument of any type as long as it is called on an entity of type ANY but is expecting a specific one. And these two facts open up a possible catcall with is_equal.
The current version of the ECMA Eiffel specification has tried to resolve the issue by giving is_equal another signature. The spec. defined
This is possible, but it has some ugly consequences. Look at the class INTEGER. Would you be pleased with the class text
and the code snippet
local i: INTEGER do i := 5 if i.is_equal ("Hello world") then ... end end
I would say that it is a possibility to change all arguments of type "like Current" in features of ANY to "ANY" and not redefining them covariantly in descendants. With that, the catcalls with is_equal can be avoided.
But what about copy? What would you say to "i.copy("Hello world")"?
Bertrand Meyer some time ago stated that the ECMA committee has oscillated about the signature of is_equal, but now the definition of is_equal (other: ANY) is firm. Some weeks later another member of the committee stated that is_equal(other: like Current) is back. Maybe in the next meeting they ... I don't know.
But wait a moment. We are discussing the constraint "feature rich ANY" here. Therefore the real question is: Is it necessary that the features is_equal, copy etc. are in ANY?
What would be the consequence of e.g. putting is_equal and copy into a class EQUALITY_FEATURES, is_deep_equal and deep_cloned into DEEP_FEATURES, io and print into STANDARD_INPUT_OUTPUT?
The most obvious consequence is that you cannot any longer compare any two objects by the object comparison operator "~", because "~" relies on is_equal.
Objects of many types need not necessarily the feature is_equal. But some surely need. All numbers need to be checkable for equality. E.g. the class INTEGER would read
Without inheriting from EQUALITY_FEATURES no assignment of expanded types would be possible. Expanded types have copy semantics. Therefore the semantic effect of assignment is based on the semantics of the copy feature. Any expanded class not inheriting the equality features would not be assignable (as opposed to reference type which are always assignable without having copy).
Removing FRA will give us "You only get what you ask for".
Besides the additional effort it gives me some new freedom of design. Without FRA I can design a classes which cannot be compared nor copied. Sometimes this might be my design intention. Maybe I want to say to you: Just create objects of this class, modify them with the features I have designed for it, but do not ask for equality nor for copy. The class has not been designed for that. Why not?
The constraint FRA can be lifted, if it helps. The combination of UC, FRA and the anchored signature of is_equal and copy cannot all be kept.
Covariance (COV)
Since catcalls can only happen when polymorphic attach meets covariant redefinition some might propose to get rid of covariance.
Bertrand and Emmanuel have written a little paper with the title "The world is covariant. Is it safe?".
The title already reveals that they don't consider removing covariance an option. They try to illustrate it with a CUSTOMER which can be served any type of BEVERAGE (i.e. CUSTOMER has the feature "serve(b:BEVERAGE)". A BEVERAGE is either a SOFT_DRINK or ALCOHOL. MINORs are CUSTOMERs, but they shall have the more restrictive feature "serve(b:SOFT_DRINK)".
They say implicitely that modelling the world usually involves covariance. A more specific type might require some more specific argument in its features.
The problem arises if a MINOR is considered as a CUSTOMER (polymorphic attach) and served an alcoholic beverage. That is the realworld correspondance of a catcall which at least parents won't like.
Despite the potential catcall, the example is a strong case for covariance: Its modelling power.
Another case for covariance is its practical implications.
Let us look into the class COMPARABLE.
deferred class COMPARABLE feature is_less alias "<" (other: like Current): BOOLEAN deferred end is_less_equal alias "<=" (other: like Current): BOOLEAN do Result := Current < other or Current ~ other end is_greater alias ">" (other: like Current): BOOLEAN do Result := not (Current <= other) end ... end
Note that in the class COMPARABLE only the feature "<" is deferred. The others "<=", ">", ">=" are effective and based on "<" and "~". The class COMPARABLE is therefore also called a partial implementation. A descendant only has to effect (i.e. to implement) the feature "<" and will get the other ones for free and consistent with a mathematical order relation.
Or look into the class NUMERIC.
deferred class NUMERIC feature plus alias "+" (other: like Current): like Current deferred end minus alias "-" (other: like Current): like Current deferred ensure Result + other ~ Current end ... end
NUMERIC does not provide a partial implementation like COMPARABLE. But it
specifies contracts which the descendant has to obey to. It specifies
e.g. that the operations "+" and "-" have to satisfy some consistency
relation. We can say that NUMERIC models a certain behaviour which INTEGER,
REAL, COMPLEX, MATRIX, etc. have to satisfy.
Now imagine the classes COMPARABLE and NUMERIC without its implicit covariance. Does it make sense to specify
plus alias "+" (other: ANY): like Current ?
What is the sum of an INTEGER with a MATRIX?
No! We definitely want
plus alias "+" (other: like Current): like Current
So the case for covariance can be made with modelling power, partial
implementations and behaviour classes. Covariance is a very powerful
mechanism. Therefore nobody wants to get rid of covariance.
But: Is it an all-or-nothing decision? Do we want to be able to redefine covariantly any feature in any class?
Maybe we can find a restricted form of covariance which gives us the advantages of covariance without its dangers.
Remember: Covariance itself is not the problem, only covariance combined with polymorphic attach and polymorphically calling the feature with covariantly redefined arguments.
So it could be possible to allow covariant redefinitions only when they cannot do any harm, i.e. instead of full covariance only a restricted form of covariance.
An attempt: Allow covariant redefinitions only on features which are not inherited conformantly.
As a consequence classes like COMPARABLE, NUMERIC can only be inherited non-conformantly.
What other restrictions of covariance can you imagine?
Backward compatibility (BC)
The changes of the Eiffel language have been made up to now in a very conservative and prudent manner. A compiler shall be able to support an old and a new version of the language at the same time. Changes in the language shall not break existing code.
This is in general a wise way to move forward. A code breaking change is inconvenient to the users.
Therefore the fix of the hole in the type system is expected not to break existing code. Is this possible?
A solution of the catcall issue has to disallow possible catcalls. Since in the current version of Eiffel it is possible to code catcalls, the new version of the language must reject some of the currently valid Eiffel code.
Therefore the requirement of complete backward compatibility is too strong. It cannot be fulfilled.
But we don't want to give up backward compatibility in general. Well written Eiffel code does not produce catcalls. But it might use Eiffel constructs which are catcall prone even if in the specific case it does not produce a catcall.
Is it at least possible to make a change in the language which does not break existing code which does not produce catcalls?
Even this requirement might be to strong. The compiler has to analyze the code statically. It cannot do all reasoning which humans can do. If we can prove that a specific code cannot produce catcalls it might be impossible for the compiler to prove that. In that case, the compiler has to reject the code, i.e. break the code even if no actual catcall happen.
Therefore I propose to lift the requirement for backward compatibiliy completely (at least to open up the solution space). A satisfactory solution has not been found for more than 15 years now. If we pose onto ourselves too many constraints, we might not be able to come up with a good solution. Once having a solution which is satisfactory without the backward compatibility constraint we can try to polish and improve it to make it as backward compatible as possible.
Promiscuous generic conformance (PGC)
Conformance is based on inheritance. A descendant conforms to its conformant parents, i.e. if "class D inherit B ..." is encountered, D conforms to B. Conformance is transitive. If A conforms to B and B conforms to C, then A conforms to C.
This is not a complete description of conformance, but it captures the basic idea.
But Eiffel also has generic classes. There is a rule of conformance which opens up another path. If we have a generic class CG[G] and a type D which conforms to B, then CG[D] conforms to CG[B] as well.
In simple terms ARRAY[STRING] conforms to ARRAY[ANY]. I.e. for generic types there is not only the conformant ancestry of the base class which determines conformance but also the ancestry of its actual generic parameters.
Does this make sense? We can answer diplomatically: Yes and no!
As long as I want to get an element of an ARRAY an ARRAY[STRING] returns me a string. Since STRING conforms to ANY I can attach an ARRAY[STRING] to ARRAY[ANY] and the returned value is fine.
If I put something into ARRAY[ANY] (which is in reality ARRAY[STRING]), the compiler won't complain. But the object attached is an ARRAY[STRING] which does not like putting INTEGERs or BOOLEANs into it.
So yes, it is fine for all features not having arguments of formal generics of the surrounding class. And no, every use of a feature having arguments of formal generics of the surrounding class might produce a catcall.
Promiscuous generic conformance is part of the problem. We have to find a less liberal form of generic conformance in order to make Eiffel type safe.
The most rigid solution would be to dissallow conformance based on conformance of actual generics, i.e. an ARRAY[STRING] cannot be attached to an entity of type ARRAY[ANY]. The question: Does this cut off useful patterns?
I personally have not yet encountered useful patterns which require attaching an ARRAY[STRING] to ARRAY[ANY]. But maybe you will protest and say "I use this pattern frequently to achieve XYZ". If you have a valid point, then we have to find something less rigorous.
A less rigorous solution has already been proposed in the above mentioned paper "The world is covariant. Is it safe?". It is based on the observation that only features having arguments of formal generic type are dangerous. E.g. in the class ARRAY
the feature "item" will never produce catcalls, but the feature "put" can do.
If we attach an object of type ARRAY[STRING] to an entity a:ARRAY[ANY] we don't want a.put(...) to be called, i.e. we want a type which has the features of ARRAY[ANY] without the dangerous ones. In other words we want ARRAY[ANY] to "forget" some of its features.
Syntactically it has been proposed to use ARRAY[variant ANY] to be that type. ARRAY[variant ANY] has all features of ARRAY[ANY] except those which have arguments of formal generic type. With that it is not any problem to attach an object of type ARRAY[STRING] to an entity a:ARRAY[variant ANY], because you cannot call the offending feature a.put(...). Attaching an object of type ARRAY[STRING] to an entity a:ARRAY[ANY] won't be allowed any longer.
What would you prefer? The simple more rigorous solution or the more complicated fine grained solution still allowing some generic conformance?
Is there a third (or forth, ... ) way?
Generic constraints based on conformance (GCC)
This and the next chapters will be updated soon.
Validation with local analysis (VLA)
The solution
- helmut.brandl's blog
- Login or register to post comments
-

Comments
Hi Helmut! You write: "By the
Hi Helmut! You write:
"By the way: The ECMA committee has already given up universal conformance. The type detachable T does not conform to ANY, just to detachable ANY. Only attached types conform to ANY."
Why does ECMA not have universal conformance? It just that the top element of the type system is now "detached ANY", and not just "ANY". Still, there exists a type that all types conform to, which is the definition of universal conformance.
I agree with your point that
I agree with your point that all types in ECMA Eiffel conform to "detachable ANY". But the standard does define universal conformance as
"Every type conforms to ANY" (see 8.6.5).
and not "Every type conforms to detachable ANY" (which would be consistent with your statement).
But don't take my point with the ECMA standard too serious. We all know that the standard is inconsistent in various parts and does not reflect the current status of the discussion. But it is the only written paper available which is supposed to be the language standard.