Download STATIC DETERMINATION OF DYNAMIC PROPERTIES OF
Transcript
STATIC DETERMINATION OF DYNAMIC PROPERTIES OF GENERALIZED TYPE UNIONS Patrick Couso~and Radhia Couso~* Laboratoire d'Informatique, U.S.M.G., BP. 53 38041 Grenoble Cedex, France Abstract. The classical programming languaqes such as PASCAL or ALGOL 88 do not provide {ull data type security. Run-time errors are not precluded on basic operations. Type safety necessitates a refinement of the data type notion which allows subtypes. The compiler must also be able to ensure that basic operations are applicable. This verification consists in determining a local subtype of g l o b a l l y declared variables or constants. This may be achieved by improved compiler capabilities to analyze the program properties or by language constructs which permit the expression of these properties. Both approaches are discussed and illustrated by the problems of access to records via pointers, records, access to variants of record structures, d e t e r m i n a t i o n of d i s j o i n t collections of linked and d e t e r m i n a t i o n of integer subrenge. Both approaches are complementary and a balance must be found between what must be specified by the programmer and what must be discovered by the compiler. Key words and phrases : Type safety, type unions, subtype, data type, system of equations, verification/discovery, error detection capabilities, abstract interpretation of programs, use of o o i n t e r s / v a r i a n t s of record;structures, domains/collections, 88, EUCLIO, type secure integer subrange type, ALGOL PASCAL. CR categories : 4.12, 4.13, 4.2, 5.4. 1. Introduction reference ignores the fact that a reference may be dummy, The type of an object defines how that object relates to other objects and which actions may be In all these languages the p r o b l e m of subscript ran- systems of ALGOL 801187S], PASCAL[1874], ALGOL 68 ~e is not safely treated by the type concept, LiKe- [1975] ... do not convey enough information to de- wise, termine staticly whether a given action applied For example, in AL- CAL, Attache de Recherche au C.N.R.S., L a b o r a t o i r e Associ~ N ° 7. ** This work was supported by IRIA-SESORI under grants 75-035 and 78-180. in PAS- a pointer to a record must De considered as potentially designating any record of a given type. in ALGOL 88 the type * the classical type systems define only loose relationships between objects. For example, GOL 80 the type procedure does not include the type of acceptable parameters, [variants of record of erring on the current a l t e r n a t i v e of the union. applied to it. U n f o r t u n a t e l y the classical tyoe to a value will be meaningful. in PASCAL type unions structures) are unsafe because of the possibility One cannot express the fact that two linked linear lists of the same type do not intermix. Finally, the rules of the language or the programming discipline accepted by the programmer are not statiely enfor- 77 cad by the compilers, so that run-time checks are the widely used remedy. However these expensive time checks are usually turned off before programming security run- linguistic the "least" In the interest of increased flexible reliability framework type properties), error has been discovered. ware products, offered by full typing [within a suitable to properly propagate and the simplicity Ibut incomplete) classical strong offered by the type systems. of soft- the language designer may reply upon : 2. Nil and Non-nil Pointers - The design of a refined necessitates and safe type system, linguistic constructs te strong type oropertles. which Among the objections which propaga- ters ere the faotsthat The rules of the lan- pe violations guage must then be checkable by a mere textual scan of programs [1976] provide language a secure design and baroque le.g. ALGOL 68[1975] approach may degenerate programming ranteeing This the type of the object pointed [1974] except for.variant to large that pointers languages. heap cells The design of a refined compiler wkich performs a static proved treatment of programs error-detection then remains simple and provides capabilities. im- and flexible, but security offered by compiler verifications [e.g. EUCLIO legality assertions which the compiler for the verifier]. may degenerate is generates and mysterious The comelier comparable the two approaches re- is not used). However have the nil value which points to no element at all ; this is a source of frequent summary of the meaningful operations as a static the operations prescribed on that value. by a syntactically techniques techniques, It is shown that the language value which happens Cousot since both need a refinement pe notion. They differ by the fact that one needs to be nil. - the subtype of nil pointers type (which happens covery may be equivalent only to pointers type enforcement or dis(e.g. the language of non-nil is not the case for infinite pointers). type systems by the programming checks is really ve- [e.g. protection facilities. However time checkable. needed syntactic languages genera- and ting code in master-mode by appropriate this hardware detec- constion is not always utilizable. Moreover, for a means by which languamore complicated can establish system. : Ithese checks are usually for system implementation ge designers this ry cheap for pointers when using the hardware memory are not compile Finally we propose Since nil references,type of non intermixing type discovery can be applied subtype. designer has three solutions Run-time can be facilitated to that record to have only one value) The rule is that dereferencing between rule must be enforceable In such a case to that re- the other uses a type dis- We show that strong which pointers type cord type type checking and discovery. integer ranges), to a record - the subtype of non-nil of the ty- but we show the close connexion collections so : - the type of pointers are strong- meaning- a pointer type notion must then be refined that one can distinguish ra- ly related a type checker whereas The pointer design ap- approach are not always dynamically ful. This is the case when dereferencing by means of examples; to program optimization proach and the compiler design tructs. which au- e degree of sophistication ther than program verification This cells) they are no longer accessible The type of a value may be viewed techniques we propose for the static ana- lysis of programshave unions, allocated program verifiers. We illustrate coverer, from variable when "dispose" valid construct [1976]. and ensuring a pointer may always However tomatic until by gua- at (PASCAL errors. This compiler design approach into futurustic (disjoint IPASCALE1874] The language of records), point only to explicitly main allocated ty- and that they may be left dang- ling. One can take care of these objections, end EUCLIO use of type unions). (PL/I) against the use of poin- they can lead to serious a balance between the scripting 78 these examples run-time such as array subchecks are very expensive. using a simole Safe of line (3). This reasoning is easily mechanized as follows : associate invarian~P1, P2, P3, P4 and P5 language a type tion design, system which prescribed truct strong that typing valid be dynamically pointer types, to ooints cons- (i.e. forbid rations (i.e. or non-nil pointer9 and check the correct use of opeauthorize dereferencing for non- nil pointers only). - Compile time checks, to recognize the safe language PASCAL ( H o a r e variants are of (1) P1 [2) P2 = (P1 = [at (11) respectively. of the orogramming and W i r t h [ 1 9 7 3 ] ) , as d e f i n e d by these the in- subsequent : = L) or (3) P3 = [P2 and P4 = P2 and (5) P5 = P3 o r -- [5) and P5) (4) (b = true) and ((pt <> n i l ) (pt+.value [ot+.value (~ p t ' = n)) and (b = false) <> n) I s°t'(p4) pt and p t = o t ' + . n e x t ) has b e e n d e l i b e r a t e l y see D e m b i n s k i and b) oversimplified, and S c h w a r t z [ 1 9 7 6 ] ) . now this last strategy. Since 2.1 related equations (Equation use o£ a type scheme which is too tolerant. We illustrate and semantics the type o£ an object the type"non-nil syntactically (9) the disallow tyoe vio- to be changed from the tyoe"nil pointer",to to between nil system lations [21,[4},(71, According algorithm from the test meaningful. This type scheme must distinguish and non-nil i.e. any opera- by a syntactically always will with ensures propagation in general systems it solution to consider simplifications is such undecidable as t h e to one a b o v e , find a we m u s t Static Correctness Check of Access to Records [to the prejudice of the via Pointers precision of our results), Consider the simple record with value ignore problem of searching for a "n" in a linked the existence of the boolean variable b, of" the fields "value" linear list L : For that purpose we will and thus focusing in records o£ the linear list, on pointers. Moreover, consider only the pointer variable value next lowing mredicates L we will pt, and the fol- ,F-Fq__Tq? .... respectively solution as follows : : pt = nil, pt <> nil, dicates The PASCAL on st is given by PASCALF1974] (p. 64) gram denoted by nil, form a complete .is (at = nil) or (pt<> nil) non-nil,T . These pre- lattice whose HASSE's die- : T [1) pt := L; b := true; (2) {P1} [3) while L (4) [pt <> n i l ) = nil / ~non-nil and b do ({P2} Where i is used to denote (5] i~ pt+.value Known about the variable (6) b (7) {P3} (8) the fact that nothing is = n then pt. := f a l s e Since we are only considering else subset Of the set of predicates, (9) {P4} (10) pt (11) {P5}); tions can be simplified an oversimplified our system of equa- accordingly : := p t + . n e x t ; (1') P1 = T (2') P2 = (PI or P5) and non-nil (3') P3 = P2 (4') P4 = P2 [5') P5 = P3 o r The above piece of program is correct with regard to accesses to records via pointers, pt is not nil when dereferenced at lines since T (5) and (In (10). This fact is established by the programmer 79 equation (1) we c o n s i d e r (pt = L) since L may be an e m o t y or n o n - e m p t y nil) or [pt <> nil] only consider Our F is complete has a PI, lattice [5 theorem least complete T, in e q u a t i o n (pt = Kleene's [5] we tions. 'next' a (nil or non-nil] oointer PS> = F [ < P 1 , P2, PI, application in Therefore, states that the : P4, [Tarski[1955]). from the L s in itself, this least = T, P2 = [P1 = Knas- P3 F fixpoint P4 oreviously established) ~ [2, can 5]] and non-nil and non-nil non-nil P5 = P3 o r T Kleene non-nil ± l ± l or T = [PI o r P5] or T ] [T or ±] end non-nil, l ,[± end non-nil > i ± l i T l i ,(l or T]> I ± T and > Kildall[1973] = <T, non-nil, non-nil the K l e e n e ' s > call ~3 = F[~2] this programs. or T ] <T, and non-nil,non-nil,non-ni],(l non-nil or ,non-nil,non nil, T]> T since technique Abstract nil,non-nil,[non-ni] T]> or = <T, non-nil ,non-nil,non-nil, T program > the a b s t r a c t interpretation of since = h3 cutions ber o{ steps, sequence converges which is o b v i o u s The s o l u t i o n us that P2 = P3 [7] and since means [10] are s t a t i c l y which according which imolies t h r o u g h ot at lines s h o w n to be c o r r e c t . to the v a l u e of PI and PS, 2.2 to the test on pt at line be i d e n t i c a l l y true. is g a t h e r e d about the pro- which executes to this new m e a n i n g . of some f a c e t s We the then get of the p o s s i b l e exe- A theoretic framework interpretation o£ p r o g r a m s together with examples are g i v e n o£ in C o u s o t [ 1 g 7 6 ] . A Safe Linguistic Framework to Handle Nil Poin- that A comolete and s a t i s f a c t o r y p r o b l e m of d e r e f e r e n c i n g (as in ref real is [nil] tens[1976] within [11], pointer [3] m a y not e reoresentation any name r e f e r s idea of z e n e r a l i z i n g to r e p r e s e n t f r o m the above non-nil 80 := 3,14] the f r a m e w o r k of n i l - n a m e s to a nil name is p r o p o s e d by M e e r of A L G O L to n o n - n i l 68. The values by (this is a c h i e v e d by not profor the nil symbol), to a v a l u e . nil-names. oointers s o l u t i o n of the or a s s i g n i n g t y p e s are r e s t r i c t e d exclusion viding p r o p a g a t i o n may be d e r i v e d to the p r o g r a m ters With regard [2] and in p a r t i c u l a r , The s i m p l e p r o g r a m m e r ' s and i n t e r p r e t a t i o n is g i v e n tells [5] and its i n t e r p r e t a t i o n that pt m a y be nil at p r o g r a m p o i n t s constant various the data L s is a f i n i t e that pt is not nil at lines [g] of o u r p r o g r a m , of r e c o r d s num- system of e q u a t i o n s = P4 = n o n - n i l , our i n t e r p r e t a t i o n the a c c e s s e s to o u r in a f i n i t e about of the p r o g r a m . abstract Kleene's some d e t a i l s o£ an f n t e r p r e t o r summary of we are f o r g o t t e n , according a static versions Sintzoff[1972] both a new m e a n i n g g r a m by m e a n s have Following text and the i n f o r m a t i o n = < T , ( T O F T] and n o n - n i l , n o n algorithms they are " e f f i c i e n t " sequence. o£ the o r o g r a m > stun, and W e g b r e i t [ 1 9 7 5 ] been recognized, = < T , ( T or T] a n d non-nil same as above, non-nil, OF T ] > 12 = F[Ii ] [4], gives : = T -<T, lattice. sequence = P2 I i = F[~ 0 ] Thus, to the non-nil P2 : P5] $] (where according = P2 the sequence, or or i computa- of the p r o g r a m are i n t e r p r e t e d computation = ±, useless = non-nil since from (P, l [T the Moreover, morphism P1 PS>] application as the limit of K l e e n e ' s ~O = < l , = <T,(T actions simolified equations : = <T,[± A symbolic execution the f o l l o w i n g is of the f o r m itself. sequence when eliminating elementary to pt]. order-preserving lattice he d e £ i n e d we get preserving fixpoint a complete [1952] P4, an order ter-Tarski F is delivers is a s s i g n e d P2, list, that the { u n c t i o n system of e q u a t i o n s <P1, where denoted the fact (when d e f i n e d ] value which linear The type void Finally the so that is used type o£ nil end is the u n i o n of the p r e v i o u s ones. For example we can write a c o n s t r u c t i o n like mode list = union mode cell adequate syntactic constructs] is correct. Since in this examole the type system is finite, [ref cell, void] : struct [integer value, both ap- proaches are equivalent as far as type verifications list next] are concerned. to represent linked linear lists. An emety list is represented by the value empty, the only void value. Our routine would have to be rewritten : 3. list pt :: L; Variants of Record Structures while case pt in 3.1 Ire# cell pt'] ~> else In ALGOL BeEIg75] a variable may assume values (ot := next of ot'~ out ~> Unsafe Type Unions in PASCAL if value of ot'=n then false true] aT different types. The type of this variable is fi, then said to be the union of the types of these va- false lues. In PASCAL[1974] the concept of type unions is embodied in the form of variants of record struc- eSaO do skip od; tures : a record type may be specified as consisting of several variants, optionally discriminated This program is safe, since in ALGOL 88 the by a tag field. non-safe coercion of pt from mode union [ref cell, void] Example to mode ref cell has to be made explicit by a conformity case construct. type mode The idea is therefore = [int, char]; type charint = to force the p r o g r a m m e r to explicitly perform the record run-time tests, which in this example is dictated anyway by the logic of the problem : case tag : mode of {the rewritten version admittedly looks a bit cumbersome, but more int : [i : integer]; convenient ways of expressing such a flow of con- char : (c : character] trol may be exhibited end; [Oijkstra[1875]]]. vat digit, 2.3 Remarks letter, a l p h a n u m : charint In a program containing these declarations the occurrence of a variable designator alohanum.c is It is remarkable that both approaches necessionly valid, if at this point that variable is of tate the same secure type system, yet they differ type character. It is so, [if and] only if alpha- in the choices of making it available or not to the num.tag = char. H o w e v e r this is not staticly veriprogrammer. fied by the PASCAL compilers for the following reaThe refined type system considers the pointer type as the onion of two sybtypes : pure sons : [non-nil) The tag field of a variant record definition pointers and dummy (nil) pointers. Type safety is is optional, and may exist only in the pro- guaranteed by requiring strong typing : the type grammer's mind. of a value determines which operations may be mea- ningfully applied to it. When present, the tag field may be assigned, thus allowing to realize implicit type transIn both cases the type correctness has to be fer functions. For instance, a variable of verified or established by the compiler. For that type character : purpose an [often implicit) system of equations is used. equations has to be found by the compiler, other alphanum.tag In one case the solution to that system of alphanum.c in the case the c o m p i l e r simply verifies that the ~= char; := 'H'; may be interpreted as being of type integer solution supplied by the programmer [by means of for the purpose of printing the internal 8] representation - No a s s i g n m e n t s : alphanum.tag := int~ once they to the have been tag f i e l d s are authorized initialized. writeln[alohanum.i]; - U n i t i n Z is a l l o w e d (Note that the tag is a p p r o p r i a t e l y about its value set, and alphanum care one can w r i t e as well :: : := letter~ : is alphanum.c safe but without legal, because the type of the right hand side 'H'; value eharint[char] may be c o e r c e d to the type writeln[alphanum.i]:] of the left hand side t~b e c h a r i n t [ a n y ] 3.2 Safe Type Unions in ALGOL 68/EUCLID variable permits charint[any] a l o h a n u m to h o l d v a l u e of t y p e c h a r i n t [ c h a r ] (the either a o r a v a l u e of t y p e charint [int]]. Suggestions structures i.e. have which compile-time assignments determine been ensure Such ta Z f i e l d s the c u r r e n t similar to p r o v i d e type-unions checkable. to the a statement made that to the features and tag v a l u e syntactic are let from "inspect the - forbid were using when", of S I M U L A 68 ~ 9 7 5 ] we w o u l d charint = union mode integer digit charint alohanum~ tag field which in original its is : (integer, ~ character character]~ letter safe case from the usin Z conformity and clauses. The antagonism type-safe, lows with which ALGOL PASCAL handles 68-1ike manner. parameterized-tyoes, a formal parameter of : [int, char] is m o r e variant This the the type obvious records Since al- usually declaration mode type charint [ta Z : mode] char end end When ~> vat c : character the that the :: x ~ end letter run-time := x check type transfer tested on it case : tag of int ; end char statement o£ w h i c h out carried by ensures variant a cam- of a record the to the compiler checks for which all can non-union Static Treatment of Type Unions char has been vide flexible rity [Wirth [1975]] be able type deliberately unions to d i s c e r n following abstract Record of the r e c o r d actual tag type parameter "charint" may is be a c o n s - designed at the expense : however, a wise the programs secure interpretation vat digit which vat : charint letter lint] : charint allows alphanum type (char] are to p r o - of secu- compiler by of t h e s e a single record with since tide [the numerous record tag is of e n u m e r a t e d should using the programs: which represents [any] at moments This the of p r o g r a m a program identified to n e s t e d T which set program represented is a f i n i t e is a u g m e n t e d non-initialized point, execution, but at with by by The set a null value. two d i f f e - two d i f f e r e n t dealin Z with : 82 may be assumed by a tag field a variants is s t r a i g h t f o r w a r d ] . type value same variants types Since the consider generalization and values type unions ta Z, : when abstractly We will values. type-safe be tag f i e l d s . rent 68 or E U C L I D will of d i s c r e t e unions : charint values by t h e i r single : vat ALGOL to r e t r i e v e case a variable or any, digit an : int ~ end and ta Z is e x p l i c i t l y d i s c r i m a t i n g case PASCAL ~ end ensures corresponding = i : integer united is in use, charint declared, tant var to r e t r i e v e tyoes~ tag of int ~ > type-checking be record case of way is by a d i s c r i m i n a t i n g be 3.3 type been only in in a EUCLIO tag w i l l The the ~> if case olete EUCLID[1978] principle type This int = > programmer, since discriminatin Z x : alphanum ; end has since char is h i d d e n checked write the object coercion, :: a l p h a n u m be v i o l a t e d . statement. In A L G O L m a y be allowed, would [I 974 ]. The is no d e - u n i t i n g letter compiler context There safe, of a r e c o r d variable, a s t a t i c s u m m a r y of the p o t e n t i a l gram executions must consider tag fields. variables (More g e n e r a l l y , o{ e n u m e r a t e d a set of values type]. over, is a f i n l t e Thus the a b s t r a c t in 2 T, the p o w e r - complete if the p r o g r a m c o n t a i n s enumerated lattice. to take c o u n t of t h e m in the p r o g r a m a b s t r a c t m simple Finally, (2 T x ... x 2 T] m times. programs {null} {null} {null} the a s s i g n m e n t {male} 5) Since this space can be p e r f o r m e d at c o m p i l e this g i v e s rise un- to two e x e c u t i o n paths [6] and [8] : I [6) {male} I {female} {null} i is [7] of time. {male} [ {female} [8) {male} {female} 9] {male} {female} 10] The two e x e c u t i o n Example : = {mole} = {male} {null} {female} paths are j o i n e d together: i female}u{female}i{male} ' {male}u{male} type o e r s o n I{null} the value of the test is s t a t l c l y Known, is a com- the a b s t r a c t e x e c u t i o n is i g n o r e d to m a r y . a g e is ignored. the a s s i g n m e n t interpreta- space to p a u l . a g e I {female} Since if the p r o g r a m c o n t a i n s with tag of type T, o u r a b s t r a c t d a t a lattice, {null} {male} ac- v a r i a b l e s of type T or record v a r i a b l e s plete f i n i t e {null} [2) [4) of senior mary [I) [3) More- simple v a r i a b l e s type T, it is c o n v e n i e n t tion process. i line i paul for this is the case for v a l u e s o{ the tag will be c h o s e n set of T, w h i c h pro- = {female} u {Female} = {mole, female} record case sex : (male, female] of Note that at line may have end ; vat paul, mary, senior don't appreciate : person; [10] tag v a l u e s senior.sex it is c l e a r that "senior" "male" or "female". the qact that = if p a u l . a g e paul,sex else := male; but n e i t h e r do A L G O L 68 n o r EUCLIO. paul.age := ...; mary.sex := female; mary.age := ...; guages (3} (4) union (S) if p a u l . a g e £ mary.age then (6) senior it is e v i d e n t that in some m e r k n o w s p e r f e c t l y well := paul; type is used, {7) female {i With cases these but is u n a b l e of a to e x p l o i t this statement. This same limitation t r e a t m e n t of p r o g r a m s , lan- the p r o g r a m - which alternative since he m u s t use a d i s c r i m i n a t i n g exist else then male Knowledge, static we : ~ mary.age (1) (2} However, case a r i s e s w i t h our more powerful schemes [Sintzoff[1975]]. (6} senior := mary; Finally, (9) useful end; [10] in the s t a t i c information tements, and if s t a t e m e n t s , mity The symbolic execution of that piece of used f r o m case sta- as A L G O L 68 c o n f o r - tests. Example program would be : t r e a t m e n t of p r o g r a m s will be g a t h e r e d {Paul : = {male} ; Mary = {Female} ; Senior = {Male, Female}} if S e n i o r . S e x ... [I) ... [2) ... else ... fi 83 = Paul.sex then The abstract i n t e r p r e t a t i o n of a test (A = B) in a 1.1 Static variants to describe classes of data context where A and B are variables which may as- which are different but yet closely related. sume set of values S A and S B delivers a context For examole, Men and W o m e n may be described w h e r e A and B may assume the set of values S A n S B as Persons d e p e n d i n g on their sex, thus on the true path. EUCLIO authorizes (Thus in {I) we get Paul = Senior = (Male} n {Male, Female} = {Male}).The context De- livered for the false path is type Person : (Sex = {Male, Female)) = ... type Man = Person(Male) type Woman A = i__f (ISA nSBI = 1) and not (SAc S8) then SA - SB = Person(Female) In PASCAL however, variables of abstract else S A f i (Thus in (2] we get Paul : type Man and Woman may be staticly recogni- : {male} and Senior = {Fe- zed when their tag values never change. male}). 1.2 Dynamic variants, to describe objects whose When this abstract interpretation of programs components depend on a possibly is terminated we can recognize secure programs by the following facts : stopped, There are no assignments to tag fields, other than changing state. For example a car may be m o v i n g or thus EUCLID authorizes type Car for i n i t i a l i z a t i o n ( w h i c h is recognized {State : {moving, stopped, des- troyed)) by the fact that the tag value is changed from : = ... vat mycar : Car{an__~) null to some value). We can also authorize useless tag assignments, Since the actual oarameter supolied for the i.e. those which assign tag is any, to a tag w i t h o u t c h a n g i n g its value. the variable can be changed from one variant to another d u r i n g execution, by The unsafe de-uniting coercions must be checked. assigning values of d i f f e r e n t variants to This cannot occur when a record variable is assigned to another, the variable. However, since all record variables are considered to be of union types. no r e f i n e m e n t is al- lowed, and no proper subset of the possible (Note that tag values can be used such an assignment may indirectly modify a : vat m y c a r : Car({moving, stopped}) tag value, but this is safe). Oe-uniting coerThis fact may be discovered by a static ana- cions only occur when accessing a field in a lysis of the program, and might be useful record. This is safe only if the tag has been in memory allocation. staticly established to be of correct value. Otherwise, a w a r n i n g is reported to the ueer, 2. Storage S h a r i n g (Overlays). This implies the use and a run-time check inserted in the program. of the same storage area (expressed in the language as "the same actual variable") for diffe- 3.4 Flexibility Versus Security rent purposes i.e. for r e p r e s e n t i n g different abstract variables whose lifetimes are d i s j o i n t This compiler aoproach has the advantage of fle- {block structure is not incorporated in PASCAL). xibility over the secure language approach. It This is a typical case of free union, where no is clear that all EUCLIO orograms translated in- tag will be carried along to indicate the cur- to PASCAL will be recognized to be safe by this rently valid variant. This tag may be staticly technique. simulated, Following Wirth[1975] there appear to be three fields of the variant. Unsafe a s s i g n m e n t s will d i f f e r e n t motivations behind the desire for variants of record structures provided that one ensures an appro- priate setting of the tag upon assignment to be identified and therefore the m u t a t i o n from : one abstract variable to another may be reported I. The need for heterogeneous structures, in two main cases to the user. : 84 3. R e a l i z a t i o n EUCLID of i m p l i c i t in r e c o g n i t i o n led b r e a c h e s necessary, by m e a n s of the Tact that c o n t r o l - provides unchecked oT type c o n v e r t e r s the o r o z r a m elements never refer : seen how a P A S C A L it is c l e a r : d o m a i n of elem; comoiler might 0S2 : d o m a i n of elem; that P A S C A L provides fle- soecify that DS1 dynamic variables, into d i f f e r e n t We h a v e shown : + DSI~ $2 : + OS2; The r e s u l t s of this s t a t i c z r a m s m i g h t also be useful we get a s o p h i s t i c a t e d It is o b v i o u s constructs can be simoly cessitate language the p r o g r a m m e r lan- to scan intentions is the case examole concerning dynamic two p o i n t e r s which may ne- be sets of d i s j o i n t i£ $I and $2 are p o i n t e r s records of the same between of the linked s t r u c t u r e , to d i f f e r e n t type. a pointer only in the p r o g r a m m e r ' s $I and $2 o o i n t of the program- c a u g h t by c o m o i l e r s This that ; the d e c l a r a t i o n s : the c o n f u s i o n the f i r s t e l e m e n t list is v a l i d has no Now, to d i f f e r e n t Unfortunately for e x p r e s s i n g his i n t e n - rich and not n e c e s s a ~ l y e a s y constructs. they o e i n t Thus that the p r o g r a m s will since H o w e v e r some simple mer which of pro- c o m p i l e r for a s i m p l e then, not be v e r y readable, treatment in c o d e g e n e r a t i o n . and OS2 will domains $1 tions. now n e c e s s a r y o n e can s p e c i f y 0SI tructs h a v e been used in e i t h e r s e c u r e or i n s e c u r e preestablished is to the same record that a c o m p i l e r m a y r e p o r t to the user w h i c h c o n s - guage. it code of the c h a r a c t e r x i b i l i t y at the e x p e n s e of security. ways. and to s t a t e the c o n t r a r y . In L I S [ 1 9 7 4 ] type c o n v e r s i o n s , this fact to the user. Finally, share all <<= c h a r a c t e r [ ' H ' ) to i the i n t e r n a l 'H'. We have reoort may of the t y p e s y s t e m are s o m e t i m e s i := u n s i g n e d - i n t assigns type t r a n s f e r f u n c t i o n s . records to and the intellect. of type elem, t h e m s e l v e s m a y p o i n t to the same record. the idea of d o m a i n s has to be r e c u r s i v e l y in o r d e r that e l e m e n t s to s o e c i f y Thus applied of d o m a i n 0SI to u n d e r s t a n d o o i n t only in o u r next a l l o c a t i o n of records. 0SI to e l e m e n t s : domain tyoe eleml of 0S1 : of eleml; = record next : + 0SI; val : R; 4. Disjoint Collections of Linked Records end; 4.1 Collections in EUCLID and that e l e m e n t s of D S 2 Suppose in PASCAL we h a v e to r e o r e s e n t [of t y p e R], we can u s e two a r r a y s : : d o m a i n of e l e m 2 With such a d e c l a r a t i o n , next comoiler that the sets $1 and $2 are disjoint, say any m o d i f i c a t i o n and v i c e - v e r s a . namically linked tyDe list elem knows linear lists we will S i n c e we w a n t cardina- different use dy- : different eleml = record val : list; to g u a r a n t e e domains riable we h a v e = + elem; next : R end; that is to that n, the m a x i m a l lity of the two sets is not known, : t OS2; val of $1 has no s i d e e f f e c t on $2 Suppose = record of R; the P A S C A L ; : tyoe e l e m 2 v a t $I, $2 = a r r a y E 1 . . n ] to e l e m e n t s two sets 0S2 of r e c o r d s of 0S2 can o o i n t only that two p o i n t e r s into can n e v e r r e f e r to the same va- to c o n s i d e r that +DSI tyoes of Dointers. and elem2 are d i f f e r e n t to w r i t e twice deletion ...) which the a l z o r i t h m s handle and The t r o u b l e types, +DS2 are is now that so that we have (insertion, search, the two s i m i l a r lists $I and $2. : R; end; EUCLIDE1978] vat $I, $2 tyoes This tffme, the r e a d e r s compilers] is m o r e flexible and a u t h o r i z e s : list; of the o r o g r a m have to s u p o o s e (e.g. P A S C A L to be o a r a m e t e r i z e d . Thus we will d e s c r i b e the tyoes of lists $I and $2 once, that the sets $I and $2 the d o m a i n 85 (called c o l l e c t i o n as d e o e n d i n g in EUCLIO] on to w h i c h they The collection Delong. type elem collection is p a r a m e t e r i z e d to w h i c h This collection type elem] type elements by the of t y p e C is a c o l l e c t i o n pointing elem(C to C name elem Since C of the of point. of r e c o r d s be p r o v i d e d [o ~ ve : : collection of e l e m ( C ) ] we h a v e entered listsuoport it by a reeursive in the d e f i n i t i o n by an a c t u a l some type = lanzuaze parameter] listsuooort(OS next v a t val record 0SI vat $1 : collection Now : collection $2 cribed once, a parameter on 0SI been insert later. Since we lists $I suffices or OS2 $1, record and $2 c a n to pass to w h i c h of parameter actual Now w e have the they type, be d e s name refer $I w h i c h to of indicate insert[OS actual but To m a k e have 0SI and 0S2 : collection the 0SI, different] conspicuous same : are as that Since the type linear pends on listsupport type its tion m u s t be o£ type the type lists just of records 0SI, the listsupoort[OS elem of vat 0SI val type : R] formal [or the s a m e type. that to 0S1. as port 0SI is i n c o m p l e t e "listsupport" type elem We g e t of that type point elem, to c o l - : (OS : l i s t s u p p o r t ( p a r a m e t e r ] ] next the : +C record = collection : listsupport(OS1]; $I : + 0SI : listsupoort[OS2]; $2 : + 0S2 is p r e c i s e with but somewhat the P A S C A L type sum- list = type of by t h a t : ?] the elem = record val var : = collection Apart o_~ : above since of elements OS of : list; : R and $2 from the the and are disjoint the E U C L I O precision. $2 are ; difficulty tage of : list disjoint especially for linked of c o p e i n g approach Since the lists, register linear with has it can a new the comniler lists}. lin- advan- Knows produce that better allocation. pointing is a c o l l e c t i o n Moreover of type restricted memory : listsupport[?]] $2 {$I notion, listsup- : listsupport(OS $I, guistic $I declaration when : end; de- collec- name overcomplicated declarations + elem; of e l e m ( ? ] as 0SI of e l e m ( O S ] name collections such = : R listsuooort : is a c o l l e c t i o n the of of var 0S2 compared = listsupport[OS1] However name v a t 0S1 which will code means type type to r e p l a c e CO : l i s t s u p p o r t ( o a r a m e t e r ] ] vat elem(OS] such is the now w a n t by the listsuoport. vat val collections = collection parameterized A declaration be p r o - record type of e l e m [ O S ] , to g i v e the of a c o l l e c t i o n name a shorthand will whet we type next type that in the d e £ i n i t i o n listsuooort : OS, different we now w a n t to the linked 0S2 S collections that type, "listsupport" porting OS, = of e l e m ( O S ] parameter lists of this of t y p e OS c o r r e s o o n d i n Z to the parameters that indicates in d e f i n i n g in p a r t i c u l a r lections belongs to d e c l a r e vat is c l e a r to sol- : listsuoport[perameter]) actual supporting end procedure have = forward r in list 0SI. It we : r) to c o l l e c t i o n the f o r m a l the succeeded collection elem(OS2] type the parameter used, vided : insert[OS1, possible keyword has the d e f i n i t i o n s it just the c o l l e c t i o n will of : + 0S2 the o p e r a t i o n s The use must of e l e m [ O S 1 ] : + 0SI var OS2 vat : +C : R (each listsuoport collection vat vat question of convention record end of e l e m ( O S ) cord = 86 the variants allocation type combination R with in r e c o r d s strategies. two v a r i a n t s of c o l l e c t i o n s may yield Suppose Ra, and efficient we have a re- Rb of d i f f e r e n t memory sizes Type say 1 and 3 w o r d s Rtyoe Type = (Re, R (tag : Although tion Rb] : Rtype] = record case t a g i n Ra = ... end Ra Rb ~ ,.. end Rb We have cation the following of R of c o l l e c t i o n compilers seems for memory We w i l l pointer 01: I I I I I I I I I v a t C2 : c o l l e c t i o n of R[Rb] I v e t C3 : c o l l e c t i o n of R{unknown}ll [the collection type of records (it m a y will vat be R[Ra] not c h a n g e C4 (The once : collection records variant values during Thus defect we c a n n o t lists and we may to in P A S C A L let collections. represent variables a collection which point by the within that set of collec- Examole : X Y Z C3 i s unknown of a r e c o r d I I ---I C4 can change execution, variants ber of c o l l e c t i o n s somewhat its u s e f u l n e s s t to the I I---1 from C1 ~ of collections is d e t e r m i n e d declare an array C2 one by a s s i g n i n g Collection records). will The m a i n no- tion. III type R[any] of c o l l e c t i o n of d i f f e r e n t the the allocated). of to a n o t h e r 1 of collection or R [ R b ] ] . T h e appear However undeniable discover may cower allo- : o f R[Ra] vat in E U C L I O to u n d e r s t a n d . compiler expressive 4.2 Compiler Discovery of Disjoint Collections alternatives of c o l l e c t i o n s limited difficult the end e a s e ; end r e c o r d of quite is that at c o m p i l e o~ d i s j o i n t the the num- 01 will be d e n o t e d pointer will linear not ones• : mute C1 try Thus the Z]. (V;W], We will in o o p o s i t i o n to rind global local program form collection try of a p r o g r a m However the at e a c h be of Y, variables collections. time. be d e n o t e d [×, into disjoint to EUCLID, collections we but local try to com- invariants we will 9oint be r e s t r i c t e d will C2 to p a r t i t i o n to : [V, W are pointers to the same collection] and TIll [X, Y, Z are oointers to the same collection] T[2] which we w i l l denote : {V, W / X , Y , Z } We n o w have TEn] to d e f i n e dicates (i.e. example : the the union conjunction of sets ~ of such { A , B , C / D,E} ~ { F , A , G / H} = { A , B , C , F , G I £ on one hand A may p o i n t It has not b e e n collection tions [which be u s e d tition is in f a c t which would criticism recognized the are v a l i d union to the the c o n c e p t recursively, a collection a globally declared of collec- at v a r i o u s be u s e f u l is that that that into smaller program compiler). disjoint record referenced A similar by F and G, i t par- sub-collections. 87 referenced clear that to a A, B, on the same r e c o r d . of formation• the i n s t r u c t i o n s After is for / D,E / H} hand A may p o i n t The i n s t r u c t i o n s cannot cannot to a record on the o t h e r C, F and G may p o i n t ooints of c o l l e c t i o n is to say one by B and C, o r , pre- of c o l l e c t i o n s ) the p r o g r a m p r o v i d e : useful in- X : = nil ; X := Y; { w h e r e if X = nil new it is Y is then known to be nil} ... The following the job Known var that or will be record. Thus X will the o n l y pointer we h a v e or c o n s i s t i n g of point to no r e c o r d to isolated a single the newly at all, With copy CI, C1 X 2 ..... X ..... Xn / Yi . . . . . C2, :: $1~ while = {X i, ($1 L : list; v a t $2 : list]; : list; $2 :: nil; L := nil: ~P1} input predicate P1 to do {P0} (empty an is s u p p o s e d bezin allocated a collection record). procedure : procedure [X] PASCAL Cl <> n i l do begin Yn } {P2} the above P2 instructions = extract(X, {X / lead to an o u t p u t predicate: new(C2)~ X1 ..... C2+.val := C l + . v a l ~ C2+.next := nil; {P3} PI) X / n Y More generally, with an input ter assignement such as : Y } "''" n i" predicate if L = nil'then {P4} P,I, a p o i n - $2 := C2 {PS} X+.next ... +,next := Y+.next optional may c a u s e record. The output +.next else optional X and Y t o Hence ... they indirectly are predicate put will point in the be P2 {P8} to L+.next a common {P7}; {P6} same c o l l e c t i o n . = P1 := C2 L u {X,Y}. := C2; C1 := C l . n e x t ; {P9} A sensible the remark right-hand in w h i c h broken plicity, obvious side case into of we this m a y ignore the v a l u e delivered the a s s i g n m e n t cause two d i s j o i n t case {P1} is that this may a collection other be e nd nil, {PlO} to be sub-collections. fact, by For than end~ sim- in the According : the X := Y+.next ... +.next {P2} basic establish ; [I] optional P1 to o u r constructs the abstract of the following interpretation lenzuage system we can o£ now of e q u a t i o n s : = extract[L,extract[S2,extract(C1,PO) U {C1,$1})] which will cause X to be d i s c o n n e c t e d collection and collection of Y. W h e n variable, ted be c o n n e c t e d the o u t p u t to the inout X end P2 = e x t r a c t ( X , to a r e c o r d Y are assertion assertion P1) from [2] of the the P2 will PI by ~ {X, not its be same rela- (3) : P2 = P1 u P9 (Since the [C1 <> n i l ) mation on c o l l e c t i o n s when assignment of deed modification Now, sen the we w i l l copying give an e x a m p l e . of a l i n k e d linear We h a v e list b y C2 a r e cho- : us no i n f o r - non-oointer in the values structure and pointed ignored) [4] P4 = e x t r a e t ( L , P 3 ] (5] P5 = e x t r a c t [ S 2 , (6] P6 = P3 (since 88 gives true] P3 = e x t r a c t [ C 2 , P 2 ] (The Y} test we i g n o r e [7) P7 = P6 ~ { L , C 2 } (8) P8 = P5 ~ P7 P4) ~ { $ 2 , the £act C2} that L <> n i l ) a to [9] P9 [I0] = extract(L,PB] [The statement same collection) PIO => P6 = P3 = { $ 1 , C I / $ 2 , L / 0 2 } *[8] u {L,C2} Cl :: C l . n e x t leaves C1 in the (7] => P7 (8] => P8 * = extract[CI,P1 = P6 ~ {L,C2} = {$1,C1/$2,L,02} P7 = P5 u P7 ~ P9] {$1,CI/$2,C2/L} Since that the the above are v e r i f i e d theoretical system conditions of e q u a t i o n s [Cousot[1076]) fixpoint using We with a #inite has we can Kleene's which * ensure a solution compute the Same v a l u e least only dicate PO, ($1,$2] where and [CI,C2,L] * => P1 on on the are PO [I) the m o s t other supposed = {$I,S2 hand hand the the initial stop on t h a t o£ the loop path. It remains : Pi PIO ore- = extract[el, P1 ~ P9) : extract(C1, {$1,CI/$2/C2/L} : {$1,C1/$2,C2,L}] parameters local to be in the / CI,C2,L} out sequence. disadvantageous the one as a b o v e , the p a t h (10]=> start ~ {$1,C1/$2,L,02} P8 = { $ 1 , C 1 / $ 2 , L , C 2 } {$1,Cl/$2,C2,L}] variables same = ±, collection: ¥i PIO * { [1,10] The = extract[L,extract(S2,extract[O1,PO] £inal main {cl,sl})l = {01/$1/$2,C2,L} results result records extract[el, are m a r k e d is that on e n t r y although of the by a star $I and procedure [*]. $2 may "copy" The share : : extract[L,extract(S2,{SI,S2/C1/C2,L} PO u {Cl,Sl})) * P1 = {$1,$2/C1,C2,L} = extract[L,extract(S2,{S1,S2,Cl/C2,L}]] it is g u a r a n t e e d that = extract[L,{Sl,Cl/S2/C.2,L}] of : = {S1,C1/$2/02/L} [2] => P2 = P1 u P9 = P1 [3] => P3 = e x t r a c t ( C 2 , P 2 ] (4] => P4 = e x t r a c t ( L , P 3 ] (5) => P5 the orocedure PIO this is not the case on e x i t = {C1/$1/$2,C2,L}. u ± = P1 {SI,G1/S2/G2/L} = {$1,C1/$2/C2/L} = = extract[S2,P4] 4.3 Remarks u {$2,C2} a. This {$I,CI/$2/C2/L} abstract interpretation of p r o g r a m s may be u {$2,C2} refined as in E U C L I O one associate : when records have variants P5 = { $ I , C 1 / $ 2 , C 2 / L } * (6 => P8 = P3 = { $ 1 , 0 1 / $ 2 / C 2 / L } (7 => P7 = P6 u { L . C 2 } {$1,C1/$2/C2/L} u {L,C2} can of tags of in f a c t will all velopments with each records be xib]e => P8 = P5 u P7 = {$1,C1/$2,C2/L} than authorize u {$1,C1/$2/C2,L} {A,B} application of o a r a g r a p h 3, We will the or "any" "one oT" collections amon Z three the in the c o l l e c t i o n . the m a i n { $ I , C 1 / $ 2 / C 2 , L} (8 collection with say possibilities set This o£ o u r d e - be m o r e of E U C L I D , #leand two v a r i a n t s {A,8,C}. Other- = {$1,C1/$2,C2,L} [g] => P9 = e x t r a c t ( L , P S ] * P9 : in variant PIO PO . . . . . => P2 = PI : * P2 = the while-loop have stabilized {S1,Cl/$2/C2/L} the reason on {A,B,E} in- the following = T : ~ {A,B} {A,C} {A} {B} {B,C} {$1,Cl/S2,C2,L} {$1,01/$2,C2,L} => P3 = e x t r a c t ( C 2 , P 2 ] *[4] => P4 = e x t r a c t [ L , P 3 ) pass, until we : u P9 *(3) We come b a c k stated rarchy {$1,01/$2,C2,L} We go on c y c l i n g [2] wise u {L,C2} = {$1,C1/$2,L/C2} {}=± {$1,CI/$2/L/C2} = for P4 w i t h the so we s t o p on t h a t path. value of the previous 89 {C}. type hie- whereas EUCLID uses a s i m p l i f i e d type inclusion fuse the automatic discovery of collections. The scheme advantage however must be c o u n t e r b a l a n c e d by the : {A,B,C} / (A} fact that p a r a m e t e r i z e d collections \ (B} (which are = T necessary with recursive data structures) may become inflexible and d i f f i c u l t to use. {C} \l/ We now come t o {}:a between tely b. Besides and in opposition with EUCLID the collections are defined as local invariants, Very pree the programmer necessary unions, an e x a m p l e that for is and the secure to say where a cooperation compiler and c h e a p a case has definite disadvantages 5. Integer 8ubrange Type when over use the the is absolu- of type compiler programmer. cise and detailed information can beJgathered whereas the EUCLID p r o g r a m m e r would have to g l o b a l l y specify the union of such information. This localization of c o l l e c t i o n s may have imporA subrange type such as tant c o n s e q u e n c e s type index = 0..9 - An optimizing c o m p i l e r will be able to limit the number of objects which are supmosed to is used to specify that variables of type index have been m o d i f i e d by side-effects when assigning to objects designated by pointers, will possess the properties of variables of the [use- base integer type, u n d e r the restriction that its ful in register allocation], value remains within the specified range. [19751). In Cousot[19751, we developed - R u n - t i m e tests may be inserted before a statement i n t e g e r variables. Let us take an obvious example : to verify that no variable in the c o l l e c t i o n i i= I; of X may access the record which X points to, {P1} while - The g a r b a g e collector may be called when all variables in a c o l l e c t i o n are "dead" The s i m p l e abstract we i l l u s t r a t e d stylized may u s e the that ways. fair the interpretation here recognize is _< 1 0 0 0 do i := i+1 {P3}; { P4} - etc... It i {P2} [i.e. are not used before being assigned to), c. [Wirth a techni- que to have the c o m p i l e r d i s c o v e r the subrange of : dispose[X}~ to : : may be f u r t h e r data structures Boom[1974], however to say provided a d v a n t a g e of EUCLID Let programs investigated are used that to by t h e ple in EUCLID c o m p i l e r s locally refine programmer. by [a,b] the equations predicate a _< i corresponding to _< b . our exam- : (1) P1 [2] P2 = [P1 [3] P3 = P2 + [ 1 , (4] P4 where and (or better part of = [1,11 = (P1 + is u and vals. intentions since the expressive power of collections is limited), is of u P3) n [-oo, 1000] 1] u P3] n [1001, +oo1 The i~ then t h a t w h e n the p r o g r a m m e r has declared his intentions us d e n o t e The s y s t e m Kerr[1975]. same t e c h n i q u e s collections of defined n are Suppose by [a, union b] and + [c, d] = [a+c,b+d], intersection we knew t h e solution of to inter- that system, iie. he is forced to c o n f o r m to his declarations. For example he will not P1 be able to use the same pointer variable to P4 = [ 1 0 0 1 , traverse two lists which are built in diffe- It rent collections. On the contrary this may con- this 90 is = [1,1], obgious solution P2 = [ 1 , 10001 , P3 = [ 2 , 10011, 10011 to let is a -Fixpoint the compiler o-F t h e verify system that : [1] -~-> P1 = [ 1 , 1] (2] ~> u P3] P2 = (P1 ([1, 1] [[1, 1001] = [1, [3] ~> ~> u [2, P3 = P2 + [ 1 , 10001 1001]) n [-oo, clear, n [-oo I000] [1, : [2, 1001] To make it instead of P1, Pi, PI, P4 the programmer is only able to declare vat i : 1000]] 1 ..1001 that is to say that PJ u P2 o P3 u P4 E [I, 10011 which adds an inequation to tiTe system o£ equations but does 1] : P4 = (P1 [-oo, 1000] 1000] [1 + 1, (4] compiler to discover local subranges. n not provide its solution. We then consider integer + [1, 1] suorange types as union types since the global de ~ 1000 + 1] claration must be the union of all local subranges. Thus, if U P3] n [1001, we declare : +~1 var i : 0..2; [[1, 1] u [2, [I, 1 0 0 1 ] : [1001, 1001]] n [1001, n [1001, +~1 we really want to say that the type of i at each +~] program point is one of the following a l t e r n a t i v e s : 1001] If on the contrary we want the c o m p i l e r to dis0..2 cover this fixpoint, we may try to solve the equations by algebraic m a n i p u l a t i o n s [Cheatham and TownleyE1g761] which may be quite inextricable. other way is to use Kleene's sequence, The but the trou- ble is that our abstract data space is an infinite lattice, and we may have infinite sequences. Since 0 . .0 2, .2 compilers must work even for programs which may turn out to loop, the only way to cope with the undecidable problem is to accept a p p r o x i m a t i v e ans- ]t wers. For example in the program : for i := I to 100 do We t h e n begin understand a criticism b y H a b e r m a n n [ 1 9 7 3 1 that sLbranges ere not types, n := i; while if write since a global sub- range type definition does not determine the set n <> 1 do even(n] of operators that are applicable to variablesof then n := n/2 that type. else n := 3 * n + I; For example, (i] let f be a {unction with one formal parameter o# type 2..I0 and i a variable globally end; declared of type d.,5. The variable i may be used at program point p in the expression f[i] provided C o u s o t [ 1 9 7 5 ] will d i s c o v e r an approximate range that i may be united to the subrange 2..10. Oynafor n which will be [1, +~].However, if the actual mically the local type of i at program point p is range of n were known by the programmer and if the i..i, wnich is simply derived from the value T of programmer could tell this to the compiler, then the variable i. in the expression f[i], a v e r i f i c a t i o n would be simpler i must be [in most cases but coerced from the type i..i to the type 2..10. This not on this difficult example]. is fies that the subrange of i at program point p must We can now state our main objection against subrange types in PASCAL safe when 2 ~ i and i s 10. Staticly this signi- be a subrange of 2..5. This subrange of 2..5 cannot : the fact that range asser- tions must be given globally in the d e c l a r a t i o n pre- be locally specified in PASCAL. vent the programmer from giving the solution of the This u n d e r s t a n d i n g of subranges leads us to the con- system of equations clusion that integer subranges should be specified to the compiler. The programmer can only give an approximation of locally. Moreover, the solution, vious examples we cannot exbeet the compiler to be which is usually insufficient for the 9] and in opposition with our pre- able to discover these local subrange properties. results will be obtained by type checking or type It is then essential that orogrammers provide them, discovery as long as finite type systems are consi- by means of assertions or as previously by means of dered. The main difference b e t w e e n these approaches conformity clauses so that we would write in the is the one between security (at the expense of fle- spirit of ALGOL 88 (MeertensL1975]) xibility) or simplicity : [at the expense of preci- sion, and of the possibility that c o m p i l e r warnings i := 1; be ignored). while case i in However when the type union system is infinite [I..I000)o:(i := O + I; true), out teger subrenge type), :false [in- it has been shown that static type checking necessitates language constructs which coPe eilow do subtypes to be locally derived. skip od; The argument was based upon the o b s e r v a t i o n These constructs give the solution of the system of equations which the compiler has to solve for that type v e r i f i c a t i o n consists in strong type checking. a solution to e system of type equations. The redundancies [equations establishing Global identically verified) can be eliminated. Moreov!~r tyae declarations give an a p p r o x i m a t i o n of the so- the PASCAL r e s t r i c t i o n that the bounds of ranges lutions to that system. must be m a n i f e s t e o n s t o n ~ i s lar solution from that a p p r o x i m a t i o n may involve o definite advantoge The discovery of o particu- since this v e r i f i c a t i o n will involve no symbolic infinite computations. On the contrary, formula manipulations. guage is d e s i g n e d to directly provide a solution Run-time tests will remain necessary in difficult cases, but their number will to the compiler, be decreased. straightforward verification. if the lan- type checking consists in a This reasoning might turn out to be useful to language designers who until now could not lo- 6. Conclusion gically prove the validity of their design of language constructs. M o r e o v e r this reasoning may ser- We illustrated the Tact that unsecure data types ve as a basis to define type safety in languages [which do not guarantee all operations on values of that type to be meaningful] sidered as the union of secure oSes of these were pointers, records in collections, and prove particular languages to be type reliable. can be con- (sub) types. Exam- varionts in records, 7. integer subranges. Acknowledgements P. Cousot benefitted from d i s c u s s i o n s on this topic A type-saTe programming system must staticly determine which sate subtyDe of the union is used when with colleagues in IFIP Working Group 2.q checking correct use of operations on union typed oriented hi,her-level languages). The o r g a n i z a t i o n objects. with J. Horning and K. C o r r e l l o f a debate on EUCLIO The language d e s i g n e r may achieve this goal by one of the followin Z alternatives: was especially helpful. - Incorporate rules and constructs in the We w e r e language so that any operation of the langua- for ge can be staticly shown to be o p e r a t i n Z on correctly typed arguments. - D e s i g n a compiler in order to verify that the security rules have not been transgressed, It compiling although not enforced by the language. was a r g u e d techniques that in must both cases, be u s e d , the [Machine- same and c o m p a r a b l e 92 us. very lucky to hove F. Blanc do t h e typing 8. Bibliography MOL-Bulletin [P. Cousot ed.], Nb 5, Sept. 1976, DO. 91-172. ALGOL 6 0 1 1 9 6 3 ] . Naur P. [Ed.] " R e v i s e d R e o o r t on t h e A l g o r i t h m i c GOL 6 0 " , CACM, 6, [Jan. 1983], Language AL- HabermannL 1973 ]. Do. 1 - 1 7 . "Critical Comments Pascal", ALGOL 6811875]. Van Wijngaarden " R e v i s e d R e p o r t on t h e A l g o r i t h m i c GOL 68", Aeta Informatica (Eds.]. A. et al. Language [1973), Sprin- Language AL- 5 [1975], pp. 1-236. Hoare and Wirth[1973]. "An Axiomatic Definition guage PASCAL", Analysis 3, 47-57 ger-Verlag . 90om[1974]. "Optimization on the Programming Acta Informa~ica of Programs in Languages of the Programming Acta Informatiea, Lan- 2, 1973, op. 3 3 5 - 3 5 5 . with Pointer Variables". Ph.D. Thesis, Dept. ce. University Appl. Anal. oT Waterloo, and Camp. Ontario Karr[1975]. Scien- [June]. "Gathering Come. Cheatham and Associates About Programs", Inc. CA-7507-1411, Mass. July 1974. Townley[1976], "Symbolic Evaluation DO analysis". of Programs Proceedings posium on Symbolic August Information Kildall[1973]. : a look at lo- of the 1876 ACM Sym- and Algebraic Comeutation. 1976. "A Unified Aooroach to Global tion", of ACM Symposium Conf. Record ples of Programming pp. 194-2U8, Program Ootimiza- Languages, October on Princi- Boston, Mass., 1973. Cousot[1975J. "Static Determination Programs". U,S.FI.G., in the proceedings of Dynamic RR 25, Nov. 1975, of "Colloque sot la orogrammation", Properties of Kleene[1952]. to apoear "introduction International Parls 13-15 April Publishing LIS[1974]. Cousot Couset[1976] model for static truction Fourth : a unified analysis ming Languages. lattice of programs or approximation ACM Symposium Manual by cons- on Principles Los Angeles, of "Colloque tion", to appear 13-15 April languages Pascal[1974]. in the proceedings "PASCAL, sur la Programma1976, 18, Mitchell J.G., "Report Homing "SINULA and formal [1875), O.J., J.C., LIS"Reference Oecem- 1976. London R.L., 1875" [S. Sehuman, in Algo- ed.] Paris. K., Wirth N. Science, and Report", Nb.18, 8irtwistle, Dahl, Lecture Notes Springer-Verla Z, 1974. Myhrhaug, Begin",Student]itteratur, on Soecific Jan. Language in "New Oirections Lund, Nygaard. 1874. Sintzoff[1972] Popek G.J. on the Programming IRIA, Jensen "Calculating B.W., Language Report 4549 EI/EN, January User Manual in Comouter Ounod. non determinacy of programs", CACM, Lamoson Heliard : pp. 453-457. EUCLID[1976]. Languages SIMULA[1974]. commands, derivation Revised "~Iode and Meaning", 1977. Oijkstra[1975]. "Guarded Implemen<ation pc. 125-138, International Paris, J.P., Meertens[1975]. of Program- January type in programming a new approach", Rissen , CII Technical ber 1974, and Schwartz[1878]. "(he pointer J.O., of fixooints". rithmic Oembinski Ichbiah P. "The System Interpretation North-Holland 1978, Ounod. "Abstract to Metamathematics", Co, Amsterdam. EUCLID", 93 1972, Properties Models", of Programs SIGPLAN DO. 203-207. by Valuations Notices, Vol. ~ Nb.1, Sintzoff[1975]. "Verifications utilisables d'assertions comme valeurs riables ext6rieures", programs", oour les fomc~ions et affectant des va- in "Proving and Improving ed. G. Huet and G. Kahn, IRiA. Tarski[1955]. "A lattice-theoretical its Apo]ications", matics, 5 11955), F±xooint Theorem and Pacific Journal'of Mathepp. 235-30B. We~breit[1975]. "Property Sets", Vol. Extraction i.E.E.E. SE-I, in Well-FoundeO Trans. on Software Property Engineering, nb. 3, pp. 270-285. WirthE1975]. "An Assessment CAL", SIGPLAN of the Programming Notices, Vol. 10, Language PAS- nb. 6, June 1975, pp. 23-30. 94