Next: 5. Industrial Example Up: Diploma Thesis: Utility Support Previous: 3. Code Instrumentation Contents

Subsections

4. Model Information

The OCL compiler needs model information for type checking. How this works is explained in [FF00] section 5.3.3.

One possible source of model information may be a UML model exported from a CASE tool. This is probably the most elegant way. But since most real-world projects don't have a (up to date) UML representation of their business model, this isn't feasible in practice.

Another source is the java code itself, accessed through the reflection API^4.1. This is very convenient, since no additional model is needed. However, java reflection lacks some model properties which are important for type checking.

Element types of collections, particularly collections representing associations. From a C++ perspective, java lacks templates implementing parameterized container classes.
Qualifier types of maps, representing qualified associations.
The isQuery tag of operations. Note, that OCL expressions may use operations without side effects (queries) only.

This chapter presents a solution to the first two items above. The information needed is put into the source code. Section

explains, how this information is stored, while section

presents several approaches, how this information is generated.

The third item could be solved in a similar way, by putting an isQuery tag into the source code. However, this is not an urgent problem. Without an explicit solution the developer has to be careful to call java methods without side effects only in OCL expressions.

4.1 Representing Element Types

Element types and qualifier types are specified using special tags in javadoc comments. See the example below.

class Company
{
  /**
     All persons employed by this company.
     @element-type Person
  */
  Collection employees;
}

The @element-type tag takes an parameter specifying a java class or interface. Thus, it's similar to @see as defined in [GJS96] section 18.4.1. The @element-type tag is valid for attributes only, and there must be at most one such tag per javadoc comment. The tag is not restricted to attributes of type java.util.Collection, since future implementations could use other collection API's as well.

Analogously, the @key-type tag is introduced for association qualifiers.

class Bank
{
  /**
     Customers qualified by their account number.
     @element-type Person
     @key-type Integer
  */
  Map customers;
}

Note, that the reflection model is restricted to qualified associations with one qualifier only. [UML] allows multiple qualifiers, but there is no convenient representation for this in java.

Furthermore, UML specifies a qualified association which has not been qualified in the OCL expression to be a set, i.e. there must be no duplicates. For the java example above, this means, that the following invariant must hold:

context Bank inv:
customers->size()=customers->asSet()->size()

Since this is not enforced by java.util.Map (only keys are guaranteed to be unique), the OCL library provides an appropriate runtime check.

4.1.0.0.1 Implementation.

A really comfortable implementation would let the java compiler do the parsing, and provide the information through an extended reflection API. This would be similar to the @deprecated tag. However, this approach would require the java compiler, the JVM and the standard runtime library to be modified. Apart from the effort of making these modifications, most java developers probably have a profound aversion against using a dedicated java environment just for checking OCL constraints.

The implementation developed with this paper extends^4.2 the reflection facade by scanning the source code for these comments on demand. This implies, that the java source code is necessary for type checking OCL constraints in addition to the class files.

There is a crucial question left: Where do the tags come? Possible sources are:

A UML model. The code generator of a CASE tool could generate these tags.
Maintained by hand. It is good-practice of programming, to specify which kind of objects are supposed to be in a collection attribute. The tags just make this information available formally.
Reverse Engineering. This is discussed in detail in section below.

Collection attributes with type tags are verified on runtime by the instrumented code. Note, that this kind of type information is useful for reverse engineering a UML model from given java code.

4.2 Reverse Engineering

Section explained, how to store additional type information of a java model in javadoc tags. This section discusses, how to create this information.

Actually, these type tags have to be created manually. None of the automated procedures is perfect, so these procedures are suitable for decision support only. This chapter tries to support the developer with an interactive tool for inserting @element-type and @key-type tags into the code. There are two main features of this tool:

Graphical User Interface: Clear presentation of missing type tags and comfortable editing facilities.
Decision Support: Giving hints to the developer. These hints are either derived statically (section and ) or gathered dynamically on runtime (section ). There should be a special indication, if several hints suggest different types.

A prototype^4.3 of this tool according to the ideas presented in this section has been developed by Steffen Zschaler. The prototype currently features the graphical user interface and the runtime analysis of section

4.2.1 Source Code Analysis

Information about element types may be derived from static properties of the class, such as parameter types of methods and other tags in javadoc comments.

The following example suggests some of these properties. The element type of employees is obviously Person, but this information is not yet available to the OCL compiler. The tool could derive an appropriate hint for the developer from each of the underlined features.

/**
All employed {@link Person persons} of this company.
@see Person
*/
Collection employees;

boolean isEmployee(Person);
void addEmployee(Person);
void removeEmployee(Person);

Note, that the example above requires linguistic knowledge about plural and singular form of nouns (employee here). This gets far more difficult, if identifiers are not English.

4.2.2 Runtime Analysis

This section describes, how to trace element types of collections on runtime. This is useful, if there is no static type information available, as described in the previous section.

For each collection attribute, the object types encountered during a run of the program are collected and fed into the interactive tool. This requires the program to be executable. Additionally there must be extensive test cases available, otherwise only a subset of all possible element types will be encountered.

The interactive tool presents the set of object types for every collection attribute. Additionally, the tool highlights all types, for which there is no super type in this set. Formally, these are the local minima of the set respective to the generalization partial order. These local minima are good candidates for an element type, especially if there is only one minimum. Presenting minima simplifies the decision if there are many types encountered in the collection attribute.

4.2.2.0.1 Implementation.

The instrumented code makes a static method traceTypes^4.4 to be executed whenever a collection attribute changes its contents. Class TypeTracer maintains a static data structure containing all element types and key types for all attributes, as well as the minima of these type sets. This information is continuesly written to a log file. The interactive tool can read this log file and display the information.

4.2.3 Byte Code Analysis

This section describes how type information can be extracted from java byte code. This technology and its implementation (called Superwomble) was developed by Daniel Jackson and Allison Waingold at the MIT. This section outlines the parts of their paper ([JW99]) related to type information together with experiences from practical experiments with the tool.

Superwomble is a powerful reverse engineering solution. It generates object graphs from nothing but java byte code. Object graphs are roughly speaking a subset of UML class diagrams. They feature classes with generalizationships and associations between them. The graph is finally fed into a tool named dot, which does a nice layout for the graph.

One of the tricky parts of this tool is the detection of element types for object containers, which is exactly what this whole chapter is about. How this works, is explained on the Company-Person example from section (page ).

Suppose company is a variable of type Company and person of type Person. A typical program around this example would probably contain a statement like this:

company.employees.add(person);

The operation add takes an argument of type Object, but is called with a variable of type Person. This is a good hint, that the element type of employees is Person.

The same works for objects returned from the container. The expression

person=(Person)(company.employees.iterator().next());

strongly suggests the element type Person.

Another highlight of Superwomble is, that container classes are detected even if they don't implement java.util.Collection. In fact the type is not cared at all. Instead there are some heuristics applied to decide, whether a class is an object container or not.

The tool was used on several parts of both the OCL toolkit and the net-linx code, and it produced good and reliable results.

Integration of Superwomble results into the interactive tool should be possible. The object graph is exported to a human readable text file. However, this task is outside the scope of this paper.

4.2.4 Comparison

This section provides a comparison between the three approaches presented in the sections above.

	Source Code	Byte Code	Runtime
		(Superwomble)
Source Code required	yes	no	yes
Required Code Quality	fairly	fully	up and
	parseable	compileable	running
Availability of Results	intermediate	good	good
Reliability of Results	good	very good	intermediate
Application Effort	low	low	high
Implementation Effort	low	high	very low
(starkly subjective)
Availability	LGPL	Binary at no cost	LGPL

Runtime analysis requires the source code to be instrumented before. Only byte code analysis requires no source code. This argument is weakened by the fact, that the type information is to be inserted into source code anyway. However, byte code analysis may cover libraries, which aren't available in source code, but provide useful type information about other parts of the program.

Anyway, byte code analysis requires, that there is a fully compilable source code somewhere, even if it's not available to the user. Source code analysis even makes do with incorrect source code, as long as signature data is parsable (method headers etc.) and method bodies hold the bracket balance. Most demanding on source code quality is runtime analysis, which requires a running system, with complete test cases.

Source code analysis is most demanding on the ``beauty'' of implementation. To deliver results, it requires some kind of getter/setter methods for the container attributes. In contrary, byte code and runtime analysis even work for public container attributes manipulated from outside of the class.

For runtime analysis reliability of results depends heavily on completeness of test cases. If the test cases are insufficient, results may be wrong. Byte code analysis provides best reliability, it's more difficult for a poor quality code to fool the analysis.

Runtime analysis also requires most application effort for the user. The system must be actually run. Particularilly all requirements for runtime (libraries, database, configuration etc.) must be available.

The implementation effort is very subjective for this paper. Both source code and runtime analysis require parsing and instrumenting of java source code, which was already built for runtime verification of OCL constraints. Thus the implementation effort in this paper was low. Byte code analysis is something completely different.

Finally, availability is about whether it is allowed to use, review and adapt the implementation. According to [FSF00], the difference between LGPL and Binary at no cost is same as between free speech and free beer.

4.2.5 Summary

There have been three approaches presented for acquiring type information. Runtime analysis is implemented and fully integrated into the project. Source code analysis is not yet implemented, but this should be easy to add. Experiences with byte code analysis where drawn from the tool Superwomble ([JW99]). This is fully implemented but not integrated into the OCL toolkit, thus not ready to use.

Adding up the scores, byte code analysis is probably the best. However, most of the criteria listed above are some kind of orthogonal, so adding up scores might not be sufficient for a decision. Each application may emphasize different criteria, so a universal solution is not available.

All approaches have one thing in common: they are not perfect. Thus, they cannot be used directly in the type checker of the OCL compiler. The intermediate step of the @element-type tags is necessary to allow correcting intervention of a human user.

Next: 5. Industrial Example Up: Diploma Thesis: Utility Support Previous: 3. Code Instrumentation Contents

Ralf Wiebicke 2005-11-25