Section: Method guides
Date: 1998

Guide to UML

Document contents

Purpose of the guide
Purpose of UML
Version
Verdict
When to use
History
Definition
Initialisation
Analysis
Design
Construction
Reverse engineering
Support and other resources
Contact information

Section: Method guides
Date: 1998

Purpose of the guide

The purpose of this document is to assess how much value UML can deliver to the process of developing computer systems. The document does not just seek to assess how well UML meets the criteria set out in the OMG RFP - it measures what UML delivers against what could potentially be delivered by a diagrammatic system specification and description language.

The result of this assessment is a high-level verdict and a set of standard marker sizes for use in our UML tool evaluations. The guide also aims to provide a clear and comprehensive explanation of the (sometimes rather confusing and contradictory) concepts in UML. Where concepts are so confusing that their true meaning or syntax cannot be determined, we say so.

Section: Method guides
Date: 1998

Purpose of UML

UML (unified modelling language) is intended to be an analysis and design language, not a full software development methodology. It specifies notations and diagrams, but makes no suggestions regarding how these should be used in the software development process. Nevertheless, we will refer to it as a 'method' in this article, because this is the standard terminology for the CASE tools service.

UML aims to address a surprisingly wide range of software types, from standard information recording and retrieval systems (IS systems), to embedded, real-time and even safety-critical systems. Previously, it had been thought that different methods were required to address different market segments, so this attempt at integration is relatively ambitious.

UML is said to address the modelling of manual, as well as computerised, parts of systems, and to be capable of describing current systems, as well as specifying new systems.

The authors claim that UML can be used for business process modelling as well as information system modelling, but we will only evaluate its suitability for information systems modelling in this guide.

Section: Method guides
Date: 1998

Version

In this guide, we review version 1.1 of UML, which was added to the Object Management Group's (OMG's) list of adopted technologies in September 1997. Although version 1.2 of UML had yet to be finalised when this guide was written, there is a definite policy that it will only address simple editing errors. Therefore, the vast majority of comments in this guide apply to version 1.2 as well as 1.1. A new version of this guide will be produced when version 1.3 is released.

Section: Method guides
Date: 1998

Verdict

A big step forward

UML is a major step forward from the OMT, Booch and OOSE methods it replaces. UML can specify more of your application than its predecessors, and is more formally defined; it even includes an execution model. As a result, UML CASE tools should be able to generate more of your application than OMT, Booch or OOSE tools. There is also significant support for modelling distributed concurrent processing systems.

UML will become the OO modelling standard

Furthermore, the support of most of the major players in the application development field has been assembled behind UML, and OMG approval has been gained. CASE vendors are moving to UML in droves. Ovum therefore predicts that UML will become the standard OO modelling language. Reducing the number of modelling languages that CASE tool developers and users have to worry about should reduce costs and allow effort to be concentrated on developing better CASE tools and systems. As a by-product, the IT world now has (for the first time) an agreed definition of some of the fundamental concepts of object orientation, including non-obvious concepts such as frameworks and facades. The impact of this should not be underestimated: semantic confusion is one of the biggest time-wasters in the IT world. The UML initiative also gives the IT world an on-line tool interoperability standard - defined as a set of CORBA IDL interfaces, through which any component of a model can be addressed.

...but some disappointments

The development of a new method is a process fraught with difficulty, so it is no surprise that Ovum felt that UML 1.0 was unsatisfactory in several areas. Version 1.1 is a clear improvement on its predecessor, but considering the fact that it is now an OMG standard, we are a little disappointed that more progress has not been made. Some aspects of the problems identified in version 1.0 have been resolved, but progress is not yet sufficient to allow us to remove any of our concerns from the list:

incomplete coverage. UML does not yet allow you to specify complete applications. Particular weaknesses are limitations in its ability to specify conditional and iterative behaviour in interface and process design; no official syntax has yet been adopted for processing expressions, conditions or invariants; and there is a significant lack of any syntax for specifying the properties of domains. We do welcome the initial tentative moves made towards the adoption of a standard syntax for most of these concepts in version 1.1
incomplete formality. UML is still not fully formally defined. The current syntactical (grammatical) definition is still self-referential (UML concepts are used in the definition of UML) and incomplete. However, the addition of OCL rules (relating to how well-formed the product is) is to be welcomed. The semantics are informally and incompletely defined using natural language; the semantic definitions can themselves be self-referential. Some progress towards resolving the self-referential nature of UML syntax and semantics has been made by separating out the concepts used to define UML into a distinct section of the definition. However, the syntax and semantics of these 'core' elements still need to be properly defined using an abstract logical language.
no process. The UML authors decided not to recommend a system development process in UML because, they say, it was even more difficult to agree on a standard process than a standard language. It was therefore not possible to agree a language and a process within the project timescale, and indeed this was out of the scope of the OMG's request for proposals. Ovum feels that outline process guidelines should have been included - even if the best that could be done was to describe a number of alternatives. This would have helped users to understand the UML, and might also have helped the UML authors to simplify the language
logical/physical confusion. Ovum believes that a good method should help you to keep information about what a system should do (expressed in a logical model), separate from information on how that functionality should be implemented in a specific implementation environment (expressed in a physical model). Of course, the same modelling concepts can be used to create logical and physical models - indeed this is the most elegant approach - but the method should insist that any given model is declared to be either logical or physical. In line with this approach, UML allows key concepts such as classes to be used in both logical and physical models, but unfortunately provides no unambiguous way of declaring whether a given model is logical or physical. (Where concepts that might support this distinction are defined it is not clear that they imply a distinction between implementation dependence and independence - see Definition/semantics, Analysis/types/classes and General facilities/traceability relationships). The situation is further confused by the fact that UML's syntactic rules fail to prevent physical-only concepts being included in the same model as logical-only concepts. We are told that this is because some people do not want to make this distinction. We say that some - indeed many - relationships between logical-only and physical-only model elements are simply not sensible
superfluous complexity. UML syntax is more complex than is necessary to model the behaviour that it covers. This complexity is a consequence of several factors: merging methods without eliminating concepts that are close substitutes; the inappropriate use of meta-model inheritance; confusion between logical and physical concepts; inclusion of too many concepts specific to particular programming languages, and a failure to take advantage of the central position of object concepts in the UML meta-model. We welcome the fact that UML 1.1 simplifies UML 1.0 syntax in a several areas (for example, the definition of collaborations, by demoting some model elements to stereotypes, and by removing some inherited capabilities through constraints), but much more simplification is possible
end-user reviewability. It would be easier to check analysis diagrams with end users (for example, as part of the process of quality assuring a specification) if the standard symbols were made more meaningful, and if less use was made of textual adornments
no specific support for generation. UML does not provide specific abstractions to support the translation of logical into physical models or the generation of code from these models. There was no clear consensus on how this should be achieved. However, a well thought-out conceptual approach does exist in the Shlaer-Mellor method
incomplete real-time and safety-critical support. Although UML provides some basic facilities, it does not provide fully-developed support for capacity and performance specification. No specific facilities are provided to support safety-critical systems
immaturity. UML is still maturing. Substantial changes to classes/types, signals and messages may be found between versions 1.0 and version 1.1/1.2. Changes in version 1.3 may well be restricted by the fact that all such changes have to be approved by the revision task force, although in many ways a reduced rate of change is a disadvantage as many substantive problems remain to be addressed
rough edges. Version 1.1 is a little better than 1.0 (in particular many discrepancies between the notation and semantics guides have been resolved), but numerous inconsistencies remain, and it is still very difficult to understand. We are extremely surprised that the OMG has not seen fit to 'quality assure' the UML document before adopting it as an international standard. We understand that the revision task force hopes to resolve most of these issues in version 1.2, and a glance at a draft document suggests they are making significant progress.

Versions 1.2, 1.3&ldots;and beyond

The UML revision task force plans a release (UML 1.2) incorporating editorial corrections in March/April 1998. The next significant release (UML 1.3) described as a 'technical cleanup' will be in the third quarter of 1998. A minor revision release (UML 1.4) is scheduled for 1999.

It is acknowledged that significant work (beyond 1.1/2) is still required in the areas of reviewability and support for real-time and safety-critical systems, but this work has not, as yet, been linked to a specific version. We hope that the real-time work will be carried out within the context of the revision task force, so that the continued integrity of the standard is ensured.

Finally, it is with great pleasure that Ovum can report that support for the translational approach to software engineering is being considered. The originator of Softeam's hypergenericity method has a seat in the revision task force, and discussions are ongoing with Steve Mellor, one of the originators of the Shlaer-Mellor method (see the Guide to Shlaer-Mellor and the Objecteering evaluation). We further commend the fact that the task force is considering the adoption of a formal syntax for UML expressions, possibly via separate OMG request for proposals (RFP). We also understand that an RFP has been issued for a batch data interchange format.

Section: Method guides
Date: 1998

When to use

The UML specification is still developing. We feel there are too many problems with version 1.1 of UML to view it as a mature release.

Consequently, we cannot yet recommend its use in large-scale, mission-critical projects (although organisations building such systems might well wish to start pilot programs). Such projects would still be better off using a structured method; one of the more mature object-oriented/structured method hybrids or, for real-time systems, one of the better developed object-oriented real-time methods.

If you are designing smaller scale, less critical systems using one of the less capable object-oriented methods, such as OMT, OOSE or Booch, you should consider switching to UML now. Your users will already be familiar with the basic concepts and UML's additional formality and coverage should give you sufficient advantage to overcome the disadvantages of early adoption.

If you are designing smaller scale, less critical systems using one of the more mature structured or hybrid methods, the learning curve will be steeper and the gains less dramatic. We would suggest that you delay your decision to early-1999, even further than we suggested in our first UML evaluation, reflecting the fact that version 1.1 has addressed less of 1.0's problems than we had hoped. By early-1999, version 1.3, which we hope will be a mature release, will have been issued for approximately six months, and the first tool implementations should be available. The 1.4 specification should also be available, though will not yet have been implemented by any case tool.

Whatever the scale and importance of your application, if you wish to build real-time systems or adopt a fully translational approach to software development, you may have to wait even longer for proper UML support - but you should watch our method guide update for the outcome of the next phase of development, and to see which tool vendors are extending and enhancing UML.

Section: Method guides
Date: 1998

History

Prior to the inception of UML, four methods (OMT, Booch, OOSE and Shlaer-Mellor) had become established as the dominant object-oriented analysis and design methods. The methods formed two groups: Shlaer-Mellor, a method with a firm belief in the generation of systems from a logical specification and specific constructs to support it, and the other three methods, which had no particular methodological support for generation, and whose authors were ambiguous about its benefits. These latter methods seemed to favour the progressive elaboration of high-level specifications into physical designs that are then 'filled in' with code. Ovum has previously characterised these two approaches as 'translationist' and 'elaborationist', and argued the advantages of the 'translationist' approach (see Ovum's Guide to Shlaer-Mellor).

Three object-oriented methods

UML has been created by merging the notation and syntax of OMT, Booch and OOSE (the second group of methods) into a single language and then gathering feedback from the user community. Input from the Catalysis method (on types and collaborations) has also been significant, as has input from Jacobson, Rumbaugh and Booch's work on workflow and business process modelling. We also believe that we can discern a significant influence from SDL (see Ovum's Guide to SDL). Prior to their merger, each of the three main methods had their strengths and weaknesses. Indeed, we have previously commented that these techniques provided a less complete and formal specification of system requirements than the structured methods (such as Information Engineering and SSADM) that they have largely replaced.

&ldots;are merged to form UML

The unification was carried out under the auspices of Rational Software, which now employs the methodologists responsible for all three methods: Grady Booch (originator of the Booch method), Jim Rumbaugh (originator of OMT), and Ivar Jacobson (originator of OOSE). Rational sees it as a community effort. It formed a supporters group, called the 'UML Partners', which included Digital Equipment, Hewlett-Packard, I-Logix, ICON computing, MCI Systemhouse, Microsoft, Oracle, Unisys and many others.

UML is submitted to OMG

Rational then decided to try to make UML a standard. With this in mind, it submitted version 1.0 of UML to the OMG (Object Management Group) in response to their request for proposals for an object-oriented analysis and design meta-model and specification exchange standard. Several other submissions were made to OMG, including, most prominently, the Open Group (Platinum et al, not the offshoot of the Open Software Foundation) submission and the IBM/ObjectTime submission. Each of these had its strengths and weaknesses. The Open Group, for example, suggested a standard software development process as well as a language. IBM's submission proposed a fully separate method-independent meta-meta-model, as well as a method-dependent meta-model.

With astonishing political dexterity, Rational sought and gained Microsoft's support for its submission (UML is to form one of the information models for the Microsoft Repository) and, from its strong position (Microsoft's support, plus that of Booch, Rumbaugh and Jacobson), proposed that the best elements of all the other OMG submissions be wrapped into UML (with the notable exception of the Open Group's process). By doing so, it gained the public support of all the other proposers for a joint submission. The merged method (UML version 1.1) was created by a multi-company task force from all the OMG submissions. It has now been adopted by the OMG.

Community ownership and control

Reflecting the increasing emphasis on community ownership, Cris Kobryn from MCI Systemhouse was appointed chair of the UML 1.1. semantic task force. Since the OMG approval of UML 1.1 control has passed to an OMG committee, the 'UML Revision Task Force', also chaired by Cris Kobryn.

Section: Method guides
Date: 1998

Definition

UML 1.1 is defined in two volumes: UML semantics, and UML notation. The formal definition spans both documents. A glossary is provided for further explanation. Language extensions necessary to encompass business process modelling, the objectory software development process, are dealt with in further volumes, as is the definition of the CORBA interface facility (which allows tools to access each other's UML models). We do not cover these extensions in this review.

Approach to definition

The UML semantics volume describes the abstract syntax of the method. (This means that it specifies all the UML concepts, together with the relationships that can be established between them.) It also defines the semantics (meaning) of the concepts and their relationships. The UML notation guide describes the symbols that can be used to represent UML concepts, and thus defines part of the concrete syntax of the language. The other part of the concrete syntax, the rules governing the combination of notational elements to form diagrams, is taken to be implied by the abstract syntax in the semantics volume.

The notation guide also provides some further explanation of the meaning of UML concepts, together with some syntactic examples. Unfortunately, it sometimes conflicts with UML semantics. In this review, we worked from both volumes. Where inconsistencies occur, the UML definition says that you should use the UML semantics version, unless otherwise specified. We follow this convention: where inconsistencies are important, we comment on them.

Abstract syntax

UML concepts and their relationships

The abstract syntax of UML is defined in an abstract and self-referential manner. The main aspects of the abstract syntax are defined in a meta-model expressed as a standard UML class diagram. Each UML concept is represented by a class. The relationships that can be established between instances of UML concepts are described by UML associations and aggregations. So, for example, the meta-model states that a UML operation may have several parameters by drawing a class box for each of the concepts' 'operation' and 'parameters', and drawing an associative relationship between them with the appropriate cardinality. Rules that are difficult to express on a UML class diagram are expressed using an object-oriented meta-language developed by IBM, the Object Constraint Language.

At least this is the theory, in practice OCL is sometimes used to express meta-associations that would be easily expressed on the meta-model UML class diagram. This leads to apparent contradictions - for example, a state machine can be attached to any meta-model element according to the meta-model diagram, but only to classifiers and behavioural features (such as operations) according to the OCL constraints. We are told that the constraints always over-ride the meta-model diagram but this is confusing, to say the least.

Figure 1 Part of the UML meta-model

This extract says: 'an operation can have many parameters

Model elements and inheritance

The key concepts in UML: classes/types, states, even relationships, are ultimately derived via an inheritance hierarchy from a very general concept of a 'model element'. (To be precise there is an even higher level concept, 'element', that need not concern us here.) A model element corresponds roughly with Ovum's concept of an 'item type'; for example, the UML meta-model class 'message' is a sub-type of 'model element'. When you construct a diagram, you are creating instances of model elements like message (strictly instances of any non-abstract sub-type of model element, because the hierarchy has many levels and typically only the lowest level in any branch can be instantiated). These model element instances are equivalent to 'items' in Ovum's terminology.

The inheritance hierarchy has syntactic impact - the sub-types of a UML concept inherit all the syntactic facilities (the permitted relationships) of its supertypes. For example, model elements can have associated names, so a class, as an indirect sub-type of model element, can also have a name. (See Analysis/types and classes.)

A more typical approach to defining a meta-model would be to separate out the more abstract concepts (such as a model-element) into a meta-meta-model that is instantiated into a meta-model. In fact, UML is said to have a meta-meta-model - the OMG's Meta Object Facility (MOF). This would imply some degree of overlap with concepts such as 'model element', were the MOF appropriate to this application. In fact, the MOF was set up to model very physical concepts (such as those in CORBA) at a meta-meta level, and thus seems rather at odds with some applications of UML, since UML can be used to represent logical or business concepts.

One disadvantage of UML's multi-layered 'inheritance' approach is that you may inherit unanticipated - and inappropriate or redundant - characteristics. For example, UML 1.1 says that use cases are 'classifiers' (class-like things). We cannot believe that all the syntactical and semantic properties of classifiers (including the ability to have operations, attributes and state models) are applicable to use cases. Many more examples will be encountered and commented on in this guide. In some cases, inappropriate features are removed by rules relating to how well-formed the product is; these are expressed as UML constraints. We think that the need to remove facilities in this way suggests that the current UML inheritance hierarchy needs significant revisions - it does not currently represent a valid classification of UML model elements.

A more extreme manifestation of this problem is that any syntactical capability available to a model element is also available to the syntactical capability itself. What does this mean in practice? It means, for example, that it is perfectly legal in UML to define a namespace of a namespace (see General facilities/visibility). It is very difficult to see what real-world meaning could be ascribed to such a syntactical monstrosity.

A brief tour of the inheritance hierarchy

Not all model elements are 'model elements with namespaces' (for example, operations and attributes are not), but those that are require the names of any model elements they contain to be unique - see General facilities/identity and Naming. Some 'model elements with namespaces' can be generalised (see General facilities e-use). These are model elements that can be subject to inheritance relationships.

The importance of the 'classifier' concept

Some elements that can be generalised are classifiers - basically model elements that share most of the capabilities of classes. Use cases, nodes, types, components, subsystems, data types and classes are all classifiers (direct or indirect sub-types of classifier in the meta-model). The concept is important because many syntactical capabilities, such as the ability to have attributes, operations and associations, are granted to all classifiers.

The importance of containment

The notion of containment was implied in the above description of namespaces, and is important to an understanding of the UML meta-model. Some model elements are related to other model elements using a strong form of association called composite aggregation. These model elements cannot exist without the enclosing model element. For example, operations and attributes cannot exist outside the context of a classifier such as a class or type.

Objects are central - or are they

UML's 'object' concepts: classes, types (indeed, all 'classifiers') stand at the centre of a UML model in the same way as they do in any object-oriented modelling language. Object-oriented languages regard most concepts as directly dependent on classifiers, but UML is not entirely consistent in allowing properties to be transmitted through this dependency. Some concepts (such as state behaviour) are indeed transmitted through dependency on classifiers. If you sub-type a classifier, you bring its behaviour with it. In other cases, separate syntactic provision is made. For example, in the case of refinement relationships, you can refine each model element separately (see General facilities/ traceability), and in the case of UML's packaging mechanism (see General facilities /hiding detail) you can package every model item separately. Once again, we have an unnecessary facilities in the meta-model that are likely to cause confusion.

Models and views

A set of UML model elements (strictly a set of instances of UML model elements), related as permitted by the meta-model associations, constitutes a UML model of a system. UML says that all the model elements in the model must be at the same level of abstraction (for example, use case, analysis or design) but provides no mechanism for indicating the level of abstraction. UML has no notion of the completeness of a model - although most practitioners would say that a complete model must include (the information content of) a class diagram and either a set of sequence diagrams that together reference all the objects, or a state diagram for each object. Indeed, in UML 1.1 (unlike UML 1.0) is not clear whether a model can cover more than the information content of a single diagram - it says that a model 'specifies the system from a certain viewpoint and level of abstraction'. The scope of the model concept thus depends on the definition of the word 'viewpoint', which could well mean a view in the sense of 'state view' or 'architectural view', but unfortunately is not defined.

UML also defines the concept of a view element or 'view' (as opposed to a viewpoint), which it describes as 'a textual or graphical representation of one or more model elements'. As defined, it deals with presentation concerns, and should therefore not be part of the abstract syntax. We assume that it is meant to be a generalisation of the concept of a diagram. However, if this is the case, why are all the model elements in a view element not constrained to come from the same model?

To coin a phrase, we think UML is in a 'model muddle' that needs to be resolved as soon as possible.

How well defined

Theoretically, UML syntax is undefined

The fact that UML concepts are used to define the main elements of UML syntax means that definitions are circular (aggregation is, for example, defined using aggregation relationships) and this leaves UML formally undefined. The problem is mitigated, to some extent, by the fact that most object-oriented practitioners already understand the meaning of the concepts (classes, inheritance and associations) that are involved, and the fact that the model elements required to use UML are separated out into a distinct part of the definition. However, to ensure the syntax is fully defined, this core should be syntactically and semantically defined using a well established general purpose logical language.

The use of inheritance in the UML definition gives rise to another significant problem. The UML definition states that a sub-type always inherits the union of the associations possessed by its super-types. However, when UML is used to define UML, we find rules relating to how well-formed the product is (which are effectively expressed as UML constraints) that, in order to remove inappropriate syntactic capabilities, subtract associations (or other features) from sub-types in the meta-model. This is a contradiction - if constraints are allowed to subtract associations from sub-types (or more generally override all inherited facilities) then the definition of inheritance (or constraints) should make this clear, and it does not. In theory, this leaves all model elements effected by such rules (interfaces and subsystems are two important examples) syntactically undefined.

UML syntax is contradictory

Finally, UML 1.1, though better in this respect than 1.0, still contains far too many contradictory syntax rules, some of which arise from the general problems remarked on above. For example, comments (the UML equivalent of notes) are sub-types of model elements according to the meta-model diagram, but sub-types of view elements according to the UML semantics text - model elements and view elements have rather different syntactic capabilities. Many more inconsistencies are identified in the course of this guide.

How much duplication

UML has a high degree of syntactic redundancy. There are many alternative ways of describing a specific system structure or behaviour. For example, despite the fact that use cases are fundamentally a service-oriented concept (see Initialisation/use cases), its syntactic capabilities mean that you can give a use-case attributes and operations, and thus use it in some senses as if it were a class. It is not clear what this would mean. Cris Kobryn, chair of the UML Revision Task Force says that this redundancy is because UML is designed to support multiple processes. We do not believe that the need to support many processes implies a technical need for redundancy in the language - though of course, it may make the political task of agreeing a standard easier. Such redundancy may not bother the methodologist, but it confuses the user considerably. It also can lead to people attaching different meanings to the different possible ways of representing a given concept. Several examples of redundancy are presented in the course of this guide.

Figure 2 Extract from UML meta-model

Since model elements can have names, classes can too

Concrete syntax

UML concepts are made concrete by allocating one of three representations: a graphical symbol, textual representation, or 'tool hyperlink' representation. A diagram is the concrete expression of a 'view' onto a UML model.

Semantics

Logical/physical confusion

UML provides no clear way to define the level of abstraction to which a model relates. We show that UML does not clearly define logical-only (implementation independent) model elements, or define relationships that imply a clear logical/physical division in Analysis/types and classes and General facilities/traceability relationships. In one sense there is nothing wrong with this - a generalised modelling language could be applicable across many different domains.

However, if a language does not provide specific concepts to support different levels of abstraction, it should require the level of abstraction of any given UML model to be defined, since otherwise the model is effectively meaningless. Indeed, we feel that it should suggest standard levels of abstraction. Unfortunately, UML provides no specific place for you to declare whether a complete UML model represents things in the real world (such as people and objects, logical abstractions of those things such as you would find in a logical application model) or physical program code. You could attach a UML comment to a model to indicate this, but we think that such an important concept needs a dedicated 'slot'. Standard abstraction levels should include 'wholly logical' (no necessary relationship between the model concepts and any implementation environment concepts) and 'wholly physical' (all model elements have 1:1 mappings to implementation environments concepts).

If you allow a language to be used at different levels of abstraction, you should not pollute it by defining some concepts (for example 'nodes' - see Design/deployment diagrams) that are specific to one level of abstraction (physical-only in this case). If you choose (for reasons that escape us) to pollute your language in this way, you should certainly ensure that the rules relating to how well-formed the product is for models prevent the use of the specific concepts at the wrong level of abstraction. In UML, there are no syntax rules to stop you including processor nodes in a logical specification.

Furthermore, if you provide model elements designed to implement traceability relationships between logical and physical models (one interpretation of UML 'refines' relationships - see General facilities/traceability relationships) you should separate these from the rest of the meta-model definition to prevent terminal confusion among readers of the document.

Current and required confusion

UML also fails to provide a standard mechanism for defining whether a model represents a description of a current system, or a model of a future, desired system. Again 'comment' could be used, but a specific slot is desirable.

Formality and completeness

Even if you assume a certain level of abstraction and a purpose for your model, you find that UML does not provide you with a formal definition of the meaning of UML concepts and their relationships. Instead, the meaning of the concepts is explained in free text. These explanations are generally clearer than in UML 1.0, but in many areas (for example, the explanation of super-ordinate and sub-ordinate use cases) the text is opaque or insufficiently precise; in some areas (for example composite objects - See Analysis/class type diagrams), definition of meaning is missing or incomplete. Examples are provided in the notation guide, but these are essentially unchanged from version 1.0 and are of variable value.

Semantics and the inheritance hierarchy

In principle, the meaning of UML concepts, like their syntax, should be transmitted via the meta-model inheritance hierarchy. In practice, it is not clear that this is intended - there are several places in the meta-model where the meaning of sub-types seems inconsistent with the meaning of their super-types. For example, it is stated that dependencies (see General facilities/traceability) cannot go across levels of abstraction; however, one of the sub-types of dependency, 'refines' relationships, is specifically designed to connect model elements across levels of abstraction.

Execution rules

A language's execution rules define the governance of the order of execution of well formed models. It is thus an important part of the semantics of a language.

UML's execution model is informally defined using structured text. The model defines the order of execution of each time-dependent activity. Although a substantial piece of work, UML's execution model is incomplete. For example, the behavioural implications of combinations of nested state machines and nested objects are not fully worked through. The lack of a defined syntax for the detail of processing expressions in UML means that the execution model cannot take account of the impact of the evaluation of these expressions on behaviour. Other examples can be found under Analysis/state diagrams.

Overall, we conclude that UML's semantics are significantly less well defined than its syntax, which itself is inconsistently, imprecisely, and incompletely defined.

UML does not aim to define which of its diagrams should be used at different stages of the software development process. It is content to state that it is designed for use in an iterative, incremental approach to system development.

Process

During the last year, a number of alternative processes have been offered by different vendors. We do not analyse these here; they do not form part of the standard. Instead, we offer the main URLs: Icon Computing's Catalysis method (www.iconcomp.com/catalysis), SINTEF's TIMe 'The Integrated Method' (www.sintef.no), and Rational's Objectory Process (www.rational.com).

General facilities

In the Definition section we stated that the way UML is defined means that many facilities apply to many or all item types. We call these shared facilities 'general facilities' and discuss them in some detail in this section.

Identity and naming

UML 1.1, in contrast to UML 1.0, does not clearly state whether model elements must or may have an identifier or name. This would appear to be an unintentional omission. If names are allocated, UML requires uniqueness within an item's namespace, which for most items is set by the enclosing package (see Hiding detail) or classifier such as class/type. This means that names relating to items contained in packages or classifiers may not be unique across packages or classifiers. UML addresses this by allowing compound names to be formed in the same way as operating system path names, by the concatenation of the local name with the enclosing classifier name, package name, its enclosing package name and so on. For example: 'warehouse. despatch. despatch note'.

In the case of items that are contained by classifiers (for example, attributes and operations), the name space is set by the classifier; the full path name requires the classifier name as well as package names.

Hiding detail

Like most system specification languages, UML provides more than one mechanism for hiding detail. The mechanisms provided fall into two main groups: simple information hiding mechanisms, and mechanisms specially designed to support multi-level hierarchical decomposition.

UML shares its main information hiding mechanisms with other object-oriented languages. It uses operation signatures (the specification) to hide method processing (the realisation), and encapsulates attributes through methods. However, UML extends these ideas - the 'specification ealisation dichotomy'- to UML's object concepts. So, types (see Analysis) are seen as specifications for classes. Types have operations, classes have methods.

The UML authors say that realises and specification ealisation 'relationships' are not true UML relationships but instances of meta-associations qualified by role names. 'Analogous represents relationships' can be set up between operations or classifiers and collaborations to say that one model element expresses the same information as a second model element, but at a lower level of detail or possibly logical/physical abstraction.

You will note from the above that UML is not clear whether specification ealisation and realises relationshipsimply an overview/detail relationship or an implementation dependent/ independent distinction. On the one hand, the words specialisation and realisation and the fact that UML talks about the relationships crossing levels of abstraction seems to imply a logical/physical division, although the terms logical and physical are never used. On the other, the hiding of a detailed (method) processing specification by an operation signature seems more like detail hiding.

Furthermore, the current semantics of 'represents' and 'realises' relationships seem very similar: we think they should both have the same name or a semantic distinction should be implemented.

Packages

The most important hierarchical aggregation/decomposition mechanism in UML is the 'package'. Most UML item types, UML classifiers such as classes/types, relationships and even dynamic item types (such as state machines and sequences) can be grouped into 'packages'. (It is not clear why model elements that are dependent on classifiers need to be able to be grouped separately from classifiers.) This grouping creates a strong 'ownership' relationship: if the package is deleted, so are the items it contains or 'owns'. However, packages themselves can be defined before the items they contain and may themselves 'own' other packages. Packages may be used to conduct a top-down decomposition of a problem, as well as a bottom-up aggregation, although we think that the latter is the way in which they will usually be used.

Packages encapsulate the items they own. What items inside and outside the package can see is determined by adding visibility adornments to the items, and defining certain inter-package or item/package relationships. The rules involved in setting up package encapsulation, and controlling the ways in which it may be violated, are intricate to the point of being impenetrable. A great deal of effort is put into defining ways in which a touchstone of object-orientation, encapsulation, can be broken - mainly, it seems, because these mechanisms are present in some of the less disciplined object-oriented programming languages such as C++.

Packages also set a scope within which names must be unique (see Identity and naming).

Class decomposition

Packages are not the only aggregation/decomposition mechanism - classifiers (though not use cases) can themselves contain other nested classifiers. In the semantics guide meta-model, the facility is represented by the ability for classes to be related by composite aggregations, implying that containment relationship holds at the instance level, and deletion of a container classifier instance will delete any contained classifier instances. Nest at the class level is possible, but this means that creation or deletion of instances of the outer class will automatically create/delete linked instances of the inner class. Beyond this, the relationship between the behaviour of the outer and inner objects is ill-defined in UML. The semantics guide also says that 'some aspects' of the dynamic behaviour of the inner objects will be 'inherited' by the outer object. The only response to this is 'what aspects'?

Subsystems

In UML 1.1 a further aggregation/decomposition mechanism has been introduced: UML subsystems. A subsystem is a hybrid between a package and a class. Like a package, it owns other model-elements, can be hierarchically nested, and has no behaviour of its own. Like a class, it can have operations (but not attributes), and can be instantiated. In addition, a subsystem must have a defined minimum modelling content that traverses two levels of abstraction (or possibly detail) - this technique inherits UML's logical/physical confusion. Subsystems must have a specification (high level or logical) part consisting of a use-case model and operations, and a realisation (detailed or physical) part consisting of a series of collaborations. The collaborations must together realise every use-case and every operation in the subsystem (See Initialisation/Use Cases and Analysis/collaborations). This is confusing because there are two service- oriented concepts here, use cases and operations, and it is not at all clear what their respective roles or permitted relationships are - UML says subsystem use cases represent complete services, whereas subsystem operations represent fragments of services. This might mean that the use cases invoke the operations to provide the services - but UML doesn't say so.

UML could be argued to have an over-provision of decomposition mechanisms. However, the UML definition says that the three techniques are complementary - you should use packages when passive grouping is required, subsystems when you are undertaking top-down development, and class decomposition when the container needs to be instantiable. This breaks down somewhat, since subsystems are instantiable, and classes can offer the facilities offered by subsystems.

Traceability and dependencies

UML implements traceability by allowing explicit 'refines' relationships to be set up between model elements at different 'levels of abstraction' (for example, analysis and design). Such relationships may be taken to imply a logical-physical division between the linked elements, or by contrast, a detailing relationship - UML does not clarify what it means by 'levels of abstraction'. UML is equally unclear as to the meaning of specification ealisation and represents links. Refines relationships and specification ealisation links seem to cover the same (rather blurry) ground - yet another area of redundancy in the UML meta-model.

Further confusion is introduced by the further concept of a 'trace', which it defines as a traceability link between model elements in different models. Since a refines relationship occurs between model elements at different levels of abstraction, which must (by the definition of the UML model concept) be in different models, the notion of refines and trace relationships can be seen to be equivalent. The only distinctions appear to be that trace relationships 'often deal with requirements' and their direction 'can usually be ignored'.

Trace and refines relationships are both sub-types of dependency and can therefore be assumed to inherit its semantics. Unfortunately, dependencies are explicitly prohibited from connecting different levels of abstraction (see the notation guide), and are required to be directional - these direct contradictions with the properties of their sub-types.

Dependencies can be attached to any model element (via many-many relationships) thus allowing trace and refines relationships to link between any model elements and model element sets. This does not seem to make sense. For example, what would it mean to say that an object was refined by a state model, or an attribute by an operation? We think that there should be a mid-point between the original (very restricted) scope of these relationships in UML 1.0, and the current unlimited scope.

Further sub-types of dependency (in addition to refines and trace relationships) include 'usage' and 'binding'. The former is a dependency in the traditional sense of dependencies - it says model element x uses model element y and therefore cannot function without it; it may be affected by changes to it. Interestingly, in any modelling language such dependencies are fully derivable from other modelling elements (if dependencies exist there must be other model level links, probably signals or operation invocations, between the two model elements concerned). Thus, the concept may be redundant, or may be useful as a way of summarising such interactions for presentation purposes.

Re-use

UML supports simple re-use through the ability to create new instances of classifiers such as classes and types. Instances can also be created of related concepts, such as signals, parameters, associations and attributes.

In UML 1.1, instances do not have to belong to the same classifier for all of their lifecycle. They can become instances of a different type, in which case they take on all the characteristics of that type and lose all the characteristics of the previous type.

Terminology related to instantiation is a little confusing in UML. To address this problem, we will refer to instances of classes and types as objects. Instances of other UML concepts will be referred to as either 'attribute instance', 'component instance' or using a special purpose name if UML provides one. The term 'object' (no italics) will refer to the general object concept as in 'object-orientation'.

UML's information hiding facilities, in particular its 'specification ealisation' relationships, also encourage re-use. By defining high-level interfaces or specifications, they reduce the amount of information you need to know about a software design or physical part if you want to re-use it in a different context.

In common with most object-oriented languages, UML supports inheritance via generalisation/specialisation relationships. Signals, associations, stereotypes, packages, subsystems, models and all classifiers such as classes/types can participate in inheritance relationships since this facility belongs to a super-type of all of these concepts, imaginatively known as a 'generalisable element' (itself a sub-type of model element). Inheritance must be additive - a sub-class inherits the union of all the facilities of its superclasses. Redefinition of attributes and operation signatures is forbidden, though you can change the implementation (method) of an operation (see polymorphism, below). Operations with different names are considered different operations. Multiple inheritance is permitted, as are special inheritance types, such as overlapping and incomplete sub-types. Multiple inheritance is strictly controlled - UML says that a sub-class may not inherit attributes or operations with the same name from its parents.

Despite all this well-intentioned syntactical control UML fails to tackle one of the fundamental problems with inheritance - its real world meaning. Until this central problem (which may require the identification of several different forms of inheritance) is tackled, the definition of rules relating to how well-formed the product is, which assist, rather than shackle developers will not be possible. The fact that the UML meta-model definition overrides UML's own inheritance rules using constraints is indicative of this problem.

UML supports the notion of templates - the ability to create a model element instance (Ovum: item) that is specifically designed for controlled re-use. To create a template you take the model element that you wish to re-use, and replace context specific properties by abstract parameters. When you use the template, you create a new model element based on the template, and substitute context specific properties for the parameters. In principle, in order to enforce controlled re-use, a case tool should prevent you from making any other changes to the new model element, unless you separate it from its template. The provenance of the new model element is indicated by a link (a binding relationship see Traceability and Dependencies above) to the template from which it was created, annotated with the substitutions.

Re-use will also be encouraged by the existence of the Corba interchange facility for UML tools.

Figure 3 Inheritance relationship notation

Communication mechanisms and adaptability

UML 1.0 has two key communication concepts: signals, and operation calls, both are sub-types of a more general concept called 'request'. A message is the use of either in the provision of a specific system service (or more correctly interaction - see Analysis/Collaborations for a more detailed explanation).

Signals and operation calls

The two concepts seem very similar. UML says that operation/method calls are the invocation of an operation/method on one classifier (such as a class/type) by another classifier. Signals are sent by one classifier, and listed as received (at reception points or 'receptions') by another classifier - like operation calls, they invoke processing on the receiving classifier. Both have names and the same signature syntax (see State diagrams). Signals, in contrast to operation calls, can be received by more than one classifier even if the classifiers are in different inheritance hierarchies, and can thus be re-used across classifiers. However neither signals nor operation invocations can be truly broadcast - the target classifier and instances must always be specified.

Given their similarities, we are not sure why signals and operation calls are regarded by UML as separate concepts. It is argued that the ability for signals to exist independently of classes allows their re-use, and is particularly helpful in exception handling. We agree, but do not think that you should be re-using signals in logical models. Processing, like data, should be normalised to a single class. The class should act as the vehicle for re-use. Generic exceptions can be dealt with by referring the error message up the class inheritance tree to a suitably generic class. We are also told that signals are always asynchronous and operation calls may be synchronous or asynchronous. We feel that the fact that operation calls may be either synchronous or asynchronous weakens the argument. In our view, it is probable that the two separate concepts have been introduced because similar, but not identical, concepts are used by different object-oriented methods and programming languages.

We think the two concepts should be merged into one, which can be qualified to indicate whether a communication is synchronous or asynchronous.

Polymorphism

UML signals and operation calls have been designed to increase the adaptability of your specification, and hence of your final system. Both exhibit polymorphic behaviour.

If you define an operation/method for a classifier like a class/type, a sub-type of the classifier may redefine the way the operation is implemented (ie its method) to suit the characteristics of the sub-type, but still use the same name. (It may not however redefine its signature). This means that you can call the operation for all or any objects belonging to any classifiers in the inheritance tree by using the one name - you do not have to call a different function to perform the operation on objects belonging to different classifiers. Furthermore, if you want to add to the range of sub-types in future, you will not need to invent a new name - the new sub-type can have a redefined method with the same name. This means that your calling logic can remain unchanged. In this way polymorphism increases the adaptability of your specification.

As signals have an existence independent of the classifiers that send them, they do not use the classifier inheritance hierarchy to implement polymorphic characteristics. Instead (since they are sub-types of generalisable elements), you can specify separate inheritance hierarchies of signals. Polymorphism is implemented as follows: if a classifier sends or responds to a specific signal, it must send or respond to all sub-types of the signal. Further, if a transition on a state machine (see Analysis) triggers on receipt of a given signal, it must trigger on receipt of any sub-types of that signal. This is just another way of achieving the same effects as polymorphism of operations.

Substitutability

An implication of the fact that a sub-type is the union of the properties of its super-types is object substitutability (as well as operation substitutability or polymorphism). An instance of a sub-class may be used wherever an instance of a super-class is expected.

Communication reliability

Regrettably, UML provides no specific support for defining the expected reliability level of communications. This will restrict the applicability of UML to the design of safety-critical systems.

Data types and expressions

UML defines no standard data types for attributes, nor does it require you to use any specific syntax for concepts usually specified in textual expressions. Therefore no syntax is mandated for processing conditions, guard conditions, pre- and post-conditions, invariants, constraints or the detailed processing code invoked by state transitions.

The UML definition suggests that one option is to use OCL - the declarative object constraint language in which rules relating to how well-formed the product is are written. OCL is described in the UML OMG Standard Documentation but it is regarded as one among a number of alternatives (including implementation languages) and the standard's statements about OCL are very tentative. We understand this is because OCL has not yet been proven to be adequate in this capacity. In sum we do not feel that OCL can yet be described as an official expression language for UML.

Another option is to use a standard programming language such as C++ or Java. Of course if you do this you will lose the language independence of your specification.

We believe that the lack of an official implementation independent syntax for detailed expressions remains a key weakness of UML. Until a language is tested and officially adopted, tool vendors will not expend the considerable effort required to implement a detailed specification language in their tools.

Volumetrics

UML's support for volumetrics is rather weak. For example, it allows you to record the maximum number of objects that a class can have, but not the average number, nor does it allow you to define message frequencies (all of which are important if you want to control performance, especially in the real-time environment).

It was suggested to us that you could use one of UML's general extensibility features - constraints - to add this information. This information certainly could be recorded this way, but provision of a specific concept would make it much easier for CASE tools to separately report and make use of this information (for example, to automatically optimise data designs to reflect expected loads, or in performance testing).

Performance and time dependent behaviour

No support is provided for the specification of performance.

UML provides some support for making behaviour dependent on time. Transitions can be triggered by time events which fire when the time elapsed from the point of entry to a state exceed a specified value.

Parallel processing

UML supports parallel, as well as single-threaded, processing. Parallel processing is supported by defining execution semantics for special classes (active classes) whose instances and children can queue requests and process them concurrently. The execution semantics of asynchronous calls between objects are also defined, and state transitions are allowed to process in parallel. (See Analysis/State Machines for further description).

Workgroup and configuration management

Support for management control information (for example, date that an item was first defined, who defined it, version and other configuration management information) is lacking. We are told that this is process-related, and therefore falls outside the UML brief. It is also, we are told, difficult to include in a method since the level of granularity at which it should be recorded (against packages, a class or an attribute) is not clear.

UML does not propose any measure of specification complexity. This will make it more difficult to estimate the effort that would be required to implement the system a model specifies.

Customisability

UML offers several extensibility mechanisms, which it says are designed to be easy to implement on a standard Case tool. Meta-case functionality should not be required.

UML allows you to create new sub-types of model element, or indeed of any of the progeny of model elements (these are called 'stereotypes'); add new characteristics to existing model elements or stereotypes (these are called 'tagged values') and to qualify the meaning of an instance of a model element or stereotype (by attaching 'constraints').

The definitions of the latter two concepts are rather confused. It is still not clear whether UML intends that users should be able to apply constraints and tagged values at the meta-model level (thereby changing the properties or meaning of all future instances of the model element) or at the level of a specific instance of a model element (thereby changing the meaning of just that specific instance). UML declares that the former is its intention (p57 of the semantics guide) and then gives examples of the latter when it describes the standard stereotypes provided by UML. It is not clear whether a specific constraint is re-usable across multiple model elements, and how that relationship is distinguished in the UML meta-model from a specific constraint that applies to the relationship between multiple model-elements (an example of the latter is the {OR} constraint that, attached to several associations says that either one or the other or the other may hold). It is still not clear how a constraint is different from a UML comment. The UML definition says that a constraint is written in a constraint language, and then destroys the distinction by saying that this language may be natural language. (They also suggest that a comment has no semantics. What is the point of putting a note on the diagram if no meaning is attached to it? ). Finally is it not clear whether constraints can be used override the rule that a sub-type inherits all the facilities of its super-types.

There is no mechanism for defining new syntactical rules for any stereotypes (new model elements) you may define - they simply inherit the rules applicable to their parents. Nevertheless, the ability to create new item types at this level of abstraction is a very powerful mechanism. It will require careful tool support.

Section: Method guides
Date: 1998

Initialisation

Strategic planning

UML provides no support for strategic planning of information systems.

Business Process Modelling

UML does provide support for business process modelling. This is outside the scope of the current guide, and so is not scored.

Exploratory analysis

UML supports top-down exploratory analysis via package decomposition, class/type nesting and use case models.

Package, subsystem and class/type decomposition

You can break your system down into packages or preferably subsystems (see General facilities/Hiding detail). The top level subsystem or package represents the system. You then break down these packages or subsystems into further subsystems or packages, and so on until the units involved are simple enough to be represented by types or classes rather than packages or subsystems. The bottom level package or subsystem will enclose these prototypical classes/types. Packages and subsystems can be related using the same inheritance relationships that are available for types and classes; however, class/type associations and properties such as attributes and (for packages) operations are not available. Packages and subsystems can also be related by usage dependencies, to show that the contents of one package uses the contents of another. Please see the diagram below (which shows packages rather than subsystems as no notation is currently defined for the latter).

An alternative means of architectural decomposition is to use nested types or classes. In this case you will be able to record attributes and operations. (However there are problems with the syntax and semantics of this technique, see General facilities/Information Hiding)

Figure 4 Package diagram showing dependencies (dashed lines) and inheritance relationships (solid lines)

Use cases

Package and subsystem decomposition references specification model elements, rather than defining them so we award reference markers. However, class/type decomposition is sufficiently expressive to justify a major contribution in the objects/entities and static rules columns.

A use case itself (as distinct from a use case diagram) represents a complete system service, including all its variations, viewed from outside the system (i.e. without revealing the service's internal structure). It is thus analogous to the SSADM notion of a user function: 'a service providing automated support for a complete user task'. Use cases are a high level concept - they can be broken down into lower level UML concepts.

Use cases can be used to specify the services available from any subsystem or class according to the text on p93 of the semantics guide, or any classifier according to the diagram on p94. The former seems most likely to be the true intention since if the latter is true they could be applied directly to themselves, which would seem to conflict with a rule that prohibits direct decomposition. They are most frequently applied to the top level subsystem, in which case they specify the services required of, or provided by the system as a whole. If applied to a class they specify how one or more of its operations can be strung together to create a service.

Use case diagrams show services and the users of services

A use case diagram (see Figure 5) shows which entities outside the system boundary can 'communicate with' which use cases within the system boundary by using association lines. Use case diagrams thus straddle the human machine boundary. The entities outside the system boundary are called 'actors' by UML, a term that embraces external systems as well as users. 'Actors' like subsystems are regarded as a sub-type of classifier, and these concepts should thus be regarded as user roles and system types, rather than users or system instances. Many-to-many relationships can be established between actors and use cases, thus allowing access permissions to be expressed (fulfilling the requirements of our organisation column). Actors may be formed into inheritance hierarchies which inherit access permissions, a very useful facility.

The use case diagram also shows relationships among use cases. A use case can make use of another use case, or extend another use case. The 'uses' dependency says that one use case will use the behaviour indicated in the other use case. The 'extends' dependency says that any given instance of a use case may use (subject to conditions specified in the extension) the behaviour specified in another use case. There is a problem with the 'extends' dependency - although the UML says a condition should be specified, no meta-model slot is provided (whereas such a slot is provided for example for a state model guard condition).

Figure 5 Example of a use case diagram

The purpose of uses and extends relationships is to prevent repetition, and begin to analyse services into sensible parts. The relationships are represented by generalisation arrows. We think the relationship should instead be a usage dependency, as such a connection would seem to have little to do with generalisation.

Figure 6 Use case dependencies

...and who can use them

The way in which a service is used may vary from one occasion to another, according to decisions made by the user. Use case diagrams are drawn to represent all the ways a service or services can be used (the path taken though the service may vary from use to use according to the decisions of the user) not the specific way that a service is used.

UML also defines the concept of a use case instance diagram. This records a specific way in which a service can be used - that is, an example of the use of the service 'with all decisions definite', otherwise known as a 'scenario'.

Use cases can be broken down into collaborations

Use cases can be broken down to other lower level UML items, typically collaborations (see Analysis). The collaboration shows how classifiers like classes/types collaborate to provide the service. If the collaboration is expressed as a scenario then many collaborations will be needed to fulfil a use case.

As we noted above, use cases cannot be directly decomposed into other use cases. If you want to decompose use cases you must do this indirectly by decomposing the subsystem or class to which it is attached. Subsystems for example may be decomposed into other subsystems or classes either of which may have more detailed use cases attached. UML says that consistency must be maintained between the more detailed (sub-ordinate) use cases and the more general (super-ordinate) use cases, indeed goes into a lengthy and confusing peroration on the subject which it concludes with the ultimately confusing statement that 'sub-ordinate use cases &ldots;&ldots;..are actors of each other'. We think that clarity rather than mental gymnastics is required. (The key point is that when you are constructing a use case diagram of a subsystem that happens to represent just part of a system, other classes or subsystems without that part are depicted as - take on the local role of - actors in that subsystem's use case diagram).

UML says that you can attach interfaces (See Analysis/Classes/Types to use cases to specify a sub-set of the behaviour represented by the use case, and corresponding interfaces can be attached to an actor that communicates with the interface. It is difficult to see where this would be useful.

We have only discussed the main relationships that use cases can participate in. UML regards use cases as sub-types of classifier, and they are therefore able to participate in all the relationships that are defined for classifiers (except where explicitly restricted by rules relating to how well-formed the product is). Making a service-oriented concept a sub-type of an architectural concept does not seem a good idea. Not only is it confusing, but it is also bound to lead to inappropriate inheritance. In this case for example use-cases can have operations and attributes - UML tries, but in our view fails, to find a convincing use for this facility.

Defining the scope of the system

Use case diagrams show the boundary of the system from the services perspective. On a class/type diagram, the system boundary can be determined by the fact that types and classes outside the boundary (whether human or machine) are represented as actors, which are visually distinguishable from types and classes.

The UML provides no specific mechanism for distinguishing the scope of an entire information system (manual plus automated elements) from the scope of the automated system. This could be achieved through UML's general extensibility mechanisms by stereotyping packages or classes/types, but it would be better if a specific concept was provided.

Requirements management

In UML, textual requirements are stereotypes of comment, and, presumably, can have many-to-many relationships with model elements. We say presumably because this was the case in UML 1.0, but unfortunately the meta-model link is missing in UML 1.1, and the text only describes a one to many relationship.

Use cases specify the services required from a system and are thus can be regarded as specifying service related functional requirements (but not non-functional requirements, or requirements that hold across several services). We award a medium sized marker in the requirements column.

Section: Method guides
Date: 1998

Analysis

Types and classes

Types are (probably) logical

The concept of class lies at the heart of most object-oriented methods; UML is no exception. What is surprising, is that UML chooses to split the concept in two. It distinguishes between 'types', which represent specifications, and 'classes', which show how these specifications are fulfilled. This suggests that types can be regarded as logical, and classes as physical. In fact, in UML the distinction is not quite as clear as this. UML says that classes can be used to model implementation specific or implementation independent concepts - if you want to say a class is implementation dependent you stereotype it thus <<implementation class>>. No such stereotype is provided for an implementation independent class, suggesting that types are the appropriate concept. But the UML semantics volume does not clearly state that types are implementation independent concepts, though we understand this is the intention. Instead UML distinguishes types by the fact that they support dynamic classification, whereas implementation classes do not.

Classes may be logical or physical

How you should make use of the class/type distinction is also unclear. There is nothing to stop you putting classes and types together in one model (at a single level of abstraction eg 'logical') and using a specification/ realisation relationships to say that one or more classes realises a type. This probably should be interpreted as the class detailing the specification of the type. Equally you could create separate type and class models and have the class model refining the type model. This probably should be interpreted as a physical class model refining a logical type model. It gets even more confusing when you realise that you could do the reverse (create a type model that refines a class model) or create a class model that refines a class model. All these creations would be syntactically correct.

Syntactic similarities, but one difference

As types are stereotypes of class (see General facilities/ Extensibility Mechanisms), the syntactic and semantic characteristics of the two concepts are similar. However, there is one key difference, and this lies in the area of behavioural features - UML's name for operations, methods, and signal receptions.

The difference is that a type can only have operations, whereas a class can have both methods and operations - a method realises an operation in the same way as a class realises a type. (This distinction leads to a meta-model contradiction, which leaves the type concept formally undefined. In losing the ability to have methods types have lost a meta-model association possessed by classes. But UML does not allow meta-model associations to be lost or gained in stereotypes).

Both types and classes can have signal receptions (see General facilities/communication mechanisms).

More on operations, methods and signal receptions

For operations and signal receptions, only the signature (operation/signature name, its arguments and their data types) and declarative pre- and post-conditions can be defined. (Pre- and post-conditions are equalities that must hold before and after execution. No syntax is proposed for these concepts, although OCL could be used. Also, strangely, on p40 of the semantics guide they are described as stereotypes of constraint - which means they could be applied anywhere, even to invariants - rather than constraints on operations).

If an operation is on a class, it can have a method to describe how it is processed. Here UML suggests that methods should be described using program code. Strangely methods can have signatures as well as operations. This means that, for a class, you could define the signature of an operation twice - once in the operation and once in the method, though both versions are presumably constrained to be the same. This seems to be needless redundancy.

Signal receptions cannot have methods under any circumstances. So how do you define the processing carried out when a signal is received? Signals invoke action sequences (See Analysis/State Diagrams) or lines of program code (presumably only if they occur on classes). Now, operations also invoke action sequences, which leads us to the conclusion that if you eliminate the redundant signature information, methods are effectively the same as action sequences - i.e. one or the other is a redundant concept!

Powertypes and meta-classes

In addition to standard types and classes, UML provides for two (more abstract) concepts: powertypes and meta-classes. Both are stereotypes of classifier.

Powertypes are classifiers whose objects are the sub-types of another class. UML says that, in addition to the stereotype notation, you should indicate a powertype by qualifying the inheritance relationships between the sub-types and their supertype (but is confused as to whether the qualifying model element should be a constraint - page 40 of UML semantics - or a dependency -see the standard elements table). One example would be the class/type 'tree species', whose objects, 'birch' and 'oak', are sub-types of the class/type 'tree type'.

Meta-classes are a Smalltalk concept - they are classes/types with instances that are themselves classes/types. Meta-classes contain operations for creating and initialising new types or classes in the same way as classes contain operations for creating and initialising their objects. They might thus be seen as the object-oriented equivalent of self-modifying code. UML says that, in addition to the stereotype notation, you should indicate the relationship between the classifiers in some manner, but is once again confused as to whether the model element involved should be a constraint - page 40 of UML semantics - or a dependency -see the standard elements table).

We are not sure why meta-classes and powertypes need to be defined as UML concepts - they will only occasionally be used and can be invented, if needed, by appropriate use of standard instantiation and inheritance relationships. Like many other UML concepts, they seem to have been imported uncritically from a programming language.

Class/type diagrams

UML describes only class diagrams in detail; no separate description of type diagrams is given. However, a symbolic representation for types is proposed these are to be represented as stereotypes of class like this : <<type>>. (Most other classifiers have their own diagrammatic representation - see Initialisation/Use cases, Design/Deployment diagrams and design/Component diagrams. We will discuss these classifiers in the appropriate sections. The exceptions are data types and subsystems for which UML suggests no diagrammatic representation).

Figure 7 Type and class symbols

As classes and types share similar syntax and semantics it may be assumed that diagrams are otherwise almost identical. The shift in levels of abstraction or detail between type and class diagrams suggests that types and classes will not normally appear (at least in full notation, with all relationships delineated) on the same diagram (though this is allowed) - the refines or realises relationships would make the diagrams too complex. Separate diagrams will probably be used, with refines or realises relationships being expressed as tool hyperlinks.

UML class/type diagrams are full implementations of object-oriented class diagrams. Classes/types can have other classes/types nested within them. Classes/types can be flagged as persistent or non-persistent, and you can define the maximum number of objects that can be created for a class/type. Taken with the ability to define active classes (classes/types whose instances can process concurrently) these facilities should please the real-time community. However, there is no support for specifying random failures in the processing of operations and methods. This restricts UML's usefulness in the design of safety-critical systems, where the impact of random system failures may need to be predicted. Classes/types can have invariants recorded against them - declarative rules that hold good throughout the life of any object. These may be seen as an implementation of Ovum's static rules, but unfortunately UML provides no official syntax - though you can use OCL if you wish.

...are a full class diagramming technique

The attributes and methods/operations of classes/types can be listed. The data type of an attribute is specified by defining a link to a data type or class. The class/type the attribute points to is likely to be a UML 'data type', but could in fact be a complex class from the class model. (see General facilities). Several attributes can point to the same class, so the UML allows for attribute domains, but puts significant responsibility on the tool for correct and flexible implementation.

&ldots;support a rich set of associations

UML class diagrams support a rich set of associative relationships to add to the inheritance relationships described in the General facilities section. Only classes and types can participate in associative relationships, which UML sees as a general category. All associative relationships can be binary or n-ary, but you cannot hold attributes at the intersection of a n-ary relationship. All association relationships have two (or more for n-ary relationship) roles, which contain information about the relationship. This may vary according to the direction in which you read it. The information includes cardinality (UML calls this multiplicity), whether the relationship is navigable, whether the objects have a meaningful order, whether the relationship is aggregative (one object belonging to class/type contains objects belonging other class/type), and the mysterious 'ischangeable'. The latter states whether the identity of the aggregate can be affected by a change in the identity of the part referenced by the relationship. (For example, if you swapped one person's head for that of another person, you would have changed that person's identity).

UML regards associations as generalisable elements. This implies that generalisations can be sub-typed separately from the classifiers to which they refer. We would like to see a clear explanation of the allowed relationships between association super-types and their sub-types (should they be sub-sets, super-sets or what?) and a clearer indication of why association generalisations are necessary.

In general, UML's approach to associative relationships is a useful merging of structured and object-oriented concepts, but we question whether there is any meaning to an aggregative relationship that permits a cardinality of zero at the master end. UML says this is to deal with the situation where a container object can be destroyed, without the objects it contains being destroyed. We feel this is best modelled by straight-forward association.

&ldots;but key information is shown using text not symbols

UML's diagrammatic symbols for associative relationships are shown in Figure 8.

Figure 8 UML's diagrammatic symbols for associative relationships

No symbols are provided to indicate cardinality and optionality in the case of non-aggregative relationships. This information must be shown as text. This is an unfortunate limitation, since such relationships are very common and cardinality/optionality information is key. UML allows you to define exclusivity between associative relationships using the {or} constraint.

Figure 9 Class diagram for home multi-media hire service, showing 'person' acting as a template class from which the 'borrower' class is created

Classes (and hence indirectly types) are sub-types of classifier and thence indirect sub-types of model element. Therefore all the facilities of model element are available to classifiers, classes and types including the ability to define dependency relationships with other model elements, and to be grouped by packages. It is intriguing to ponder what the semantic difference is between a dependency and a directed association. There would seem to be some redundancy here.

We judge UML types and classes to make a major contribution in the objects/entities and static rules columns.

Objects

Both types and classes can be instantiated as indeed can other classifiers. Instantiations of classes and types are referred to as objects (in italics in this method guide).

Objects can be represented on class or type diagrams - these are shown using the same notation as types and classes, but with their name underlined, and, if required, an instantiation dependency to their parents. A name can be supplied for the object as well as the class/type, using the syntax 'class/type name: object name'. <note to editor - the underline is required by the UML method>. The possibility that instances may change class or type during their lifecycle is shown by drawing an additional dependency relationship.

We discuss diagrams which are made up entirely of objects below. (Other instances - i.e. instances of classifiers other than type or class - can be represented on instance level versions of the appropriate classifier diagrams - instance level use case, deployment and component diagrams - you simply underline the instance name. In the case of subsystems no notation is suggested. An instance of a data type is a value, which does not need a diagrammatic representation).

A diagram consisting entirely of objects and their relationships is called an 'object diagram'. This represents a snapshot of the objects, and the relationships existing between them at a particular point in the stream of events that affect a system. UML regards each object diagram as an example of the structural relationships that may exist among a group of object; when these relationships occur in reality, the names of the objects involved may be different from those shown on the object diagram.

Object diagrams

On object diagrams, associations between objects are represented by association instances, called 'links'. Different objects belonging to the same class/type are represented by placing more than one symbol on the diagram. Information attached to the roles on the parent classes may be suppressed or displayed; multiplicity information, in particular, is usually suppressed -relevant object links are shown by individual lines.

We award object diagrams the same object/entities markers as class/type diagrams, but reduce the static rules score to reflect the fact that the technique cannot (by definition) express full multiplicity information, and hence cannot distinguish mandatory from optional relationships. We also add a minor contribution in the dynamic rules section as a set of object diagrams drawn at different stages in the life of the system will illustrate the system's dynamic rules.

Interfaces

The purpose of interfaces is to collect together operations offered by classifiers (such as classes and types) into coherent services. The interfaces are attached to the classifier using specification ealisation relationships, so the classifier is said to realise the interface. Interfaces can be linked to many different classifiers. This says that each of the classifiers may act as alternatives in realising the interface, (where 'alternatives' can imply modelling or run-time options). It does not mean that you can take some operations from one classifier, some from another - indeed each alternative classifier must be able to provide all the operations specified by the interface. This initially seemed to us to be a pity - the collection of operations to form coherent service definitions might very well require the selection of different operations from multiple classifiers. However you could of course use a type/class to collect the operations (maybe a container class) and then define an interface to that type/class.

Interfaces are another sub-type of classifier, and inherit most of the features of classifiers. However interfaces are not allowed to contain other interfaces or have attributes. However they are allowed to have specification ealisation relationships with other interfaces. It is not clear what this means.

Interfaces are represented using a class box with the stereotype indicator <<interface>> - confusing as interfaces are in fact sub-types of classifier. The 'realises' relationship from and interface to the classifier that realises it is shown by a dashed generalisation arrow. This is strange, as the specialisation ealisation relationship in the semantics guide has apparently no generalisation semantics.

Interfaces are class-like objects but their primary purpose is to organise and reference functionality defined in classes. We award reference markers in the entity, function, and interface columns.

Behaviour

Behaviour in the UML sense means all the dynamic properties of objects, including dynamic rules and processing. After a brief excursion into collaborations (which may be seen as a hybrid structural/behavioural concept) the rest of this Analysis section discusses UML's behavioural concepts.

Behaviour can be specified using state machines, which place the emphasis on dynamic rules, and are presented on state transition diagrams. State machines show all possible behaviour of the model element (typically a classifier like a class or type) to which they are attached. They usually show the behaviour of just one model element or classifier.

Behaviour can also be specified using interactions. These emphasise the processing involved in providing a particular service, and its interface with the user or external systems. They therefore show only part of the behaviour of, for example, a given type or class, and may (optionally), in contrast to state machines, only describe a few (or even just one) of the pathways through the sequence of interactions that constitutes the service. (Where just one pathway is shown - where all decision forks are pre-decided - an interaction is called a 'scenario'.) However, they show this behaviour across all the classes/types involved in providing the service.

State diagrams and the diagrams that display interactions (sequence or collaboration diagrams) can be seen as different views onto an underlying behavioural specification. If each set of diagrams is complete, each view can constitute a complete specification of behaviour in itself (although arguably the state machine view is more rigorous). It is helpful to have all three views, however, since they emphasise different aspects of the behaviour.

Collaborations and interactions

UML's collaboration and interaction concepts are intertwined and related to other UML concepts in a complex way. We discuss them together, first at the conceptual level and then at the diagrammatic level, in order to highlight their relationships.

Collaborations

A collaboration is a subset of the classifiers (types, classes etc.) and associations from a system specification, selected to represent just those involved in providing a specific service. The sub-setting is carried to its logical conclusion - properties of the classifiers and associations such as attributes, operations and cardinalities that are irrelevant to the service are omitted, and the resulting cut-down model elements are referred to as roles e.g. class-roles. Collaborations (the concept) as opposed to collaboration diagrams (see below) are largely structural - their only dynamic aspect is that the classes involved are selected as those that provide the service. Collaborations are defined at the classifier level in the semantics guide (no collaboration instance concept is defined and linked to the appropriate concepts under common behaviour) but the UML notation guide refers to their presentation at the instance level on diagrams. Clarification is required of this apparent inconsistency.

A collaboration may have associated behaviours that show the dynamics of how the service is provided. According to UML, it does not have to. These behaviours are specified using 'interactions'.

Collaborations are indirect sub-types of model element via namespace and thus inherit all the facilities of model element, such as the ability to define dependency relationships and to be owned by packages, plus the ability to define a namespace for their contents. It is not clear why collaborations are allowed to define their own namespace - as projections of a classifier model they should inherit their namespace from the class model. (Also since collaborations are non-disjoint partitions of a classifier model, if collaborations define a name space, different collaborations could define different, mutually inconsistent names for the classifiers).

Interactions

Interactions define the messages involved in providing a service. Interactions may be used to specify the dynamics of a collaboration, and must be specified in the context of a collaboration.

Messages are defined by UML as the use of a signal or an operation call in an interation. It is not clear why this concept is required as well as that of 'request', which is a super-type of signal or operation call in the meta-model. Could this be because request is made an abstract class in the meta-model to force a distinction between signal and message? In which case could this area of the meta-model not be made much more simple by conflating the message and signal concepts?

Application of collaborations and interactions

Collaborations can be attached to classifiers (for example use cases and classes) plus operations to more fully define services they provide. (For the reasons given under General Facilities/Hiding Detail it is not clear whether collaborations are also considered to be more 'physical' than the classifiers and operations they 'represent').

If attached to an operation the classifiers in the collaboration may only include the data types of the operation's arguments together with the data types of the attributes owning the operation. If attached to a classifier, for example a use case, any classifiers in the class model may be included. These rules do not bear close examination - if the classifier were an class then the collaboration would have to represent one or more of its operations. The collaboration would at the same time be constrained to include only the data types of those operations, and be permitted to include any classes it wanted - a direct contradiction.

Collaboration diagrams

Collaboration diagrams may use object diagrams (with all information irrelevant to the service removed) as a base. The fact that a message may be sent to multiple objects may be specified using the 'multi-object' symbol - a single object is linked to this multi-object to act as a source for further messages. This approach allows the diagram to act as a cross check on the multiplicities in the class model. Visibility adornments specific to the interaction (global, local, parameter and self) may be added.

On this skeleton, behaviour may be displayed by labelled messages attached to the links between the objects. Actors (see Initialisation/Use cases) can be shown as the source of messages from the environment, and may have many-to-many relationships with messages. The messages may represent the invocation 'activation' of an operation or signal, or the subsequent 'return' of values. Calls to another operation on the same object are shown by looped messages - an object calling itself. Where the operation called is the same as the operation which issues the request, the result is a recursive call. There are no graphical symbols for iterative calls - we think there should be. Different forms of processing behaviour are shown by different message symbols. These are illustrated in Figure 10.

Figure 10 Message symbols

The messages are named and described by message labels. These have a specified textual syntax that allows you to define the sequence of execution of the messages, including conditional and iterative sequencing, return values and the signature of the message. The sending of a message can be dependent on the completion of several predecessor messages - allowing the expression of synchronisation or rendezvous information to support parallel processing. The fact that iterative and branching behaviour is specified at the level of individual messages, makes it very difficult to express some complex dialogues that involve the iteration and conditional execution of groups of messages.

UML also provides a place for the expression of processing logic within a message label, by allowing arguments and, potentially, actions and action expressions to be included, but does not define an official syntax for these expressions. However the data type of a parameter may be class, and thus (if say the class is a business object representing a fragment of an underlying data structure) it is arguable that a parameter may represent quite complex data structures. For example a parameter could be of data type 'order', which itself was defined as a business class whose instances consisted of instances of the 'order-info' class together with its associated order line class instances. When applied to messages from actors to objects, this facility allows you to state that complex data structures, such as users might enter onto database system forms, are to be input into the system. However defining a new complex class and linking it to a message does seem to be quite a lot of effort compared to the approaches that structured methods take to the same problem.

A collection of inbound messages can be adorned with the 'vote' constraint. This means that the effective return value is selected by the majority vote of all the values returned by the individual messages. In UML 1.0 messages could be grouped into named transactions or logical success units. This quite useful facility is missing in UML 1.1. There is no support for performance specification or the specification of time dependent behaviour in collaborations.

The creation and destruction of an object may be shown by the stereotypes <<new>>, <<destroyed>> and <<transient>>. An example of a collaboration diagram is shown in Figure 11.

Figure 11 Collaboration diagram showing how a window redraw command is processed

UML collaboration diagrams therefore allow you to look at your system from a 'service' perspective. Within this perspective they address our function column, but they do not allow full specification of complex conditional or iterative processes, or of detailed processing logic. We award a moderate contribution in the function column. Since interactions with actors can be shown, they also address our interface and organisation columns. All our requirements for the organisation column are met, so we award a major contribution. However, their difficulty in expressing complex iterative and conditional message exchanges, when added to the fact that no attempt is made to define logical screens, means that we can only award a moderate contribution in the interface column. In addition, collaboration diagrams inherit the object/entity and static rules scores of object diagrams. We award a major contribution in the objects/entities column and a moderate contribution in the static rules column.

Defining patterns and frameworks

UML provides a major service to the IT community by formally defining those most ill-defined of terms: 'pattern' and 'framework'. UML defines a pattern as a template collaboration (but notes that additional textual documentation will be required). The instantiation of a template collaboration to fulfil a specific function can be shown as depicted in the diagram below.

The template collaboration, as the complete entity representing the design pattern, is shown as a dotted ellipse containing the name of the pattern. A dotted arrow (or dotted line depending on which bit of the UML semantics guide you read) is shown from the collaboration symbol to each of the specific objects or classifiers that participate in that specific application of the template collaboration. The dotted arrow thus represents the binding of the parameterised template collaboration classes to local classes. The name on the line states which class in the template collaboration is to be bound to the local class. Other template parameter bindings can be shown using a note box linked to the template collaboration symbol by a dotted line/arrow.

UML defines a framework as a package consisting mainly of patterns. We are interested to see the community's judgement on these definitions.

Figure 12 Representation of a pattern

Sequence charts

Sequence charts emphasise the dynamic aspects of a collaboration or interaction, and de-emphasise the structural aspects. They are derived, with some modifications, from the message sequence chart notation. (This notation achieves its highest level of development in the context of the SDL method - see Ovum's Guide to SDL.)

Sequence diagrams, unlike collaboration diagrams, do not show the associations between participating objects. Instead, they show the sequence of messages graphically, along one dimension of the diagram - you read the order of the messages from the top to the bottom of the chart. The exchange may be quite complex, dealing with the processing of several inputs from the environment, thus constituting a dialogue, rather than a simple transaction. However UML does not state whether the scope of a sequence diagram should be a system transaction, or a user success unit (like an SSADM function), nor does it provide any specific mechanism to allow you to declare its scope, and this leaves one aspect of the meaning of these diagrams undefined.

The objects involved in providing the service are shown as vertical dashed lines, with a named object symbol at its head. Objects can include external entities, such as actors. Unfortunately, the layout of the diagram means that if a sequence serves several actors, each one would need to be shown separately, and all message flows duplicated. The lines begin where the object is created (the creating message points at the object symbol) and finish when it is destroyed. For emphasis, destruction is marked with a bold 'X'. The vertical lines become long thin rectangles to indicate the period during which a specific object is performing an action (is the 'focus of control') either directly or indirectly via a subordinate procedure. This allows clear diagramming of nested flows of control.

Figure 13 Sequence chart showing focus of control, conditional message sends, recursion, object creation and destruction

Message symbols and labels are as used in collaboration diagrams with the following exceptions. Sequence information and return values are usually omitted from message labels, and processing logic may be drawn down the left-hand side of the diagram, instead of being expressed in the label. Iterations are shown by enclosing the iterated messages and writing a termination condition (for which UML does not prescribe an official syntax) below the enclosure. (Interestingly, the UML semantic guide does not appear to support such iterated groups of messages; the notation guide mentions them, but does not specify how they should be denoted on a diagram.) A message with a transmission time delay that is large in comparison with the execution time of the method or operation it invokes, can be shown by a slanting arrow. You cannot declare a specific set of messages as a logical success unit or transaction. Recursive calls are indicated by offsetting a focus of control rectangle to the right of the current focus of control rectangle. Conditional execution and a branching of execution into two concurrent threads is shown by branching arrows with guard conditions (for which UML does not prescribe a syntax). The lifeline of an object may also be divided. The division shows conditional execution.

This implementation of iterative and conditional processing is acceptable where the logical structure of iterations is simple. Where it is more complex, you really need to define a high level diagram that expresses the iteration and conditional execution of low level sequence chart fragments. This is supported in SDL (see Ovum's Guide to SDL), but not in UML.

In our service columns we award the same markers as for collaboration diagrams, for the same reasons. However, only reference markers are awarded in the objects/entities and static rules columns as sequence diagrams have no structural component. We make an award in the organisation column, but reduce it to a moderate contribution, to reflect the duplication required to indicate that multiple actors can use the same service.

State transition diagrams

These are state transition diagrams as referred to in the Guide to evaluations.

State transition diagrams will usually be attached to types or classes to describe the dynamic rules under which they operate, but they may also be attached to any classifier, operation (method) or reception according to the UML OCL rules. (The meta-model indicates that they can be attached to any model element but we decide to ignore this as we despair of ever finding a useful meaning for a state model of, say, an invariant). Strangely multiple state behaviour specifications are allowed for a single class etc. It is not clear why this is useful.

A state transition diagram prepared for a type or class will usually show all the behaviour of which the type or class is capable, not just that associated with a given service or scenario. However the behaviour may be expressed at different levels of detail. A state transition diagram for a class will specify the action code which is executed on the transitions (though this may of course be 'hidden' by the CASE tool) whereas a state transition diagram for a type will only specify the order of execution of its operations or signal receptions.

It is not clear why you should want to specify a state diagram for a operation, method, or reception.

A full implementation of Harel state charts

The UML implementation of state transition diagrams is a full implementation of Harel state charts, with some minor modifications. Harel state charts allow you to define dynamic rules by describing the states that a type/class , for instance, may occupy and to define which events can cause transitions between specific pairs of states. They also allow you to define the processing associated with transitions, and, in the UML version, the processing that occurs while you occupy a state.

Support state decomposition

In UML state diagrams, states, represented by round cornered boxes, may themselves contain a state diagram nested within them. A nested state diagram shows the decomposition of a state into the states and transitions of which it is composed. This feature allows you to drill-down from an overview to the detail. You can also deal with situations in which the same set of transitions impact on many different states, as all transitions impacting the shell of the outer state are assumed to impact all the inner states. (By contrast, transitions that hit sub-states within the outer shell are presumed to be transitions to that state alone.)

If the nested states are concurrent, the nested diagram can be partitioned into different regions, each of which can process concurrently. We note that UML thereby supports concurrency at the level of individual action sequences (or methods), a level that many might see as courageous. It will certainly be a challenge to formalise the execution model.

UML allows you to define start, finish and history pseudo-states, as well as normal states. As its name suggests, the 'start' state is the state where the state transition machine starts; the optional 'finish' state indicates the termination of processing and the death of the object. 'History' states allow you to resume processing of a nested state diagram (from the point at which you left it) after suspension by a transition on the main state diagram.

&ldots;and a rich range of transition types

Transitions between states (represented by arrows) are caused by an event. This may be the receipt of a signal (signal event); the receipt of an operation call (call event); the achievement of a specific point in time or expiry of a duration (time event), a specified change of attribute value anywhere in the system (change event), or simply the fulfilment of a 'guard condition'. If a guard condition alone is specified, the transition will fire as soon as the condition is fulfilled. If a guard condition plus another type of event is specified, both have to be fulfilled before the transition fires. UML also supports the notion of 'internal transitions', meaning transitions that are said to occur on entering, when occupying, and on leaving, the state. This notion seems to us to be self-contradictory and confusing. The same behaviour could be expressed more clearly by creating an additional state and transition. The language also supports 'self-transitions'. These fire repeatedly until a termination condition is fulfilled. Each time they fire, they cause the execution of 'on entry', 'when occupying', and 'on leaving' transitions, together with any nested transitions. No official syntax is proposed for 'guard' or 'termination' conditions, these are simply regarded as UML Boolean expressions (see General facilities) with no official syntax.

No syntax is defined for processing expressions

Transitions result in processing and the sending of signals or operation calls to other objects. The detail of this is expressed on the diagram in a 'transition string' (which has a defined syntax), which is similar to that used for messages (see Collaboration diagrams and Sequence charts). The detail includes the event signature (event name, its arguments or parameters, and their data types), an optional guard condition and a sequence of actions. The action sequence is an ordered list of actions, which includes algorithmic processing statements and signal send clauses. An official syntax is defined for signal send clauses, but not for actions. This means that the processing specification is incomplete although helpfully several types of action are recognised in version 1.1, including instance creations, deletions, operation invocations, and signal invocations (we think attribute value replacements and relationship 'cuts' and 'ties' should be added to this list). Furthermore, no syntax is proposed for defining the amount of time that processing should or does take.

State machine execution is specified in an informal textual model. We do not have space to discuss this in detail here, but we note that it has the following properties:

all actions run to completion without interruption
synchronous and asynchronous messaging is supported
instances of active classes can operate concurrently, actions can execute concurrently
signals and operation calls are queued
transitions cannot be assigned priorities, they acquire them according to their level of nesting within the nested state transition model.

We also note however that UML appears to get into difficulty in resolving conflict where an event causes two non-concurrent transitions to fire simultaneously. We wonder whether a model that allows this should be regarded as well-formed.

Figure 14 State diagram for a telephone

Figure 15 State diagram for a gear box, showing a non-concurrent composite state

Figure 16 State diagram for a term's work at college showing concurrent states

You can, if you wish, show state diagrams for more than one classifier on one state diagram, together with the operation calls and signals that pass between them. Each state diagram is shown as enclosed within a class/type box. The send is shown by a labelled pecked arrow from the transition or the class/type box to another class/type box enclosing a state machine or to a transition within it.

UML state machines cannot be sub-typed separately from the classifiers etc. to which they refer. However if a classifier is sub-typed it follows that the sub-type will inherit some or all of the behaviour of the super-type, and add some behaviour of its own. UML defines a series of different policies which state what parts of the behaviour must be inherited, and what may be added. We note that the most flexible of these policies contradicts the requirement that operations may not be deleted on sub-types (presuming that any given operation must appear somewhere on an object's state chart). We think that UML should decide on a single policy for each type of classifier inheritance, and that more than one form of classifier inheritance may be required (only one is defined at the moment).

Another interesting relationship exists between the state models of classifiers which are related by composite aggregations, and other referential integrity rules. In the case of composite aggregations for example the death of a composite object must result in the death of its parts. UML currently stipulates no rules relating to how well-formed the product is for the relationship between state models under these circumstances.

UML state diagrams look at the system from a dynamic rules perspective, and indeed are able to give a sound definition of these rules. They also contain high level information about the functions the system will perform, but are not organised to show this process-oriented information to best effect. The information about processing they do contain is high level - no syntax is proposed for the specification of detailed processing logic. We award a major contribution under static rules, and a moderate contribution in the function column.

Activity diagrams

The purpose and semantics of activity diagrams are still unclear in version 1.1 of the method. We therefore present our high level interpretation of the technique, together with some criticism of the presentation of the concept in version 1.1 of UML. We will not attempt to give an example diagram until the technique is more clearly defined.

Ovum believes that UML activity diagrams are intended to present a detailed view of the processing involved in providing a service specified in a use case, operation, or signal reception - a view that is more detailed than that represented in an collaboration or sequence diagram. The key additional facility they provide is the ability to flow diagram the processing that take place within an action sequence. These flow diagrams can optionally be extended to encompass the portions of each classifier's state diagram that is involved in providing the service. The state diagram information is re-presented onto an activity diagram so as to give a seamless flow representation of the service covering different object's state models, and the action sequences within them.

(Activity diagrams may also be used to support business process and workflow modelling, but these applications are outside the scope of the current Guide).

UML supports activity diagrams by adding some additional meta-model elements to the state machine syntax. The most important of these is the notion of an 'action state' a state that represents the execution of an atomic action from an action sequence. Unfortunately no official syntax is provided for the actions. Transitions between action states are represented by state machine transition arrows, but these transitions are held to be triggered by the completion of the action in the preceding action state. Transitions between action states have no associated processing. This means that the meaning of activity diagram state and transition symbols is the inverse of the normal meanings associated with state diagram states and transition symbols - very confusing, particularly when you wish to model the execution of part of an objects state model. You will have to turn the state model inside out first!

Activity diagrams support concurrent states and transitions in the same way as state diagrams, but in the case of activity diagrams you don't have to enclose such concurrent regions by a super-state box - simple fork and join connectors will do. A specific symbol is provided for rendezvous that depend on the availability of a specific object in a specific state - the object flow state symbol.

A major addition to the state machine syntax is the notion of 'swimlanes' (vertical dotted lines) which we think may be used to separate the processing belonging to different objects, when the diagram is used to model a complete service as described above. At least the example diagrams given in the UML manual appear to support this interpretation. However in places the text appears to conflict, suggesting that all the swimlanes occur within a class.

Activity diagrams, like state machines normally include variations on standard behaviour, and hence must support conditional logic. In the case of activity diagrams this is shown using diamond shaped choice boxes with guard conditions on outgoing transitions. No syntax is specified for the conditions.

In sum, UML activity diagrams present a detailed service or processing oriented view of your system, but (since no official syntax is proposed for their component actions), contrive to do this without greatly improving your chances of generating from the specification. We award a moderate contribution in the function column.

Section: Method guides
Date: 1998

Design

Transition from analysis to design

With the exception of types, all UML analysis techniques are re-used in design. We do not repeat discussion of the techniques, but credit them in the table for scoring purposes. The marks we award in the organisation column are modified from those we awarded under Analysis, because, in design, the organisation column refers to the ability to specify the distribution of processing.

UML also adds some new techniques specific to the design stage. Component diagrams show how functionality is packaged into separately deployable, re-usable units. Deployment diagrams show how these units are distributed across processors. The latter therefore attracts a major score in the organisation column.

Broadly speaking, there are three different ways to approach the development of implementation specific design models from implementation independent analysis models:

analysis models are modified to produce design models, without retaining the original analysis models - the elaborational approach
a separate design model is created (perhaps by copying then modifying the analysis model); however, the original analysis model is retained. Traceability links are established between the design model and the original analysis model. This is a hybrid approach
the analysis model can be translated into the design model (or even directly into code), by marking up the analysis model with translation directives and defining a set of translation rules. This process implies a well-developed conceptual model of the facilities of the implementation environment.

UML provides explicit support for the first two approaches, but cannot be said to fully support the third, as no directives or translation language is defined, nor is any explicit support provided for the modelling of implementation environment facilities. However, techniques (such as deployment diagrams) provided for other purposes could be used to address the latter need. Ovum has discussed the relative advantages of the translational approach elsewhere in this service (see article 'The software industry as an ostrich').

Component diagrams

UML components are best viewed as a way of packaging design level items (classes) into physical units that can be deployed and re-used. The concrete manifestation on a component is usually a file of source or executable code.

In general, components may be type (in the sense of 'category', not UML type) or instance level concepts. The components on a component diagram are always type level concepts. So the concrete manifestation of such a component is a class from which instances (specific files) may be created. On a deployment diagram (see Figure 18), instance (as well as type level) components may be recorded. Strictly speaking, the source or binary code components represented are unlinked when they are shown on component diagrams - deployment diagrams indicate which environments the code will run in.

Component diagrams show components together with the dependencies (which component makes use of which other component) between them.

Given the similarity between the component and the package and subsystem concepts, it is reasonable to ask why it was felt that an additional concept was required. One answer is that packages and subsystems can define the logical organisation of your specification (though of course they are not prohibited from defining a physical organisation). There is value in being able to maintain a separate logical and physical organisation. Another answer is that components allow many-to-many 'contains' relationships with classes, whereas the package or subsystem 'owns' relationship is, by definition, one-to-many. Components can thus be used to model the inclusion of a class in more than one functionally distinct software unit (this facilitates the construction of modular software units), or in different instances of functionally identical software units (this facilitates the design of distributed systems).

We also note from the semantics guide that components are sub-types of classifier and hence inherit all the facilities of classes. It is not clear why this should be - are they intended to be more than just packaging mechanisms? A more physical modelling technique than classes? We do not think so. Rather, we think that they are intended to be packaging mechanisms that expose operations, and which can define interfaces to facilitate re-use. The rest of the baggage brought with the class can be seen as accidental inheritance. In what way, then, is placing classes within components, different from nesting classes within composite classes? The answer is that components carry the implication that they are separately deployable physical units (files).

Figure17 Example of component diagram for a spell checker

Figure 18 Example of deployment diagram for a spell checker

Notationally, components are shown using a complex symbol (which is difficult to draw, but does not seem to represent anything in particular) out of which may protrude a 'lollipop' that represents the component's interface. Components (or interfaces on components) are connected by a dashed arrow representing the dependency: component A uses component B. (See Figure 17).

Despite the fact that components inherit all the capabilities of classes, they are intended to be a packaging, rather than a specification concept, and we award reference markers in all columns except organisation.

Deployment diagrams

The notation guide says that deployment diagrams show the physical processors (hardware plus operating system) on which your system will be implemented; the physical communications links between the processors, and which components are to be implemented on which processors. They may be expressed at either the type or the instance level. In the former case, they might show that a particular component type is to be implemented on all Apple Macintoshes; in the latter case, that the accounts department's RS/6000 server is to have a specific instance of a component implemented on it.

Processors are shown as cubes (again difficult to draw and unintuitive). Components can be nested within the cubes, and objects can be nested within these components. Components may be expressed as types or instances - distinguished by the normal class/type vs. object naming conventions. Naming conventions are as for classes/types and objects. Communications links are shown using the association or link symbol, stereotyped to allow different kinds of communication link to be denoted. Instance level components can be shown as moving between nodes, by using the <<becomes>> stereotype on a dependency relationship (surely a misuse of dependency semantics). This allows you to model mobile processing agents.

As in the case of components, nodes are sub-types of classifier and hence inherit all the characteristics of classifiers (in addition to all the capabilities of model elements, such as dependency relationships and the ability to be included in packages). This raises the same issues as it does in the case of component diagrams. We feel that many of the facilities of classifiers are inappropriate to nodes (why, in a software development project, would you want to show an inheritance relationship between nodes?), so we ignore the inherited facilities in our scoring.

UML deployment diagrams allow you to define the distribution of software components on physical processors at both the type and the instance level, and hence fulfil the requirements for a major contribution in the organisation column.

Section: Method guides
Date: 1998

Construction

Two broad approaches can be taken to producing code from a model. The model can be:

used to simply guide manual coding. Method support for this approach can include guidance on the production of programmers briefs from the design code
translated into code skeletons or full program code. Method support for this approach can include the ability to flag model elements according to desired translations, translation languages and rules (including standard mappings from method concepts to concepts in target languages), and the ability to model implementation environment facilities.

UML provides no specific facilities to support for either approach, and could not (even if specific support for translation was provided) support the generation of complete applications, because UML cannot specify complete applications (see Analysis and Design). Therefore, we do not award any points for construction.

On a positive note, UML is more formal and more complete than past object-oriented methods. Tool vendors will undoubtedly provide some of the missing facilities. So UML tools should, in the long term, be able to generate more of an application than their object-oriented predecessors.

Furthermore, since deployment diagrams allow you to define what code should run on what specific processors, UML case tools could support the automatic roll-out of your application across a network.

Section: Method guides
Date: 1998

Reverse engineering

UML gives no guidance on reverse engineering, so we award no markers in any column. Indeed, UML does not provide any means of distinguishing a model of your legacy (or current) system from a model of your planned (required) system, nor does it provide any specific concepts or notation to deal with the special problems of modelling legacy systems.

Section: Method guides
Date: 1998

Support and other resources

A range of support resources are now available:

a brief guide to UML: UML distilled from Addison-Wesley
an overview guide on the use of UML in respect of real-time systems (available from the Rational web site), also a book on real-time UML: Real Time UML -Developing Efficient Objects for Embedded Systems by: Bruce Powel Douglass, available from Addison-Wesley.
CD-based tutorials: for example Rational's UML seminar, and ICONIX's 'Mastering UML with Rational Rose'
basic and advanced UML courses from Rational and many other IT training providers
a discussion group. The UML revision task force mailing list (uml-rtf@omg.org). To join this mailing list send a request to request@omg.org.
Web pages: the UML revision task force web page is at http://uml.systemhouse.mci.com/. Rational's Web site is at www.rational.com/UML.

There is still an urgent need for a thorough, but user-friendly, book on UML. We understand that Booch, Jacobson and Rumbaugh are currently working on a user and reference guides, due for publication Q1/Q2 1998. Details from http://www2.awl.com/cseng/series/uml.

Tool vendors, in particular, may be interested to hear that Radmila Juric (South Bank University, London, UK) has published a statement of all the integrity rules in UML version 1.0. Radmilla has also published a critical assessment of UML's approach to rules relating to how well-formed the product is, class diagrams and use cases, and has contributed to this method guide.

When puzzled about the meaning or purpose of different aspects of UML, I have found Booch's 1994 book, Object Oriented Analysis and Design, on the 'Booch' method to be helpful ancillary reading. You may also wish to consult Object Oriented Modelling and Design by Rumbaugh et al (1991); and Object Oriented Software Engineering by Ivar Jacobson (1992).

Section: Method guides
Date: 1998

Contact information

For further information on UML you can contact the UML revision task force.

Cris Kobryn (chair)
UML Revision Task Force
MCI Systemhouse
PO Box 2320
Fallbrook
California 92088
USA

Tel: +1(800) 234-4586

Mike Budd, with assistance from Radmila Juric