UML Best Practice: Attribute or Association

Short

Use Associations for Classes and Attributes for DataTypes

Purpose

Make an informed choice between Attributes and Associations when modeling a relation between two Classifiers.

Details

When modeling the structure of your system there are basically two ways to express a structural relationship between two Classifiers. You could use an Association between the two Classifiers, or you could create an Attribute owned by one Classifier with it’s type set to the other Classifier.

Both ways, Association or Attribute are pretty much equivalent. There’s not really a big difference between the two except for personal preferences.

The problem with modeling teams working on the same model is of course that you can’t allow personal preferences, you have to make a clear choice what to use in which circumstances.

To explore the details of the two approaches it is best to have a look at the UML meta model.

In this meta diagram we see that both the Attribute as the Association use the same Property object to link to a type.

The association has two or more Properties as MemberEnd. Each of these Properties has a Type, so that is the way the association links two or more Classes. The derived link from Association to EndType is derived from the type of the Properties in the memberEnds.

The Attribute of a class is in fact a Property in the ownedAttributes of a class.  Again through the fact that a Property is a TypedElement an thus has a Type as “type” we get the relation to another Classifier.

In the years I’ve been working with different modelling teams I’ve found that the rule that works best is to use Associations for Classes and Attributes for DataTypes.

Now whats the difference between a Datatype and a Class? Well, they are actually pretty similar. The UML specification states:

A data type is a special kind of classifier, similar to a class. It differs from a class in that instances of a data type are identified only by their value.

So that means that DataTypes are much like the primitive types and enumerations we know in the programming world. This concept is generally referred to as being immutable. So you can think of things like Integer, Date, MoneyAmount, but also enumerations such as Color, DayOfTheWeek etc..

If we add Datatype and Enumeration to the meta diagram we get following

You can see that DataType is a subtype of Classifier, and that Enumeration is a subtype of Datatype.

Following example shows how to use Classes and Datatypes when following this best practice.

In this diagram we see two Enumerations: Currency and ProductCategory. ProductCategory is being used as the type of the attribute Product.Category while Currency is being used by the Datatype MoneyAmount.

I’ve added dependencies to visually express which Datatype is being used by which Classifier, but those dependencies are usually not there in a production model.

More UML best practices

UML Composition vs Aggregation vs Association

What makes a UML Composition different from an Aggregation or a regular Association?

The concepts of Association, Aggregation and Composition exist in UML since the first published versions, but the exact meaning of these concepts, especially the Aggregation still leads to heated debates among UML experts.

But before we go into the details, let’s have a look at how these concepts are defined in UML. I guess every UML user is familiar with the graphical notation, but how do these concepts look like in the UML (v 2.5) meta model?

UML 2.5 Associations Meta Model

This is a the part of the UML meta model that defines Association. (I’ve hidden the elements not relevant to the subject for clarity)

What we see is that an Association has at least two Properties in the role of memberEnd. A property has an attribute aggregation of type AggregationKind. It’s this AggregationKind that specifies the difference between a regular Assocation, an Aggregation and a Composition.

The three possible values for AggregationKind are defined in the UML specifications as follows:

  • none
    Indicates that the Property has no aggregation.
  • shared
    Indicates that the Property has a shared aggregation.
  • composite
    Indicates that the Property is aggregated compositely, i.e., the composite object has responsibility for the existence and storage of the composed objects (parts).

But a bit further, in the semantics section of Properties we find the same explanation except for a small addendum

  • shared
    […] Precise semantics of shared aggregation varies by application area and modeler.

So basically the OMG is saying: We don’t know what it means, make up your own definition.

Looking for more clues in the definition of Association we find the constraint:

Only binary associations can be aggregations.

memberEnd->exists(aggregation <> AggregationKind::none) implies (memberEnd->size() = 2 and memberEnd->exists(aggregation = AggregationKind::none))

OK, that doesn’t really help us. All it states is that the Aggregations and Compositions can only exist in Associations that have maximum two members, but that’s like the “normal” Association for most of us. I haven’t seen many Associations with more then two members yet.

And the second part of the OCL constraint tells us that only one of the two ends can play the whole part, so the other end must play the part part.

Looking further in the specs we find in the semantics section of the Property the following

Composite aggregation is a strong form of aggregation that requires a part object be included in at most one composite object at a time. If a composite object is deleted, all of its part instances that are objects are deleted with it.

So that paragraph already tells us a little bit more about the nature of the Composition. Let’s dissect this paragraph and figure out what to remember

  • that requires a part object be included in at most one composite
    object at a time
    So a part cannot play the role of part in two compositions at the same time. This implies that the multiplicity of a composite association can only be [0..1] or [1..1] on the composite end.
  • If a composite object is deleted, all of its part instances that are objects are deleted with it.
    This is one of the parts where v 2.5 is different from previous versions. Previous versions of the UML specifications had the phrase “are normally deleted with it”. By leaving the “normally” out there no more ambiguity. Deleting the whole will always result in deleting the part in a composition.

But there still a loophole for the delete story.

NOTE. A part object may (where otherwise allowed) be removed from a composite object before the composite object is deleted, and thus not be deleted as part of the composite object.

So before the whole is deleted we can remove the part to avoid having to delete the part as well.

And then there a last paragraph that deals with Compositions

Compositions may be linked in a directed acyclic graph with transitive deletion characteristics; that is, deleting an object in one part of the graph will also result in the deletion of all objects of the  subgraph below that object. The precise lifecycle semantics of composite aggregation is intentionally not specified. The order and way in which composed objects are created is intentionally not defined. The semantics of composite aggregation when the container or part is typed by a DataType are intentionally not specified.

  • Compositions may be linked in a directed acyclic graph with transitive deletion characteristics:
    Now this is a difficult one. The directed acyclic graph part tells us that, when following the links from whole to part, we will not visit the same element twice. Combined with the “at most one composite at a time” constraint this even means that composite relations form a hierarchical tree. The transitive deletion part means that deleting one element from the tree would then delete the whole branch under this element. Unfortunately the word may in the sentence means that this again is no hard constraint, but merely an indication of how it can be used.
  • […]are intentionally not specified
    This is simply sad… These few sentences basically tell us again nothing at all, except that we shouldn’t look for a specification of these aspects in the UML specifications.

So that’s about all  the UML specification has to say about the different types of aggregation. All other constraints you find in books and on the internet are purely interpretations and added semantics of the authors.

Now let’s have a look at some typical examples of Composition and Aggregation (or aggregationKind = shared and aggregationKind = composite as we learned from the specifications)

This example shows the class structure of the popular social networking site LinkedIn

On the right we see the Group structure as a set of Compositions because
a) each “whole” can be considered as a grouping of “parts”
b) each “part” can belong to only one “whole” at a time
c) the “parts” should be deleted when the “whole” is deleted (except when we move them to another “whole” first)

Note that discussions can be moved from one section to another (Usually from Discussions to Jobs or Promotions), but they cannot be part of two sections at the same time.

The relation between Group and User however is an Aggregation because
a) a Group can be considered a “grouping” of users
b) a User can be part of multiple Groups at the same time
c) a User should not be deleted when a Group is deleted.

To summarize

The Composition is a type of Association with real constraints and impact on development, whereas the Aggregation is purely a functional indication of the nature of the Association with no technical impact.