August 19, 2021
5 min read

Modeler’s Corner #4: Subtypes in Data Modeling

Data modeling
Team Ellie
Ellie editorial team

Modeler’s Corner is our series of blog posts on best practices and practical tips & tricks for all you Ellie modelers out there. The series focuses on everyday issues you might face as a Data Modeler. We aim to help you build the most informative, understandable, and efficient business-driven models with Ellie. For more comprehensive training needs, don’t hesitate to ask us!

Subtypes and supertypes

The first three entries in our Modeler’s Corner series focused on getting your building blocks straight: identifying entities and utilizing Ellie’s entity categories to understand those entities better, and approaching the entities from the viewpoint of simple language: verbs and nouns. Now it’s time to think about using these blocks in the most efficient way.

A good modeler has a toolbox from which they can draw various instruments when encountered with a suitable situation. These tools are repeatable patterns, and this time we’re going to cover a simple but yet very powerful one: entity subtypes.

What is a subtype?

Let’s imagine we run a travel agency (not a great choice in 2021, but perhaps we are optimistic about the future!). We have a system in which our users create Bookings. They can book Flights and Hotels. With a single booking, you book a flight or a stay in a hotel. Clearly, there are two types of bookings: Flight booking and Hotel booking. Still, both are very similar, essentially in that the user logs into our fancy system, selects their desired flight or hotel, and books them. This is an example of a situation where we should reach into our toolbox for a subtype structure.

An entity can have two or more subtypes. Any single instance of the entity always represents a single subtype; in our travel agency, an individual booking is always for either a flight or a hotel – not both at the same time, and you clearly can book a ‘nothing’. This means that subtypes are always mandatory and mutually exclusive.

How to Add a subtype entity in an Ellie model?

In Ellie, we use the box-within-a-box notation for subtypes. Just click on an entity and drag it inside another, and they are glued together; click and drag outside to un-glue. Having done that, our travel agency example might look like this:

Travel agency example with Booking subtypes

As you can see from the diagram above, the sub-entities can be neatly connected to “outside” entities at the same time as the parent entity is connected to others. This shows how we can define that User and Booking confirmation are things that link to all types of bookings, but Hotel for example is only linked to the relevant subtype. (If you happen to wonder about the meaning of the various colors of the entities in this model, please refer to Modeler’s Corner #2!)

This is a very powerful pattern that, while giving you a better understanding of the big picture, also makes your models visually simpler, as you don’t have to repeat relationships from every outside entity to every subtype entity. If something is common for all subtypes, you can handle it on the supertype (parent) level.

You can even create subtypes within subtypes – in fact, there is no hard limit in Ellie for the number of nested subtypes!

What are the relationships between Subtype-supertype?

All Ellie’s entities have their own Glossary entry, which you can access either via the Glossary menu itself or by double-clicking on an entity on the canvas. In the entity’s Glossary view, you can see the Relationships tab. This is an extremely powerful source of information in general: here you can see all the relationships your entity has in any model across the entire Ellie environment! Regarding subtypes and supertypes, this view has (now, after Ellie’s v3.7 update) even more interesting stuff.

An example of subtypes on the Relationships tab

You can see in the above picture that in addition to the relationships this Supply item entity has to other entities, it also displays two subtype entities: Material and Service. By hovering your mouse over these, you can see in which models the subtypes are being used. Supertypes are displayed in a similar fashion, and naturally, you can navigate from an entity to another by just clicking on them.

When to use subtypes and what to do with them later?

Subtypes are best used in situations where you know there are a certain number of “kinds” of an entity that behave differently. They could have different relationships with “outside” entities, or they could have different attributes – or it might just make sense from a communication perspective to ensure that the different subtypes are clearly displayed on the modeling canvas and defined in the Glossary. However, if the number of subtypes grows very large (dozen+), the model will usually become visually more complicated, reducing its readability to the average user.

If you are modeling for a development initiative, such as data warehousing, you will continue from the business model to logical and physical modeling (note: logical modeling is an upcoming feature in Ellie, see our Q2 roadmap update). In these more technical and detailed models, there are usually three alternative ways of handling subtype structures coming from the business model:

  • Modeling only the supertype: in this solution, you are basically pushing all the details from the subtype entities back into the supertype – in effect, all the relationships and extra attributes are added to the supertype and the subtypes disappear as independent entities. This might make sense if e.g. your expected queries will often concern many subtypes at the same time, but it leads to your supertype’s attributes becoming “sparse”, as not all the attributes are relevant for all the instances.
  • Modeling only the subtypes: in this solution, you forgo the supertype entity and only keep the lower-level subtypes. All the relationships and attributes from the supertype (that are by definition common to all subtypes) need to be repeated for every subtype. This works rather well when there’s very little that is common to all the subtypes, but if there’s a lot of data on the supertype, it causes replication (i.e. denormalization of data).
  • Modeling the supertype and the subtypes as separate entities: in this solution, you will need to create explicit relationships between the subtypes and the supertype, as they are now separate. The supertype entity contains a union of all the keys from all the subtypes and all their common attributes. This solution might require some extra programming logic to make sure that the supertype and subtypes stay in sync (i.e. when encountering a new key, insert it into both supertype and the relevant subtype).

Whatever your end result is going to be, subtype-supertype structures are very powerful patterns for your business model. They’re very easy to do in Ellie – just click and drag! – so put this pattern in your mental toolbox and get modeling!