Modeling & Configuring Hierarchies in Semarchy
This article reviews the basics of modeling and configuring hierarchies in Semarchy. This is a work-in-progress and will be updated.
Examples of hierarchies
Hierarchical structures are common in many different domains. Some examples might be:
- Organizational structures (CEO / managers / other managers / my manager / me / the cat)
- Charts of accounts (Income statement / expenses / other roll-up accounts / bottom level expense account that can accept transactions)
- Product hierarchies (product family / product sub-family / product / item SKU/UPC)
- Location (sales region / country / state or province / city / store)
- NAICS (North American Industry Classification System) codes (business sector / sub-sector / industry group / industry / national industry)
- Linnaean taxonomy of organisms (domain / kingdom / phylum / class / order / family / genus / species)
Types of hierarchy
Most hierarchical structures can be categorized as:
- Balanced: The bottom-most "members" of a hierarchy (often referred to as leaves) are always at the same level. Product hierarchies are often structured this way. Each parent level of the hierarchy often has a specific business meaning.
- Ragged or Natural: The leaf level members of a hierarchy can be at any level. Organizational structures and financial hierarchies (eg. charts of accounts) are typical examples. Ragged hierarchies enable more complexity in certain branches of a hierarchy, to meet the business requirements for classification and aggregation.
Hierarchies are established for two primary reasons:
- Taxonomy: To classify or categorize the leaf level members. A great example is the Linnaean taxonomy (a rank-based classification of organisms). These are often balanced hierarchies.
- Aggregation: To provide a way to aggregate (roll-up) granular data - often gathered at the leaf member level - to higher levels. This is how a chart of accounts hierarchy works. Such hierarchies might be balanced or (more likely) ragged.
Modeling hierarchies
Hierarchies can be modeled in Semarchy in three general patterns (or styles):
- Balanced hierarchies are usually (but not always) modeled as a series of different entities (one for each level of the hierarchy), with reference relationships linking the levels via foreign keys. Each entity will have attributes appropriate for that level of the hierarchy. Product hierarchies are a typical example, and our Tutorial that configures a "Product Retail Demo" uses such a structure to define a product family, a product sub family, a product, and product items (SKUs/UPCs).
- Give the relationship a meaningful name, and also assign a meaningful name to the "Referenced Role Name". and "Referencing Role Name". Why? Because these role names are what you will use to configure the navigation of the hierarchy in Semarchy's UI. The "Referenced Role Name" also defines the name of the foreign key that Semarchy creates, which is critical for an understanding of Semarchy's data model.
- Ragged hierarchies can be modeled using a self-join reference relationship within a single entity. This allows a recursive structure to be captured, to any depth:
- You add foreign keys in the usual way - most commonly in the Diagram view of the Semarchy Workbench: click Add Reference and then drag and drop to create the relationships between an entity and itself - drag from the entity slightly away and then back onto itself, and it will create a self-join foreign key.
- As for balanced hierarchies (above) it is even more important be sure to give your relationships and roles meaningful names.
- Here is an example of a recursive self-join for an "Office" entity (this is from a health care model). You can see how the "Referenced Role Name" also determines the name of the foreign key "ParentOffice" (in green).
- Ragged hierarchies can also be modeled using a separate relationship entity that contains the foreign keys of the parent and of the child records in a recursive relationship. This gives more flexibility than the simple self-join in (2) above, because multiple relationships can be defined for different reasons - and that reason can become an additional attribute in the relationship entity. (eg. by adding a foreign key to an entity that defines the possible set of relationship types). This permits alternate roll-up structures to be modeled - for example, different aggregations in financial hierarchies are common in EPM, BI and reporting applications (eg. to generate financial reports that are compliant with different global accounting standards, such as US GAAP vs. IFRS).
- You add foreign keys to a relationship entity in the usual way - most commonly in the Diagram view of the Workbench: click Add Reference and then drag and drop to create the many-to-one relationships between the relationship entity and the entity.
- In Semarchy, these relationship entities are usually defined as Basic entities (even though the entities whose structures they control might be Fuzzy Matched or ID Matched entities).
- Here is an example of a "Person Relationship" entity designed to capture different kinds of relationships between Persons, which could also reflect familial relationships.
- Note the use of a "RelationshipType" attribute (it is a List of Values) to categorise the relationship.
- Assigning meaningful names to relationships and roles is critical here, to avoid confusion! These determine the name of the foreign key attributes.
Configuring hierarchies for navigation
Hierarchies can be displayed and navigated in Semarchy by using the hierarchy options when configuring transitions in a business view.
- The same process is used no matter which style of hierarchy modeling was used. The Semarchy xDM Developer's Guide describes how to configure hierarchies in Business Views.
- Navigating a series of cascading one-to-many relationships between different entities (for a fixed-level hierarchy) is configured as a transition in a Semarchy business view, just like any other transition. This will automatically add a tab to the base form of the business view in the UI, showing a list (as a Semarchy collection) of the related, child records. You can continue configuring additional transitions as you drill down the hierarchy. Below is an example of a business view that navigates transitions from a region ("EG") to the airports in that region ("EGBB - Birmingham") , then to the runways at each airport ("15/33"), along with the details of how the transitions were configured. Important things to note:
- This business view is configured for the Region entity ("ICAORegion") since this is the top entity in the hierarchy.
- The tree view in the UI is enabled by the "Enable Hierarchy View" checkbox in the business view configuration. Separately, each transition can be enabled in the hierarchy (or not) with the "Enable in hierarchy" checkbox in the "Hierarchy Configuration" finger tab.
- Business view transitions are also used to configure the UI to navigate a separate, relationship entity. Here is an example of a product hierarchy modeled as ProductType, ProductNode and ProductRelationship entities, followed by the configuration of the transitions in the business view. Features to note here are:
- The business view is configured for the ProductNode entity
- It includes transitions to the child records (as ProductParentToChild) and also to its parents (ProductChildToParent), because this modeling approach permits a ProductNode to have multiple. alternate parents as well as multiple children.
- It is very important to name the transitions carefully and to understand their use. The transition down the hierarchy to the child records uses the ProductParent role in the data model, because it is navigating from a ProductNode record to the ProductRelationship records, using the value of the ProductParent foreign key in the child record. It is very easy to use the wrong relationship role when configuring these relationship entities in a business view transition!
- It is possible to display multiple hierarchical relationships in Semarchy - for example, in the Product Retail demo application it is possible to configure hierarchy navigation by Product Family, Product Sub-Family, Product and Item, and also to navigate (in the same tree view) into the permitted sizes that are appropriate for each Product Family.
- If a recursive, self-join is used to represent a hierarchy, you need a Root Filter to the business view to only display the top-most member of the hierarchy. You can access all descendant members (see definition of "descendant" below) by drilling down the hierarchy tree view. This root filter is used to identify the root member of the hierarchy (ie. the value of the parent foreign key is null). Here is an example based upon an earlier data model example:
Modeling Tips
- Use consistent terminology when referring to hierarchy design. Here are some examples using the upside-down tree/root/branch paradigm:
- Member: Any record in a hierarchy ("node" is a common alternative nomenclature)
- Root: The top-most member of a hierarchy
- Leaf: The bottom-most member of a branch of a hierarchy. Lower level (child) members cannot be added to a leaf.
- Ancestors: All the members between a member up to (and including) the root member. This includes a member's "parent", "grand-parents", etc.
- Descendants: All the members below a member, in all branches and sub-branches.
- Siblings: All members who share the same parent member.
- Levels: Usually define the depth of a member within a hierarchy (ie. how many ancestor nodes do I have, including me?), typically starting at Level 1 for the root member.
- In general, it is not a good idea to create separate reference relationships to define the relationship between a leaf record and each of its parents, grand-parents, and so on. For example, if you want to track a sales representative's Business Manager and their Area Manager, this is best captured in a self-join entity, in which the role of each employee (eg. "Area Manager", Business Manager", "Sales Representative") is captured for each employee's record. Do not create separate reference relationships to define each relationship.- this will become unmaintainable - with inconsistent data - as employees change roles, get hired and fired, etc.
- When you define reference relationships between entities, always give them meaningful names. Also define meaningful names for the role names at each end of the reference relationships. You be thankful when you come to configure transitions in business views (or elsewhere).
Related articles