Configure "last one in wins" consolidation strategy in MDM

Use case 

I have three publishers:

  1. PubA
  2. PubB
  3. PubC

If the record comes from PubA or PubB, the field from either PubA or PubB should win first. Otherwise the value from PubC should win. If there are records from both PubA and PubB, then the more recent record should win.

I have the following master records that consolidate to one golden record:

Day 1 


IDField1Field2
Master recordsPubA.123Evans1 pm
PubC.345Evans2pm
Golden recordsabc123Evans1 pm

Day 2


IDField1Field2
Master recordsPubA.123Evans1 pm

PubB.234Evans1:00 p.m.

PubC.345Evans 2pm
Golden recordsabc123Evans 1:00 p.m.

Problem

The system field UpdateDate doesn't always work as a tie breaker for the consolidation strategy. Why?

When a new record is submitted or there is a change to the match conditions, the master record might show a new UpdateDate because this record was "touched" from a system/technical perspective in calculating new match groups. However, this new UpdateDate could be considered nonsensical from a business user's perspective because there was no material business change to the record. For example, we could possibly get an update to the UpdateDate for all the master records in the group (they're showing Updated On to Sep 15, 2020) even if 4 out of 5 of the master records did not have any updates to the business values because 1 of the 5 master records was updated or considered to be part of another match group. 

To resolve this problem, we need greater control of the UpdateDate field that we rely on for survivorship logic. To achieve this, the solution below explains how you can create a new field that is updated whenever the record is pushed for processing in xDM. 

Solution

  1. Add a new field like SourceUpdateDate. Write the consolidation strategy that ranks publishers and uses the SourceUpdateDate as a tie breaker. 
  2. Add a new SemQL Enricher like RecordSourceUpdateDate

  3. Enrich into the SourceUpdateDate field the current timestamp. In the SemQL editor, use CURRENT_TIMESTAMP()

  4. Use the Largest Value for your field-level consolidation strategy for SourceUpdateDate
  5. Use the following Custom Ranking consolidation rule for your business fields where you care about 'last one in'. Check Skip Nulls. You don't want to take PubA's value for that field if it is null.
case /* consolidation rule ranking based on publishers */ 
  when PublisherID = 'A' or PublisherID = 'B' then 10 
  when PublisherID = 'C' then 20 
  else 30 /* good practice to define an else clause */
end ASC,
SourceUpdateDate DESC

This case statement includes the else clause which will take care of publishers that are not A, B, or C. That should answer your question about Pub D.

Important Note

If you simply have Pub A, Pub B and Pub C, and you want to want to make sure the "last one in" wins for each publisher and you are OK with using a Preferred Publisher strategy, then you can just add the expression SourceUpdateDate desc in the Additional Order By field as a tiebreaker instead of using this Custom Ranking SemQL consolidation strategy. It still requires you to go through step 1-4 but you don't need to configure a Custom Ranking consolidation rule. CAVEAT: This strategy only works if the SourceUpdateDate field is populated for ALL records. If you have any nulls in SourceUpdateDate, Oracle by default ranks Nulls first when ordering by descending. So using SourceUpdateDate desc in the Additional Order By field will not work. Instead, use a coalesce with SourceUpdateDate and CreationDatecoalesce( SourceUpdateDate, CreationDate ) desc