Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Current »

Use case 

I have three publishers:

  1. PubA
  2. PubB
  3. PubC

If the record comes from PubA or PubB, the field from either PubA or PubB should win first. Otherwise the value from PubC should win. If there are records from both PubA and PubB, then the more recent record should win.

I have the following master records that consolidate to one golden record:

Day 1 


IDField1Field2
Master recordsPubA.123Evans1 pm
PubC.345Evans2pm
Golden recordsabc123Evans1 pm

Day 2


IDField1Field2
Master recordsPubA.123Evans1 pm

PubB.234Evans1:00 p.m.

PubC.345Evans 2pm
Golden recordsabc123Evans 1:00 p.m.

Problem

The system field UpdateDate doesn't work in this consolidation strategy. When a new record is submitted, the UpdateDate can be nonsensical because the last record in has an UpdateDate of NULL and the older record has an UpdateDate. For example:

In this case, the record created in September should win. But the UpdateDate attribute is NULL so basing the consolidation strategy on UpdateDate will be bad.

Solution

  1. Add a new field like SourceUpdateDate. Write the consolidation strategy that ranks publishers and uses the SourceUpdateDate as a tie breaker. 
  2. Add a new SemQL Enricher like RecordSourceUpdateDate

  3. Enrich into the SourceUpdateDate field the current timestamp. In the SemQL editor, use CURRENT_TIMESTAMP()

  4. Use the Largest Value for your field-level consolidation strategy for SourceUpdateDate
  5. Use the following Custom Ranking consolidation rule for your business fields where you care about 'last one in'. Check Skip Nulls. You don't want to take PubA's value for that field if it is null.
case /* consolidation rule ranking based on publishers */ 
  when PublisherID = 'A' or PublisherID = 'B' then 10 
  when PublisherID = 'C' then 20 
  else 30 /* good practice to define an else clause */
end ASC,
SourceUpdateDate DESC

This case statement includes the else clause which will take care of publishers that are not A, B, or C. That should answer your question about Pub D.

Important Note

If you simply have Pub A, Pub B and Pub C, and you want to want to make sure the "last one in" wins for each publisher and you are OK with using a Preferred Publisher strategy, then you can just add the expression SourceUpdateDate desc in the Additional Order By field as a tiebreaker instead of using this Custom Ranking SemQL consolidation strategy. It still requires you to go through step 1-4 but you don't need to configure a Custom Ranking consolidation rule. CAVEAT: This strategy only works if the SourceUpdateDate field is populated for ALL records. If you have any nulls in SourceUpdateDate, Oracle by default ranks Nulls first when ordering by descending. So using SourceUpdateDate desc in the Additional Order By field will not work. Instead, use a coalesce with SourceUpdateDate and CreationDatecoalesce( SourceUpdateDate, CreationDate ) desc

  • No labels