Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
maxLevel2
stylesquare

...

  • Using Transformations in Matching Rules
    Avoid functions transforming data (SOUNDEX, UPPER, etc. included) in match/binning rules.
    • Reasons:
      • May cause an issue on the Indexes. These functions is performed for every time the record is compared.  
    • Solution
      • Materialize these values into attributes via enrichers
  • Use Fuzzy Matching Functions with Care
    • Distance and Distance Similarity functions are the most costly functions.
      Sometimes, materializing a phonetized value with enricher then comparing with equality gives functionally equivalent results.
  • Very Large Complex Rules
    • Avoid one big matching rule.
    • Each rule should address a functionnally consistent set of attribute data.
  • Consider Indexing
    • For very large volumes, adding an index on the significative columns involved in the binning, then one index for the columns matching rule.
      e.g.: 
      create index S_<indexName> on MI_<entity> (B_BATCHID, B_BRANCHID, B_CLASSNAME, B_PUBID, B_SOURCEID, <columns involved matching, with those having more distinct values first>);
      -- Remove BranchID for v4.0 and above

Issues in Other Certification Phases

...

  • Symptom: "ORA-01467: Sort Key too long" issue
    • Solution: alter session set "_windowfunc_optimization_settings" = 128; in session initializing for the connection pool in the datasource configuration.

...

  • Using FDN_ columns instead of FID_ columns in the SemQL condition
  • Using function transforming data (SOUNDEX, UPPER, etc. included) may cause an issue on the Indexes
  • To complex SemQL condition.


Others TIPS

ORACLE statistics

Be sure that the ORACLE stat gathering are not turned off in the Semarchy process.

LOGGING

Turning off logging can really speed up processing. But that means that no Semarchy activity event is logged.